Joint Design of Advanced Computing Solutions for Cancer (JDACS4C)*
JDACS4C is a collaborative, cross-agency partnership between the National Cancer Institute (NCI) and the Department of Energy. JDACS4C was created in 2016 under the NCI Cancer Moonshot℠ to accelerate cancer research using emerging exascale-computing capabilities. The capabilities developed through this collaboration are anticipated to lead to the formulation of better risk assessment and treatments for cancer patients. Investigators from NCI and the Frederick National Laboratory for Cancer Research work collaboratively with experts in computational, data, and physical sciences from four DOE national laboratories, Argonne, Los Alamos, Lawrence Livermore, and Oak Ridge. Based on a multi-disciplinary, team science approach, the JDACS4C program has three research pilot efforts at the molecular, cellular, and population levels that align with several existing NCI and DOE program investments. With each effort jointly led by scientists from DOE national laboratories and NCI funded programs and laboratories, the pilots are each using the latest cutting-edge high-performance computing from the DOE Exascale Computing Program to accelerate progress on key cancer priorities in several areas. Looking to the future, these new capabilities and technologies will significantly reduce the time required for large-scale data analysis, simulation runs, and model development and validation. As a result, basic, translational, clinical, and population-level cancer research will accelerate dramatically helping to fulfill the Cancer Moonshot’s goal to accomplish ten years’ worth of research in five years.
Pilot 1: Predictive Modeling for Pre-Clinical Screening
The cellular level pilot, focused on developing predictive models to improve pre-clinical therapeutic drug screening, is jointly led by scientists from the NCI Division of Cancer Treatment and Diagnosis, Frederick National Laboratory for Cancer Research, and Argonne National Laboratory, with involvement across the JDACS4C collaboration. The goal of this pilot is to employ advanced computational technologies, including machine and deep learning, to rapidly develop, test, and validate predictive pre-clinical drug efficacy models, accelerating the rate for identifying promising new treatment options for precision oncology. This pilot is developing machine learning-based predictive models trained on experimental data from many sources, including cancer cell lines, organoids, and patient-derived xenografts (PDXs). Already, the team has established a large training database with integrated NCI data sources including the NCI ALMANAC study, the NCI Genomic Data Commons, and over a dozen additional sources, and using this integrated database, developed deep learning models for predicting tumor response to single drugs and drug combinations. Looking forward, the pilot expects to focus on the use of hybrid computational models, those employing both data and biological/mechanistic understanding, to continue to improve the initial computational models, offering the potential to improve risk identification, pre-clinical drug screening, and treatment selection when translated into the clinic.
Pilot 2: Improving Outcomes for RAS-related Cancers
The molecular level effort, which builds of the accomplishments of the ongoing NCI RAS Initiative, is co-led by scientists from the Frederick National Laboratory for Cancer Research and Lawrence Livermore National Laboratory, with involvement across the JDACS4C collaboration. The team is deepening understanding of RAS protein biology to open insights to possible new treatment options for RAS-related cancers through the integration of next generation experimental data with large scale computational simulations. Already, through leveraging computational approaches to address experimental gaps and develop multi-scale modeling and machine learning capabilities, the team has measured key experimental protein-protein and protein-membrane interactions, integrated these observations into novel multi-scale simulations to corroborate biophysics and computation on the path to develop new insights for future treatments. In the future, the ability to simulate important cancer protein interactions and cell membrane initiated signaling cascades at unprecedented scale and fidelity will be used to deepen our biological understanding of the disease while accelerating the development of drugs for RAS-driven cancers as well as other cancers associated with undruggable targets.
Pilot 3: Population Information Integration, Analysis, and Modeling for Precision Surveillance
The population level pilot, based on cancer statistics collected by the NCI Surveillance, Epidemiology, and End Results (SEER) program database, is co-led by scientists from the NCI Division of Cancer Control and Population Sciences and Oak Ridge National Laboratory, with involvement across the JDACS4C collaboration. The goal of this effort is to pilot the transformation of cancer care by applying advanced computational capabilities to population-based cancer data, leading to new understanding about the impact of new diagnostics, treatments, as well as other factors affecting patient outcomes in a real-world setting. Already the team has developed, deployed and refined pathology report annotation processes to develop critical training data and validation processes, established agreements to access real world cancer registry data, and applied the new tools to automatically extract and code key features in pathology reports. In the future, the approaches to accelerate the development of new integrated sources of health and cancer information, together with the knowledge gained from the JDACS4C molecular and cellular level pilots, are aiming to establish insights into better predicting cancer patient outcomes together with real world factors affecting patient health trajectories and clinical trial eligibility.
CANDLE (CANcer Distributed Learning Environment)
Collectively, these efforts are supported with the latest deep learning capabilities through the DOE’s Exascale Computing Project CANDLE (CANcer Distributed Learning Environment) effort. CANDLE is an open source, collaboratively developed software platform that provides deep learning methodologies and capabilities that accelerate cancer research. The CANDLE project has already delivered software for use within the scope of the JDACS4C pilots as well as other cancer research projects, conducting hands-on workshops with the cancer research community to share these important new capabilities. Future work is underway for new releases of CANDLE. Focus areas include model optimization to increase capabilities for handling even larger amounts of data with parallel computing, and development of new areas to accelerate cancer research with deep learning.