AACR Project GENIE: Data

The first set of cancer genomic data aggregated through AACR Project Genomics Evidence Neoplasia Information Exchange (GENIE) was made available to the global community in January 2017.  The fourth data set, GENIE 4.0-public, was released in July 2018 and added more than 8,000 records to the database. The combined data set now includes more than 48,000 de-identified genomic records collected from patients who were treated at each of the consortium's participating institutions, making it among the largest fully public cancer genomic datasets released to date.  GENIE 4.1-public release, a patch to correct a few outstanding issues found in the original 4.0-public release, was posted on September 19, 2018, to both cBioportal and Synapse.

The public release of the fifth data set, GENIE 5.0-public, will take place in January 2019.    

The combined data set now includes data for over 80 major cancer types, including data from more than 7,500 patients with lung cancer, nearly 5,500 patients with breast cancer, and more than 5,100 patients with colorectal cancer.

Additional details, analyses, and summaries of the data attributes can be visualized here. For information regarding the use of the data, consult the data guide

Users can access the data directly via cbioportal, or download the data directly from Sage Bionetworks. Users will need to create an account for either site and agree to the terms of access.

For frequently asked questions, visit our FAQ page

Date Updated: 9/28/18