Skip to content

uc-0006: Gathering blood cancer data sets

Completion Date: ✅ September 2020

Tutorial walkthrough of this use case

NIH Goal:

Enhance the ability to ask scientific questions across data sets

Persona

p-001: Clinical Researcher

Objective

obj-0001: Multi DCC Comparison

Description

Acute Myeloid Leukemia (AML) is a type of blood cancer. In AML, the affected myeloid cells, a type of white blood cells, are not functional and build up in the bone marrow leaving reduced capacity for healthy white and red blood cells. While risk factors for developing AML exist, often times the underlying cause remains unknown. Gene mutations and chromosomal abnormality in the leukemia cells occur sporadically. Characterization of the wide spectrum of genetic events involved in AML will aide in better understanding of its etiology and ultimately in development of improved therapy.

Amberose would like to combine whole genome sequencing (WGS) data with global transcriptomic profiling using RNA-sequencing (RNA-seq) to look for functional dysregulation of a few genes. They know that there are likely already data sets created by NIH researchers that they could use for their initial hypothesis generation, and decide to start by searching Common Fund data at the CFDE portal.

Amberose navigates to the CFDE portal, and searches by Biosample, then filters that list by Anatomy, and searches within those results for 'blood' and 'venous blood'. They then filter these results to keep only whole genome sequencing assay (WGS) and RNA-seq assay values. Amberose then looks at the "Project" filter and finds that the several thousand results in their search belong to only 17 projects. By reading the project information for each, Amberose narrows their search down to the two that seem the most applicable: Genotype-Tissue Expression (GTEx)and TARGET: Acute Myeloid Leukemia. Amberose exports the list of files that belong to those two projects, which they can use to actually obtain these files (or request access to them) at their parent portals (Kids First and GTEx).

Tasks for this use case:

Requirements for this use case: