EGAD00000000001 
   
  
    
    WTCCC1 project samples from 1958 British Birth Cohort 
    
   
  
    
      
      Affymetrix 500K 
      
    
   
  1504 
 
  
    EGAD00000000002 
   
  
    
    WTCCC1 project samples from UK National Blood Service 
    
   
  
    
      
      Affymetrix 500K 
      
    
   
  1500 
 
  
    EGAD00000000003 
   
  
    
    WTCCC1 project Bipolar Disorder (BD) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000004 
   
  
    
    WTCCC1 project Coronary Artery Disease (CAD) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000005 
   
  
    
    WTCCC1 project Inflammatory Bowel Disease (IBD) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000006 
   
  
    
    WTCCC1 project Hypertension (HT) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000007 
   
  
    
    WTCCC1 project Rheumatooid arthritis (RA) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000008 
   
  
    
    WTCCC1 project Type 1 Diabetes (T1D) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000009 
   
  
    
    WTCCC1 project Type 2 Diabetes (T2D) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000010 
   
  
    
    WTCCC1 project Ankylosing Spondylitis (AS) samples 
    
   
  
    
      
      Illumina 15K 
      
    
   
  957 
 
  
    EGAD00000000011 
   
  
    
    WTCCC1 project Autoimmune Thyroid Disease (ATD) samples 
    
   
  
    
      
      Illumina 15K 
      
    
   
  900 
 
  
    EGAD00000000012 
   
  
    
    WTCCC1 project Multiple Sclerosis (MS) samples 
    
   
  
    
   
  975 
 
  
    EGAD00000000013 
   
  
    
    WTCCC1 project Breast cancer (BC) samples 
    
   
  
    
      
      Illumina 15K 
      
    
   
  1004 
 
  
    EGAD00000000014 
   
  
    
    WTCCC1 project samples from 1958 British Birth Cohort 
    
   
  
    
   
  1504 
 
  
    EGAD00000000015 
   
  
    
    WTCCC project African control samples 
    
   
  
    
      
      Affymetrix 500K 
      
    
   
  1496 
 
  
    EGAD00000000016 
   
  
    
    WTCCC project Tuberculosis (TB) samples 
    
   
  
    
      
      Affymetrix 500K 
      
    
   
  1498 
 
  
    EGAD00000000017 
   
  
    
    Cord blood control samples from Gambia 
    
   
  
    
   
  - 
 
  
    EGAD00000000018 
   
  
    
    Severe malaria cases from Gambia 
    
   
  
    
   
  - 
 
  
    EGAD00000000019 
   
  
    
    840 families where both parents have been genotyped together with the child with severe malaria 
    
   
  
    
   
  1 
 
  
    EGAD00000000020 
   
  
    
    685 families where both parents have been genotyped together with the child with severe malaria 
    
   
  
    
   
  - 
 
  
    EGAD00000000021 
   
  
    
    WTCCC2 project samples from 1958 British Birth Cohort 
    
   
  
    
   
  3000 
 
  
    EGAD00000000022 
   
  
    
    WTCCC2 project samples from 1958 British Birth Cohort 
    
   
  
    
   
  3000 
 
  
    EGAD00000000023 
   
  
    
    WTCCC2 project samples from National Blood Donors (NBS) Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000024 
   
  
    
    WTCCC2 project samples from National Blood Donors (NBS) Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000025 
   
  
    
    WTCCC2 project Ulcerative Colitis (UC) samples 
    
   
  
    
      
      Affymetrix 6.0 
      
    
   
  2869 
 
  
    EGAD00000000026 
   
  
    
    Randomly-selected, unrelated individuals 
    
   
  
    
      
      Illumina 610-Quad 
      
    
   
  518 
 
  
    EGAD00000000027 
   
  
    
    eQTL data for European newborns 
    
   
  
    
      
      Ilumina HumanHap550-2v3_B-Beadstudio 
      
    
   
  176 
 
  
    EGAD00000000028 
   
  
    
    Aggregate results from a GWAS study on 3352 cases abd 3145 controls 
    
   
  
    
   
  6497 
 
  
    EGAD00000000029 
   
  
    
    Aggregate results from a case-control study on stroke and ischemic stroke. 
    
   
  
    
   
  19602 
 
  
    EGAD00000000030 
   
  
    
    T1DGC project 1958 British Birth Cohort samples 
    
   
  
    
   
  2604 
 
  
    EGAD00000000031 
   
  
    
    HLA genotyping of 1958 British Birth Cohort samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000032 
   
  
    
    NcOEDG Helsinki 1 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000033 
   
  
    
    NcOEDG Helsinki 2 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000034 
   
  
    
    NcOEDG Helsinki 3 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000035 
   
  
    
    NcOEDG Helsinki 4 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000036 
   
  
    
    NcOEDG Stockholm 1 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000037 
   
  
    
    NcOEDG Stockholm 2 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000038 
   
  
    
    NcOEDG Stockholm 3 samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000039 
   
  
    
    NcOEDG Malmo - Lund samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000040 
   
  
    
    GenomEUtwin Danish (DK) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000041 
   
  
    
    GenomEUtwin Swedish (SWE) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000042 
   
  
    
    GenomEUtwin Finnish (FIN) samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000043 
   
  
    
    GenomeEUtwin control samples 
    
   
  
    
      
      Illumina HumanHap300-Duo 
      
      Illumina HumanHap 550K 
      
    
   
  2099 
 
  
    EGAD00000000044 
   
  
    
    Northern Finland Birth Cohort 1966 samples 
    
   
  
    
      
      Illumina HumanHap370 
      
    
   
  5844 
 
  
    EGAD00000000045 
   
  
    
    Genomic sequencing and transcriptome shotgun sequencing of a metastatic tumour and its recurrence after drug therapy in a single patient 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00000000046 
   
  
    
    RNA-SEQ data from 3 recurrent and 1 ovarian primary Granulosa Cell Tumour samples 
    
   
  
    
   
  4 
 
  
    EGAD00000000047 
   
  
    
    Signal data for from 3 recurrent and 1 ovarian primary Granulosa Cell Tumour samples 
    
   
  
    
   
  4 
 
  
    EGAD00000000048 
   
  
    
    Sequencing data from oestrogen-receptor-alpha-positive metastatic lobular breast cancer sample 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00000000049 
   
  
    
    RNA-SEQ data from oestrogen-receptor-alpha-positive metastatic lobular breast cancer sample 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00000000051 
   
  
    
    Sequencing data from matching Renal Carcinoma samples 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  25 
 
  
    EGAD00000000052 
   
  
    
    Sequencing data from natching Pancreatic Carcinoma samples 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  25 
 
  
    EGAD00000000053 
   
  
    
    Sequencing data from Breast Cancer samples 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00000000054 
   
  
    
    NCI-H209 is an immortal cell line derived from a bone marrow metastasis of a patient with small cell lung cancer, taken before chemotherapy. The specimen showed histologically typical small cells with classic neuroendocrine features. NCI-BL209 is an EBV-transformed B-cell line derived from the same patient as the small cell lung cancer cell line, NCI-H209 
    
   
  
    
      
      Life Tech - Solid 
      
    
   
  1 
 
  
    EGAD00000000055 
   
  
    
    COLO-829 is a publicly available immortal cancer cell line and COLO-829BL is a lymphoblastoid cell line derived from the same patient 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  2 
 
  
    EGAD00000000056 
   
  
    
    WTCCC project samples from the primary biliary cirrhosis cohort 
    
   
  
    
      
      Illumina 610K Quad 
      
    
   
  1705 
 
  
    EGAD00000000057 
   
  
    
    WTCCC project samples from the Parkinson's disase cohort 
    
   
  
    
      
      Illumina 610K Quad 
      
    
   
  1705 
 
  
    EGAD00000000058 
   
  
    
    Aggregate results from 22 Carbamazepine-induced hypersensitivity syndrome patients and 2691 UK National Blood Service (NBS) control samples 
    
   
  
    
   
  2713 
 
  
    EGAD00000000059 
   
  
    
    Aggregate results from 43 Carbamazepine-induced hypersensitivity syndrome patients and 1296 1958 British Birth Cohort control samples 
    
   
  
    
   
  1 
 
  
    EGAD00000000060 
   
  
    
    Samples from the UK Glomerulonephritis DNA bank 
    
   
  
    
   
  - 
 
  
    EGAD00000000073 
   
  
    
    Gabriel samples from the 1958 British Birth Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000074 
   
  
    
    Gabriel samples from the Swedish BAMSE Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000075 
   
  
    
    Gabriel samples from the Swedish BAMSE Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000076 
   
  
    
    Gabriel samples from the Australian Bussleton Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000077 
   
  
    
    Gabriel samples from the Australian Bussleton Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000082 
   
  
    
    Gabriel samples from the French EGEA Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000083 
   
  
    
    Gabriel samples from the French EGEA Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000084 
   
  
    
    Gabriel samples from the German Gabriel Advanced Survey 
    
   
  
    
   
  1 
 
  
    EGAD00000000085 
   
  
    
    Gabriel samples from the German Gabriel Advanced Survey 
    
   
  
    
   
  1 
 
  
    EGAD00000000086 
   
  
    
    Gabriel samples from the multicenter GAIN cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000087 
   
  
    
    Gabriel samples from the multicenter GAIN cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000088 
   
  
    
    Gabriel samples from the Karelia Allergy Study 
    
   
  
    
   
  1 
 
  
    EGAD00000000089 
   
  
    
    Gabriel samples from the Karelia Allergy Study 
    
   
  
    
   
  1 
 
  
    EGAD00000000090 
   
  
    
    Gabriel samples from the Russian KMSU cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000091 
   
  
    
    Gabriel samples from the Russian KMSU cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000092 
   
  
    
    Gabriel samples from the German MAGIS cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000093 
   
  
    
    Gabriel samples from the German MAGIS cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000094 
   
  
    
    Gabriel samples from the UK MRCA cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000101 
   
  
    
    Gabriel samples from the Russian TOMSK cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000102 
   
  
    
    Gabriel samples from the Russian TOMSK cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000103 
   
  
    
    Gabriel samples from the Russian UFA cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000104 
   
  
    
    Gabriel samples from the Russian UFA cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000105 
   
  
    
    Gabriel samples from the multicenter occupational cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000106 
   
  
    
    Gabriel samples from the multicenter occupational cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000107 
   
  
    
    Gabriel samples from the multicenter occupational cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000108 
   
  
    
    Gabriel samples from the UK AUGOSA cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000109 
   
  
    
    Gabriel samples from the UK SEVERE cohort 
    
   
  
    
   
  1 
 
  
    EGAD00000000114 
   
  
    
    Whole transcriptome sequence data from 18 ovarian clear-cell carcinoma samples and one TOV21G ovarian clear-cell carcinoma cell line 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00000000115 
   
  
    
    Summary data from GWAS analysis on 856 cases and 2836 control 
    
   
  
    
   
  3719 
 
  
    EGAD00000000119 
   
  
    
    Genotypes from cell lines derived from breast carcinoma tissue 
    
   
  
    
      
      Affymetrix 6.0 
      
    
   
  51 
 
  
    EGAD00000000120 
   
  
    
    WTCCC2 project Multiple Sclerosis (MS) samples 
    
   
  
    
      
      Human670-QuadCustom v1 
      
    
   
  11375 
 
  
    EGAD00000000121 
   
  
    
    Genotypes at MITF E318K variant 
    
   
  
    
      
      Taqman and sequencing 
      
    
   
  2488 
 
  
    EGAD00000000122 
   
  
    
    Genotypes at MITF E318K variant 
    
   
  
    
      
      Illumina Human660W-Quad 
      
      Illumina HumanCNV370 
      
      Illumina HumanHap 300 v2 Duo 
      
    
   
  1925 
 
  
    EGAD00001000001 
   
  
    
    Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  18 
 
  
    EGAD00001000002 
   
  
    
    Massive genomic rearrangement acquired in a single catastrophic event during cancer development 
    
   
  
    
      
      Illumina Genome Analyzer 
      
      Illumina Genome Analyzer II 
      
    
   
  11 
 
  
    EGAD00001000003 
   
  
    
    Gencode Exome Pilot 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  7 
 
  
    EGAD00001000004 
   
  
    
    CLL cancer Sample Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer 
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000005 
   
  
    
    Various Cancer Fusion Gene Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  14 
 
  
    EGAD00001000007 
   
  
    
    Osteosarcoma Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  43 
 
  
    EGAD00001000013 
   
  
    
    CLL Cancer Whole Genome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  19 
 
  
    EGAD00001000014 
   
  
    
    Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  54 
 
  
    EGAD00001000015 
   
  
    
    Exome sequencing of hyperplastic polyposis patients. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001000016 
   
  
    
    Familial Melanoma Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  89 
 
  
    EGAD00001000017 
   
  
    
    PAS Pedigrees: Identification of novel genetic variants contributing to cardiovascular disease in pedigrees with premature atherosclerosis. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000018 
   
  
    
    Identifying causative mutations for Thrombocytopenia with Absent Radii 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000019 
   
  
    
    Lethal malformation syndrome 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000021 
   
  
    
    Paroxysmal neurological disorders 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  97 
 
  
    EGAD00001000022 
   
  
    
    Exome sequencing in patients with cardiac arrhythmias 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  20 
 
  
    EGAD00001000023 
   
  
    
    Recurrent Somatic Mutations in CLL 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  11 
 
  
    EGAD00001000024 
   
  
    
    Whole Exome Sequencing for Characterization of Disease Causing Mutations in two Pakistani Families Suffering from Autosomal Recessive Ocular Disorders. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  4 
 
  
    EGAD00001000025 
   
  
    
    Determination of the molecular nature of the Vel blood group by exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  4 
 
  
    EGAD00001000026 
   
  
    
    Investigation of the genetic basis of the rare syndrome Post-Transfusion Purpura (PTP) 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000027 
   
  
    
    ICGC Germany PedBrain Medulloblastoma Pilot_2_LM 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000029 
   
  
    
    Grey Platelet Syndrome (GPS) 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000030 
   
  
    
    Analysis of genomic integrity of disease-corrected human induced pluripotent stem cells by exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000031 
   
  
    
    Human Colorectal Cancer Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  16 
 
  
    EGAD00001000032 
   
  
    
    Hepatitis C IL28B pooled resequencing study with 100 responders and 100 non-responders 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  4 
 
  
    EGAD00001000033 
   
  
    
    "SNV detection from formalin fixed paraffin embedded (FFPE) samples" 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000034 
   
  
    
    "Usage of small amounts of DNA for Illumina sequencing" 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  3 
 
  
    EGAD00001000035 
   
  
    
    "Single nucleotide variant detection in multiple foci of three prostate cancer tumors" 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  9 
 
  
    EGAD00001000036 
   
  
    
    "Copy number variant detection in multiple foci of three prostate cancer tumors" 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  9 
 
  
    EGAD00001000037 
   
  
    
    An evaluation of different strategies for large-scale pooled sequencing study design. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  7 
 
  
    EGAD00001000038 
   
  
    
    Hyperfibrinolysis 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000039 
   
  
    
    Platelet collagen defect 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000040 
   
  
    
    Bleeding 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000041 
   
  
    
    Various Platelet Disorders 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  7 
 
  
    EGAD00001000042 
   
  
    
    Whole-Exome-Seq-Dataset 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  30 
 
  
    EGAD00001000043 
   
  
    
    RNA-Seq-Dataset 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  16 
 
  
    EGAD00001000044 
   
  
    
    Recurrent Somatic Mutations in CLL 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  212 
 
  
    EGAD00001000045 
   
  
    
    Somatic mutation of SF3B1 in myelodysplasia with ring sideroblasts and other cancers 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001000046 
   
  
    
    Gastric Cancer Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001000047 
   
  
    
    exome sequence data for 49 HIV elite long term non-progressors and rapid progressors. Partial dataset (overlap with EGAD00001000087) of raw BAMs mapped to GRCh37_53. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  49 
 
  
    EGAD00001000048 
   
  
    
    monozygotic twin discordant for schizophrenia 
    
   
  
    
      
      Complete Genomics 
      
    
   
  2 
 
  
    EGAD00001000049 
   
  
    
    Pancreatic adenocarcinoma QCMG 20110901 
    
   
  
    
      
      AB SOLiD 4 System 
      
      AB SOLiD System 3.0 
      
    
   
  26 
 
  
    EGAD00001000050 
   
  
    
    Tandem duplication of chromosomal segments is common in ovarian and breast cancer genomes 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  13 
 
  
    EGAD00001000052 
   
  
    
    UK10K_NEURO_MUIR REL-2011-01-28 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  104 
 
  
    EGAD00001000053 
   
  
    
    Exome sequencing in patients with Calcific Aortic Valve Stenosis 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000054 
   
  
    
    Mutational Screening of Human Acute Myleloid Leukaemia Samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000055 
   
  
    
    Genetic variation in Kuusamo 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  434 
 
  
    EGAD00001000057 
   
  
    
    RNA-Seq analysis 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  15 
 
  
    EGAD00001000058 
   
  
    
    Exome Sequencing analysis 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  21 
 
  
    EGAD00001000059 
   
  
    
    Screening for human epigenetic variation at CpG islands 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  116 
 
  
    EGAD00001000060 
   
  
    
    Acral melanoma study whole genomes 
    
   
  
    
      
      Complete Genomics 
      
    
   
  3 
 
  
    EGAD00001000061 
   
  
    
    Acral melanoma study whole exomes 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  3 
 
  
    EGAD00001000062 
   
  
    
    ADCC Rearrangement Screen 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000063 
   
  
    
    Triple Negative Breast Cancer sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000064 
   
  
    
    Cell Line Sub Clone Rearrangement Screen 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000065 
   
  
    
    Mixed Leukemia Rearrangement Screen 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  5 
 
  
    EGAD00001000066 
   
  
    
    Breast Cancer Follow Up Series 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  288 
 
  
    EGAD00001000067 
   
  
    
    Cancer Single Cell Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000068 
   
  
    
    Multifocal Breast Project 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000069 
   
  
    
    Lung Rearrangement Study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000070 
   
  
    
    TMD_AMLK Exome Study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001000071 
   
  
    
    Kaposi sarcoma exome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000072 
   
  
    
    Fanconi Anemia transformation to AML 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000073 
   
  
    
    MDSMPN Rearrangement Screen 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000074 
   
  
    
    Integrative Oncogenomics of Multiple Myeloma 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  174 
 
  
    EGAD00001000075 
   
  
    
    Gastric and Esophageal tumour rearrangement screen 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001000076 
   
  
    
    CRLF2 sequencing project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001000077 
   
  
    
    CRLF2 sequencing project Exomes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001000078 
   
  
    
    ALK inhibitors in the context of ALK-dependent cancer cell lines 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000079 
   
  
    
    PREDICT 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  186 
 
  
    EGAD00001000080 
   
  
    
    Genomics of Colorectal Cancer Metastases - Massively Parallel Sequencing of Matched Primary and Metastatic tumours to Identify a Metastatic Signature of Somatic Mutations (MOSAIC) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  351 
 
  
    EGAD00001000081 
   
  
    
    Splenic Marginal Zone Lymphoma with villous lymphocytes exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000082 
   
  
    
    20 Matched Pair Breast Cancer Genomes 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001000083 
   
  
    
    Recurrent Somatic Mutations in CLL 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina Genome Analyzer IIx 
      
    
   
  61 
 
  
    EGAD00001000084 
   
  
    
    Matched Ovarian Cancer Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  23 
 
  
    EGAD00001000085 
   
  
    
    Somatic Histone H3 mutations 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000086 
   
  
    
    Analysis of genomic integrity of disease-corrected human induced pluripotent stem cells by exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000087 
   
  
    
    exome sequence data for 25 HIV elite long term non-progressors and rapid progressors. Partial dataset (overlap with EGAD00001000047) of raw BAMs mapped to GRCh37_53. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001000088 
   
  
    
    ER-, HER2-, PR- breast Cancer genome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000089 
   
  
    
    Acute Lymphoblastic Leukemia Exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  20 
 
  
    EGAD00001000090 
   
  
    
    Glioma cell lines rearrangement screen 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  3 
 
  
    EGAD00001000091 
   
  
    
    Non Tumour Renal Cell Line Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  1 
 
  
    EGAD00001000092 
   
  
    
    Cancer Exome Resequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  58 
 
  
    EGAD00001000093 
   
  
    
    Breast Cancer Exome Resequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  21 
 
  
    EGAD00001000094 
   
  
    
    Cancer Genome Libraries Tests 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  16 
 
  
    EGAD00001000095 
   
  
    
    Acute Myeloid Leukemia Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001000096 
   
  
    
    Pancreatic adenocarcinoma QCMG 20120201 
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  166 
 
  
    EGAD00001000097 
   
  
    
    Matched breast cancer fusion gene study 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000098 
   
  
    
    FRCC Exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  16 
 
  
    EGAD00001000099 
   
  
    
    Meningioma Exome 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  26 
 
  
    EGAD00001000100 
   
  
    
    Renal Matched Pair Cell Line Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  10 
 
  
    EGAD00001000101 
   
  
    
    ADCC Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001000102 
   
  
    
    Myeloproliferative Disorder Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000103 
   
  
    
    Myeloproliferative Disorder Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  4 
 
  
    EGAD00001000104 
   
  
    
    Acute Lymphoblastic Leukemia Exome sequencing 2 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  97 
 
  
    EGAD00001000105 
   
  
    
    MuTHER adipose tissue small RNA expression 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  130 
 
  
    EGAD00001000106 
   
  
    
    Primary Myelofibrosis Myeloproliferative Disease exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  67 
 
  
    EGAD00001000107 
   
  
    
    SCAT osteosarcoma sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001000108 
   
  
    
    Paroxysmal neurological disorders 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  327 
 
  
    EGAD00001000109 
   
  
    
    Unraveling the genetic basis of a collagen migration defect in patients with a combined  platelet dysfunction and reduced bone density 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001000110 
   
  
    
    Breast Cancer Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  179 
 
  
    EGAD00001000111 
   
  
    
    CML Discovery Project 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000112 
   
  
    
    Identifying Novel Fusion Genes in Myeloma 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000113 
   
  
    
    Mutational landscapes of primary triple negative breast cancers - Exomes 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  108 
 
  
    EGAD00001000115 
   
  
    
    Mutational landscapes of primary triple negative breast cancers - WGS 
    
   
  
    
      
      ABI_SOLID 
      
    
   
  32 
 
  
    EGAD00001000116 
   
  
    
    Acute Lymphoblastic Leukemia Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  61 
 
  
    EGAD00001000117 
   
  
    
    Myelodysplastic Syndrome Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  152 
 
  
    EGAD00001000118 
   
  
    
    Osteosarcoma Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001000119 
   
  
    
    Chordoma Exome Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001000121 
   
  
    
    Breast Cancer Whole Genome Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000122 
   
  
    
    DATA_SET_ICGC_PedBrainTumor_Medulloblastoma 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  206 
 
  
    EGAD00001000123 
   
  
    
    Polycythemia Vera Myeloproliferative Disease exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  119 
 
  
    EGAD00001000124 
   
  
    
    Sequencing Acute Myeloid Leukaemia 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000125 
   
  
    
    Chondrosarcoma Exome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  104 
 
  
    EGAD00001000126 
   
  
    
    HER2 positive Breast Cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  101 
 
  
    EGAD00001000127 
   
  
    
    Burden of Disease in Sarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  220 
 
  
    EGAD00001000128 
   
  
    
    Familial Thrombocytosis germline exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000129 
   
  
    
    Essential Thrombocythemia Myeloproliferative Disease exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  189 
 
  
    EGAD00001000130 
   
  
    
    Breast Cancer Matched Pair Cell Line Whole Genomes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000131 
   
  
    
    Genetic landscape of hepatocellular carcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000132 
   
  
    
    Mutational landscapes of primary triple negative breast cancers - RNA seq 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  80 
 
  
    EGAD00001000133 
   
  
    
    The landscape of cancer genes and mutational processes in breast cancer 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  199 
 
  
    EGAD00001000134 
   
  
    
    Sequence reads for pediatric GBM samples for manuscript: Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  54 
 
  
    EGAD00001000135 
   
  
    
    Neuroblastoma whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001000136 
   
  
    
    CML blast phase rearrangement screen 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000138 
   
  
    
    The expression data for this study can be found here: http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1088/and its SNP6 data can be found here:http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1087/ 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001000139 
   
  
    
    Tumor sample of a serious ovarian carcinoma 
    
   
  
    
      
      Complete Genomics 
      
    
   
  1 
 
  
    EGAD00001000140 
   
  
    
    Blood sample of serious ovarian carcinoma patient 
    
   
  
    
      
      Complete Genomics 
      
    
   
  1 
 
  
    EGAD00001000141 
   
  
    
    Triple Negative Breast Cancer Whole Genomes 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  243 
 
  
    EGAD00001000142 
   
  
    
    Renal Follow Up Series 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  637 
 
  
    EGAD00001000143 
   
  
    
    Xenograft Seqeuncing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000144 
   
  
    
    Lung Cancer Whole Genomes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000145 
   
  
    
    Matched Pair Cancer Cell line Whole Genomes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001000147 
   
  
    
    Osteosarcoma Whole Genome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  108 
 
  
    EGAD00001000149 
   
  
    
    A Comprehensive Catalogue of Somatic Mutations from a Human Cancer Genome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000150 
   
  
    
    Targeted re-sequencing of 97 genes in T-ALL 
    
   
  
    
      
      454 GS FLX Titanium 
      
    
   
  33 
 
  
    EGAD00001000151 
   
  
    
    UK10K OBESITY REL-2011-07-14 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  88 
 
  
    EGAD00001000152 
   
  
    
    UK10K_RARE_THYROID REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001000153 
   
  
    
    UK10K_RARE_SIR REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001000154 
   
  
    
    Single-cell genome sequencing reveals DNA-mutation per cell cycle 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000158 
   
  
    
    Subgroup-specific structural variation across 1,000 medulloblastoma genomes 
    
   
  
    
   
  23 
 
  
    EGAD00001000159 
   
  
    
    DATA FILES FOR SJOS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001000160 
   
  
    
    DATA FILES FOR SJACT 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000161 
   
  
    
    DATA FILES FOR SJLGG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001000162 
   
  
    
    DATA FILES FOR SJEPD 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001000163 
   
  
    
    DATA FILES FOR SJPHALL 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000164 
   
  
    
    Whole Genome Sequencing accompanying Genetic landscape of pediatric Rhabdomyosarcoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001000165 
   
  
    
    DATA FILES FOR SJINF 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000167 
   
  
    
    UK10K_RARE_HYPERCHOL REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000168 
   
  
    
    UK10K_RARE_CILIOPATHIES REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001000170 
   
  
    
    UK10K_NEURO_MUIR REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  167 
 
  
    EGAD00001000171 
   
  
    
    UK10K_RARE_FIND REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001000173 
   
  
    
    UK10K_NEURO_ASD_FI REL-2012-01-13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  85 
 
  
    EGAD00001000174 
   
  
    
    DATA_SET_Coverage_bias_sensitivity_of_variant_calling_for_4_WG_seq_tech 
    
   
  
    
      
      AB SOLiD 4 System 
      
      Complete Genomics 
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001000175 
   
  
    
    Identification of SPEN as a novel cancer gene and FGFR2 as a potential therapeutic target in adenoid cystic carcinoma 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  48 
 
  
    EGAD00001000176 
   
  
    
    DATA_SET_Comparing_sequencing_four_proto-typical_Burkitt_lymphomas_BL_IG-MYC_translocation 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000177 
   
  
    
    Whole Genome Methylation in CLL 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  6 
 
  
    EGAD00001000178 
   
  
    
    UK10K_RARE_CHD REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000179 
   
  
    
    UK10K_RARE_COLOBOMA REL-2012-01-13 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  75 
 
  
    EGAD00001000180 
   
  
    
    UK10K_RARE_NEUROMUSCULAR REL-2012-01-13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  47 
 
  
    EGAD00001000181 
   
  
    
    UK10K_OBESITY_SCOOP REL-2012-01-13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  212 
 
  
    EGAD00001000182 
   
  
    
    UK10K_NEURO_UKSCZ REL-2012-01-13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  95 
 
  
    EGAD00001000183 
   
  
    
    UK10K_NEURO_FSZNK REL-2012-01-13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  273 
 
  
    EGAD00001000184 
   
  
    
    UK10K_NEURO_FSZ_REL_2012_01_13 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  120 
 
  
    EGAD00001000185 
   
  
    
    UK10K_RARE_COLOBOMA REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  98 
 
  
    EGAD00001000186 
   
  
    
    UK10K_RARE_HYPERCHOL REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001000187 
   
  
    
    UK10K_RARE_THYROID REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  65 
 
  
    EGAD00001000188 
   
  
    
    UK10K_RARE_SIR REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  63 
 
  
    EGAD00001000189 
   
  
    
    UK10K_RARE_NEUROMUSCULAR REL-2012-02-22 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001000190 
   
  
    
    UK10K_RARE_FIND REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  90 
 
  
    EGAD00001000191 
   
  
    
    UK10K_RARE_CILIOPATHIES REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  128 
 
  
    EGAD00001000192 
   
  
    
    UK10K_RARE_CHD REL-2012-02-22 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000193 
   
  
    
    UK10K_OBESITY_SCOOP REL-2012-02-22 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  573 
 
  
    EGAD00001000194 
   
  
    
    UK10K_COHORT_TWINS REL-2011-12-01 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  1713 
 
  
    EGAD00001000195 
   
  
    
    For information about this sample set, please contact the sample custodian Nic Timpson: N.J.Timpson@bristol.ac.uk 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  740 
 
  
    EGAD00001000196 
   
  
    
    Neuroblastoma samples 
    
   
  
    
      
      Complete Genomics 
      
    
   
  203 
 
  
    EGAD00001000197 
   
  
    
    Progressive Hearing Loss 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  8 
 
  
    EGAD00001000198 
   
  
    
    Gene Discovery in Age-Related Hearing Loss 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000199 
   
  
    
    ORCADES_WGA 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  400 
 
  
    EGAD00001000200 
   
  
    
    Dilgom Exome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001000201 
   
  
    
    MDACC-endo 
    
   
  
    
      
      AB SOLiD System 3.0 
      
    
   
  28 
 
  
    EGAD00001000202 
   
  
    
    Neuroblastoma samples (Analyses_vcf files) 
    
   
  
    
   
  204 
 
  
    EGAD00001000203 
   
  
    
    Otosclerosis gene discovery 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000204 
   
  
    
    Hearing loss in adults from South Carolina 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000205 
   
  
    
    BRAF and MEK resistant cell line clones 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000206 
   
  
    
    UK10K_RARE_COLOBOMA REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  123 
 
  
    EGAD00001000207 
   
  
    
    UK10K_RARE_HYPERCHOL REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  88 
 
  
    EGAD00001000208 
   
  
    
    UK10K_RARE_THYROID REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  65 
 
  
    EGAD00001000209 
   
  
    
    UK10K_RARE_FIND REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  121 
 
  
    EGAD00001000210 
   
  
    
    UK10K_RARE_CHD REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  124 
 
  
    EGAD00001000212 
   
  
    
    Functional characterisation of CpG islands in human tissues 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  26 
 
  
    EGAD00001000213 
   
  
    
    Screening for abnormal CGI methylation in primary colorectal tumours 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  21 
 
  
    EGAD00001000214 
   
  
    
    Whole genome sequencing of colon samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000215 
   
  
    
    RNA sequencing of colon tumor/normal sample pairs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  139 
 
  
    EGAD00001000216 
   
  
    
    Exome capture sequencing of colon tumor/normal pairs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  144 
 
  
    EGAD00001000217 
   
  
    
    UK10K_RARE_CILIOPATHIES REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  150 
 
  
    EGAD00001000218 
   
  
    
    UK10K_RARE_SIR REL-2012-07-05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  81 
 
  
    EGAD00001000219 
   
  
    
    UK10K_RARE_NEUROMUSCULAR REL-2012-07-05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  117 
 
  
    EGAD00001000220 
   
  
    
    Deep sequencing of CTCs 
    
   
  
    
      
      454 GS FLX Titanium 
      
      Illumina MiSeq 
      
    
   
  3 
 
  
    EGAD00001000221 
   
  
    
    Whole genome sequencing of SCLC tumor/normal samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000222 
   
  
    
    Exome capture sequencing of SCLC tumor/normal pairs and cell lines 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  103 
 
  
    EGAD00001000223 
   
  
    
    RNA sequencing of SCLC tumor/normal sample pairs and cell lines 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  79 
 
  
    EGAD00001000224 
   
  
    
    Enrichment of CRC 
    
   
  
    
      
      454 GS FLX Titanium 
      
    
   
  2 
 
  
    EGAD00001000225 
   
  
    
    Deep sequencing of KRAS 
    
   
  
    
      
      454 GS FLX Titanium 
      
    
   
  8 
 
  
    EGAD00001000226 
   
  
    
    Chordoma is a rare malignant bone tumor that expresses the transcription factor T. We conducted an association study of 40 patients with chordoma and 358 ancestry-matched, unaffected individuals with replication in an independent cohort.  Whole-exome and Sanger sequencing of T exons reveals a strong risk association ( allelic odds ratio (OR) = 4.9, P = 3.3x10-11, CI= 2.9-8.1) with the common (minor allelic frequency >5%) non-synonymous SNP rs2305089 in chordoma, which is exceptional in cancer genetics. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000227 
   
  
    
    EGAD00001000227_UK10K_NEURO_ABERDEEN_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  347 
 
  
    EGAD00001000228 
   
  
    
    EGAD00001000228_UK10K_NEURO_ASD_BIONED_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001000229 
   
  
    
    EGAD00001000229_UK10K_NEURO_ASD_FI_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  85 
 
  
    EGAD00001000230 
   
  
    
    EGAD00001000230_UK10K_NEURO_ASD_GALLAGHER_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001000231 
   
  
    
    EGAD00001000231_UK10K_NEURO_ASD_SKUSE_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  320 
 
  
    EGAD00001000232 
   
  
    
    EGAD00001000232_UK10K_NEURO_ASD_TAMPERE_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  54 
 
  
    EGAD00001000233 
   
  
    
    EGAD00001000233_UK10K_NEURO_EDINBURGH_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  219 
 
  
    EGAD00001000234 
   
  
    
    EGAD00001000234_UK10K_NEURO_FSZNK_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  281 
 
  
    EGAD00001000235 
   
  
    
    EGAD00001000235_UK10K_NEURO_IOP_COLLIER_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  170 
 
  
    EGAD00001000236 
   
  
    
    EGAD00001000236_UK10K_NEURO_MUIR_REL_2012_07_05 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  167 
 
  
    EGAD00001000237 
   
  
    
    EGAD00001000237_UK10K_NEURO_GURLING_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001000239 
   
  
    
    EGAD00001000239_UK10K_NEURO_IMGSAC_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001000240 
   
  
    
    UK10K_NEURO_FSZ_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  120 
 
  
    EGAD00001000241 
   
  
    
    EGAD00001000241_UK10K_OBESITY_SCOOP_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  674 
 
  
    EGAD00001000242 
   
  
    
    EGAD00001000242_UK10K_NEURO_ASD_MGAS_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001000243 
   
  
    
    Melanoma-TIL Study Exomes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001000245 
   
  
    
    Pulldown cytosine deaminases 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000246 
   
  
    
    Integrative Oncogenomics of multiple myeloma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  106 
 
  
    EGAD00001000247 
   
  
    
    Integrative Oncogenomics of multiple myeloma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001000248 
   
  
    
    RNAseq Pulldown 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000249 
   
  
    
    This is the bam file generated after alignment using BWA program for the SAIF genome 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000251 
   
  
    
    De novo mutations in schizophrenia 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  611 
 
  
    EGAD00001000252 
   
  
    
    Evaluation of PCR library method on whole genome samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000253 
   
  
    
    AML targeted resequencing study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001000254 
   
  
    
    This dataset contain the raw files generated for SAIF genome project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000255 
   
  
    
    Testing the feasibility of genome scale sequencing in routinely collected FFPE cancer specimens versus matched fresh frozen samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001000256 
   
  
    
    UK10K_NEURO_UKSCZ REL-2012-07-05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  595 
 
  
    EGAD00001000258 
   
  
    
    Deep RNA sequencing in CLL 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  107 
 
  
    EGAD00001000259 
   
  
    
    DATA FILES FOR SJAMLM7 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000260 
   
  
    
    Hypodiploid acute lymphoblastic leukemia whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001000261 
   
  
    
    Retinoblastoma whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000262 
   
  
    
    OICR PANCREATIC CANCER DATASET 
    
   
  
    
   
  4 
 
  
    EGAD00001000263 
   
  
    
    A small subsample of EGAD00001000689. Please do not use. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000264 
   
  
    
    Resistance towards chemotherapy is one of the main causes of treatment failure and deathamong breast cancer patients.The main objective of this project is toidentify genetic mechanisms causing some breast cancer patients not torespond to a particluar type of chemotherapy (epirubicin) while otherpatients respond very well to the same treatment. In the project wewill perform genome / exome sequencing of a selection of breast cancerpatients (n=30). These patients are drawn from a cohort where allpatients have recieved treatment with epirubicin monotherapy before surgical removal of alocally advanced breast tumour, and where all patients have beensubjected to objective evaluation of the response to thetherapy. Subsequent to sequencing, we will analyse the data andcompare with the clinical data for each patient (object response totherapy). The main aim being to identify mutations that are associatedwith resistance to epirubicin. Identification of mutations with strongpredictive value, may have a direct impact on cancer treatment sinceit opens the possibility for genetic testing of a tumour, and desicionon which drug is likely to work best, prior to treatment start. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001000265 
   
  
    
    This Study uses a focused bespoke bait pull down library method to target findings of Chondrosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001000266 
   
  
    
    This Study uses a focused bespoke bait pull down library method to target findings of Osteosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  110 
 
  
    EGAD00001000267 
   
  
    
    This Study uses a focused bespoke bait pull down library method to target findings of Chordoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000268 
   
  
    
    DATA FILES FOR SJCBF 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001000269 
   
  
    
    OLD DATA FILES FOR SJMB - Superseded by EGAD00001001864 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000270 
   
  
    
    DATA_SET_EOP-PCA-LargeAndSmallTumors1 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000271 
   
  
    
    Pilot study Pilocytic Astrocytoma ICGC PedBrain, whole genome sequencing of 5 tumors and matched blood 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000272 
   
  
    
    Genomic Alterations in Gingivo-buccal Cancer: ICGC-India Project_YR01 
    
   
  
    
      
      454 GS FLX Titanium 
      
      Illumina HiSeq 2000 
      
    
   
  200 
 
  
    EGAD00001000273 
   
  
    
    This Study uses a focused bespoke bait pull down library method to target findings of Meningioma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  147 
 
  
    EGAD00001000274 
   
  
    
    DATA_SET_TRANSCIPTOME_Comparing_sequencing_four_proto-typical_Burkitt_lymphomas_BL_IG-MYC_translocation 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000275 
   
  
    
    Data set for Whole-genome-Sequencing of adult medulloblastoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000276 
   
  
    
    OICR PANCREATIC CANCER DATASET 2 
    
   
  
    
   
  10 
 
  
    EGAD00001000277 
   
  
    
    High Quality Variant Call files, generated by bioscope, converted to vcf format. Complete dataset for all 300 samples. 
    
   
  
    
   
  202 
 
  
    EGAD00001000278 
   
  
    
    ICGC MMML-seq Data Freeze November 2012 whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000279 
   
  
    
    ICGC MMML-seq Data Freeze November 2012 whole exome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  4 
 
  
    EGAD00001000280 
   
  
    
    This experiment is to validate putative somatic substitutions and indels identified in an exome screen of ~50 osteosarcoma tumour/normal pairs. It is the first stage in our ICGC commitment to study osteosarcoma. The validation process is an important component of our analysis to clarify the data prior to looking for evidence of new cancer genes, or subverted pathways important in the development of cancer. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  112 
 
  
    EGAD00001000281 
   
  
    
    ICGC MMML-seq Data Freeze November 2012 transcriptome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000282 
   
  
    
    Neuroblastomas are tumors of peripheral sympathetic neurons and are the most common solid tumor in children.   To determine the genetic basis for neuroblastoma we performed whole-genome sequencing (6 cases), exome sequencing (16 cases), genome-wide rearrangement analyses (32 cases), and targeted analyses of specific genomic loci (40 cases) using massively parallel sequencing.    On average each tumor had 19 somatic alterations in coding genes (range, 3-70).  Among genes not previously known to be involved in neuroblastoma, chromosomal deletions and sequence alterations of chromatin remodeling genes, ARID1A and ARID1B, were identified in 8 of 71 tumors (11%) and were associated with early treatment failure and decreased survival.  Using tumor-specific structural alterations, we developed an approach to identify rearranged DNA fragments in sera, providing personalized biomarkers for minimal residual disease detection and monitoring.  These results highlight dysregulation of chromatin remodeling in pediatric tumorigenesis and provide new approaches for the management of neuroblastoma patients. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001000283 
   
  
    
    Agilent whole exome hybridisation capture was performed on genomic DNA derived from MDS and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to discover the prevalence of our findings using bespoke pulldown methods and sequencing the products from a larger set of patient DNA. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  764 
 
  
    EGAD00001000284 
   
  
    
    Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  1 
 
  
    EGAD00001000285 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  55 
 
  
    EGAD00001000286 
   
  
    
    Whole-exome study of congenital macrothrombocytopenia 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001000287 
   
  
    
    Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  54 
 
  
    EGAD00001000288 
   
  
    
    Invasive lobular carcinoma (ILC) is the second most common histological subtype of breast cancer accounting for 10-15% of cases. ILC differs from invasive ductal carcinoma (IDC)with respect to epidemiology, histology, and clinical presentation. Moreover, ILC is lesssensitive to chemotherapy, more frequently bilateral, and more prone to form gastrointestinal, peritoneal, and ovarian metastases than IDCs. In contrast to IDC, the prognostic value ofhistological grade (HG) in ILC is controversial. One of the three major components of histological grading (tubule formation) is missing in ILC which hinders the process of gradingin this histological subtype and results in the classification of approximately two thirds of ILC as HG 2.Over the last decade, a number of gene expression signatures have shed light onto breast cancer classification, allowing breast cancer care to become more personalized. Withrespect to the management of estrogen receptor (ER)-positive breast cancer, several gene expression signatures provide prognostic and/or predictive information beyond what is possible with current classical clinico-pathological parameters alone. Nevertheless, most studies using gene expression signature have not considered different histologic subtypesseparately. Recently, a comprehensive research program has elucidated some of the biological underpinnings of invasive lobular carcinoma. Genetic material extracted from 200 ILC tumor samples were studied using gene expression profiling and identified ILCmolecular subtypes. These proliferation-driven gene signatures of ILC appear to have prognostic significance. In particular, the Genomic Grade (GG) gene signature improved upon HG in ILC and added prognostic value to classic clinico-pathologic factors. In addition this study demonstrated that most ILC are molecularly characterized as luminal-A (~75%)followed by luminal-B (~20%) and HER2-positve tumors (~5%). Moreover, we investigated the prognostic value of known gene signatures/ gene modules in the same cohort of ILC. As a second step within the scope of this project, we aim to investigate the interactionsbetween somatic ILC tumor mutations to observed transcriptome findings. To this end, we aim to perform somatic mutation analysis for the ILC tumors for which Affymetrix gene expression profiling is available. To this end, we will use a gene screen assay, which specifically interrogates the mutational status of a few hundreds of cancer genes. We believe that this pioneering effort will be fundamental for a tailored treatment of ILC withimprovement in patients' outcome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1130 
 
  
    EGAD00001000289 
   
  
    
    Agilent whole exome hybridisation capture was performed on genomic DNA derived from cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000290 
   
  
    
    Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  1 
 
  
    EGAD00001000291 
   
  
    
    Exome sequencing identifies mutation of the ribosome in T-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  128 
 
  
    EGAD00001000292 
   
  
    
    Whole genome sequencing analysis was performed on 6 patients within matched germline, follicular lymphoma and transformed follicular lymphoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000293 
   
  
    
    Sequencing data for Australian Ovarian Cancer study submitted 20121116 
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  72 
 
  
    EGAD00001000294 
   
  
    
    UK10K_RARE_CHD REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  124 
 
  
    EGAD00001000295 
   
  
    
    UK10K_RARE_HYPERCHOL REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  120 
 
  
    EGAD00001000296 
   
  
    
    UK10K_RARE_CILIOPATHIES REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  108 
 
  
    EGAD00001000297 
   
  
    
    UK10K_RARE_FIND REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  124 
 
  
    EGAD00001000298 
   
  
    
    UK10K_RARE_NEUROMUSCULAR REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001000299 
   
  
    
    Whole exome sequencing of samples selected from the Finrisk sample collection. The samples sequenced in this study have all been collected in Kuusamo, Finland. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001000300 
   
  
    
    UK10K_OBESITY_GS_REL_2012_07_05 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  430 
 
  
    EGAD00001000301 
   
  
    
    A couple of previously characterized and sequenced libraries will be repeated using a couple of differing size selection criteria and skim sequenced using an Illumina HiSeq. The resulting sequence will be analyzed to determine the optimal DNA library size for our specific downstream analysis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000302 
   
  
    
    This experiment is looking at the mutational signatures generated by engineered HRAS mutations by using whole genome sequence generated on massively parallel next generation sequencers. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000303 
   
  
    
    ICGC prostate cancer whole genome mate-pair sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  22 
 
  
    EGAD00001000304 
   
  
    
    ICGC prostate cancer miRNA sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000305 
   
  
    
    ICGC prostate cancer RNA sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000306 
   
  
    
    ICGC prostate cancer whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000307 
   
  
    
    UK10K_RARE_COLOBOMA REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  117 
 
  
    EGAD00001000308 
   
  
    
    Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing 
    
   
  
    
   
  1 
 
  
    EGAD00001000309 
   
  
    
    UK10K_OBESITY_GS REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  424 
 
  
    EGAD00001000310 
   
  
    
    UK10K_NEURO_ASD_BIONED REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001000311 
   
  
    
    UK10K_NEURO_ASD_FI REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001000312 
   
  
    
    UK10K_NEURO_ASD_MGAS REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001000313 
   
  
    
    UK10K_NEURO_ASD_SKUSE REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  305 
 
  
    EGAD00001000314 
   
  
    
    UK10K_NEURO_ASD_TAMPERE REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000315 
   
  
    
    UK10K_NEURO_ABERDEEN REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  313 
 
  
    EGAD00001000316 
   
  
    
    UK10K_NEURO_ASD_GALLAGHER REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  75 
 
  
    EGAD00001000317 
   
  
    
    UK10K_NEURO_EDINBURGH REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  214 
 
  
    EGAD00001000318 
   
  
    
    UK10K_NEURO_FSZ REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  119 
 
  
    EGAD00001000319 
   
  
    
    UK10K_NEURO_GURLING REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000320 
   
  
    
    UK10K_NEURO_IMGSAC REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  111 
 
  
    EGAD00001000321 
   
  
    
    UK10K_NEURO_IOP_COLLIER REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  158 
 
  
    EGAD00001000322 
   
  
    
    UK10K_NEURO_MUIR REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  166 
 
  
    EGAD00001000323 
   
  
    
    Sequencing data for Australian Pancreatic Cancer study submitted 20130102 
    
   
  
    
      
      AB SOLiD 4 System 
      
      Illumina HiSeq 2000 
      
    
   
  200 
 
  
    EGAD00001000324 
   
  
    
    We will sequence the RNA of lymphoblast samples, transformed with EBV, which have poikiloderma syndrome with mutations in c16orf57. The aim of the experiment is to characterise RNA structural effects in this disease. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000325 
   
  
    
    In this study, mutations present in a series of human melanomas (stage IV disease) will be determined, using autologous blood cells to obtain a reference genome. From each of the samples that are analyzed, tumour-infiltrating T lymphocytes have also been isolated. This offers a unique opportunity to determine which (fraction of) mutations in human cancer leads to epitopes that are recognized by T cells. The resulting information is likely to be of value to understand how T cell activating drugs exert their action. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000327 
   
  
    
    release_2: ICGC PedBrain: whole genome mate-pair sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001000328 
   
  
    
    ICGC PedBrain: RNA sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001000329 
   
  
    
    UK10K_RARE_THYROID REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  113 
 
  
    EGAD00001000332 
   
  
    
    UK10K_NEURO_FSZNK REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  258 
 
  
    EGAD00001000333 
   
  
    
    Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing,s sarcoma tumours. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001000334 
   
  
    
    UK10K_RARE_SIR REL-2012-11-27 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  111 
 
  
    EGAD00001000335 
   
  
    
    UK10K_NEURO_UKSCZ REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  527 
 
  
    EGAD00001000336 
   
  
    
    UK10K_OBESITY_SCOOP REL-2012-11-27 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  784 
 
  
    EGAD00001000337 
   
  
    
    Illumina RNA-Seq will be performed on four Ewing's sarcoma cell lines and two control cell lines. RNA was extracted from all the lines using a basic Trizol extraction protocol. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000338 
   
  
    
    We propose to definitively characterise the somatic genetics of ER+ve, HER2-ve breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000339 
   
  
    
    Multiple myeloma is an incurable plasma cell malignancy whose molecular pathogenesis is incompletely understood. We used whole exome sequencing, copy number profiling and cytogenetic to analyses 84 samples from 67 patients with myeloma. In addition to known myeloma genes, we identify new candidate genes, including truncations of SP140, ROBO1 and FAT3 and clustered missense mutations in EGR1. We find oncogenic mutations in cancer genes not previously implicated in myeloma, including SF3B1, PI3KCA and PTEN. We define diverse processes contributing to the mutational repertoire, including kataegis and somatic hypermutation. Most cases have at least one cluster of subclonal variants, including subclonal driver mutations, implying on-going tumor evolution. Serial samples revealed diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Our findings reveal the myeloma genome to be heterogeneous across patients and, within individual patients, to exhibit diversity in clonal admixture and dynamics in response to therapy. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  154 
 
  
    EGAD00001000340 
   
  
    
    The objective of this study is to resequence of targeted intervals containing autosomal recessive variants causing neurological disorders in consanguineous pedigrees. Using homozygosity mapping, three intervals of very different sizes have previously been unambiguously mapped for three different neurological diseases: 2.4Mb, 8Mb and 14.3Mb in size, for Microlissencephaly, Severe Mental Retardation and Complicated hereditary spastic paraplegia respectively. This study is a pilot to assess how well custom targeted resequencing performs across a broad size range of intervals. The study design is to use a different custom capture probe set for each interval, pulldown from a single patient from each family, and sequence 1 lane using Illumina paired-reads for each sample. Candidate variants will be followed up in the families themselves, and in patients with similar phenotypes from outbred populations 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  3 
 
  
    EGAD00001000341 
   
  
    
    This pilot study aims to generate pilot data to inform future study designs in consanguineous families or inbred populations by resequencing the exome of six individuals from five families with neurodevelopmental diseases. For all of these families a single mapping interval containing the causal variant has previously been identified. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000342 
   
  
    
    This project aims to find causal variants in 50 patients diagnosed with Microcephalic Osteodysplastic Primordial Dwarfism (MOPD), of presumed recessive inheritance performing whole exome sequencing to ~50x mean depth.This is a collaboration with Prof A. Jackson, MRC Human Genetics Unit, Edinburgh 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001000343 
   
  
    
    This project aims to identify highly penetrant coding variants increasing the risk of Congenital Heart Disease (CHD) performing whole exome sequencing on DNA samples from 23 affected individuals, selected from 10 families with presumed Autosomal Recessive Inheritance. This is a collaboration with Prof. Eamonn Maher and Dr. Chirag Patel from the Department of Medical and Molecular Genetics, University of Birmingham plans to sequence 23 indexed Agilent whole exome pulldown libraries on 75Bp PE HiSeq (Illumina) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001000344 
   
  
    
    Exome sequencing of 30 parent-offspring trios to >50X mean depth, where the offspring has sporadic TOF, to identify potential causal de novo mutations. We will use the exome plus design for pulldown that incorporates  ~6.8Mb of additional regulatory sequences in addition to the ~50Mb GENCODE exome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  90 
 
  
    EGAD00001000345 
   
  
    
    Exome sequencing of  12 DNA samples obtained from patients with structural brain malformations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001000346 
   
  
    
    Exome sequencing of patients and their families with diverse rare neurological disorders. Some families have prior linkage data identifying a specific chromosomal interval or interest, other families do not have linkage data available. Many of these families come from special populations whose demography or preference for consanguineous marriages make them particularly tractable for genetic studies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001000347 
   
  
    
    These samples include exome sequences of family members with dyslipidemias from Finnish origin. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  95 
 
  
    EGAD00001000348 
   
  
    
    This pilot study aims to generate pilot data to inform future study designs by resequencing the whole exomes of 10 unrelated individuals diagnosed with Bilateral Anophthalmia. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  16 
 
  
    EGAD00001000349 
   
  
    
    These samples are from locally advanced breast cancers that have been treated with epirubicin monotherapy before surgery. We will sequence some samples from patients with good response to the therapy and some with poor response to the therapy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001000350 
   
  
    
    We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  17 
 
  
    EGAD00001000351 
   
  
    
    This pilot study aims to generate pilot data to inform future study designs by resequencing the whole exomes of 10 unrelated individuals diagnosed with Congenital Heart Disease (CHD). 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  16 
 
  
    EGAD00001000352 
   
  
    
    DATA FILES FOR SJLGG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000353 
   
  
    
    DATA FILES FOR SJLGG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  45 
 
  
    EGAD00001000354 
   
  
    
    Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  81 
 
  
    EGAD00001000355 
   
  
    
    ICGC MMML-seq Data Freeze March 2013 whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000356 
   
  
    
    ICGC MMML-seq Data Freeze March 2013 transcriptome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001000357 
   
  
    
    PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced by MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001000358 
   
  
    
    Chondrosarcoma (CHS) is a heterogeneous collection of malignant bone tumours and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1/2 in nearly half of central CHS. However, there has been little systematic genomic analysis of this tumour type and thus the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 cases of CHS. We identified hypermutability of the major cartilage collagen COL2A1 with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition we identified mutations in IDH1/2 (59%), TP53 (20%), the RB1 pathway (27%) and hedgehog signaling (22%). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  17 
 
  
    EGAD00001000359 
   
  
    
    In this study we will sequence the transcriptome of Verified Cancer Cell lines. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000360 
   
  
    
    The genome-wide landscape of somatically acquired mutations in mesothelioma has not been deeply characterised to date, but advances in DNA sequencing technology now allow this to be addressed comprehensively. Harnessing massively parallel DNA sequencing platforms, we will identify somatically acquired point mutations in all coding regions of the genome from patients with mesothelioma. In addition, using paired-end sequencing, we will map copy number changes and genomic rearrangements from the same patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  232 
 
  
    EGAD00001000361 
   
  
    
    This is a small pilot data set to test the feasibility of cDNA exomes across 1200 cancer cell line panel. cDNA exomes or Fus-seq is further explained in this studies Abstract. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000362 
   
  
    
    Human induced pluripotent stem (hiPS) cells hold great promise for regenerative medicine. Safety issues of use of hiPS cells however remain to be addressed. One of such issues is mutations derived from somatic donor cells and introduced during genome manipulation. We sequence whole genomes of hiPS cells and analyzed mutations. Our study brings hiPS cell technology one step closer to application to regenerative medicine. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000363 
   
  
    
    Common variable immunodeficiency (CVID) is the most common form of primary immunodeficiency with an estimated incidence of 1:10,000. It has been apparent for many years that CVID has a genetic component, occurs frequently in families and can have both a recessive or dominant mode of inheritance. In recent years, 4 genes underlying CVID have been identified; however, mutations within in them are estimated to account for no more than 10% of all cases of CVID.
We have identified a multi-generational family with autosomal dominant CVID. Genome-wide linkage analysis has mapped the locus underlying CVID in this family to an approximately 9.2 Mb interval on chromosome 3q27.3-q29, between the markers D3S3570 and D3S1265. This locus is distinct from any of the previously mapped susceptibility loci suggesting a novel genetic variant is responsible for disease in this family. The aim of this study is to use exome sequencing of affected (n = 4) and unaffected (n = 4) individuals, in tandem with the available genetic mapping data, to identify the causal variant underlying CVID in this family. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000364 
   
  
    
    We performed low coverage whole genome sequencing of plasma DNA from prostate cancer patients to establish copy number profiles on both a genome-wide and a gene-specific level. The data include plasma samples from prostate cacner patients (n=13), non-malignant controls (males, n=10 and females, n=9), plasma samples from pregnancies with aneuploid and euploid fetuses (n=4). Furthermore, we sequenced different tumor samples (n=6) of one patients and a serial dilution of HT29 in a background of normal DNA (n=9). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  50 
 
  
    EGAD00001000365 
   
  
    
    In this study we analysed patients with metastatic prostate cancer to scan their tumor genomes noninvasively in plasma DNA. We enriched 1.3 Mbp of seven plasma DNAs (4 CRPC cases: CRPC1-3 and CRPC5; 3 CSPC cases: CSPC1-2 and CSPC4) including exonic sequences of 55 cancer genes and 38 introns of 18 genes, where fusion breakpoints have been described using Sure Select Custom DNA Kit. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  7 
 
  
    EGAD00001000366 
   
  
    
    WGBS data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001000367 
   
  
    
    Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000368 
   
  
    
    Genomic libraries (500 bps) will be generated from total genomic DNA derived from Osteosarcoma cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000369 
   
  
    
    We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000370 
   
  
    
    This dataset is compromised of 5 sequencing experiments from a single patient with sporadic and recurring parathyroid carcinoma. The samples include whole genome sequence of the primary tumor, the first recurrent tumor and peripheral blood. Whole transcriptome sequence of the first and second recurrent tumors are also included. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000371 
   
  
    
    Sequencing data for PDAC cell lines generated by QCMG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001000372 
   
  
    
    We conducted whole genome sequencing and DNA SNP array of 12 uveal melanoma genomes and their matched DNA from blood. We also conducted RNA-seq of the 12 tumour samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001000380 
   
  
    
    Illumina paired-end sequencing of whole- exome pulldown DNA from Severe Insulin Resistant patients. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  64 
 
  
    EGAD00001000381 
   
  
    
    Illumina paired-end sequencing of whole- exome pulldown DNA from Severe Insulin Resistant patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000382 
   
  
    
    Whole Exome Sequencing of Permanent Neonatal Diabetes Patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001000383 
   
  
    
    In collaboration with Dr Robert Semple we have identified a family harbouring an autosomal dominant variant, which leads to severe insulin resistance (SIR), short stature and facial dysmorphism. This family is unique within the SIR cohort in having normal lipid profiles, preserved adiponectin and normal INSR expression and phosphorylation. DNA is available for 7 affected and 7 unaffected family members across 3 generations. All 14 samples have been genotyped using microsatellites and the Affymetrix 6.0 SNP chip. Linkage analysis identified an 18.8Mb haplotype on chromosome 19 as a possible location of the causative variant. However, Exome sequencing of 3 affected and 1 unaffected family members has not identified the causative variant suggesting the possibility of an intronic or intergenic variant in this region or elsewhere in the genome. We propose to conduct whole genome sequencing of 5 members of the pedigree at a depth of 20X. The chosen samples are two sets of parents plus one member of an unaffected branch of the pedigree who shares the risk haplotype on chromosome 19. Sequencing of the two sets of parents will be used along with the genome-wide SNP data to impute 4 affected children giving an effect sample size of 6 affected individuals. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000384 
   
  
    
    In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs  but it is unlcear whether some cell types carry through fewer mutations through reprogramming (either due to mutations present in the primary cells, or mutations accumulated during reprogramming). Through in depth analysis of hiPSCs generated from different somatic cells, it will be possible to assess the variation in genetic stability of different cell types. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  35 
 
  
    EGAD00001000385 
   
  
    
    Wholegenome libraries will be prepared from at least two serial samples reflecting different stages of disease progression and matched constitutional DNA for 30 Myeloproliferative Disease samples. Five lanes of Illumina HiSeq sequencing will be performed on each of the tumour samples and four lanes for each of the constitutional DNA. Sequencing data will mapped to build 37 of the human reference genome and analysis will be performed to characterize the spectrum of somatic variation present in these samples including single base pair mutations, insertions, deletions as well as larger structural variants and genomic rearrangements. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  108 
 
  
    EGAD00001000386 
   
  
    
    Wholegenome libraries will be prepared from at least two serial samples reflecting different stages of disease progression and matched constitutional DNA for 30 Myelodysplastic syndrome patient samples. Five lanes of Illumina HiSeq sequencing will be performed on each of the tumour samples and four lanes for each of the constitutional DNA. Sequencing data will mapped to build 37 of the human reference genome and analysis will be performed to characterize the spectrum of somatic variation present in these samples including single base pair mutations, insertions, deletions as well as larger structural variants and genomic rearrangements. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  83 
 
  
    EGAD00001000387 
   
  
    
    This study aims to whole genome sequence DNA derived from breast cancer patients who received neo-adjuvany chemotherapy. All patients had multiple biopsies performed before chemotherapy. Patients who had residual disease after the course of treatment underwent a further biopsy. We aim to characterise the mutations involved. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  35 
 
  
    EGAD00001000388 
   
  
    
    Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001000389 
   
  
    
    Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000390 
   
  
    
    We propose to definitively characterise the somatic genetics of triple negative breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  101 
 
  
    EGAD00001000392 
   
  
    
    Agilent whole exome hybridisation capture was performed on genomic DNA derived from Chondrosarcoma cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  60 
 
  
    EGAD00001000393 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001000394 
   
  
    
    DNA methylation has been shown to play a major role in determining cellular phenotype by regulating gene expression. Moreover, dysregulation of differentially methylated genes has been implicated in disease pathogenesis of various conditions including cancer development as well as autoimmune diseases such as systemic Lupus erythematosus and rheumatoid arthritis. Evidence is rapidly accumulating for a role of DNA methylation in regulating immune responses in health and disease. However, the exact mechanisms remain unknown. The overall aim of the project is to investigate the role of epigenetic mechanisms in regulating immunity and their impact on autoimmune disease pathogenesis.The aim of this pilot study is to perform whole genome methylation analysis in peripheral blood mononuclear cells (PBMCs) and cell subsets (CD4, CD8, CD14, CD19, CD16 and whole PBMCs) obtained from 6 healthy volunteers. Whole genome methylation analysis will be performed using two methodological approaches, the Infinium Methylation Bead Array K450 (Illumina) and MeDIP-seq. mRNA expression arrays will also be performed in order to correlate DNA methylation with gene expression as well as genotyping on the Illumina OmniExpress chip 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001000395 
   
  
    
    Noninvasive Prenatal Molecular Karyotyping from Maternal Plasma 
    
   
  
    
   
  1 
 
  
    EGAD00001000396 
   
  
    
    We performed serial plasma-Seq analyses on a male who progressed from castration-sensitive to castration-resistant prostate cancer within 10 months following treatment with androgen-deprivation therapy. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001000397 
   
  
    
    The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  47 
 
  
    EGAD00001000398 
   
  
    
    The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  8 
 
  
    EGAD00001000399 
   
  
    
    In 2009 we identified a four-generation family with over 700 members and 41 affected with Crohn's disease (CD).  At the time we sequenced the exome of 6 affected individuals but did not identify any coding variants which appear to explain the high prevalence of disease.  Since then we have collected DNA from a large number of additional family members, genotyped linkage arrays on the entire family to refine genomic regions shared by identity by descent and genotyped affected and unaffected members at known CD risk loci identified by Genome Wide Association Studies (GWAS).  These analyses have confirmed that a significant unexplained excess of disease remains after accounting for all known genetic factors, and that several regions of the genome are shared by a large fraction of affected individuals.  We therefore perform whole genomes sequencing from 8 individuals which will allow us to impute the complete sequence of nearly all the members of the two largest and most severely affected branches of the family. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000400 
   
  
    
    The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000401 
   
  
    
    Population based sequencing of whole genomes of Crohn's disease patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2926 
 
  
    EGAD00001000402 
   
  
    
    The study will analyse by exome sequencing 42 Greek patients with premature MI and no vessel disease to identify genetic factors underlying this condition. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000403 
   
  
    
    The ENGAGE project is a FP7 funded EU project aiming to combine genetic and phenotype information from European population based cohorts. In this sub-project we aim to do whole exome sequencing of individuals selected from Health 2000 and FINRISK cohorts. Individuals have been selected based on their metabolic trait phenotypes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  394 
 
  
    EGAD00001000404 
   
  
    
    Acute myeloid leukaemia (AML) is an aggressive and molecularly diverse disease with a poor overall survival of 20-25%. With an annual incidence of 2.9 per 100,000, AML is currently the commonest myeloid malignancy in Europe, yet the two main therapeutic options for this disease, anthracyclines and purine analogues, have remained unchanged for over 20 years. 
Currently patients are stratified at diagnosis according to a series
of clinicopathological parameters (e.g. age, white cell count and
presence/absence of previous clonal haematological disease) and
molecular markers (e.g. chromosomal translocations/deletions,
aneuploidy and mutations in genes such as FLT3 and NPM1). Patients
with adverse prognostic features, whose prognosis is particularly poor
(e.g. <15% long-term survival) are offered treatment with allogeneic bone marrow transplantation (allo-BMT) if a sibling or unrelated donor is available. This can significantly improve survival (e.g. up to 40% long-term survival in some contexts), albeit at the expense of significant toxicity and transplant-related mortality (TRM). 
Allo-BMT is thought to work in part by allowing the delivery of large doses of chemotherapy followed by haemopoietic "rescue" with donor haemopoietic stem cells (haemopoietic failure would otherwise ensue).  However, potentially the most potent effect of allo-BMT is the cytotoxic effect of donor lymphocytes against AML blasts, a phenomenon known as graft-vs-leukaemia (GVL) effect. Increasingly, transplants using reduced chemotherapy intensity (mini-allografts) are being used that partially circumvent the toxicity from chemotherapy and rely on GVL to effect cure. 
Nevertheless, AML relapse after allo-BMT still occurs at a significant rate of up to 80% depending on the type of transplant. There is accumulating evidence that genetic events in residual leukaemic cells enable them to evade immunodetection and therefore survive the GVL effect and expand to cause relapse. The most striking example of this is the loss of HLA antigens after transplants in which donor and recipient are not fully HLA-matched. In these cases, the leukaemia "deletes" the genomic region containing the disparate HLA antigen which was preferentially targeted as "foreign" by the GVL effect. However, the genetic basis of immune evasion in the majority of transplants, which are fully HLA matched, is not known. One possibility is that loss of genes coding for antigens outside the HLA locus but which are also targets of GVL may operate, alternatively genetic events that affect processes downstream of immunological cytotoxicity may be responsible.
The identification of genetic events that mediate immune evasion would not only facilitate the understanding of this process but can help plan therapeutic interventions that improve the outcomes of allogeneic transplantation for AML and other disorders. We intend to study this by conducting exome sequencing on 6 cases of AMLs from patients that attend my clinic at Addenbrooke's hospital and have relapsed after allogeneic transplantation. Samples from AML diagnosis, remission/normal and AML relapse (total n=18) will be studied to identify somatic mutations in the primary AML and those acquired by the relapsed clone. The 18 samples will also be studied by array CGH to detect regions of genomic amplification or deletion. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001000405 
   
  
    
    In this project we will sequence the exomes of 250 patients with Parkinson's disease 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  247 
 
  
    EGAD00001000406 
   
  
    
    Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare and aggressive haematological malignancy derived from precursors of plasmacytoid dendritic cells. Due to the rarity of BPDCNs our knowledge of their molecular pathogenesis was until recently confined to observations describing reccurent chromosomal deletions involving chromosomes 5q, 12p, 13q, 6q, 15q and 9. A recent publication went on to delineate the common deleted regions using aCGH and demonstrated that these centred around known tumour suppressor genes including CDKN2A/B (9p21.3), RB1 (12p13.2-14.3), CDKN1B (13q11-q12) and IKZF1 (7p12.2).
These mutations are found recurrently in several different cancers and in most cases are thought to be involved in tumour progression rather than initiation. However, the well-defined nature and cellular ontogeny of these neoplasms suggests strongly that they share one or a few characteristic mutations as has been demonstrated for other uncommon but well-defined neoplasms such as Hairy Cell Leukemia (BRAF) and ovarian Granulosa Cell tumours (FOXL2). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000407 
   
  
    
    We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of the International Headache Genetics consortium and EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  327 
 
  
    EGAD00001000408 
   
  
    
    We aim to whole-exome sequence DNA samples from 75 individuals with severe forms of Inflammatory Bowel Disease and related autoimmune diseases to identify the rare, highly penetrant, variants that we believe underlie these phenotypes. Case samples will be obtained from both new and existing (UK IBD Genetics Consortium) collaborators to ensure only the most extreme cases are sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000409 
   
  
    
    2000 ulcerative colitis cases drawn from the UKIBD Genetics  
Consortium cohort and whole-genome sequenced at 2X depth. A case  
control association study using control samples whole-genome sequenced  
by UK10K will be undertaken to identify common, low-frequency and rare  
variants associated with ulcerative colitis. Data will be combined  
with similar data across 3000 Crohn's disease cases from the same  
cohort to identify inflammatory bowel disease (IBD) loci and better  
understand the genetic differences and similarities of the two common  
forms of IBD. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1992 
 
  
    EGAD00001000410 
   
  
    
    We will perform exome sequencing on selected cases of splenic marginal zone lymphoma (SMZL) and diffuse large B-cell lymphoma (DLBCL) in order to characterise their genetic makeup and identify biomarkers for prognosis and prediction of treatment response. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001000411 
   
  
    
    These samples include exome sequences of family members with dyslipidemias from northern Finnish origin. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000412 
   
  
    
    We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of the International Headache Genetics consortium and EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  477 
 
  
    EGAD00001000413 
   
  
    
    UK10K_RARE_CHD REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001000414 
   
  
    
    UK10K_RARE_CILIOPATHIES REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  122 
 
  
    EGAD00001000415 
   
  
    
    UK10K_RARE_COLOBOMA REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  123 
 
  
    EGAD00001000416 
   
  
    
    UK10K_RARE_FIND REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  124 
 
  
    EGAD00001000417 
   
  
    
    UK10K_RARE_HYPERCHOL REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001000418 
   
  
    
    UK10K_RARE_NEUROMUSCULAR REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  140 
 
  
    EGAD00001000419 
   
  
    
    UK10K_RARE_SIR REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  121 
 
  
    EGAD00001000420 
   
  
    
    UK10K_RARE_THYROID REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  124 
 
  
    EGAD00001000421 
   
  
    
    The aim of this project is to identify rare variants in the 1q region associated with type 2 diabetes. To this end 651 case samples and 651 control samples from six populations have been pooled (pool sizes range from 27-33 individuals), and are being sequenced. The hybridization solution being used captures the exons and UTRs of genes in the 1q region. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000422 
   
  
    
    We perform whole exome sequencing on samples from a large IBD pedigree. The selected samples are from more distantly related family members (healthy and with IBD) and a set of matched population (Ashkenazy Jewish ancestry) samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001000423 
   
  
    
    The aim is to find rare variants of intermediate penetrance in those at risk of Crohn's disease 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  10 
 
  
    EGAD00001000424 
   
  
    
    The aim of this project is to identify rare variants in the 1q region associated with type 2 diabetes. To this end 651 case samples and 651 control samples from six populations have been pooled (pool sizes range from 27-33 individuals), and are being sequenced. The hybridization solution being used captures the exons and UTRs of genes in the 1q region. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001000425 
   
  
    
    GENCORD2 RNA-seq BAM files using BWA 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  568 
 
  
    EGAD00001000427 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001000428 
   
  
    
    204 individuals were genotyped with the Illumina 2.5M Omni chip. Filtered genotypes were imputed into the 1000 genomes project European panel SNPs. Beagle R2 is indicated in VCF files for further filtering. See Materials and Methods in publication for details. 
    
   
  
    
   
  204 
 
  
    EGAD00001000429 
   
  
    
    UK10K_OBESITY_TWINSUK REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000430 
   
  
    
    UK10K_NEURO_UKSCZ REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  554 
 
  
    EGAD00001000431 
   
  
    
    UK10K_OBESITY_GS REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  428 
 
  
    EGAD00001000432 
   
  
    
    UK10K_OBESITY_SCOOP REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  985 
 
  
    EGAD00001000433 
   
  
    
    UK10K_NEURO_ABERDEEN REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  392 
 
  
    EGAD00001000434 
   
  
    
    UK10K_NEURO_ASD_BIONED REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  77 
 
  
    EGAD00001000435 
   
  
    
    UK10K_NEURO_ASD_FI REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001000436 
   
  
    
    UK10K_NEURO_ASD_GALLAGHER REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  77 
 
  
    EGAD00001000437 
   
  
    
    UK10K_NEURO_ASD_TAMPERE REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  55 
 
  
    EGAD00001000438 
   
  
    
    UK10K_NEURO_EDINBURGH REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  234 
 
  
    EGAD00001000439 
   
  
    
    UK10K_NEURO_FSZNK REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  285 
 
  
    EGAD00001000440 
   
  
    
    UK10K_NEURO_GURLING REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000441 
   
  
    
    UK10K_NEURO_IMGSAC REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  113 
 
  
    EGAD00001000442 
   
  
    
    UK10K_NEURO_IOP_COLLIER REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  172 
 
  
    EGAD00001000443 
   
  
    
    UK10K_NEURO_MUIR REL-2013-04-20 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  175 
 
  
    EGAD00001000444 
   
  
    
    Cancer is driven my mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000445 
   
  
    
    We recently worked-up a pulldown protocol for studying 21 genes recurrently mutated in AML (Study1770). Our manuscript is currently under revision and to address the reviewers' comments we need to validate some mutations by re-sequencing. In this add-on study we will be using PCR followed by MiSeq for this purpose. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  9 
 
  
    EGAD00001000446 
   
  
    
    Fastq files of 213 samples of hepatocellular carcinoma (NCCRI) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  213 
 
  
    EGAD00001000596 
   
  
    
    This project is to develop and validate a method to detect de novo mutations in a foetal genome through deep sequencing of cell-free DNA from the plasma of pregnant women.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000597 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  212 
 
  
    EGAD00001000598 
   
  
    
    The Ethiopian area stands among the most ancient ones ever occupied by human populations and their ancestors. Particularly, according to archaeological evidences, it is possible to trace back the presence of Hominids up to at least 3 million years ago. Furthermore, the present day human populations show a great cultural, linguistic and historic diversity which makes them essential candidate to investigate a considerable part of the African variability. Following the typing of 300 Ethiopian samples on Illumina Omni 1M (see Human Variability in Ethiopia project, previously approved by the Genotyping committee) we now have a clearer idea on which populations living in the area include the most of the diversity.This project therefore aims to sequence the whole genome of 300 individuals at low (4-8x) depth belonging to the six most representative populations of the Ethiopian area to produce a unique catalogue of variants peculiar of the North East Africa. Furthermore 6 samples (one from each population) will also be sequenced at high  (30x) depth to ensure full coverage of the diversity spectrum.The retrieved variants will be of great help in evaluating the demographic dynamics of those populations as well as shedding light on the migrations out of Africa. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  120 
 
  
    EGAD00001000599 
   
  
    
    We have collected material from a patient who had BrafV600E mutant melanoma that was
treated with PLX4032. We have germline DNA from the patient and DNA and RNA from
distinct lesions before and after treatment with PLX4032. We have transcriptome sequenced these samples to obtain a snap shot of the mechanisms of resistance that are operative. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000601 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000602 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000603 
   
  
    
    We recently used the Agilent SureSelect platform to re-sequence a set of genes known to be
mutated in human AML. The results from 10 AML DNA samples were very satisfactory, but
the effort required was significant.
Thus, we decided to re-sequence the same genes using the Haloplax system for target
enrichment in 48 AML samples. We planned to do this using MiSeq and have data from a
pilot of 3 samples. The data is promising but coverage appears pathcy so far.
However, in order to get a better understanding of the data we will need deeper sequencing. We
will need two lanes of HiSeq to get the same degree coverage as Sureselect.
his data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  54 
 
  
    EGAD00001000604 
   
  
    
    In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs  and from studies in the mouse, it appears that an epigenetic memory of the starting cell type is carried over to hiPSCs. However a comprehensive comparative study of the characteristics of these hiPSCs has been missing from the literature. Importantly studies which aimed to address these aspects of hiPSCs have used cells from different patients. In order to avoid this important confounding variable and to keep the genetic background constant, tissue samples were procured from the patients and reprogrammed to iPS cells. The transcriptomes of these iPS cells will be compared.
Protocol: primary cultures of cells were reprogrammed to iPS cells. RNA was extracted using a standard column extraction kit. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  47 
 
  
    EGAD00001000605 
   
  
    
    CR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced either by HiSeq or MiSeq. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  10 
 
  
    EGAD00001000606 
   
  
    
    Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework (www.sanger.ac.uk/genetics/CGP/cosmic/). The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  38 
 
  
    EGAD00001000607 
   
  
    
    PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced either by HiSeq or MiSeq. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001000608 
   
  
    
    PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced either by HiSeq or MiSeq. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  60 
 
  
    EGAD00001000609 
   
  
    
    Whole transcriptome sequencing of 28 untreated prostate cancers, 13 castration resistant prostate cancers, and 12 benign prostatic hyperplasias. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  53 
 
  
    EGAD00001000610 
   
  
    
    Methylated DNA immunoprecipitation sequencing of 28 untreated prostate cancers, 11 castration resistant prostate cancers, and 12 benign prostatic hyperplasias. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001000611 
   
  
    
    Small RNA sequencing of 28 untreated prostate cancers, 12 castration resistant prostate cancers, and 3 benign prostatic hyperplasias. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001000612 
   
  
    
    Low coverage whole genome sequencing of 27 untreated prostate cancers, 9 castration resistant prostate cancers, and 4 benign prostatic hyperplasias. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001000613 
   
  
    
    UK10K_NEURO_ASD_MGAS REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  97 
 
  
    EGAD00001000614 
   
  
    
    UK10K_NEURO_ASD_SKUSE REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  341 
 
  
    EGAD00001000615 
   
  
    
    UK10K_NEURO_FSZ REL-2013-04-20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  128 
 
  
    EGAD00001000616 
   
  
    
    Pilocytic Astrocytoma ICGC PedBrain whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  192 
 
  
    EGAD00001000617 
   
  
    
    Pilocytic Astrocytoma ICGC PedBrain RNA sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  73 
 
  
    EGAD00001000618 
   
  
    
    1204 Sardinian males 
    
   
  
    
   
  1195 
 
  
    EGAD00001000619 
   
  
    
    Experiments using targeted pulldown methods will be sequenced to validate findings in the exomes of patients with Myeloproliferative Neoplasms (MPN). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  360 
 
  
    EGAD00001000620 
   
  
    
    A bespoke targeted pulldown experiment will be performed on patients with Angiosarcoma. the resulting products will be sequenced to determine the prevalence of previously found mutations in these patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000621 
   
  
    
    We propose to definitively characterise the somatic genetics of Prostate cancer through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. This study will aim to validate the findings of the whole genome study by re-sequencing regions of interest using a bespoke pulldown bait. See ICGC website for more information: http://icgc.org/icgc/cgp/70/508/71331 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  18 
 
  
    EGAD00001000623 
   
  
    
    This VCF contains the full sequence data post QC. This consists of 41,911 individuals. All polymorphic sites are present in this VCF. 
    
   
  
    
   
  41911 
 
  
    EGAD00001000624 
   
  
    
    Multifocality or multicentricity in breast cancer may be defined as the presence of two or more tumor foci within a single quadrant of the breast or within different quadrants of the same breast, respectively. This original classification of the breast cancer as multicentric or multifocal was based on the assumption that cancers arising in the same quadrant were more likely to arise from the same ductal structures than those occurring in separate areas of the breast. The problem with these definitions is that the ?quadrants? of the breast are arbitrary external designations, as no internal boundaries do exist. This project will therefore focus both on synchronous multifocal and multicentric tumors. The incidence of multifocal and multicentric breast cancers was reported to be between 13 and 75% depending on the definition used, the extent of the pathologic sampling of the breast and whether in situ disease is considered evidence of multicentricity (1). Although this incidence is variable, those figures show that it is a frequent phenomenon. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic management of the cancer. Multifocality or multicentricity has been associated with a number of more aggressive features including an increased rate of regional lymph node metastases and adverse patient outcome when compared with unifocal tumors (2-3), and a possible increased risk of local recurrence following breast conserving surgery (4). For the moment, the literature is divided on whether there is a corresponding impact on survival outcomes. Today, the current convention to stage and to treat multifocal and multicentric tumors is the classical tumor-node-metastasis (TNM) staging guidelines with which tumor size is assessed by the largest tumor focus without taking other foci of disease into consideration. If some papers, as the recent one from Lynch and colleagues, support the current staging convention (3), others, however, as Boyages et al. suggested that aggregate size and not the size of the largest lesion should be considered in order to refine the prognostic assessment of those tumors (5). On the top of that, the question whether multifocal/multicentric carcinomas are due to the spread of a single carcinoma throughout the breast or is due to multiple carcinomas arising simultaneously has been a matter of debate. Some studies suggested that multifocal breast cancer may result from either intramammary spread from a single primary tumor or multiple synchronous primary tumors; whereas others suggest that multiple breast carcinomas always arise from the same clone (6-8). Recently, Pietri and colleagues analyzed the biological characterization of a series of 113 multifocal/multicentric breast cancers (8) which were diagnosed over a 5-year period. The expression of estrogen (ER) and progesterone (PgR) receptors, Ki-67 proliferative index, expression of HER2 and tumor grading were prospectively determined in each tumor focus, and mismatches among foci were recorded. Mismatches in ER status were present in 5 (4.4%) cases and PgR in 18 (15.9%) cases. Mismatches in tumor grading were present in 21 cases (18.6%), proliferative index (Ki-67) in 17 (15%) cases and HER2 status in 11 (9.7%) cases. Interestingly, this heterogeneity among foci has led to 14 (12.4%) patients receiving different adjuvant treatments compared with what would have been indicated if we had only taken into account the biologic status of the primary tumor. This study therefore showed that differences in biological characteristics of multifocal/multicentric lesions play a crucial role in the adjuvant treatment decision making process. In this study, we will concentrate on a larger series of patients with multifocal invasive ductal breast cancer lesions. We aim at: 1. Evaluating the incidence of multifocality according to the different breast cancer molecular subtypes (ER-/HER2-, HER2+, ER+/HER2-). 2. Evaluating the incidence of multifocality in patients with hereditary breast cancer disease (presence of germline BRCA1 or BRCA2 mutations). Moreover, we would like to investigate if multifocal lesions with BRCA1 or BRCA2 mutations exhibit a characteristic combination of substitution mutation signatures and a distinctive profile of deletions as demonstrated recently by Nik-Zainal and colleagues (9). 3. Correlating multifocality with clinical information in order to define its influence on patients? survival (DFS and OS). 4. Carrying high coverage targeted gene sequencing of driver cancer genes and genes whose mutation is of therapeutic importance in order to compare clinically-relevant genetic differences between several multifocal breast cancer lesions. 5. Evaluating the impact of the distance between the different lesions on the clinical outcome but also on the genetic differences. 6. Comparing gene expression patterns between several multifocal breast cancer lesions and correlate them with the results of the targeted genes screen. 7. Characterizing the genomic and transcriptomic status of cancer related genes in metastatic lesions (local recurrence, positive lymph node or distant metastatic sites) from the same multifocal invasive ductal breast cancer patients in order to evaluate the consequence of genomic and transcriptomic heterogeneity of multifocal lesions on metastatic lesions. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic choice. This project has the potential to identify genetic/transcriptomic differences existing between several lesions constituting multifocal breast cancers, which in the routine clinical practice are usually considered to be homogeneous among them. We foresee validating significant results in a larger series of patients and this, in turn, could have a remarkable impact on the treatment and clinical management of multifocal breast cancers. Indeed, we hope to provide some evidence whether or not each focus matters in multifocal and multicentric breast cancer to define the adequate therapeutic approach, especially in the context of targeted therapies. The work to be done at Sanger will be target gene screen pooling of 1400 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  908 
 
  
    EGAD00001000625 
   
  
    
    The main objective of this benchmark is the comparison of the full sequencing pipeline of different ICGC partners, including procedures, methods and performance of library preparation and whole-genome deep-sequencing. A secondary objective will be a follow-up comparison of data analysis pipelines for identification of germline and somatic variants subsequent to the results of the ICGC Somatic Variant Calling Pipeline Benchmark. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000626 
   
  
    
    Exome sequencing data for tumor and matched normal samples of the EGAS00001000495 project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001000627 
   
  
    
    Transcriptome sequencing data of tumor and 10 matched normal samples of the EGAS00001000495 project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000628 
   
  
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001000630 
   
  
    
    In this study we will sequence the transcriptome of Verified Matched Pair Cancer Cell line tumour samples. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000631 
   
  
    
    PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced either by HiSeq or MiSeq. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001000632 
   
  
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  12 
 
  
    EGAD00001000634 
   
  
    
    The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000635 
   
  
    
    The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001000636 
   
  
    
    The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  117 
 
  
    EGAD00001000637 
   
  
    
    Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000638 
   
  
    
    Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000639 
   
  
    
    Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000640 
   
  
    
    Transcriptome studies in patients with rare genetic diseases can potentially aid in the
interpretation of likely causal genetic variation through identification of altered transcript
abundance and/or structure. RNA-Seq is the most sensitive assay for both investigating
transcript structure and abundance
The primary aim of this pilot project is to investigate to what degree integrating exome-Seq
and RNA-Seq data on the same individual can accelerate the identification of causal alleles
for rare genetic diseases. There are two main strands to this: (i) identifying which variants
discovered in exome-seq appear to be having a functional impact on transcripts, and (ii)
identifying transcript outliers, especially among known causal genes, that may not necessarily
have a causal variant identified from exome sequencing. The latter may identify the presence
of causal variants that lie far from coding regions (e.g. the formation of cryptic splice sites
deep within introns, or loss of long range regulatory elements), which can be confirmed with
further targeted genetic assays. Just over 50% of all disease-causing variants recorded in the
Human Gene Mutation Database (HGMD) affect transcript structure and abundance (e.g.
nonsense SNVs, essential splice site SNVs, frameshifting indels, CNVs).
This pilot project will study RNA from lymphoblastoid cell-lines from 12 patients with
primordial dwarfism syndromes, for 10 of these samples we have previously generate exome
data as part of our collaboration with the group of Prof Andrew Jackson. The two remaining
samples are positive controls where the causal mutation is known, and is known to affect
transcript structure and/or abundance.
Primordial dwarfism is a prime candidate for these RNA-seq studies because all known
causal mutations to date have key roles in DNA replication and thus, unsurprisingly, the
products of the causal genes are typically ubiquitously expressed.
Each RNA will be sequenced, with two technical replicates (independent RT-PCR and libraries) per
sample, and each replicate run in 1/2 of a HiSeq lane using 100bp paired reads. 
Samples preparation was as follows :The cells were grown to confluency, then pellets frozen at -80. RNA samples were prepared using the Qiagen RNeasy kit, then nanodropped and analyzed using the bioanalyzer to determine concentration and purity.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001000641 
   
  
    
    DNA replication errors occurring in mismatch repair (MMR) deficient cells persist as mismatch mutations and predispose to a range of tumors. Here, we sequenced the first whole-genomes from MMR-deficient endometrial tumors. 
    
   
  
    
      
      Complete Genomics 
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001000642 
   
  
    
   
  
    
      
      Illumina HiScanSQ 
      
    
   
  2 
 
  
    EGAD00001000643 
   
  
    
   
  
    
      
      Illumina HiScanSQ 
      
    
   
  2 
 
  
    EGAD00001000644 
   
  
    
    ICGC PedBrain DNA Methylation project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001000645 
   
  
    
    ICGC MMML-seq Data Freeze July 2013 whole genome sequencing 
    
   
  
    
   
  42 
 
  
    EGAD00001000646 
   
  
    
    A selection of human cancers harbours somatic driver mutations in genes encoding histones, most notably childhood brain tumours with K27M substitutions of the histone 3.3 gene, H3F3A. We performed whole genome sequencing of the benign cartilage tumour, chondroblastoma, and targeted sequencing of histone 3.3 genes, H3F3A and H3F3B, in seven further skeletal tumour types. We identified an exceptionally high prevalence of novel histone 3.3 driver mutations at glycine 34 and at lysine 36. Histone 3.3 gene mutations were found in 91% in giant cell tumours of bone (48/53), mainly H3F3A G34W variants, and in 92% of chondroblastoma (73/79), predominantly K36M mutations in H3F3B. H3F3B is paralogous to the cancer gene H3F3A. However, H3F3B driver variants have not previously been reported in human cancer. Our observation demonstrate remarkable tumour-specificity of mutations, with respect to which histone 3.3 gene and residue is mutated, indicating that the advantage these mutations confer is tumour dependent. Moreover, tumour-specific mutation of H3F3A and H3F3B suggests, that although both genes encode identical proteins, they are likely non-redundant and employed differentially during skeletal development. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000647 
   
  
    
    We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of  EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  110 
 
  
    EGAD00001000648 
   
  
    
    ICGC MMML-seq Data Freeze July 2013 transcriptome sequencing 
    
   
  
    
   
  31 
 
  
    EGAD00001000650 
   
  
    
    ICGC MMML-seq Data Freeze July 2013 miRNA sequencing 
    
   
  
    
   
  52 
 
  
    EGAD00001000652 
   
  
    
    Pulldown experiments will be performed on a number of patients with Myeloproliferative Neoplasms (MPN). The pulldown will be a bespoke design targeting known mutations, this pulldown will be sequenced and analysed to inform prevalence of mutations and to inform to the possibility of use as a diagnostic tool. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1036 
 
  
    EGAD00001000653 
   
  
    
    This is a continuation of the Chordoma Sequencing Project. All cancers arise due to somatically acquired abnormalities in DNA sequence. Systematic sequencing of cancer genomes allows acquisition of complete catalogues of all classes of somatic mutation present in cancer. These mutation catalogues will allow identification of the somatically mutated cancer genes that are operative and characterise patterns of somatic mutation that may reflect previous exogenous and endogenous mutagenic exposures. In this application, we aim to perform whole genome sequencing on 10 chordoma matched genome pairs. RNA Sequencing/Methylation and SNP6 and an additional sequencing of three cancer cell lines will be added to this work. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000654 
   
  
    
    DATA FILES FOR BALL-PAX5 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  153 
 
  
    EGAD00001000655 
   
  
    
    DATA FILES FOR Histone-NSD2_RNASeq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000656 
   
  
    
    FACS phenotype of 1629 Sardinian samples 
    
   
  
    
   
  1629 
 
  
    EGAD00001000657 
   
  
    
    DATA FILES FOR Histone Capture bams 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  962 
 
  
    EGAD00001000658 
   
  
    
    Changes in gene dosage are a major driver of cancer1, engineered from a finite, but increasingly well annotated, repertoire of mutational mechanisms2-6. These processes operate over levels ranging from individual exons to whole chromosomes, often generating correlated copy number alterations across hundreds of linked genes. An example of the latter is the 2% of childhood acute lymphoblastic leukemia (ALL) characterized by recurrent intrachromosomal amplification of megabase regions of chromosome 21 (iAMP21)7,8  To dissect the interplay between mutational processes and selection on this scale, we used genomic, cytogenetic and transcriptional analysis, coupled with novel bioinformatic approaches, to reconstruct the evolution of iAMP21 ALL. We find that individuals born with the rare constitutional Robertsonian translocation between chromosomes 15 and 21, rob(15;21)(q10;q10)c, have ~2700-fold increased risk of developing iAMP21 ALL compared to the general population. In such cases, amplification is initiated by chromothripsis involving both sister chromatids of the dicentric Robertsonian chromosome. In contrast, sporadic iAMP21 is typically initiated by breakage-fusion-bridge (BFB) events, often followed by chromothripsis or other rearrangements. In both sporadic and iAMP21 in rob(15;21)c individuals, the final stages of amplification frequently involve large-scale duplications of the abnormal chromosome. The end-product is a derivative chromosome 21 or a derivative originating from the rob(15;21)c chromosome, der(15;21), respectively, with gene dosage optimised for leukemic potential, showing constrained copy number levels over multiple linked genes. In summary, the constitutional translocation, rob(15;21)c, predisposes to leukemia through a novel mechanism, namely a propensity to undergo chromothripsis, likely related to its dicentric nature. More generally, our data illustrate that several cancer-specific mutational processes, applied sequentially, can co-ordinate to fashion copy number profiles over large genomic scales, incrementally refining the fitness benefits of aggregated gene dosage changes. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001000659 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000660 
   
  
    
    Analysis .bam files from HiSeq sequencing of Australian ICGC PDAC study samples, submitted 20130826 
    
   
  
    
   
  353 
 
  
    EGAD00001000661 
   
  
    
    Bespoke validation experiments will be performed on ER+ Breast Cancer cases to confirm the presence of mutations found in whole genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000662 
   
  
    
    We propose to definitively characterise the somatic genetics of Triple negative breast cancer through generation of comprehensive catalogues of somatic mutations in 500 cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. This study will use a bespoke bait set to pulldown regions of interest found in whole genome sequencing to validate mutations found. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000663 
   
  
    
    This study aims to re-sequence findings from whole genome studies using a bespoke pulldown method to validate mutations in those genomes sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  47 
 
  
    EGAD00001000664 
   
  
    
    Whole Genome Seq: Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009);duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. Paired-end RNA sequencing reads were mapped to the hg19 assembly of the human reference genome using BWA.Each ChIP-seq library was sequenced with two complete lanes on the Illumina HiSeq 2500 in the 101-bases paired-end rapid mode and aligned to hg19 using bwa.This resulted in the following coverage values (genome-wide, after deduplication, including all uniquely mapping reads):GBM103  macroH2A1: 17x   H3K36me3: 20xMB59    macroH2A1: 11x   H3K36me3: 11x 
    
   
  
    
   
  7 
 
  
    EGAD00001000665 
   
  
    
    Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. Sample derived from secondary myelodysplastic syndrome (MDS), arising after treatment for medulloblastoma in an 11-year old female Li-Fraumeni syndrome case (LFS-MB1; Rausch et al., 2012; matching WGS data available under EGAS00001000085). 
    
   
  
    
   
  1 
 
  
    EGAD00001000666 
   
  
    
    HSC73_clone: Bone marrow mononuclear cells from the healthy 73 years old female were thawed and labeled with Alexa-Fluor 488-conjugated anti-CD34 (581, Biolegend), Alexa-Fluor 700-conjugated anti-CD38 (HIT2, eBioscience), a cocktail of APC-conjugated lineage antibodies consisting of anti-CD4 (RPA-T4), anti-CD8 (RPA-T8), anti-CD11b (ICRF44), anti-CD20 (2H7), anti-CD56 (B159, all BD Biosciences), anti-CD14 (61D3), anti-CD19 (HIB19) and anti-CD235a (HIR2, all eBiocience) and 1 micro-gram/ml propidium iodide (Sigma). Using a BD FACSAria cell sorter, single Lin-CD34+CD38-PI- cells were individually sorted into low-adhesion 96-well tissue culture plates (Corning) containing 100micro-litre of StemSpan Serum-Free Expansion Medium (Stemcell technologies) supplemented with 100ng/ml of human SCF and FLT-3L, 50ng/ml of human TPO, 20ng/ml of human IL-3, IL-6 and G-CSF (all cytokines from Peprotech) and 50U/ml of penicillin and 50μg/ml of streptomycin (Sigma). Cells were incubated at 37 degrees C in a humidified atmosphere with 5% CO2 in air. After 5 days in culture, another 100micro litres of cytokine-containing medium were added. 13 days after seeding, clones B6 and G2 had expanded to approx. 105 cells and were selected for whole genome sequencing (2x101bp, paired-end, Illumina HiSeq2500) after tagmentation-based library preparation (see Extended Experimental Procedures) for clone B6 and standard library preparation for clone G2. For germline-control ~106 unsorted bone marrow mononuclear cells from the same donor were used for sequencing. An average of 30-fold sequence coverage for each the clones and the matching control were obtained.L4clone:  A progenitor cell clone was raised from a peripheral blood sample of the 39 year old healthy female. Frozen peripheral blood mononuclear cells (PBMCs) were isolated from 2 ml heparinised peripheral blood via Ficoll Paque density centrifugation.  A methylcellulose assay was performed as described earlier (Weisse et al., 2012). In brief, non-adherent mononuclear cells were incubated in the presence of the recombinant human cytokines IL-3, IL-5 and GM-CSF (R&D systems) over 14 days to induce colony formation. Colonies were detected under an inverted light microscope, and plucked by a pipette when colonies had approximately 10,000 cells/CFU. Each colony was washed three times in PBS and finally frozen as a cell pellet in -80 degrees C. Genomic DNA was isolated using the QIAamp DNA micro kit according to the instructions of the manufacturer (Qiagen, Hilden, Germany). Whole genome sequencing (2x101bp, paired-end, Illumina HiSeq2500) was performed for colony 4 after tagmentation-based library preparation and resulted in 15-fold sequence coverage for each the colony and the matching whole blood. 
    
   
  
    
   
  5 
 
  
    EGAD00001000667 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001000669 
   
  
    
    High-grade serous ovarian cancer (HGSC) is characterized by poor outcome, often attributed to the emergence of treatment-resistant subclones. We sought to measure the degree of genomic diversity within primary, untreated HGSCs to examine the natural state of tumour evolution prior to therapy. We performed exome sequencing, copy number analysis, targeted amplicon deep sequencing and gene expression profiling on 31 spatially and temporally separated HGSC tumour specimens (six patients), including ovarian masses, distant metastases and fallopian tube lesions. We found widespread intratumoural variation in mutation, copy number and gene expression profiles, with key driver alterations in genes present in only a subset of samples (eg PIK3CA, CTNNB1, NF1). On average, only 51.5% of mutations were present in every sample of a given case (range 10.2 to 91.4%), with TP53 as the only somatic mutation consistently present in all samples. Complex segmental aneuploidies, such as whole-genome doubling, were present in a subset of samples from the same individual, with divergent copy number changes segregating independently of point mutation acquisition. Reconstruction of evolutionary histories showed one patient with mixed HGSC and endometrioid histology, with common aetiologic origin in the fallopian tube and subsequent selection of different driver mutations in the histologically distinct samples. In this patient, we observed mixed cell populations in the early fallopian tube lesion, indicating that diversity arises at early stages of tumourigenesis. Our results revealed that HGSCs exhibit highly individual evolutionary trajectories and diverse genomic tapestries prior to therapy, exposing an essential biological characteristic to inform future design of personalized therapeutic solutions and investigation of drug-resistance mechanisms 
    
   
  
    
      
      Illumina Genome Analyzer 
      
    
   
  25 
 
  
    EGAD00001000670 
   
  
    
    A potential and very serious side effect of treating IBD with antiTNFa therapies (the currentgold standard) is the development of systemic lupus erythematosis (SLE). This side effect israre and unpredictable. Out of several thousand cases having received treatment, theUniversity of Calgary have accumulated 12 individuals with full phenotyping and novelserological antibody discovery panel data. We propose to exome sequence these samples inan effort to identify rare highly-penetrant variants that could be underlying this severephenotype. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001000671 
   
  
    
    Primary sclerosing chloangitis is a rare autoimmune disease of the liver (prevalence =10/100,000) with a mean age of onset of 40 years. We are currently undertaking GWASand immunochip experiments to identify loci underlying PSC susceptibility. Through ourcollaborators at the University of Calgary we have access to DNA from three parent-offspringtrios where the children required liver transplants due to PSC before the age of 9. These areextremely rare cases indeed and we believe that exome-sequencing represents a powerfulmeans of identifying the causal mutation underlying this severe phenotype. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000672 
   
  
    
    Whole-genome Bisulfite sequencing of two multiple myeloma samples and one pooled sample of plasma cells. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000673 
   
  
    
    WGBS-seq for monocytes and neutrophils 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000674 
   
  
    
    DNaseI-seq for monocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000675 
   
  
    
    RNA-seq for monocytes and neutrophils 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000676 
   
  
    
    ChIP-seq for monocytes and neutrophils 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000677 
   
  
    
    Genome-wide analysis of H3K27me3 occupancy and DNA methylation in
K27M-mutant and H3.3-WT primary pediatric high-grade gliomas (pHGGs)
as well as pediatric pHGG cell lines. The study aims to elucidate the
connection between K27M-induced H3K27me3 reduction and changes in DNA
methylation as well as gene expression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001000678 
   
  
    
    FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  341 
 
  
    EGAD00001000679 
   
  
    
    A bespoke targeted pulldown experiment will be performed on patients with Angiosarcoma. the resulting products will be sequenced to determine the prevalence of previously found mutations in these patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  107 
 
  
    EGAD00001000680 
   
  
    
    Single end short-read (50 bp) SOLiD 4 sequencing data for 300 individuals, constituting 100 patient-parent trios. For more details please read; http://www.nejm.org/doi/full/10.1056/NEJMoa1206524 
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  202 
 
  
    EGAD00001000688 
   
  
    
    In this study we performed ultra deep sequencing of genes associated with anti-EGFR resistance, such as KRAS, BRAF, PIK3CA, and EGFR in 17 plasma-DNA samples from a total of 10 patients treated with anti-EGFR therapy. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  25 
 
  
    EGAD00001000689 
   
  
    
    Whole genome DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of 3 men. For each of three different prostates, multiple tumour samples (4, 5, and 3 depending on the case) and one normal tissue sample were whole genome sequenced with a matched blood sample using the Illumiuna HiSeq platform. Tumour samples were sequenced to a target depth of 50X and normals and blood to a target depth of 30X.
As of September 2020, some of the studies using these data include:
Cooper et al, Nature Genetics 2015 (PMID: 25730763)
Wedge et al, Nature Genetics 2018 (PMID: 29662167)
Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001000691 
   
  
    
    Dataset for "Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability" 
    
   
  
    
   
  12 
 
  
    EGAD00001000692 
   
  
    
    Files associated with the dataset: HS1626.bam, HS1484.bam, HS1483.bam, HS1482.bam, HS1481.bam, HS1480.bam, HS1479.bam, HS1478.bam, A13805.bam, A13800.bam, A13799.bam, A05253.bam, A05252.bam, A13806.bam 
    
   
  
    
      
      Illumina Genome Analyzer 
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000693 
   
  
    
    The genetic consequences of cellular transformation by Epstein-Barr-Virus were assessed by comparing whole genome sequences of the original genome (before transformation) and the genome after transformation. 
    
   
  
    
   
  2 
 
  
    EGAD00001000694 
   
  
    
    This is an ongoing project and continuation to all the sequencing we have been doing over the last few years.  We have some additional families and probands with syndromes of insulin resistance not previously sequenced within uk10k or other core funded projects.  We  would like to complete the sequencing in all of the good quality families and probands we have, this would require another ~50 samples to be WES sequenced.  This cohort has already proven to be a rich source of interesting findings with papers in Science and Nature genetics. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000695 
   
  
    
    DATA FILES FOR SJLGG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001000696 
   
  
    
    The Ethiopian area stands among the most ancient ones ever occupied by human populations and their ancestors. Particularly, according to archaeological evidences, it is possible to trace back the presence of Hominids up to at least 3 million years ago. Furthermore, the present day human populations show a great cultural, linguistic and historic diversity which makes them essential candidate to investigate a considerable part of the African variability. Following the typing of 300 Ethiopian samples on Illumina Omni 1M (see Human Variability in Ethiopia project, previously approved by the Genotyping committee) we now have a clearer idea on which populations living in the area include the most of the diversity.
This project therefore aims to sequence the whole genome of 300 individuals at low (4-8x) depth belonging to the six most representative populations of the Ethiopian area to produce a unique catalogue of variants peculiar of the North East Africa. Furthermore 6 samples (one from each population) will also be sequenced at high  (30x) depth to ensure full coverage of the diversity spectrum.
The retrieved variants will be of great help in evaluating the demographic dynamics of those populations as well as shedding light on the migrations out of Africa. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000697 
   
  
    
    Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  90 
 
  
    EGAD00001000698 
   
  
    
    Illumina HiSeq sequence data (with >80x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed.The whole exome sequencing data of 20 SHH medulloblastomas from phs000504.v1.p1 dataset has been used in our study on SHH medulloblastomas: http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000504.v1.p1 
    
   
  
    
   
  4 
 
  
    EGAD00001000699 
   
  
    
    Illumina HiSeq sequence data (with >80x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001000702 
   
  
    
    Complete set of bam files associated with study EGAS00001000622 
    
   
  
    
   
  190 
 
  
    EGAD00001000703 
   
  
    
    SCLC - Whole genome sequencing data
Publication Peifer et al., 2012, Nature Genetics 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  29 
 
  
    EGAD00001000704 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001000705 
   
  
    
    Whole genome sequencing of 20 tumour and normal pairs of diffuse intrinsic pontine glioma (DIPG) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001000706 
   
  
    
    Whole exome sequencing of 6 tumour and normal pairs of diffuse intrinsic pontine glioma (DIPG) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000707 
   
  
    
    Discovery of resistance mechanisms to the BRAF inhibitor vemurafenib in metastatic BRAF mutant melanoma by massively-parallel sequencing of tumour samples. Comparison of genomic characteristics of pretreatment 'sensitive' to recurrence 'resistant' tumours to identify the genetics of drug resistance. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  57 
 
  
    EGAD00001000708 
   
  
    
    AZIN1 amplicon sequencing data of the EGAS00001000495 project. 
    
   
  
    
      
      454 GS FLX Titanium 
      
    
   
  69 
 
  
    EGAD00001000709 
   
  
    
    Dataset of CageKid Blood DNA samples 
    
   
  
    
   
  95 
 
  
    EGAD00001000710 
   
  
    
    Whole Genome Bisulfite-seq of four B cell samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000711 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001000712 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001000713 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000714 
   
  
    
   
  
    
   
  102 
 
  
    EGAD00001000715 
   
  
    
    Exome sequencing was performed for paired tumor/normal samples from patients with corticotropin-independnet Cushing's syndrome. Tumor DNA was extracted from adrenocortical adenomas and normal DNA was extracted from adjacent adrenal tissues or periphral blood. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000716 
   
  
    
    RNAseq data, Publication Fernandez-Cuesta et al., 2014, CD74-NRG1 fusions in lung adenocarcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001000717 
   
  
    
    Dataset of CageKid Tumor DNA samples 
    
   
  
    
   
  95 
 
  
    EGAD00001000718 
   
  
    
    Dataset of CageKid Tumor RNA samples 
    
   
  
    
   
  91 
 
  
    EGAD00001000719 
   
  
    
    Dataset of CageKid Normal RNA samples 
    
   
  
    
   
  45 
 
  
    EGAD00001000720 
   
  
    
    Dataset of CageKid tumor-normal paired RNA samples 
    
   
  
    
   
  90 
 
  
    EGAD00001000721 
   
  
    
    This is a continuation of the Chordoma Sequencing Project. All cancers arise due to somatically acquired abnormalities in DNA sequence. Systematic sequencing of cancer genomes allows acquisition of complete catalogues of all classes of somatic mutation present in cancer. These mutation catalogues will allow identification of the somatically mutated cancer genes that are operative and characterise patterns of somatic mutation that may reflect previous exogenous and endogenous mutagenic exposures. In this application, we aim to perform whole genome sequencing on 10 chordoma matched genome pairs. RNA Sequencing/Methylation and SNP6 and an additional sequencing of three cancer cell lines will be added to this work. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000722 
   
  
    
    Extension of angiosarcoma whole genome sequencing study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000723 
   
  
    
    Relative Spatial Homogeneity of Embryonal Brain Tumors of Childhood 
    
   
  
    
   
  42 
 
  
    EGAD00001000724 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  68 
 
  
    EGAD00001000725 
   
  
    
    This dataset contains RNA sequencing data for 675 cancer cell lines. RNA libraries were made with the TruSeq RNA Sample Preparation kit (Illumina) according to the manufacturer protocol. The libraries were sequenced on an Illumnia HiSeq 2000 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  675 
 
  
    EGAD00001000726 
   
  
    
    In total 30 Acute Myeloid Leukemias with an acquired inv(3)(q21q26) or t(3;3)(q21;q26) have been characterized by whole transcriptome sequencing (RNA-Seq). The 3q-aberration leads to overexpression of the proto-oncogene EVI1, but the mechanism of overexpression has thus far been elusive. The RNA-Seq was integral in determining the precise enhancer inducing the overexpression and led to other key discoveries. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001000727 
   
  
    
    Targeted resequencing on the specific regions chr3:126036241-130672290 and chr3:157712147-175694147 in hg19 centered on the chromosomal regions 3q21 and 3q26 respectively. The focus lies on the detection of the exact breakpoints in Acute Myeloid Leukemia (AML) patients having acquired a inv(3)(q21q26) or t(3;3)(q21;q26). This dataset contains all information to detect all structural variants contained within these regions, including the 3q-aberrations inducing the overexpression of the proto-oncogene EVI1. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001000728 
   
  
    
    Low coverage whole genome sequencing of samples from individuals from Friuli Venezia Giulia, an Italian genetic isolate population. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  199 
 
  
    EGAD00001000729 
   
  
    
    The Val Borbera  is a region characterized by low iodine and high prevalence of thyroid disorders, the commonest endocrine disorders in the general population.  About 30% of the participants of the Val Borbera Project were affected by such disorders and were characterized by several parameters, TSH level, anti TPO antibodies, echography, family origin. Individuals with extreme phenotypes were identified and could be clustered based on family origin and genotype.  
We propose to exome sequence  6 of them, affected with true goiter,  at high dept (40-60x) to obtain information on exonic  rare variants.  Due to the family structure and to the availability of whole genome sequence information on 110 individuals from the isolated population we expect to be able to identify putative causative variants for thyroid disorders that may be studied in the remaining affected individuals. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000730 
   
  
    
    The VBSEQ project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits. Up to 100 Val Borbera samples will be sequenced to a 6x depth. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  110 
 
  
    EGAD00001000731 
   
  
    
    This study includes Phase 2 whole-genome sequencing data (at 4x depth)of 100 individuals from an Italian genetic isolate population (Val Borbera, abbreviated VBI) of the Italian Network of Genetic Isolates (INGI). The INGI-VBI_SEQ2 project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001000732 
   
  
    
    RNA sequencing to validate findings of somatic pseudogenes acquired during cancer development 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000733 
   
  
    
    The dataset entails 48 RRBS libraries of 24 siblings. 24 individuals are conceived during the Dutch Famine, a severe 6 month famine at the end of World War 2. A same sex sibling was added as a control, allowing partial matching for (early) familial environment and genetics. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  48 
 
  
    EGAD00001000734 
   
  
    
    Paired end Illumina sequencing of whole exomes of multiple tumour regions. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  88 
 
  
    EGAD00001000735 
   
  
    
    Here we present the genomes of three secondary angiosarcomas 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000737 
   
  
    
    Whole exome sequencing data from 30 donors (46 tumors and 30 non-tumoral whole exome sequencing, paired-end, HiSeq 2000, Illumina) collected by the Inserm U674, PI Jessica Zucman-Rossi - Institut National du Cancer (INCa), PI Fabien Calvo, France. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001000738 
   
  
    
    Extension of angiosarcoma whole genome sequencing study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000740 
   
  
    
    UK10K_COHORT_ALSPAC REL-2012-06-02: Low-coverage whole genome sequencing; variant calling, genotype calling and phasing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2307 
 
  
    EGAD00001000741 
   
  
    
    UK10K_COHORT_TWINSUK REL-2012-06-02: Low-coverage whole genome sequencing; variant calling, genotype calling and phasing 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  1854 
 
  
    EGAD00001000743 
   
  
    
    These files contain a total of 20.4M SNVs and the complete information output by the GATK UnifiedGenotyper v1.4 on all 767 GoNL samples. These calls are not trio-aware and all genotypes were reported regardless of their quality. Both filtered and passing calls are reported in these files. Filtered calls include (1) calls failing our VQSR threshold and (2) calls in the GoNL inaccessible genome. 
    
   
  
    
   
  - 
 
  
    EGAD00001000744 
   
  
    
    The samples in this panel come from 250 families: 248 parents-child trios and 2 parent-child duos. As the children do not provide additional haplotypes or population information, they were excluded from the panel. The samples present in the release are composed of 248 couples, 2 single individuals and 1 sample composed from the 2 haplotypes from the duo's children transmitted by their missing parent.  The composed sample is named gonl-220c_223c.The files contain a total of 18.9M SNVs and 1.1M INDELs in autosomal chromosomes. They were generated by phasing/imputing the SNVs (a) and INDELs (b) using MVNCall. Only sites passing filters are reported. Sites filtered as part of the GoNL inaccessible genome were kept (but flagged as filtered) and still may contain true positive calls but should be used with care as they are located in parts of the genome that are less well captured (systematic under or over-covered or low-mapping quality) 
    
   
  
    
   
  - 
 
  
    EGAD00001000745 
   
  
    
    Data supporting the paper Transcriptional diversity during lineage commitment of human blood progenitors 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      PacBio RS 
      
    
   
  26 
 
  
    EGAD00001000746 
   
  
    
    Fernandez-Cuesta et al., RNAseq data Pipline 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001000747 
   
  
    
    Genomic libraries will be generated from total genomic DNA derived from 4000 samples with Acute Myeloid Leukaemia. Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 64 Samples will be individually barcoded and subjected to up to one lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2734 
 
  
    EGAD00001000748 
   
  
    
    In this study we performed whole genome sequencing of plasma DNA (plasma-Seq) of 19 plasma-DNA samples from a total of 10 patients treated with anti-EGFR therapy. We demonstrated that development of resistance to anti-EGFR therapies is frequently associated with focal amplifications of KRAS, MET, and ERBB2. We also showed that focal KRAS amplifications can be acquired in tumor genomes of patients under cytotoxic chemotherapy. Furthermore, we provide evidence that specific chromosomal polysomies, such as overrepresentations of 12p and 7p, harboring KRAS and EGFR, respectively, determine responsiveness to anti-EGFR therapy. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  19 
 
  
    EGAD00001000749 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000750 
   
  
    
    UK10K_RARE_FIND REL-2013-10-31 variant calling 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1151 
 
  
    EGAD00001000752 
   
  
    
    UK10K_RARE_CILWG REL-2013-09-09 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000753 
   
  
    
    UK10K_RARE_FINDWG REL-2013-09-09 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000754 
   
  
    
    UK10K_RARE_NMWG REL-2013-09-09 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000755 
   
  
    
    UK10K_OBESITY_GS UK10K_EXOME_EXTRAS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000756 
   
  
    
    UK10K_OBESITY_SCOOP UK10K_EXOME_EXTRAS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000757 
   
  
    
    UK10K_RARE_SIR UK10K_EXOME_EXTRAS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000758 
   
  
    
    dataset for BGI bladder cancer project 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  198 
 
  
    EGAD00001000759 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001000760 
   
  
    
    dataset for esophageal cancer, 17 pairs for whole-genome sequencing and 71 pairs for whole-exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  176 
 
  
    EGAD00001000761 
   
  
    
    In order to establish copy number profiles from the various samples we prepared libraries and subjected them to whole-genome sequencing at a shallow sequencing depth (0.1x) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001000762 
   
  
    
    We utilized exome sequencing for DNA obtained from saliva (germline DNA) and the four spatially separated tumor foci and 3 corresponding lymph node metastases 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000763 
   
  
    
    We used targeted deep sequencing to accurately establish the allele frequencies of the mutations identified by exome sequencing 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  23 
 
  
    EGAD00001000764 
   
  
    
    Adrenocortical carcinomas (ACC) are aggressive cancers originating in the cortex of the adrenal glands. Despite the overall poor prognosis, ACC outcome is heterogeneous. CTNNB1 and TP53 mutations are frequent in these tumors, but the complete spectrum of genetic changes remains undefined.  Exome sequencing and SNP array analysis of 45 ACC revealed recurrent alterations in known drivers (CTNNB1, TP53, CDKN2A, RB1,  MEN1)  and genes not previously reported to be altered in ACC (ZNRF3, DAXX, TERT and MED12), which were validated in an independent cohort of 77 ACC. The cell-surface transmembrane E3 ubiquitin ligase ZNRF36 was the gene the most frequently altered (21%), and appears as a potential novel tumor suppressor gene related to the ß-catenin pathway.Our integrated genomic analyses led to the identification of two distinct molecular subgroups with opposite outcome. The C1A group of poor outcome ACC was characterized by numerous mutations and DNA methylation alterations, whereas the C1B group with good prognosis displayed a specific deregulation of two miRNA clusters. Thus, aggressive and indolent ACC correspond to two distinct molecular entities, driven by different oncogenic alterations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  45 
 
  
    EGAD00001000774 
   
  
    
    This study includes whole-genome sequencing data (at 4x depth) of 100 individuals from an Italian genetic isolate population (Carlantino, abbreviated CARL) of the Italian Network of Genetic Isolates (INGI). The INGI-CARL_SEQ project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  106 
 
  
    EGAD00001000775 
   
  
    
    Whole exome sequencing of 41 melanomas and normal DNA from Braf mutant mice: 15 tumours from UV exposed mice, 15 tumours from non-exposed mice and 11 from UV exposed, sunscreen-protected mice. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001000776 
   
  
    
    UK10K_COHORT_IMPUTATION REL-2012-06-02: imputation reference panel (20140306); Merged UK10K+1000Genomes Phase 3 imputation reference panel added (20160420) 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  3781 
 
  
    EGAD00001000777 
   
  
    
    Dataset contains MeDIP-Seq, MRE-Seq and H3K4me3 ChIP-Seq data on 5 GBM patients. 
    
   
  
    
   
  16 
 
  
    EGAD00001000779 
   
  
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  2 
 
  
    EGAD00001000780 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001000781 
   
  
    
    Whole genome, high coverage, sequencing of 128 Ashkenazi Jewish controls 
    
   
  
    
   
  128 
 
  
    EGAD00001000782 
   
  
    
    Whole-genome sequencing was performed by Illumina Inc (San Diego, CA). Libraries were constructed with ~300bp insert length and paired-end 100bp reads were sequenced on Illumina HiSeq2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  190 
 
  
    EGAD00001000783 
   
  
    
    Genomic libraries will be generated from total      genomic DNA derived from 200+ patients with childhood Transient      Myeloproliferative Disorder (TMD) and or Acute Megakaryocytic      Leukemia (AMKL) as well some matched constitutional samples (n <      50 ).  Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 96 Samples will be individually barcoded and subjected to up to two lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  400 
 
  
    EGAD00001000784 
   
  
    
    This study aims to target capture sequence regions of interest from DNA derived from breast cancer patients who received neo-adjuvant chemotherapy. All patients had multiple biopsies performed before chemotherapy. Patients who had residual disease after the course of treatment underwent a further biopsy. We aim to characterise the mutations involved. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  242 
 
  
    EGAD00001000785 
   
  
    
    We propose to definitively characterise the somatic genetics of a selection of rare bone cancers through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001000786 
   
  
    
    We are interested in the contribution mutations in the Shelterin complex protein POT1 may have to the development of melanoma. We have identified a patient who carries a splice site mutation in POT1 and as part of our analysis of this gene we aim to sequence the transcriptome of this patient to see how this mutation influences splicing. RNA has been obtained from lymphocytes collected from the patient. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001000789 
   
  
    
    UK10K_COHORT_ALSPAC REL-2012-06-02: Phenotype data 
    
   
  
    
   
  1927 
 
  
    EGAD00001000790 
   
  
    
    UK10K_COHORT_TWINSUK REL-2012-06-02: Phenotype data 
    
   
  
    
   
  1854 
 
  
    EGAD00001000791 
   
  
    
    Exome sequencing of familial and sporadic small cell cancer of ovary cases. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001000792 
   
  
    
    Whole exome sequencing of paediatric glioblastoma with mutations reported in the manuscript: Mutations in ACVR1, FGFR1 and TP53 associate with tumor location in histone H3 K27M pediatric midline high-grade astrocytoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001000794 
   
  
    
    Small cell carcinoma of the ovary of hypercalcemic type (SCCOHT) is an extremely rare, aggressive cancer affecting children and young women. We identified germline and somatic inactivating mutations in the SWI/SNF chromatin-remodeling gene SMARCA4 in 75% (9/12) of SCCOHT patients in addition to SMARCA4 protein loss in 82% (14/17) of SCCOHT tumors, but in only 0.4% (2/485) of other primary ovarian tumors. These data implicate SMARCA4 in SCCOHT oncogenesis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000795 
   
  
    
    Fernandez-Cuesta et al, 2014, Nature Communication, RNA Sequencing data set 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  69 
 
  
    EGAD00001000796 
   
  
    
    This project aims to study at least 90 exomes from families with congenital heart disease. The samples have been selected in Leuven in collaboration with Koen Devriendt.  Ethic approval has been sought for in Leuven, Belgium and a HDMMC agreement for submitting these samples is in place at the WTSI. The phenotype we wil primarily focus our analysis is severe Left Ventricular Outflow Tract Obstructions (LVOTO) and Atrioventricular Septal Defect (AVSD). The indexed  Agilent whole exome pulldown libraries will be sequenced on 75bp PE HiSeq (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  167 
 
  
    EGAD00001000797 
   
  
    
    This project aims to study at least 90 exomes from families with congenital heart disease. The samples have been selected at the Royal, Brompton Hospital in collaboration with Stuart Cook and Piers Daubeney.  Ethic approval has been sought for in the UK and a HDMMC agreement for submitting these samples is in place at the WTSI. The phenotype we wil primarily focus our analysis is severe Left Ventricular Outflow Tract Obstructions (LVOTO) and Atrioventricular Septal Defect (AVSD). The indexed  Agilent whole exome pulldown libraries will be sequenced on 75bp PE HiSeq (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000798 
   
  
    
    In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs  but it is unclear whether some cell types carry through fewer mutations through reprogramming (either due to mutations present in the primary cells, or mutations accumulated during reprogramming). Through in depth analysis of hiPSCs generated from different somatic cells, it will be possible to assess the variation in genetic stability of different cell types. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  28 
 
  
    EGAD00001000799 
   
  
    
    The exome sequencing is performed using Agilent SureSelect 50Mb exome v3 and Hiseq 75bp paired reads with an mean sequencing coverage target of 50X. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  95 
 
  
    EGAD00001000800 
   
  
    
    This project aims to study exomes from families and trios with
congenital heart disease (CHD). The samples have been collected under
the Competence Network - Congenital Heart Defects in Berlin, Germany.
The phenotypes are mainly left ventricular outflow obstruction (aortic
stenosis, bicuspd aortic valve disease coarctation and hypoplastic
left heart), but will also include samples with hypoplastic right
heart and atrioventricular septal defects. We will perform whole exome
sequencing using Agilent sequence capture and Illumina HiSeq
sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  406 
 
  
    EGAD00001000802 
   
  
    
    UK10K_RARE_CILWG REL-2013-03-06 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000803 
   
  
    
    UK10K_RARE_FINDWG REL-2013-03-06 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000804 
   
  
    
    UK10K_RARE_NMWG REL-2013-03-06 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000805 
   
  
    
    UK10K_RARE_THYWG REL-2013-03-06 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000806 
   
  
    
    Whole Genome Sequencing (WGS) for St. Jude High Grade Glioma (HGG) study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  63 
 
  
    EGAD00001000807 
   
  
    
    Whole Exome Sequencing (WES) for St. Jude High Grade Glioma (HGG) study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  148 
 
  
    EGAD00001000808 
   
  
    
    RIKEN collection WGS reads for 321 HCC and blood matched samples from 158 donors submitted to ICGC for release 15 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  321 
 
  
    EGAD00001000809 
   
  
    
    RIKEN collection WGS reads for 61 liver cancer and matched blood samples from 30 donors displaying biliary phenotype 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  61 
 
  
    EGAD00001000810 
   
  
    
    Dataset for whole exome sequencing of 49 tumor-blood pairs and transcriptome sequencing of 44 tumors for adrenocortical tumors 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  106 
 
  
    EGAD00001000811 
   
  
    
    Whole exome sequencing of 6 HCCs and matched background liver in children with bile salt export pump deficiency. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000812 
   
  
    
    Sequencing of 350 cancer genes in BC samples from patients treated with either Epirubicin or Paclitaxel monotherapy in the neoadjuvant setting. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  364 
 
  
    EGAD00001000813 
   
  
    
    Fernandez-Cuesta et al., 2014, Nature Communication,
Whole genome sequencing was performed using a read length of 2x100 bp for all
samples. On average, 110 Gb of sequence were produced per sample, aiming a
mean coverage of 30x for both tumour and matched normal. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001000814 
   
  
    
    Whole genome alignments of DIPG patients 
    
   
  
    
   
  40 
 
  
    EGAD00001000815 
   
  
    
    Exome-seq, RNA-Seq, SNP array profiling of gastric tumor samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001000816 
   
  
    
    ICGC medulloblastoma whole genome sequencing data, ICGC release 16 
    
   
  
    
   
  44 
 
  
    EGAD00001000817 
   
  
    
    Alternative splicing plays critical roles in differentiation, development, and cancer (Pettigrew et al., 2008; Chen and Manley, 2009). The recent identification of specific spliceosome inhibitors has generated interest in the therapeutic potential of targeting this cellular process (van Alphen et al., 2009). Using an integrated genomic approach, we have identified PRPF6, an RNA binding component of the pre-mRNA spliceosome, as an essential driver of oncogenesis in colon cancer. Importantly, PRPF6 is both amplified and overexpressed in colon cancer, and only colon cancer cells with high PRPF6 levels are sensitive to its loss. Our data clearly point to an important role for PRPF6 in colon cancer growth and suggest that a better understanding of its role in alternative splicing in colon cancer is warranted. To determine the specific alternative splice forms that PRPF6 regulates in colon cancer, we plan three experiments: 1. The first involves knocking down expression of PRPF6 in two different cancer cell lines with 3 different siRNAs, and then completing RNA-seq to determine the gene expression changes that occur relative to a non-targeting control siRNA. Because of the role for PRPF6 in pre-mRNA splicing, we especially want to quantify the changes in splice-specific forms of all genes genome-wide to identify genes whose splicing is altered upon PRPF6 knockdown. 2. The second involves immunoprecipitating PRPF6 from two different cancer cell lines and isolating any RNA that is bound to PRPF6, since PRPF6 is an RNA-binding protein. We then want to carry out RNA-seq to identify which RNA molecules co-immunoprecipitated with PRPF6. This will help us determine possible functions for PRPF6 in regulating colon cancer growth. 3. The third involves overexpressing PRPF6 in cell lines and then carrying out RNA-seq to identify any changes in splice-specific gene expression. This will allow us to determine whether increased PRPF6 expression is sufficient to drive alternative splicing changes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001000818 
   
  
    
    Quiescent Sox2+ cells drive hierarchical growth and relapse in Sonic hedgehog subgroup medulloblastoma 
    
   
  
    
   
  4 
 
  
    EGAD00001000819 
   
  
    
    We are aiming to investigate repair of a double strand break (DSB) within the genome in the presence and absence of the BLOOM protein. Zinc Finger Nucleases introduce DSBs at specified loci within the genome. Using sequencing we will assess the size of the deletion following repair. 
Protocol
1. Transfect normal and BLOOM deficient human iPS cells with ZFNs, using AMXA
2. Harvest cells after 5 days
3. Perform column extraction of DNA
4. PCR-amplify the ZFN region 
5. Sequence and analyse repair of the DSB
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001000820 
   
  
    
    Fernandez-Cuesta et al, 2014, Nature Communication, Whole exome sequencing data set 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001000821 
   
  
    
    Raw sequencing data for all samples in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  767 
 
  
    EGAD00001000822 
   
  
    
    Whole exome sequencing and miRNA-seq data of PPB. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  18 
 
  
    EGAD00001000824 
   
  
    
    RNA sequencing will be undertaken to reconstruct rearrangements at level of transcription to determine pathogenomic genomic events in chondromyxoid fibroma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000825 
   
  
    
    This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  454 
 
  
    EGAD00001000826 
   
  
    
    We propose to definitively characterise the somatic genetics of Osteosarcoma cancer through generation of comprehensive catalogues of somatic mutations by high coverage genome and transcriptome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000827 
   
  
    
    n order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs  and from studies in the mouse, it appears that an epigenetic memory of the starting cell type is carried over to hiPSCs. However a comprehensive comparative study of the characteristics of these hiPSCs has been missing from the literature. Importantly studies which aimed to address these aspects of hiPSCs have used cells from different patients. In order to avoid this important confounding variable and to keep the genetic background constant, tissue samples were procured from the patients and reprogrammed to iPS cells. The methylation status of these iPS cells will be compared.
Protocol:
Primary cell cultures were generated and reprogrammed to iPS cells. DNA was extracted and immunoprecipitated using anti-methyl cytosine and anti-hydroxymethyl cytosine antibodies.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001000828 
   
  
    
    Fibroblasts have been shown to re-program into induced pluripotent stem (hiPS) cells, through over-expression of pluripotency genes. These hiPS cells show similar characteristics to embryonic stem cells including cell surface markers, epigenetic changes and ability to differentiate into the three germ layers. However it is unclear as to the extent of changes in gene expression through the re-programming process.. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000829 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000830 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000831 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001000832 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001000833 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000834 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000835 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000836 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  49 
 
  
    EGAD00001000842 
   
  
    
    RIKEN collection WGS reads for 100 HCC and matched blood samples from 50 donors submitted to ICGC for release 16 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001000843 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000844 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000845 
   
  
    
   
  
    
   
  44 
 
  
    EGAD00001000847 
   
  
    
    Shwachman-Diamond syndrome (SDS) is a rare autosomal recessive disorder characterized by exocrine pancreatic insufficiency, bone marrow dysfunction, leukemia predisposition, and skeletal abnormalities. We aim to characterise the structural effects of SDS in patients with this disorder by exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000848 
   
  
    
    To evaluate the presence of mutations in frequently mutated genes in MPN by performing targeted resequencing of a selected gene panel comprising of 111 genes across 40 samples with MPN. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  48 
 
  
    EGAD00001000849 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001000850 
   
  
    
    Small cell carcinoma of the ovary of hypercalcemic type (SCCOHT) is an extremely rare, aggressive cancer affecting children and young women. We identified germline and somatic inactivating mutations in the SWI/SNF chromatin-remodeling gene SMARCA4 in 75% (9/12) of SCCOHT patients in addition to SMARCA4 protein loss in 82% (14/17) of SCCOHT tumors, but in only 0.4% (2/485) of other primary ovarian tumors. These data implicate SMARCA4 in SCCOHT oncogenesis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001000853 
   
  
    
    DATA FILES FOR SJEPD 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001000854 
   
  
    
    DATA FILES FOR SJEPD 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  77 
 
  
    EGAD00001000856 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000865 
   
  
    
    WGS of 14 paired samples of Bladder Cancer patient 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001000868 
   
  
    
    FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001000869 
   
  
    
    It is the ambition of the team formed by members of the Netherlands Cancer Institute (NKI) and the Cancer Genome Project at the Wellcome Trust Sanger Institute (WTSI) to unravel the genomic and phenotypic complexity of human cancers in order to identify optimal drug combinations for personalized cancer therapy. Our integrated approach will entail (i) deep sequencing of human tumours and cognate mouse tumours; (ii) drug screens in a 1000+ fully characterized tumour cell line panel; (iii) high-throughput in vitro and in vivo shRNA and cDNA drug resistance and enhancement screens; (iv) computational analysis of the acquired data, leading to significant response predictions; (v) rigorous validation of these predictions in genetically engineered mouse models and patient-derived xenografts. This integrated effort is expected to yield a number of combination therapies and companion-diagnostics biomarkers that will be further explored in our existing clinical trial networks. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  62 
 
  
    EGAD00001000870 
   
  
    
    Testing logistics and infrastructure of molecular screening program. Core biopsies taken from invasive recurrent or metastatic breast cancer to evaluate and identify molecular traits rendering them suitable for clinical trials 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001000871 
   
  
    
    The purpose of this study is to sequence 500 known cancer genes in 960 newly diagnosed high risk breast cancer patients treated with current standard of care therapies and trastuzumab, for somatic alteration and copy number changes. We will be using next gen sequencing technology to determine the prognostic relevance of these somatic genetic alterations and of teh low frequency events to determine if they are associated with trastuzumab benefit or HER2 positive breast cancer, i.e. treatment interaction. The samples will be analysed adn correlated with clinical variables including outcome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  993 
 
  
    EGAD00001000872 
   
  
    
    These samples are to be analysed with the CGP Developed cancer panel and the results will be compared with WGS data from 4 different comercial providers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001000873 
   
  
    
    Fastq files of 10 samples of condrosarcoma 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000874 
   
  
    
    Indel/point mutation of chondrosarcoma 
    
   
  
    
   
  10 
 
  
    EGAD00001000875 
   
  
    
    The CRO7 clinical trial recruited patients with clinically operable rectal adenocarcinoma. Patients were randomized to either pre-operative short course surgery followed by chemo-radiotherapy only in those patients at high risk of local relapse. Patients in both arms the received standard %-FU based adjuvant chemotherapy as per local policy. We intend to use FFPE derived DNA from the primary tumours to identify patterns of mutations or copy number alterations that are predictive of local or distant relapse. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  330 
 
  
    EGAD00001000876 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  98 
 
  
    EGAD00001000877 
   
  
    
    Complete WGS and RNA-Seq dataset for Australian ICGC ovarian cancer sequencing project 2014-07-07, representing 93 donors.
Sequencing was performed on Illumina HiSeq.
Alignment of the lane-level fastq data was performed with bwa (WGS data) and RSEM (transcriptome data).
For this dataset lane-level .bam files have been merged and de-duplicated to create a single bam file for each sample type (tumour/normal) for each donor.
This dataset supersedes all previous datasets for this study.
2016-08-08 updated with 14 outstanding RNA-seq samples & corresponding RSEM bams
2016-12-07 updated with 7 outstanding RNA-seq controls and corresponding RSEM bams 
    
   
  
    
   
  331 
 
  
    EGAD00001000878 
   
  
    
    RNA-Seq files accompanying Genetic landscape of pediatric Rhabdomyosarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001000879 
   
  
    
    Genomic libraries will be generated from total genomic DNA derived from 200+ patients with childhood Transient Myeloproliferative Disorder (TMD) and or Acute Megakaryocytic Leukemia (AMKL) as well some matched constitutional samples (n < 50). Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 96 Samples will be individually barcoded and subjected to up to two lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  335 
 
  
    EGAD00001000880 
   
  
    
    Genotyping by array and Transcriptome profiling by high-throughput sequencing 
    
   
  
    
   
  233 
 
  
    EGAD00001000881 
   
  
    
    RNA sequencing of Resistant BCC samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000882 
   
  
    
    Targeted genome sequences of the human X chromosome in 4 colorectal adenomas and 4 matched normal tissues from male patients 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000883 
   
  
    
    Illumina HiSeq paired-end exome sequencing of a trio and singleton. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000884 
   
  
    
    In order to elucidate whether newly acquired genetic alterations during serial transplantation of patient derived primary pancreatic cancer cultures contribute to the observed clonal dynamics in vivo, all coding genes of two patient derived primary cultures and derived genetically marked serial xenografts (1°/2°/3°) were sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000885 
   
  
    
    Exome read sequences for 30 tumor-normal pairs for the study "Diverse modes of genomic alterations in Hepatocellular Carcinoma". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001000886 
   
  
    
    RNA-Sequencing data (raw read sequences) for 23 samples, from 12 patients, for the study "Diverse modes of genomic alterations in Hepatocellular Carcinoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001000887 
   
  
    
    Exome sequencing of Resistant BCC samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001000888 
   
  
    
    NSCLC WGS. 
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
    
   
  4 
 
  
    EGAD00001000889 
   
  
    
    NSCLC targeted. 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  4 
 
  
    EGAD00001000891 
   
  
    
    To characterize the subclonal genomic architecture of androgen-deprived metastatic prostate cancer, we performed whole-genome sequencing (WGS) of 51 tumours from 10 patients to an average sequencing depth of 55x, including multiple metastases from different anatomic sites in each patient
and, in five cases, the prostate tumour. Noncancerous DNA from blood or other tissue is used as reference comparison for each patient. The patients are part of PELICAN (Project to ELIminate Lethal Cancer) rapid autopsy study led by G. Steven Bova at Johns Hopkins University (USA) and Tampere University (Finland). As of September 2020, some of the studies using these data include: Gundem et al, Nature 2015 (PMID: 25830880). Additional EGAD00001000891  sample metadata is contained in Supplementary Information in this report.Tubio et al, Science 2014 (PMID: 25082706) Behjati et al, Nature Comm 2015 (PMID: 27615322) Wedge et al, Nature Genetics 2018 (PMID: 29662167)Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007)Rodriguez-Martin et al, Nature Genetics 2020 (PMID: 32024998)Woodcock et al, Nature Comm 2020 (In Press) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  62 
 
  
    EGAD00001000892 
   
  
    
    Whole Genome Sequencing Illumina HiSeq data from 20 men with prostate cancer. 20 samples were taken from primary tissue obtained at prostatectomy (target sequencing depth 50X) with matched blood control (target sequencing depth 30X). These were submitted for use in the ICGC Pan-Cancer Analysis of Whole Genomes project.
Same raw data submitted in EGAD00001001116.
As of September 2020, some of the studies using these data include:
Wedge et al, Nature Genetics 2018 (PMID: 29662167)
Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001000893 
   
  
    
    HipSci - Healthy Normals - Exome Sequencing - May 2014 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001000894 
   
  
    
    SPECTA comprises a network of participating European clinical sites and NGS screening platforms that can screen individual patients for multiple molecular targets and potentially allow the design of trials that will match the specific biology of the diseases affecting specific patients with cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  64 
 
  
    EGAD00001000896 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001000897 
   
  
    
    HipSci - Healthy Normals - RNA Sequencing - May 2014 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001000898 
   
  
    
    Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001000899 
   
  
    
    We propose to definitively characterise the somatic genetics of Metastatic breast cancer through generation of comprehensive catalogues of somatic mutations in Metastatic breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  41 
 
  
    EGAD00001000900 
   
  
    
    Multi-region Illumina whole-exome and/or whole-genome sequencing on tumor regions collected from early-stage NSCLC patients who underwent definitive surgical resection prior to receiving adjuvant therapy.Detected variants were validated on Ion AmpliSeq™ Custom Panel and/or Comprehensive Cancer Gene Panels.Patients covered by this dataset: L001, L002, L003, L004, L008 and L011. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Ion Torrent PGM 
      
    
   
  28 
 
  
    EGAD00001000901 
   
  
    
    The dataset includes the whole exome sequencing data from32 pairs of gallbladder caner tissues and patient-matched normal tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  64 
 
  
    EGAD00001000902 
   
  
    
    The dataset includes the targeted gene sequencing data from51 pairs of gallbladder caner tissues and patient-matched normal tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  102 
 
  
    EGAD00001000903 
   
  
    
    RNA-Seq data for 4 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 22 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000904 
   
  
    
    RNA-Seq data for 7 mature neutrophil sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000905 
   
  
    
    DNase-Hypersensitivity data for 5 CD14-positive, CD16-negative classical monocyte sample(s). 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001000906 
   
  
    
    ChIP-Seq data for 1 mature eosinophil sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000907 
   
  
    
    RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000908 
   
  
    
    RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000909 
   
  
    
    Bisulfite-Seq data for 1 erythroblast sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000910 
   
  
    
    Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000911 
   
  
    
    RNA-Seq data for 4 erythroblast sample(s). 22 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000912 
   
  
    
    RNA-Seq data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000913 
   
  
    
    ChIP-Seq data for 9 CD14-positive, CD16-negative classical monocyte sample(s). 59 run(s), 55 experiment(s), 55 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001000914 
   
  
    
    Bisulfite-Seq data for 3 inflammatory macrophage sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000915 
   
  
    
    RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000916 
   
  
    
    ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000917 
   
  
    
    Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000918 
   
  
    
    RNA-Seq data for 3 common lymphoid progenitor sample(s). 15 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000919 
   
  
    
    RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000920 
   
  
    
    Bisulfite-Seq data for 1 alternatively activated macrophage sample(s). 10 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000921 
   
  
    
    Bisulfite-Seq data for 1 CD8-positive, alpha-beta T cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000922 
   
  
    
    RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000923 
   
  
    
    Bisulfite-Seq data for 1 macrophage sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000924 
   
  
    
    ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000925 
   
  
    
    ChIP-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 21 run(s), 21 experiment(s), 21 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000926 
   
  
    
    DNase-Hypersensitivity data for 2 inflammatory macrophage sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000927 
   
  
    
    Bisulfite-Seq data for 1 Plasma cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000928 
   
  
    
    RNA-Seq data for 7 CD14-positive, CD16-negative classical monocyte sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000929 
   
  
    
    ChIP-Seq data for 1 macrophage sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000930 
   
  
    
    ChIP-Seq data for 7 mature neutrophil sample(s). 68 run(s), 50 experiment(s), 50 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001000931 
   
  
    
    DNase-Hypersensitivity data for 1 macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000932 
   
  
    
    Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000933 
   
  
    
    RNA-Seq data for 1 macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000934 
   
  
    
    Bisulfite-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000935 
   
  
    
    Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000936 
   
  
    
    ChIP-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001000937 
   
  
    
    RNA-Seq data for 1 alternatively activated macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000938 
   
  
    
    ChIP-Seq data for 4 alternatively activated macrophage sample(s). 29 run(s), 28 experiment(s), 28 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000939 
   
  
    
    RNA-Seq data for 3 hematopoietic stem cell sample(s). 8 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000940 
   
  
    
    ChIP-Seq data for 3 inflammatory macrophage sample(s). 21 run(s), 21 experiment(s), 21 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001000941 
   
  
    
    Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000942 
   
  
    
    DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000943 
   
  
    
    Bisulfite-Seq data for 1 germinal center B cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001000944 
   
  
    
    Whole Genome Sequencing of 5 acral melanomas and matched normal samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000945 
   
  
    
    NGS of 10 mucosal melanomas:Whole genome sequencing of 5 mucosal melanomas and matched normal DNAWhole exome sequencing of 5 mucosal melanomas and matched normal DNA 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001000946 
   
  
    
    Divergent clonal selection dominates medulloblastoma at recurrence 
    
   
  
    
   
  125 
 
  
    EGAD00001000947 
   
  
    
    Genomic libraries (500 bps) will be generated from total genomic DNA derived from Colorectal cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  45 
 
  
    EGAD00001000948 
   
  
    
    A comparison of the somatic variation present in a primary colorectal tumour and three different liver metastases from the same patient. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001000949 
   
  
    
    Validations of variants identified by exome sequencing in sequential samples derived after treatment cycle with AZA. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  170 
 
  
    EGAD00001000950 
   
  
    
    Whole genome sequencing data for ependymomas (5 tumor-control pairs). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142). 
    
   
  
    
   
  10 
 
  
    EGAD00001000951 
   
  
    
    Whole exome sequencing data for ependymomas (42 tumor-control pairs). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142). 
    
   
  
    
   
  84 
 
  
    EGAD00001000952 
   
  
    
    DNA methylation profiling of 8 control samples from adult (4) and fetal brain (4) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001000963 
   
  
    
    Exome sequencing of sporadic schwannomatosis patients 
    
   
  
    
   
  16 
 
  
    EGAD00001000964 
   
  
    
    Low-coverage whole genome sequencing of sporadic schwannomatosis patients 
    
   
  
    
   
  16 
 
  
    EGAD00001000965 
   
  
    
    Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  331 
 
  
    EGAD00001000966 
   
  
    
    Whole genome bisulfite sequencing data for 6 ependymomas plus 3 fetal controls (f1, f2, f4) and 3 adult controls (a2, a3, a4). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001000967 
   
  
    
    This dataset contains the fastq sequencing data collected from bone marrow DNA of a chronic myeloid leukaemia patient at time of diagnosis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001000972 
   
  
    
    Whole Genome Sequencing to track subclonal heterogeneity in 18  samples from 3 Chronic Lymphocytic Leukemia patients subjected to repeated cycles of therapy. NOTE: There are only 12 BAM files available to download. The other 6 files are missing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001000973 
   
  
    
    Van Hippel-Lindau syndrome multi-region exome sequencing of two patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001000974 
   
  
    
    High-grade serous ovarian cancer (HGSC) is characterized by poor outcome, often attributed to emergence of treatment-resistant sub-clones. We sought to measure the degree of genomic diversity within primary, untreated HGSC to examine the natural state of tumor evolution prior to therapy. We performed exome sequencing, copy number analysis, targeted amplicon deep sequencing and gene expression profiling on thirty-one spatially and temporally separated HGSC tumor specimens (six patients) including ovarian masses, distant metastases, and fallopian tube lesions. We found widespread intra-tumoral variation in mutation, copy number, and gene expression profiles, with key driver alterations in genes present in only a subset of samples (e.g. PIK3CA, CTNNB1, NF1). On average, only 51.5% of mutations were present in every sample of a given case (range: 10.2% to 91.4%), with TP53 as the only somatic mutation consistently present in all samples. Complex segmental aneuploidies, such as whole genome doubling, were present in a subset of samples from the same individual, with divergent copy number changes segregating independently of point mutation acquisition. Reconstruction of evolutionary histories showed one patient with mixed HGSC and endometrioid histology with common etiologic origin in the fallopian tube and subsequent selection of different driver mutations in the histologically distinct samples. In this patient, we observed mixed cell populations in the early fallopian tube lesion, indicating diversity arises at early stages of tumorigenesis. Our results reveal that HGSC exhibit highly individual evolutionary trajectories and diverse genomic tapestries prior to therapy, exposing an essential biological characteristic to inform future design of personalized therapeutic solutions and investigation of drug resistance mechanisms. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  131 
 
  
    EGAD00001000975 
   
  
    
    65 prostate cancer cases transcriptome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001000976 
   
  
    
    WGS DATA FILES FOR SJPhLike 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001000977 
   
  
    
    WGS dataset LCNEC study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000978 
   
  
    
    Multi-region whole genome sequencing of an high grade serous ovarian carcinoma sample for characterization of genomic intra-tumoural heterogeneity. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001000979 
   
  
    
    We are developing a protocol to differentiate mouse and human induced pluripotent stem (IPS) and embryonic stem (ES) cells towards the haematopoietic pathway to generate erythrocytes in vitro. This system has many applications such as the study of the role of specific genes and human polymorphisms in infectious diseases such as malaria, as well as haematological diseases such as myelodysplastic syndrome. The nature of the in vitro differentiation process means that a heterogeneous population of cells is generated. In order to understand the types of cells produced with our protocol, we have performed a single cell analysis, which has the power to reveal the different populations of cells and their characteristics. For this, a cDNA library has been made that needs to be sequenced to obtain the gene expression profiles of the different cells. With this information we will be able to assess the quality of the differentiation protocol and improve it in order to produce better cells for the downstream applications.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  192 
 
  
    EGAD00001000980 
   
  
    
    This study involves a forward genetic screen to identify common insertion sites in drug resistant clones. We will be utilising piggybac transposon systems in order to generate multiple drug resistant clones in a range of human cancer cell lines. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  144 
 
  
    EGAD00001000983 
   
  
    
    65 prostate cancer cases wgs sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001000984 
   
  
    
    This is the Whole Exome Sequencing (WES) data from 59 samples from 11 patients with lung adenocarcinomas including 48 tumor samples and 11 peripheral white blood cell samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001000985 
   
  
    
    This is the targeted capture deep sequencing (TCS) data for validation of the mutations discovered in the WES step. There are 58 bam files of TCS data including 48 tumor samples and 10 peripheral blood WBC samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001000986 
   
  
    
    Pheochromocytomas and paragangliomas (PCC/PGL) are neural crest derived tumors with a very strong genetic component. We report the first integrated genomic portrayal of a large collection of PCC/PGL. SNP array analysis revealed distinct copy-number patterns associated with genetic background. Whole-exome sequencing showed a low mutation rate of 0.3 mutations per megabase, with few recurrent somatic mutations in genes not previously associated with PCC/PGL. DNA methylation arrays and miRNA sequencing identified DNA methylation changes and miRNA expression clusters strongly associated with mRNA expression profiling. Overexpression of the miRNA cluster 182/96/183 was specific of SDHB-mutated tumors and induced invasive traits, whereas silencing of the imprinted DLK1-MEG3 miRNA cluster appeared as a potential driver in a subgroup of sporadic tumors. Altogether, the complete genomic landscape of PCC/PGL is mainly driven by distinct germline and/or somatic mutations in susceptibility genes and reveals different molecular entities, characterized by a set of unique genomic alterations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001000987 
   
  
    
    Whole exome sequencing data from tumor and normal samples from carcinosarcoma (malignant mixed mullerian tumor) patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001000988 
   
  
    
    Validation/deeper sequencing for metastatic prostate cancer samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  94 
 
  
    EGAD00001000989 
   
  
    
    Validation/deeper sequencing for metastatic prostate cancer samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001000990 
   
  
    
    mRNA-Seq on total RNA from primary osteoblastomas and phosphaturic mesenchymal tumours, focussing on fusion transcript expression 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001000992 
   
  
    
    HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving events caused by differential SIX1 binding of the SIX1 Q177R mutatns 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001000993 
   
  
    
    HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving gene expression events 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001000994 
   
  
    
    HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving chromosomal aberrations 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  56 
 
  
    EGAD00001000995 
   
  
    
    HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving DNA alterations 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  112 
 
  
    EGAD00001000996 
   
  
    
    Whole exome sequencing data for AML and matched normal samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001000997 
   
  
    
    Whole-exome sequencing of a chronic lymphocytic leukemia (CLL) developed during vemurafenib treatment of a patient with malignant melanoma. Peripheral blood mononuclear cells were separated by Ficoll gradient centrifugation. DNA was extracted from highly purified (>97%) CD19+CD5+ cells obtained from the patient while being under BRAF inhibition versus CD14+ germline control cells (>90% purity). No alterations that could be linked to aberrant RAS activity or paradoxical RAF/MEK/ERK signaling could be identified in the CLL, which shows characteristic copy number alterations. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001000998 
   
  
    
    Targeted capture of exonic and intronic regions of interest for the study of genomic alterations in multiple myeloma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001001000 
   
  
    
    Background: The disease course of patients with diffuse low-grade glioma is notoriously unpredictable. Temporal and spatially distinct samples may provide insight into the evolution of clinically relevant copy number aberrations (CNAs). The purpose of this study is to identify CNAs that are indicative of aggressive tumor behaviour and can thereby complement the prognostically favorable 1p/19q co-deletion. Results: Genome-wide, 50 base pair single-end, sequencing was performed to detect CNAs in a clinically well-characterized cohort of 98 formalin-fixed paraffin-embedded low-grade gliomas. CNAs are correlated with overall survival as an endpoint. Seventy-five additional samples from spatially distinct regions and paired recurrent tumors of the discovery cohort were analysed to interrogate the intratumoral heterogeneity and spatial evolution. Loss of 10q25.2-qter is a frequent subclonal event and significantly correlates with an unfavorable prognosis. A significant correlation is furthermore observed in a validation set of 126 and confirmation set of 184 patients. Loss of 10q25.2-qter arises in a longitudinal manner in paired recurrent tumor specimens, whereas the prognostically favorable 1p/ 19q co-deletion is the only CNA that is stable across spatial regions and recurrent tumors. Conclusions: CNAs in low-grade gliomas display extensive intratumoral heterogeneity. Distal loss of 10q is a late onset event and a marker for reduced overall survival in low-grade glioma patients. Intratumoral heterogeneity and higher frequencies of distal 10q loss in recurrences suggest this event is involved in outgrowth to the recurrent tumor. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  175 
 
  
    EGAD00001001001 
   
  
    
   
  
    
   
  2 
 
  
    EGAD00001001002 
   
  
    
    Exome sequencing data for 8 pairs of seminomas and matched normal 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001003 
   
  
    
    Exome sequencing of lymphocyte DNA from 12 affected individuals from six unrelated, non-syndromic Wilms tumor families. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001004 
   
  
    
    65 prostate cancer cases wgs sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001001006 
   
  
    
    Dataset for whole exome sequencing of 113 pairs of tumor and normal DNA samples along with 8 cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  234 
 
  
    EGAD00001001007 
   
  
    
    Low depth (4x) Illumina HiSeq raw sequence data for 100 unrelated Zulu from Durban area, South Africa. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001001008 
   
  
    
    Low depth (4x) Illumina HiSeq raw sequence data for 100 unrelated Baganda from rural Uganda. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001001009 
   
  
    
    Exome sequencing of peripheral blood from 4 individuals of a family with familial colorectal cancer type X 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001010 
   
  
    
    Sequencing of colorectal tumors and normal tissue using Ion AmpliSeq Cancer Hotspot Panel V2 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  8 
 
  
    EGAD00001001011 
   
  
    
    Monocyte differentiation into macrophages represents a cornerstone process for host defense. Concomitantly, immunological imprinting of either tolerance or trained immunity determines the functional fate of macrophages and susceptibility to secondary infections. Transcriptomes (RNA-Seq) and epigenomes (ChIP-Seq H3K4me1,H3K4me3,H3K27ac) in four primary cell types: monocytes, in vitro differentiated naive, tolerized and trained macrophages were characterized. Inflammatory and metabolic pathways were modulated in macrophages, including decreased inflammasome activation, and pathways functionally implicated in trained immunity were identified. Strikingly, B-glucan training elicits an exclusive epigenetic signature, revealing a complex network of enhancers and promoters. Analysis of transcription factor motifs in DNase I hypersensitive sites at cell-type specific epigenetic loci unveiled differentiation and treatment specific repertoires. Altogether, this study provides a resource to understand the epigenetic changes that underlie innate immunity in humans. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  57 
 
  
    EGAD00001001012 
   
  
    
    The need for a detailed catalogue of local variability for the study of rare diseases within the context of the Medical Genome Project motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population. 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  267 
 
  
    EGAD00001001013 
   
  
    
    RNAseq and exome sequencing data of gastric cancer cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001001014 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2597 
 
  
    EGAD00001001015 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001001016 
   
  
    
    DATA FILES FOR SJPhLike-RNASeq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001001017 
   
  
    
    DNA extracted from multiple biopsies taken from different areas of primary lung tumours will be subjected to targeted re-sequencing and analysed in order to assess intra-tumour heterogeneity with respect to mutations in a selection of cancer related genes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001001018 
   
  
    
    The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations.
                   This dataset contains all the data available for this study on 2014-09-24 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  374 
 
  
    EGAD00001001019 
   
  
    
    RNA-seq dataset used for the validation of CDK6 cis-regulatory mutation annotated by OncoCis. NB bam files for manuscript A_Proteomic_Chronology_of_Gene_Expression_through_the_Cell_Cycle_in_Human_Myeloid_Leukemia_Cells are now available at the following link:http://www.ebi.ac.uk/ena/data/view/ERP008483 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001020 
   
  
    
    DATA FILES FOR SJEWS-WGS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001001021 
   
  
    
    Exome sequencing of 1000 samples from the UK 1958 Birth Cohort. DNA library preps prepared with Illumina TruSeq sample preparation kit. The captured DNA libraries were PCR amplified using the supplied paired-end PCR primers. Sequencing was performed with an Illumina HiSeq2000 (SBS Kit v3, one pool per lane) generating 2x101-bp reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1000 
 
  
    EGAD00001001022 
   
  
    
    nccRCC RNA-Seq data of consented samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  139 
 
  
    EGAD00001001023 
   
  
    
    nccRCC Whole Exome sequencing data (consented samples only) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  137 
 
  
    EGAD00001001024 
   
  
    
    Fastq files of 52 samples of hepatocellular carcinoma(RCAST, THCC) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  104 
 
  
    EGAD00001001025 
   
  
    
    The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry.  In the UK there are large populations with very high first cousin marriage rates of 50-80%.  Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations.  This pilot study based on existing British-Pakistani cohort samples from Birmingham will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed.  The data deposited in the EGA consist of  low coverage whole exome sequencing on these samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1156 
 
  
    EGAD00001001026 
   
  
    
    The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry.  In the UK there are large populations with very high first cousin marriage rates of 20-50%.  Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations.  This pilot study based on existing British-Pakistani cohort samples from Birmingham will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed.  The data deposited in the EGA consists of low coverage whole exome sequencing on these samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  452 
 
  
    EGAD00001001027 
   
  
    
    The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry.  In the UK there are large populations with very high first cousin marriage rates of 20-50%.  Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations.  This pilot study based on existing British-Pakistani cohort samples will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed.  The data deposited in the EGA consists of low coverage whole exome sequencing on these samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001001028 
   
  
    
    DNA belonging to 16 tumour/normal samples were treated with bisulfite, then up to 5 different bisulfite PCRs were performed in each one of the samples. Amplicons form the same sample were pooled and submitted to sequencing on a MiSeq platform. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  18 
 
  
    EGAD00001001029 
   
  
    
    The dataset regards the sequencing of coding and putative regulatory sequences of 38 genes associated to either sporadic or Mendelian form of Parkinson's disease 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  394 
 
  
    EGAD00001001031 
   
  
    
    These are only the whole exome sequences 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001001032 
   
  
    
    DATA FILES FOR SJMEL-WGS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001033 
   
  
    
    Whole exome sequencing (WES) was performed on genomic DNA derived from two patients with Sotos Syndrome Features. Sequencing (100 base pair paired-end) was performed on an Illumina Hiseq 2000 sequencer after enrichment of 62Mb of exonic and adjacent intronic sequences with TruSeq Exome Enrichment Kit (Illumina, San Diego, CA, USA). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001034 
   
  
    
    Whole genome data (Complete genomics platform) for the study EGAS00001000824 
    
   
  
    
   
  24 
 
  
    EGAD00001001035 
   
  
    
    RIKEN collection WGS and RNA-seq reads for 66 HBV-associated HCC and matched blood or liver samples from 22 donors. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001001036 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001001037 
   
  
    
    A total of 395 couples were subjected to IVF-PGD treatment, including 129 couples with NGS-based test and 266 couples with SNP array based test for the detection of embryonic chromosomal abnormalities. The NGS test was performed using low coverage whole genome sequencing with HiSeq 2000 platform. And the SNP array test was using Affymetrix Gene Chip Mapping Nsp I 262K. The average age of patients was 32.1 years (age range 20-44 years). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  188 
 
  
    EGAD00001001038 
   
  
    
    We mapped the data to the UCSC human reference genome build 37 using BWA 0.5.9-r16. We first mapped each read pair separately using bwa aln. Then we used bwa sampe to map the paired reads together to a BAM9 file. The BAM file was then sorted by genomic position and indexed using PicardTools-1.32 SortSam. To prevent PCR artifacts from influencing the downstream analysis of our data, we used Picard to mark the duplicate reads, which were ignored in downstream analysis. We used GATK IndelRealigner on our data around known indels (from 1KG Pilot). The IndelRealigner creates all possible read alignments using the source and computes the likelihood of the data containing the indel based on the read pileup. Whenever the maximum likelihood contains an indel, the reads are realigned accordingly. Each base is associated with a phred-scaled base quality score. Calibration of Phred scores is crucial as they are used in some of the downstream analysis models. We used GATK to recalibrate the base qualities with respect to (i) the base cycle, (ii) original quality score, and (iii) dinucleotide context. To minimize issues stemming from mapping problems around indels, we decided to undergo a second round of indel realignment using the GATK IndelRealigner by family rather than by individual. For this second round, we considered two sources of possible indels: 1KG Phase 1 indels and indels aligned by BWA in the GoNL data. 
    
   
  
    
   
  - 
 
  
    EGAD00001001039 
   
  
    
    Genomic characterisation of a large series of cancer cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1072 
 
  
    EGAD00001001040 
   
  
    
    This is the complete dataset (exome and genome) for the EGAS00001000974 study. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001001041 
   
  
    
    Comparison of genomic rearrangements and DNA methylation patterns between different foci of multiple synchronous (multifocal and multicentric) invasive breast cancers. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  305 
 
  
    EGAD00001001042 
   
  
    
    In this work, using exome sequencing, we identified biallelic PNLPA6 mutations in patients with childhood blindness due to severe photoreceptor death and clinical features of Leber congenital amaurosis (LCA) and, interestingly, also of the rare Oliver McFarlane Syndrome 
    
   
  
    
      
      AB SOLiD 4 System 
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001043 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001044 
   
  
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  2 
 
  
    EGAD00001001045 
   
  
    
    DATA FILES FOR SJRB 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001001046 
   
  
    
    We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001001047 
   
  
    
    Targeted exome sequencing of  375 genes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001001048 
   
  
    
    Samples from Edwards et al 2015 - doi:10.1186/s12864-015-1685-z 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001001050 
   
  
    
    We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001051 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  200 
 
  
    EGAD00001001052 
   
  
    
    DATA FILES FOR SJTALL 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001001053 
   
  
    
    DATA FILES FOR SJOS-WGS-2ndBatch 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001001054 
   
  
    
    DATA FILES FOR Ph-likeALL WES 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001001055 
   
  
    
    Bam files for the whole exome sequencing from the study on Spatial homogeneity in pediatric brain tumors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  53 
 
  
    EGAD00001001056 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001057 
   
  
    
    RNA-seq from normal human tissues (2 x 75 bp) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001058 
   
  
    
    Cancer exome reads consisting of FASTQ paired end reads from bone marrow samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001001059 
   
  
    
    Whole Exome Sequencing files accompanying Genetic landscape of pediatric Rhabdomyosarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001001060 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  112 
 
  
    EGAD00001001061 
   
  
    
    This experiment is to inform us of the validity of using pre-made library material to perform a bespoke pulldown experiment to validate the mutations found between the whole genome sequencing of the DNA from the same individuals cancer and normal material. This is to identify the valid and informative mutations in cancer genomes. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001001062 
   
  
    
    Patient (who has had multiple malignancies) has previously been found to harbour a pathogenic p53 variant which is probably mosaic. This finding is based on exome sequencing performed elsewhere. In this study we will resequence the locus in question to ascertain whether the variant is indeed mosaic. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001001063 
   
  
    
    Chondromxoid fibroma is a benign tumour of bone with unknown underlying pathogenesis. To determine pathognomic genomic event in chondromyxoid fibroma whole genome sequencing will be undertaken to reconstruct rearrangements and find underlying mutations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001064 
   
  
    
    Extension of angiosarcoma whole genome sequencing study 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001001065 
   
  
    
    DATA FILES FOR SJCPC-WGS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001066 
   
  
    
    Dynamics of genomic clones in breast cancer patient xenografts at single cell resolution 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  188 
 
  
    EGAD00001001071 
   
  
    
    Samples from the "100" project that are in the ICGC PanCancer project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001072 
   
  
    
    (ShallowSeq CopyNumber) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  5 
 
  
    EGAD00001001073 
   
  
    
    miRNA-seq Cohort of 140 Formalin Fixed Paraffin Embedded Diffuse Large B-cell Lymphoma Patient Samples 
    
   
  
    
   
  140 
 
  
    EGAD00001001074 
   
  
    
    miRNA-seq Cohort of 92 Fresh Frozen Diffuse Large B-cell Lymphoma Patient Samples 
    
   
  
    
   
  92 
 
  
    EGAD00001001075 
   
  
    
    miRNA-seq Cohort of 15 Benign Centroblasts 
    
   
  
    
   
  15 
 
  
    EGAD00001001076 
   
  
    
    Fastq files of 239 samples of biliary tract cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  239 
 
  
    EGAD00001001079 
   
  
    
    The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry.  In the UK there are large populations with very high first cousin marriage rates of 20-50%.  Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations.  This pilot study based on existing cohort samples from the Born In Bradford study will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed.  The data deposited in the EGA consist of  low coverage whole exome sequencing on these samples.Data Access is controlled by the Wellcome Trust Sanger Institute DAC and the Born In Bradford Executive Group.
This dataset contains all the data available for this study on 2014-11-20. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2702 
 
  
    EGAD00001001080 
   
  
    
    MDS patients 
    
   
  
    
   
  5 
 
  
    EGAD00001001081 
   
  
    
    Healthy reference samples 
    
   
  
    
   
  3 
 
  
    EGAD00001001083 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001084 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  209 
 
  
    EGAD00001001085 
   
  
    
    This dataset includes 2 pairs of tumour/normal whole genome sequence data as well as MEN1 gene targeted sequencing of an additional 87 specimens. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  91 
 
  
    EGAD00001001086 
   
  
    
    These analysis are the BAM files for the LCLs samples of the EUROBATS samples. 
    
   
  
    
   
  765 
 
  
    EGAD00001001087 
   
  
    
    RNAseq BAM files for the Skin samples of the EUROBATS project. 
    
   
  
    
   
  672 
 
  
    EGAD00001001088 
   
  
    
    RNAseq BAM files for the blood samples of the EUROBATS project 
    
   
  
    
   
  391 
 
  
    EGAD00001001089 
   
  
    
    RNAseq BAM files for the Fat samples of the EUROBATS project 
    
   
  
    
   
  685 
 
  
    EGAD00001001090 
   
  
    
    This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  166 
 
  
    EGAD00001001091 
   
  
    
    We established and validated a sequence capture based NGS testing approach for PKD1. The presence of six PKD1 pseudogenes and tremendous allelic heterogeneity make molecular genetic testing of PKD1 variants challenging. In the publication accompaying this dataset (An efficient and comprehensive strategy for genetic diagnostics of polycystic kidney disease, Eisenberger et.al., PLoS one), we demonstrate that the applied standard mapping algorithm specifically aligns reads to the PKD1 locus and overcomes the complication of unspecific capture of pseudogenes. This dataset contains the raw PKD1 reads of all patients from the publication. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  55 
 
  
    EGAD00001001092 
   
  
    
    Approximately 80% of clinically clearly diagnosed patients suffering from primary ciliary dyskinesia (PCD) cannot be assigned to a specific gene defect. Despite extensive research on PCD and despite the increasing number of PCD genes and knowledge about their sites of action as e.g structural component or cytoplasmic pre-assembly factor, the biology of motile cilia and the pathomechanism leading to PCD is largely unknown. The aim of this study is to identify novel PCD related genes and processes relevant for motile cilia function.We will perform exome sequencing, aiming on the analysis of family trios.  In these families, the  diagnosis of PCD is secured, but the underlying gene defects has so far not been identified. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  150 
 
  
    EGAD00001001093 
   
  
    
    Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001094 
   
  
    
    200PG : WGS Raw Sequence (fastq) : Raw WG sequence data (fastq) in this dataset are from the 124 CPCGene Tumour/Normal Pairs used in the 200PG Study.  https://www.ncbi.nlm.nih.gov/pubmed/28068672 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  247 
 
  
    EGAD00001001095 
   
  
    
    Supporting data for ICGC PACA-CA Release 18 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  506 
 
  
    EGAD00001001096 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  419 
 
  
    EGAD00001001098 
   
  
    
    DATA FILES FOR SJINF RNASeq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  63 
 
  
    EGAD00001001100 
   
  
    
    DCC Project Code: 
SKCA-BR	Skin Adenocarcinoma - BR	Brazil 
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
      Illumina HiSeq 2500 
      
    
   
  200 
 
  
    EGAD00001001104 
   
  
    
    MMP-seq tumor samples, UDG treated (FASTQ) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  16 
 
  
    EGAD00001001105 
   
  
    
    Whole-exome sequencing in 16 RMS casesWhole-transcriptome sequencing in 8 RMS cases 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001001106 
   
  
    
    In the first part of this project, we will differentiate IPS cells from 5 human donors into macrophages, and extract RNA from unstimulated and LPS stimulated macrophages to perform RNA sequencing. We will also extract RNA before and after stimulation in blood- derived macrophages from 5 additional, unrelated healthy samples. In the second part of the project, RNA-seq data will be analysed to compare LPS response of these two macrophage populations. In summary, we will perform 75bp PE RNA-seq on 20 samples (10 pre and post stimulus), on the HiSeq 2500 platform. Samples will be multiplexed at 5 samples / lane, so we will require 4 flow cells in total.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001001107 
   
  
    
    MMP-seq cell lines (FASTQ) 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  154 
 
  
    EGAD00001001108 
   
  
    
    MMP-seq tumor samples (FASTQ) 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  218 
 
  
    EGAD00001001109 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001001110 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001001111 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001001112 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001001113 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001001114 
   
  
    
    DDD DATAFREEZE 2013-12-18: 1133 trios - exome sequence BAM files (Ref: DDD Nature 2015) 
    
   
  
    
   
  - 
 
  
    EGAD00001001115 
   
  
    
    SeqControl 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001001116 
   
  
    
    Whole Genome Sequencing Illumina HiSeq data from 95 men with prostate cancer. Samples were taken from primary tissue obtained at prostatectomy (target sequencing depth 50X) with matched blood control (target sequencing depth 30X). This data is from batches 1 to 3 and is the bulk of the data used in Wedge et al, Nature Genetics 2018 (PMID: 29662167).
As of September 2020, some of the studies using these data include:
Wedge et al, Nature Genetics 2018 (PMID: 29662167)
Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  150 
 
  
    EGAD00001001118 
   
  
    
    Gastric Cancer (GC) is a highly heterogeneous disease. To identify potential clinically actionable therapeutic targets that may inform individualized treatment strategies, we performed whole-exome sequencing on 78 GCs of differing histologies and anatomic locations, as well as whole-genome sequencing on two GC cases, each with 3 primary tumours and 2 matching lymph node metastases. The data showed two distinct GC subtypes with either high-clonality (HiC) or low-clonality (LoC). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  168 
 
  
    EGAD00001001119 
   
  
    
    Whole Genome Bisulfite Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001001120 
   
  
    
    Whole Genome Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001001121 
   
  
    
    RNA Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001122 
   
  
    
    FFPE normal panel generation for use with V3 cancer panel 0618521 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  94 
 
  
    EGAD00001001123 
   
  
    
    Deep sequencing of two skin biopsies to study the landscape of somatic mutations in human adult tissues. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001124 
   
  
    
    Our aim is to analyze the genome of human melanoma cell lines and short term culture from human melanoma samples in order to identify genes that confer drug resistance to clinically relevant targeted therapies. We will perform whole-exome sequencing, copy number variation analysis and methylome analysis in a collection of human melanoma cell lines and short term culture that will be then screened  for drug sensitivity/resistance through a library of clinically relevant drugs and drug combinations. By the combined analysis of the genomic lesion and the drug sensitivity/resistance profile of different cell lines, we will look for genes whose mutation is associated to the sensitivity or resistance to a specific drug in different samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001001125 
   
  
    
    Exome sequencing of Untreated BCC samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  91 
 
  
    EGAD00001001126 
   
  
    
   
  
    
   
  340 
 
  
    EGAD00001001127 
   
  
    
    ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001128 
   
  
    
    Bisulfite-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001129 
   
  
    
    RNA-Seq data for 10 mature neutrophil sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001130 
   
  
    
    DNase-Hypersensitivity data for 5 CD14-positive, CD16-negative classical monocyte sample(s). 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001131 
   
  
    
    Bisulfite-Seq data for 1 memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001132 
   
  
    
    RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001133 
   
  
    
    Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001134 
   
  
    
    Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001135 
   
  
    
    Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001136 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001137 
   
  
    
    RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001138 
   
  
    
    ChIP-Seq data for 6 Acute promyelocytic leukemia sample(s). 25 run(s), 23 experiment(s), 23 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001139 
   
  
    
    Bisulfite-Seq data for 3 inflammatory macrophage sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001140 
   
  
    
    RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001141 
   
  
    
    Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001142 
   
  
    
    RNA-Seq data for 1 endothelial cell of umbilical vein (resting) sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001143 
   
  
    
    Bisulfite-Seq data for 4 alternatively activated macrophage sample(s). 64 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001144 
   
  
    
    ChIP-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001145 
   
  
    
    RNA-Seq data for 2 CD38-negative naive B cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001146 
   
  
    
    RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001147 
   
  
    
    ChIP-Seq data for 7 CD4-positive, alpha-beta T cell sample(s). 46 run(s), 45 experiment(s), 45 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001148 
   
  
    
    RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001149 
   
  
    
    ChIP-Seq data for 7 mature neutrophil sample(s). 78 run(s), 60 experiment(s), 60 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001150 
   
  
    
    Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001151 
   
  
    
    Bisulfite-Seq data for 1 endothelial cell of umbilical vein (proliferating) sample(s). 21 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001152 
   
  
    
    Bisulfite-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001153 
   
  
    
    RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001154 
   
  
    
    ChIP-Seq data for 5 CD8-positive, alpha-beta T cell sample(s). 28 run(s), 28 experiment(s), 28 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001155 
   
  
    
    ChIP-Seq data for 5 alternatively activated macrophage sample(s). 36 run(s), 35 experiment(s), 35 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001156 
   
  
    
    RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001157 
   
  
    
    Bisulfite-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 61 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001158 
   
  
    
    ChIP-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 16 run(s), 16 experiment(s), 16 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001159 
   
  
    
    RNA-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001160 
   
  
    
    Bisulfite-Seq data for 1 plasma cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001161 
   
  
    
    DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001162 
   
  
    
    Bisulfite-Seq data for 1 Acute myeloid leukemia sample(s). 18 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001163 
   
  
    
    RNA-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001164 
   
  
    
    RNA-Seq data for 1 class switched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001165 
   
  
    
    RNA-Seq data for 5 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 23 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001166 
   
  
    
    RNA-Seq data for 1 endothelial cell of umbilical vein (proliferating) sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001167 
   
  
    
    Bisulfite-Seq data for 3 Acute promyelocytic leukemia sample(s). 24 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001168 
   
  
    
    ChIP-Seq data for 2 mature eosinophil sample(s). 12 run(s), 12 experiment(s), 12 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001169 
   
  
    
    RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001170 
   
  
    
    RNA-Seq data for 1 conventional dendritic cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001171 
   
  
    
    RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001172 
   
  
    
    RNA-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001173 
   
  
    
    RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001174 
   
  
    
    RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001175 
   
  
    
    RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001176 
   
  
    
    Bisulfite-Seq data for 1 class switched memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001177 
   
  
    
    RNA-Seq data for 7 erythroblast sample(s). 29 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001178 
   
  
    
    RNA-Seq data for 1 Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001179 
   
  
    
    ChIP-Seq data for 10 CD14-positive, CD16-negative classical monocyte sample(s). 73 run(s), 69 experiment(s), 69 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001180 
   
  
    
    Bisulfite-Seq data for 2 central memory CD8-positive, alpha-beta T cell sample(s). 27 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001181 
   
  
    
    RNA-Seq data for 7 Acute promyelocytic leukemia sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001182 
   
  
    
    ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001183 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001184 
   
  
    
    RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001185 
   
  
    
    DNase-Hypersensitivity data for 2 monocyte sample(s). 4 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001186 
   
  
    
    RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001187 
   
  
    
    ChIP-Seq data for 3 Chronic lymphocytic leukemia sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001188 
   
  
    
    ChIP-Seq data for 7 Acute myeloid leukemia sample(s). 23 run(s), 23 experiment(s), 23 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001189 
   
  
    
    Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 56 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001190 
   
  
    
    DNase-Hypersensitivity data for 1 Acute myeloid leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001191 
   
  
    
    RNA-Seq data for 8 monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001192 
   
  
    
    Bisulfite-Seq data for 5 macrophage sample(s). 72 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001193 
   
  
    
    DNase-Hypersensitivity data for 2 inflammatory macrophage sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001194 
   
  
    
    ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001195 
   
  
    
    ChIP-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001196 
   
  
    
    ChIP-Seq data for 13 macrophage sample(s). 55 run(s), 55 experiment(s), 55 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  13 
 
  
    EGAD00001001197 
   
  
    
    ChIP-Seq data for 2 monocyte sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001001198 
   
  
    
    DNase-Hypersensitivity data for 14 macrophage sample(s). 18 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001001199 
   
  
    
    RNA-Seq data for 18 macrophage sample(s). 19 run(s), 18 experiment(s), 18 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001001200 
   
  
    
    Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001201 
   
  
    
    Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001202 
   
  
    
    RNA-Seq data for 4 alternatively activated macrophage sample(s). 6 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001203 
   
  
    
    Bisulfite-Seq data for 1 germinal center B cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001204 
   
  
    
    ChIP-Seq data for 6 inflammatory macrophage sample(s). 35 run(s), 35 experiment(s), 35 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001205 
   
  
    
    Bisulfite-Seq data for 3 CD38-negative naive B cell sample(s). 29 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001206 
   
  
    
    Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001207 
   
  
    
    ChIP-Seq data for 4 CD38-negative naive B cell sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001208 
   
  
    
    Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  105 
 
  
    EGAD00001001209 
   
  
    
    To examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA, we performed targeted sequencing of all known human protein kinase genes (kinome) (~3.3 Mb) using the SOLiD v4 platform. This data set contains 17 breast cancer samples that were sequenced in duplicate (n=14) or triplicate (n=3), in order to assess concordance of all calls and single nucleotide variant (SNV) calls. 
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  37 
 
  
    EGAD00001001210 
   
  
    
    Medulloblastoma-associated DDX3 variant selectively alters the translational response to stress 
    
   
  
    
   
  28 
 
  
    EGAD00001001212 
   
  
    
    RNAseq profile of purified plasma cells from multiple myeloma patients and tonsils of healthy donors 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001213 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001214 
   
  
    
    Deep (>25x mean coverage) whole genome sequencing on 5-10 families drawn from the Scottish Family Health Study with four or more children. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001001215 
   
  
    
    Targeted sequencing follow-up of genomic lesions in multiple myeloma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  424 
 
  
    EGAD00001001216 
   
  
    
    The aim of this project is to genotype and sequence single spermatozoa from two men, one in his twenties and the other in his seventies. The resulting data is used to quantify the mutations that have arisen in the gametes of both individuals in order to better understand the effect of aging on mutation rates and modes.Project Outline. In order to quantify mutations, semen from two individuals are sequenced. 48 single sperm cells are isolated from each individual, and their DNA is extracted. The resulting genomes are amplified using PicoPlex, GenomiPhi MDA, Repli-G MDA, and MALBAC. QC step is applied to check the quality of WGA DNA using standard Sequenom plex (26 SNPs). A subset of 32 amplification products which pass the intiall QC, are genotyped using Affymetrix SNP6 chips. 12 of the genotyped amplification products are also sequenced. In addition, one multi-cell sample per individual is sequenced as a reference and for validation purposes.Altogether, 12 single cell sperm genomes and two multi-cell genomes are sequenced, coming to a total of 14 genomes. Of the single cell sperm genomes, 2 are sequenced to 50x coverage, and the other 10 to 25x coverage. Both multi-cell genomes are sequenced to 25x coverage. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001217 
   
  
    
   
  
    
   
  15 
 
  
    EGAD00001001218 
   
  
    
   
  
    
   
  10 
 
  
    EGAD00001001220 
   
  
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  10 
 
  
    EGAD00001001221 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001001222 
   
  
    
    TGCT Whole Exome Sequencing data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  84 
 
  
    EGAD00001001226 
   
  
    
    smRNA-Seq assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency,  Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  28 
 
  
    EGAD00001001227 
   
  
    
    Strand-specific mRNA-Seq assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency,  Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  32 
 
  
    EGAD00001001228 
   
  
    
    Whole genome shotgun sequencing assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  27 
 
  
    EGAD00001001229 
   
  
    
    ChIP-Seq (H3K27ac) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001230 
   
  
    
    ChIP-Seq (H3K27me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency,  Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001231 
   
  
    
    ChIP-Seq (H3K36me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001232 
   
  
    
    ChIP-Seq (H3K4me1) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency,  Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001233 
   
  
    
    ChIP-Seq (H3K4me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001234 
   
  
    
    ChIP-Seq (H3K9me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000;ILLUMINA 
      
    
   
  48 
 
  
    EGAD00001001235 
   
  
    
    ChIP-Seq (Input) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  48 
 
  
    EGAD00001001236 
   
  
    
    Targetted capture and resequencing of 94 known myeloid genes across MPN trials (PT1 and Voriconazole study) and other MPN samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1860 
 
  
    EGAD00001001237 
   
  
    
    This is a pilot project to determine whether the TAPG FFPE DNA's are suitable for deep sequencing. If successful an investigation of SNP distribution in a larger cohort will follow. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001238 
   
  
    
    Extension analysis to pursue candidate genes of interest in chordoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  262 
 
  
    EGAD00001001239 
   
  
    
    Extension analysis to pursue candidate genes of interest in chordoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  262 
 
  
    EGAD00001001240 
   
  
    
    VCF files of somatic variants from tumor-normal pairs of Asian lung cancer patients 
    
   
  
    
   
  30 
 
  
    EGAD00001001242 
   
  
    
    Pilot study to set up sequencing protocols for targeted pulldown methylation profiling 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001001243 
   
  
    
    Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001001244 
   
  
    
    RNA-sequencing (RNA-seq) was performed with RNA extracted from fresh-frozen
human tumor tissue samples. cDNA libraries were prepared from poly-A selected
RNA applying the Illumina TruSeq protocol for mRNA. The libraries were then
sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 30x
of the annotated transcriptome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001001245 
   
  
    
    DATA FILES FOR PCGP SJINF WES 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001001246 
   
  
    
    DATA FILES FOR PCGP SJMEL WXS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001001247 
   
  
    
    DATA FILES FOR PCGP SJMEL RNASEQ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001248 
   
  
    
    DATA FILES FOR PCGP SJETP WXS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001001249 
   
  
    
    WES of HCC by HiSeq 2000,total 71 samples including Hepatocellular carcinoma cell lines and nornal sample(Peripheral Blood or the adjacent tissues of cancer) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001001250 
   
  
    
    Low coverage (4-6x) sequencing on samples from population cohorts (Finrisk, Health2000) will be done at Wellcome Trust Sanger Institute (WTSI) using Illumina HiSeq sequencing technology. We will produce 100bp paired end reads. Variants will be called using the 1000 Genomes Project pipeline.  The samples have been selected from a national representative set of 8028 samples from persons of 30 years or older, which were screened for psychotic and bipolar disorders using the Composite International Diagnostic Interview, self-reported diagnoses, medical examination, and national registers. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  731 
 
  
    EGAD00001001251 
   
  
    
    Low coverage (4-6x) sequencing on samples from population cohorts (Finrisk, Health2000) will be done at Wellcome Trust Sanger Institute (WTSI) using Illumina HiSeq sequencing technology. We will produce 100bp paired end reads. Variants will be called using the 1000 Genomes Project pipeline.  The samples have been selected from a national representative set of approximately 30,300 samples and comprises 500 individuals of each gender in the extreme tail of high density lipoprotein (HDL) concentrations. Included individuals were between 25 and 65 years of age. Individuals with a diagnosis of diabetes or BMI>30 were excluded from the study. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  966 
 
  
    EGAD00001001252 
   
  
    
    DNA was derived from the primary tumour, lung metastasis, and peri-aortic lymph node metastasis. DNA from the spleen was used as a normal control.For WE sequencing we user Hybrid capture (Nimblegen version 3.0) of the lymph node and lung metastases, primary tumour and spleen normal; we generated ~100-fold coverage. 
    
   
  
    
   
  4 
 
  
    EGAD00001001253 
   
  
    
    DNA was derived from the primary tumour, lung metastasis, and peri-aortic lymph node metastasis. DNA from the spleen was used as a normal control.WG sequencing produced ~30-fold (primary tumour, spleen normal)-50-fold (lung metastasis) coverage 
    
   
  
    
   
  3 
 
  
    EGAD00001001256 
   
  
    
    Clonal hematopoiesis was investigated in patients with aplastic anemia using next-generation sequencing and single-nucleotide polymorphism (SNP) array-based karyotyping. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  186 
 
  
    EGAD00001001257 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001258 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001259 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001260 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001261 
   
  
    
    Bisulfite-Seq of CD14-positive, CD16-negative classical monocyte samples for methylome saturation and COMET analysis 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001262 
   
  
    
    Unaligned bam of 31 samples derived from primary tumor 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001001263 
   
  
    
    Unaligned bam of 31 samples derived from blood 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001001264 
   
  
    
    We propose to definitively characterise the somatic genetics of ER+ve, HER2-ve breast cancer through generation of comprehensive catalogues of somatic mutations in 500 cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  223 
 
  
    EGAD00001001265 
   
  
    
    Genomic architecture of mesothelioma parent study is project 925.  This project is set up in parallel to project 925 in order to Whole genome sequence ten of the 59 tumours in that project. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD00001001266 
   
  
    
    Whole genome sequencing of primary angiosarcoma 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001001267 
   
  
    
    Anaplastic meningiomas are a rare, malignant variant of meningioma. At present there is no effective treatment for this cancer. The aim of the study is to identify somatic mutations in anaplastic meningiomas. We plan to sequence a set of 500 known cancer genes in 50 anaplastic meningioma and corresponding peripheral blood DNA samples. Bioinformatics will be used to analyse the results to assess the probability of these mutations being causal and so likely of critical importance for the tumour growth. Identification of these mutations will guide selection of appropriate compounds to effectively treat the disease. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  60 
 
  
    EGAD00001001268 
   
  
    
    H9 human embryonic stem cells (hESCs) were cultured in feeder-free chemically-defined conditions in medium containing 10ng/ml Activin A and 12ng/ml FGF2 (Vallier L. 2011, Methods in Molecular Biology,  690: 57-66). Chromatin immunoprecipitation was performed as described in  Brown S. et al. 2011. Stem Cells 29: 1176-85 by using 5ug of anti-DPY30 antibody (Sigma, cat. number HPA043761). This protocol was performed in control hESCs (expressing a scrambled shRNA) and in hESCs stably expressing an shRNA against DPY30 (Sigma, clone n. TRCN0000131112).This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001269 
   
  
    
    Exome bam files of 75 Individuals From Multiply Affected Coeliac Families 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina Genome Analyzer IIx 
      
    
   
  75 
 
  
    EGAD00001001271 
   
  
    
    Around 50 samples of pre-invasive lung cancer lesions showing subsequent clinical and pathological progression or regression 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  50 
 
  
    EGAD00001001272 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001273 
   
  
    
    Whole genome sequencing was performed with DNA extracted from fresh-frozen
tumor and normal material. Short insert DNA libraries were prepared with the TruSeq
DNA PCRfree sample preparation kit (Illumina) for paired-end sequencing at a
minimum read length of 2x100bp. Human DNA libraries were sequenced to an
average coverage of minimum 30x for both tumor and matched normal. Murine DNA
libraries of tumor and matched normal were both sequenced to a coverage of 25x. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001001274 
   
  
    
    Brain samples for  this dataset were provided by the Medical Research Council Sudden Death Brain and Tissue Bank (Edinburgh, UK). All four individuals sampled were of European descent, neurologically normal during life and confirmed to be neuropathologically normal by a consultant neuropathologist using histology performed on sections prepared from paraffin-embedded tissue blocks. Twelve regions of the central nervous system were sampled from each individual. The regions studied were: cerebellar cortex, frontal cortex, temporal cortex, occipital cortex, hippocampus, the inferior olivary nucleus (sub-dissected from the medulla), putamen, substantia nigra, thalamus, hypothalamus, intralobular white matter and cervical spinal cord. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001001275 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001276 
   
  
    
    McGill EMC Release 4 for cell type "induced pluripotent stem cell" 
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD00001001277 
   
  
    
    McGill EMC Release 4 in tissue "fat pad" for cell type "fat cell" 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001001278 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "B cell" 
    
   
  
    
      
      unspecified 
      
    
   
  41 
 
  
    EGAD00001001279 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "CD4-positive helper T cell" 
    
   
  
    
      
      unspecified 
      
    
   
  55 
 
  
    EGAD00001001280 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "CD4-positive, alpha-beta T cell" 
    
   
  
    
      
      unspecified 
      
    
   
  40 
 
  
    EGAD00001001281 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "eosinophil" 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001001282 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "Monocyte" 
    
   
  
    
      
      unspecified 
      
    
   
  82 
 
  
    EGAD00001001283 
   
  
    
    McGill EMC Release 4 in tissue "venous blood" for cell type "T cell" 
    
   
  
    
      
      unspecified 
      
    
   
  20 
 
  
    EGAD00001001284 
   
  
    
    McGill EMC Release 4 in tissue "Brodmann (1909) area 11" 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001001285 
   
  
    
    McGill EMC Release 4 in tissue "Brodmann (1909) area 44" 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001001286 
   
  
    
    McGill EMC Release 4 in tissue "Brodmann (1909) area 8;Brodmann (1909) area 9" 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001001287 
   
  
    
    McGill EMC Release 4 in tissue "kidney" 
    
   
  
    
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001001288 
   
  
    
    McGill EMC Release 4 in tissue "skeletal muscle tissue" 
    
   
  
    
      
      unspecified 
      
    
   
  29 
 
  
    EGAD00001001289 
   
  
    
    McGill EMC Release 4 for assay "Bisulfite-seq": Methylation profiling by high-throughput sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  44 
 
  
    EGAD00001001290 
   
  
    
    McGill EMC Release 4 for assay "RNA-seq": Transcriptome profiling by high-throughput sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  261 
 
  
    EGAD00001001291 
   
  
    
    McGill EMC Release 4 for assay "mRNA-seq": Transcriptome profiling by high-throughput sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  40 
 
  
    EGAD00001001292 
   
  
    
    McGill EMC Release 4 for assay "smRNA-seq": Transcriptome profiling by high-throughput sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001001293 
   
  
    
    McGill EMC Release 4 for assay "ChIP-Seq Input" 
    
   
  
    
      
      unspecified 
      
    
   
  52 
 
  
    EGAD00001001294 
   
  
    
    McGill EMC Release 4 for assay "H3K27me3" 
    
   
  
    
      
      unspecified 
      
    
   
  32 
 
  
    EGAD00001001295 
   
  
    
    McGill EMC Release 4 for assay "H3K36me3" 
    
   
  
    
      
      unspecified 
      
    
   
  37 
 
  
    EGAD00001001296 
   
  
    
    McGill EMC Release 4 for assay "H3K4me1" 
    
   
  
    
      
      unspecified 
      
    
   
  41 
 
  
    EGAD00001001297 
   
  
    
    McGill EMC Release 4 for assay "H3K4me3" 
    
   
  
    
      
      unspecified 
      
    
   
  42 
 
  
    EGAD00001001298 
   
  
    
    McGill EMC Release 4 for assay "H3K27ac" 
    
   
  
    
      
      unspecified 
      
    
   
  36 
 
  
    EGAD00001001299 
   
  
    
    McGill EMC Release 4 for assay "H3K9me3" 
    
   
  
    
      
      unspecified 
      
    
   
  29 
 
  
    EGAD00001001300 
   
  
    
    McGill EMC Release 4 for assay "ATAC-seq": Sequencing of transposase-accessible chromatin as described by Buenrostro et al. (Nature Methods 10, 1213?1218 (2013) doi:10.1038/nmeth.2688) 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001001301 
   
  
    
    Whole exome sequencing data of 5 patients diagnosed with FL that had undergone several relapse episodes without evidence of transformation 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  29 
 
  
    EGAD00001001302 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001303 
   
  
    
    The dataset for the PROP1 study consists of samples of patients with combined pituitary hormone deficiency due to two most prevalent mutations in the PROP1 gene (c.301_302delGA and c.150delA) and healthy relatives and controls. All subjects were genotyped for 21 single nucleotide polymorphisms surrounding the PROP1 gene in order to assess the potential ancestral origin of the respective mutations. The genotype data are displayed in the vcf format. 
    
   
  
    
   
  328 
 
  
    EGAD00001001304 
   
  
    
    We used whole-genome bisulfite sequencing (WGBS) to generate unbiased DNA methylation maps of six purified B-cell subpopulations: hematopoietic progenitor cells (HPC); pre-B-II cells (preB2C); naive B cells from peripheral blood (naiBC); germinal center B cells (gcBC); memory B cells from peripheral blood (memBC) and plasma cells from bone marrow (bm-PC). WGBS was performed in 2 biological replicates from each subpopulation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001305 
   
  
    
    Dataset contains WES data from 3 astrocytoma patients: blood as control, primary tumor and recurrent tumor 
    
   
  
    
   
  9 
 
  
    EGAD00001001306 
   
  
    
    Human melanoma samples were collected pre, on, and progression on BRAF inhibitor therapy. RNA was extracted and run on RNA-seq. This has provided insights into different categories of BRAF inhibitor resistance mechanisms. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001001307 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001308 
   
  
    
    Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001001309 
   
  
    
    Genome and transcriptome sequence data from an appendix cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001310 
   
  
    
    Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001311 
   
  
    
    Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001312 
   
  
    
    Fastq data for whole genome bisulfite sequencing assays for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001001313 
   
  
    
    We enriched a panel of cancer associated genes using the Custom Sure Select Target Enrichment Kit. Identified mutations were validated with deep sequencing in order to assess mutated allele frequencies more accurately. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  10 
 
  
    EGAD00001001314 
   
  
    
    Sequence data from L1-amplicon libraries prepared from plasma-DNA from a set of 24 female controls and 18 male controls without malignant disease and samples from patients breast (n= 28) and prostate cancer patients (n=61). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  125 
 
  
    EGAD00001001315 
   
  
    
    Phenotype determination by SNP-Typing using PCR and snapshotPCR with subsequent fragment analysis. We investigated 400 individuals from Northern Germany and detected up to 12 different SNPs to determine eye, hair and skin colour. More than 1000 different runs on a ABI3130 were performedThis dataset includes:- Phenotype information for 400 samples- Summary and complete genotype calls for 12 SNPs on 400 samples. 
    
   
  
    
   
  399 
 
  
    EGAD00001001316 
   
  
    
    Exome sequence analysis of individuals with severe early onset inflammatory bowel disease, and their families. Individuals are ascertained through the COLORS in IBD study, which includes centres throughout UK and Europe. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  149 
 
  
    EGAD00001001317 
   
  
    
    This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  12 
 
  
    EGAD00001001319 
   
  
    
    The aim of this study is to ascertain whether leukaemic mutations exist within the blood of people with otherwise normal haematopoeisis. To satisfy this aim we plan to look for 7 known leukaemic mutations  in the whole blood DNA of a large cohort of blood donors who have normal haematopoesis.  Genomic regions around mutational sites have been amplified using a 2 step PCR process which involves barcoding of individual patients 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  5817 
 
  
    EGAD00001001320 
   
  
    
    This is a study to test ATAC-seq protocols. CD4+ and CD8+ cells have been obtained from three different anatomical compartments. We aim to assay open-chromatin regions across these cells and perform comparative analyses.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  138 
 
  
    EGAD00001001321 
   
  
    
    This dataset includes WGS & WTS alignment data generated from 1 ATC tumor, its matched peripheral blood specimen and 3 authenticated ATC cell lines, THJ-16T, THJ-21T and THJ-29T. In addition, it includes WTS data from extra 4 unique anaplastic cell lines, ACT-1, C643, HTh7 and T238. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  13 
 
  
    EGAD00001001322 
   
  
    
    A comprehensive characterisation and analysis of human breast cancers through whole-genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  196 
 
  
    EGAD00001001326 
   
  
    
    Whole genome sequencing of single adult t-cell leukemia/lymphoma case 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001329 
   
  
    
    Aligned Sequence (bam format), Duplicates removed 
    
   
  
    
   
  28 
 
  
    EGAD00001001330 
   
  
    
    In this experiment we have sequenced tumour normal pairs from patients presenting with CRC who have a prior history of inflammatory bowel disease. The idea is to identify driver mutations, new genes and novel pathways associated with the development of these malignancies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001001331 
   
  
    
    The aim of this work is to apply an integrated systems approach to understand the biological underpinnings of large joint (hip and knee) osteoarthritis which culminates in the need for total joint replacement (TJR). In this pilot we will assess the feasibility of the approach in the relevant tissue. We will obtain diseased and non-diseased tissue (cartilage and endochondral bone) following TJR, coupled with a blood sample, from 12 patients. We will characterise the 12 pairs of diseased and non-diseased tissue samples in terms of transcription (RNASeq) The pilot will help assess the feasibility of isolating sufficient levels of starting material for the different approaches, and will instigate the development of analytical approaches to synthesising the resulting data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001001332 
   
  
    
    Development of a method for separation and parallel sequencing of the genomes and transcriptomes of single cells. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  700 
 
  
    EGAD00001001333 
   
  
    
    Whole exome sequencing BAM files for samples from the BRIDGE Consortium with pathogenic or likely pathogenic variants on genes linked to bleeding or platelet disorders. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001001334 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  99 
 
  
    EGAD00001001335 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001001336 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001337 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  607 
 
  
    EGAD00001001338 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  49 
 
  
    EGAD00001001339 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001001340 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001001341 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  158 
 
  
    EGAD00001001343 
   
  
    
    Data from the study of subclonal metastatic expansion in prostate cancer. Whole genome shotgun sequencing of fifteen samples, tumour and whole blood, from the four initial patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001344 
   
  
    
    Data from the study of subclonal metastatic expansion in prostate cancer. Whole genome shotgun sequencing of six samples, tumour and whole blood, from the three additional patients whose somatic variants were examined in depth. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001345 
   
  
    
    Data from the study of subclonal metastatic expansion in prostate cancer. RNA-seq of twelve samples, tumour and benign tissue, from the four initial patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001347 
   
  
    
    Exome sequencing of a case of lethal EBV-driven LPD 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001349 
   
  
    
    Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001350 
   
  
    
    Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001351 
   
  
    
    Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001352 
   
  
    
    Data files for CONSERTING (WGS) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001001353 
   
  
    
    Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001354 
   
  
    
    Whole exome sequencing of around 700 inflammatory bowel disease cases.This data can only be used for the identification of IBD/immune-mediated disease loci. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  702 
 
  
    EGAD00001001355 
   
  
    
    DDD DATAFREEZE 2013-12-18: 1133 trios - VCF files (Ref: DDD Nature 2015) 
    
   
  
    
   
  - 
 
  
    EGAD00001001356 
   
  
    
    Neuroblastoma, a clinically heterogeneous pediatric cancer, is characterized by distinct genomic profiles but few recurrent mutations. As neuroblastoma is expected to have high degree of genetic heterogeneity, study of neuroblastoma's clonal evolution with deep coverage whole-genome sequencing of diagnosis and relapse samples will lead to a better understanding of the molecular events associated with relapse. Samples were included in this study if sufficient DNA from constitutional, diagnosis and relapse tumors was available for WGS. Whole genome sequencing was performed on trios (constitutional, diagnose and relapse DNA) from eight patients using Illumina Hi-seq2500 leading to paired-ends (PE) 90x90 for 6 of them and 100x100 for two. Expected coverage for sample NB0175 100x100bp was 30X for tumor and constitutional samples. For the seven other patients expected coverage was 80X for tumor samples with PE 100x100, 100X in the other tumor samples and 50X for all constitutional samples (see table 1). Following alignment with BWA (Li et al., Oxford J, 2009 Jul) allowing up to 4% of mismatches, bam files were cleaned up according to the Genome Analysis Toolkit (GATK) recommendations (Van der Auwera et al., Current Protocols in Bioinformatics, 2013, picard-1.45, GenomeAnalysisTK-2.2-16). Variant calling was performed in parallel using 3 variant callers: GenomeAnalysisTK-2.2-16, Samtools-0.1.18 and MuTect-1.1.4 (McKenna et al., Genome Res, 2010; Li et al., Oxford J, 2009 Aug; Cibulskis et al., Nature, 2013). Annovar-v2012-10-23 with cosmic-v64 and dbsnp-v137 were used for the annotation and RefSeq for the structural annotation. For GATK and Samtools, single nucleotide variants (SNVs) with a quality under 30, a depth of coverage under 6 or with less than 2 reads supporting the variant were filter out. MuTect with parameters following GATK and Samtools thresholds have been used to filter our irrelevant variants. .SNVs within and around exons of coding genes overlapping splice sites.. Then,variants reported in more than 1% of the population in the 1000 genomes (1000gAprl_2012) or Exome Sequencing Project (ESP6500) have been discarded in order to filter polymorphisms. Finally, synonymous variants were filtered out. MuTect focuses on somatic by filtering with constitutional sample. Mpileup comparison between constitutional and somatic DNAs allowed us to focus also on tumor specific SNVs with GATK and Samtools. Finally, every SNV called by our pipeline and also supported in any constitutional samples were filtered our in order to prevent putative constitutional DNA coverage deficiency. Then we analyzed CNVs (copy number variants) with HMMcopy-v0.1.1 (Gavin et al., Genome Res, 2012) and control-FREEC-v6.7 (Boeva et al., Bioinformatics 2011) with a respective window of 2000bp and 1000 bp, and auto-correction of normal contamination of tumor samples for Control-FREEC. Finally we explored Structural variants (SVs) including deletions, inversions, tandem duplications and translocations using DELLY-v0.5.5 with standard parameters (Rausch et al., Oxford J, 2012). In tumors, at least 10 supporting reads were required to make a call and 5 supporting reads for the sample NB0175 with a coverage of only 40X (see table 2). To predict SVs in constitutional samples for subsequent somatic filtering, only 2 supporting reads were required in order not to miss one. To identify somatic events, all the SVs in each normal sample were first flanked by 500 bp in both directions and any SVs called in a tumor sample which was in the combined flanked regions of respective normal sample was removed (see graph 1). Deletions with more than 5 genes impacted or larger than 1Mb and inversions or tandem duplications covering more than 4 genes, were removed. We focused on exonic and splicing events for deletions, inversions, and tandem duplications. For translocation, we keep all SVs that occurred in intronic, exonic, 5'UTR, upstream or splicing regions. Bioinformatics detection of variations with Deep sequencing approach Once PE reads merged and adaptors trimmed by SeqPrep with default parameters, merged reads were aligned via the BWA (Li H. and Durbin R. 2009 PMID 19451168) allowing up to 1 differences in the 22-base-long seeds and reporting only unique alignments. Only reads having a mapping quality 20 or more have been further analysed. Variant calling software was not used, since we aimed to predict variations at low frequencies, observed in less than 1% of reads. Such variants require a custom approach. Using DepthOfCoverage functions of the Genome Analysis Toolkit (GATK) v2.13.2 (McKenna A, et al., 2010 Genome Research PMID: 20644199), we focused on high quality coverage of bases A, C, G and T at the targeted variant position. Depth of coverage of each base following a mapping quality higher than 20 and a base quality higher than 10 have been taken into account in order to focus only on high quality data. Aiming to determine the background level of variability at the studied regions, 10 control samples were included in the analysis. The same approach and filtering criteria have been applied as introduced above over the entire amplicons. In order to highlight variants, for each sample the frequencies of each bases at each amplicon position were then compared to those observed in the set of controls. Statistical analyses were performed with the R statistical software (http://www.R-project.org). Fisher’s exact two-sided tests with a Bonferroni correction were performed to compare percentages of bases between the data sets, i.e. for a given base between a case and the controls. Finally, significant variations were filtered-in once (i) a significant increase in the percentage of avariant base and (ii) a significant decrease in the percentage of it's reference base following our p.values criteria was observed (p.val < 0.05). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  25 
 
  
    EGAD00001001357 
   
  
    
    Genomic characterisation of a large series of cancer cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  462 
 
  
    EGAD00001001358 
   
  
    
    463 newly diagnosed patients from the UK Myeloma XI clinical trial (NCT01554852) underwent whole exome sequencing plus targeted capture of the IGH/K/L and MYC loci.  200 ng of DNA were processed using NEBNext DNA library prepartion kit and hybridised to the SureSelect Human All Exon V5 Plus.  Four samples were pooled and run on one lane of a HiSeq 2000 using 76-bp paired end reads.  DNA from CD138+ selected bone marrow cells (myeloma tumour) as well as peripheral white blood cells were analysed and somatic mutations detected. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  926 
 
  
    EGAD00001001359 
   
  
    
    Dataset contains Exome-seq and RNA-seq from 2 GBM patients, as well as RNA-seq from the derived cultured cells (GNS). 
    
   
  
    
   
  6 
 
  
    EGAD00001001360 
   
  
    
    The majority of neuroblastoma patients have tumors that initially respond to chemotherapy, but a large proportion of patients will experience therapy-resistant relapses. The molecular basis of this aggressive phenotype is unknown. Whole genome sequencing of 23 paired diagnostic and relapsed neuroblastomas showed clonal evolution from the diagnostic tumor with a median of 29 somatic mutations unique to the relapse sample. Eighteen of the 23 relapse tumors (78%) showed RAS-MAPK pathway mutations. Seven events were detected only in the relapse tumor while the others showed clonal enrichment. In neuroblastoma cell lines we also detected a high frequency of activating mutations in the RAS-MAPK pathway (11/18, 61%) and these lesions predicted for sensitivity to MEK inhibition in vitro and in vivo. Our findings provide a rationale for genetic characterization of relapse neuroblastoma and show that RAS-MAPK pathway mutations may function as a biomarker for new therapeutic approaches to refractory disease. 
    
   
  
    
   
  221 
 
  
    EGAD00001001363 
   
  
    
    To generate an RNA-Seq dataset for organoids apically stimulated with Salmonella Typhimurium.These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001001364 
   
  
    
    This dataset contains whole exome data from 8 esophageal adenocarcinoma tumors, that has been subjected to multiregion sequencing, ranging from 3-8 regions per tumor. In total,  40 tumor samples and 8 normal blood samples have been sequenced on Illumina HiSeq 2500 at a median dept of 90x. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  47 
 
  
    EGAD00001001372 
   
  
    
    All humans outside Africa are descendants of the same single exit, usually dated at 50-70 thousand years ago. However, the route taken out of Africa is still debated. The two main candidates are a northern route via Egypt and the Levant, or a southern route via Ethiopia and the Arabian Peninsula. We are generating genetic data to evaluate these two possibilities. In this study we propose to generate low-coverage sequencing data for 100 Egyptian samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001001373 
   
  
    
    The mtDNA and Y chromosome of up to 15 Australian Aborigines,  concentrating on individuals with indigenous lineages, will be sequenced using the standard whole-genome sequencing followed by filtering out of autosomal and X sequences, so that only mtDNA and the Y chromosome will be analysed and released. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001374 
   
  
    
    The mtDNA and Y chromosome of up to 15 Australian Aborigines,  concentrating on individuals with indigenous lineages will be sequenced using the standard whole-genome sequencing followed by filtering out autosomal and X sequences, so that only mtDNA and the Y chromosome would be analysed and released. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001001375 
   
  
    
    Samples will be from the BRF113683 (BREAK-3) study which is a Phase III Randomized, Open-label Study Comparing GSK2118436 to Dacarbazine (DTIC) in Previously Untreated Subjects With BRAF Mutation Positive Advanced (Stage III) or Metastatic (Stage IV) Melanoma (n=250 enrolled)*NGS [Agilent capture (Sanger V2 panel): 360 genes and 20 gene fusions; Illumina HiSEQ Sequencing]*CNV: [via NGS or Affy SNP 6.0 or Illumina Omni (TBD)]Bioinformatics: Analysis will be performed using core Sanger informatics pipelines similar to those previously described (Papaemmanuil E et al. (2013) Blood. 22:3616 -3627).   Briefly, copy number analysis will be performed using the ASCAT algorithm, and base substitutions, small insertions and deletions using the CAVEMAN and Pindel algorithms, respectively.  Statistical approaches including generalized linear models will be used to predict clinical variables such as maximum clinical response and duration of response using genetic data.  Sanger and EBI to conduct analysis; Raw data and correlation with clinical endpoints to be analyzed by both EBI/Sanger and GSK (unique pipeline analyses to increase call confidence) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  169 
 
  
    EGAD00001001379 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001001380 
   
  
    
    All humans outside Africa are descendants of the same single exit, usually dated at 50-70 thousand years ago. However, the route taken out of Africa is still debated. The two main candidates are a northern route via Egypt and the Levant, or a southern route via Ethiopia and the Arabian Peninsula. We are generating genetic data to evaluate these two possibilities. In this study we propose to generate high-coverage sequencing data for 3 Egyptian samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001381 
   
  
    
    This dataset includes 69 sampels of whole-exome sequencing data of high-grade serous ovarian carcinoma (HGSOC). We included patients with advanced (International Federation of Gynecology and bstetrics [FIGO] stage IIIeIV) HGSOC for which biopsies were obtained during debulking surgery, the first at initial diagnosis and the second at disease relapse. Where possible, matched normal DNA from each participating patient was obtained from a whole-blood sample. Written informed consent was obtained from all patients and approved by the local ethics committee. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  69 
 
  
    EGAD00001001382 
   
  
    
    TwinsUK whole exome sequencing using NimbleGen SeqCap EZ 
    
   
  
    
   
  248 
 
  
    EGAD00001001383 
   
  
    
    TwinsUK whole exome sequencing using NimbleGen 2.1M SeqCap 
    
   
  
    
   
  242 
 
  
    EGAD00001001384 
   
  
    
    Mutations that activate the RAF-MEK-ERK signaling pathway, in particular BRAFV600E, occur in many cancers, and mutant BRAF-selective inhibitors have clinical activity in these diseases. Activating BRAF alleles are usually considered to be mutually exclusive with mutant RAS, whereas inactivating mutations in the D594F595G596 motif of the BRAF activation segment can coexist with oncogenic RAS and cooperate via paradoxical MEK/ERK activation. We determined the functional consequences of a largely uncharacterized BRAF mutation, F595L, which was detected along with an HRASQ61R allele by clinical exome sequencing in a patient with histiocytic sarcoma and also occurs in epithelial cancers, melanoma, and neuroblastoma, and investigated its interaction with mutant RAS. We demonstrate that, unlike previously described DFG motif mutants, BRAFF595L is a gain-of-function variant with intermediate activity towards MEK that does not act paradoxically, but nevertheless cooperates with mutant RAS to promote oncogenic signaling. Of immediate clinical relevance, BRAFF595L shows divergent responses to different mutant BRAF-selective inhibitors, whereas signaling driven by BRAFF595L with and without mutant RAS is efficiently blocked by pan-RAF and MEK inhibitors. Mutation data from primary patient samples and cell lines show that BRAFF595L, as well as other BRAF mutations with intermediate activity, frequently coincide with mutant RAS in a broad spectrum of cancers. These data define a novel class of activating BRAF mutations that cooperate with oncogenic RAS in a non-paradoxical fashion to achieve an optimal level of MEK-ERK signaling, extend the spectrum of patients with systemic histiocytic disorders and other malignancies who are candidates for therapeutic blockade of the RAF-MEK-ERK pathway, and underscore the value of comprehensive genetic profiling for understanding the signaling requirements of individual cancers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001385 
   
  
    
    Exome sequencing in 3 Möbius patients 
    
   
  
    
      
      AB SOLiD 4 System 
      
    
   
  3 
 
  
    EGAD00001001386 
   
  
    
    Whole Genome Sequencing of Huh7 cell lines 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001387 
   
  
    
    Using high-throughput sequencing technologies and analytical tools, we conduct an exome sequencing study that will help understand the population
genetics of a Croatian island isolate, in a sample of 200 subjects from the Adriatic island of Vis who were
selected to reflect islanders with at least four known ancestors in grandparental line who are original islanders. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  193 
 
  
    EGAD00001001388 
   
  
    
    Whole-genome bisulfite sequencing (WGBS) on 30 breast cancer cases from the BASIS project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001001389 
   
  
    
    Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001001390 
   
  
    
    Human monocytes from a healthy male blood donor were obtained after written informed consent and anonymised. Library preparation was performed essentially as described in the “Whole‐genome Bisulfite Sequencing for Methylation Analysis (WGBS)” protocol as released by Illumina. The library was sequenced on an Illumina HiSeq2500 using 101 bp paired-end sequencing. Read mapping was done with BWA. 
    
   
  
    
   
  1 
 
  
    EGAD00001001391 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001393 
   
  
    
    The aim of this study is to assess translational changes in macrophages over a time course of Salmonella infection.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001001394 
   
  
    
    Samples from Ross Innes et. al 2015 - doi:10.1038/ng.3357 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001001395 
   
  
    
    Background: Invasive lobular breast cancer (ILBC) is the second most common histological subtype after ductal breast cancer (IDBC). In spite of significant clinical and pathological differences, ILBC is still treated as IDBC. Here, we aimed at identifying recurrent genomic alterations in ILBC with potential clinical implications.Methods: Starting from 630 ILBC primary tumors with a median follow up of 10 years, we interrogated oncogenic substitutions and indels of 360 cancer genes and genome-wide copy number alterations in 413 and 170 ILBC samples, respectively, and correlated those findings with clinical, pathological, and outcome features. The Cancer Genome Atlas database was used for comparison of frequency estimates.Results: Besides the high mutation frequency of CDH1 in 65% of the tumors, alterations in one of the three key genes of the PI3K pathway, PIK3CA, PTEN and AKT1, were present in more than half of the cases. ERBB2 and ERBB3 were mutated in 5.1 and 3.6% of the tumors. FOXA1 mutations and ESR1 copy number gains were detected in 9% and 25% of the samples. All these alterations were more frequent in ILBC than IDBC. The histological diversity of ILBC was associated with specific genomic alterations, such as enrichment for ERBB2 mutations in the mixed, non-classic subtype, and for ARID1A mutations and ESR1 gains in the solid subtype. Finally, ERBB2 and AKT1 mutations were associated with short-term risk of relapse, and chromosome 1q and 11p gain with increased and decreased breast cancer free survival, respectively.Conclusion: ERBB2, ERBB3 and AKT1 mutations represent high prevalence therapeutic targets in ILBC. FOXA1 mutations and ESR1 gains urgently deserve dedicated clinical investigation, especially in the context of endocrine treatment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  541 
 
  
    EGAD00001001397 
   
  
    
    We sequenced 292 patients who were suffering NSCLC with Whole genome sequencing or Exome sequencing method. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001001398 
   
  
    
    We sequenced 205 patients who were suffering NSCLC with Exome sequencing method. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  147 
 
  
    EGAD00001001399 
   
  
    
    Data represent genome-wide DNA methylation profiles obtained by MethylCap-seq (Diagenode’s MethylCap-kit based purification followed by Illumina GAIIx sequencing),  for 70 brain tissue samples, including 65 glioblastoma samples and 5 non-tumoral tissues (obtained from epilepsy surgery). 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  70 
 
  
    EGAD00001001400 
   
  
    
    Fastq data for whole genome shotgun sequencing assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001001401 
   
  
    
    Fastq data for smRNA-Seq assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001001402 
   
  
    
    Fastq data for stranded mRNA-Seq assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001001403 
   
  
    
    Fastq data for ChIP-Seq (H3K27ac) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001404 
   
  
    
    Fastq data for ChIP-Seq (H3K27me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001405 
   
  
    
    Fastq data for ChIP-Seq (H3K36me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001406 
   
  
    
    Fastq data for ChIP-Seq (H3K4me1) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001407 
   
  
    
    Fastq data for ChIP-Seq (H3K4me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001408 
   
  
    
    Fastq data for ChIP-Seq (H3K9me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001409 
   
  
    
    Fastq data for ChIP-Seq (Input) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001001410 
   
  
    
    Whole-exome sequencing of 81 tumor/normal pairs of adult T-cell leukemia/lymphoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  162 
 
  
    EGAD00001001411 
   
  
    
    RNA sequencing of 57 tumor samples of adult T-cell leukemia/lymphoma as well as 3 samples of HTLV-1 carrier and 3 samples of healthy volunteers. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  63 
 
  
    EGAD00001001412 
   
  
    
    Whole genome sequencing of 48 tumor/normal pairs obtained from adult T-cell leukemia/lymphoma.  This data set includes 11 full-pass WGS and 37 low-pass WGS data. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001001413 
   
  
    
    DDD DATAFREEZE 2013-12-18: 1133 trios - README, family trios, phenotypes, validated DNMs (Ref: DDD Nature 2015) 
    
   
  
    
   
  - 
 
  
    EGAD00001001415 
   
  
    
    DATA FILES FOR PCGP Dyer_iPSC WGS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001416 
   
  
    
    DATA FILES FOR PCGP Dyer_iPSC TEBS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001001417 
   
  
    
    bam files associated with the study EGAS00001001205 
    
   
  
    
   
  6 
 
  
    EGAD00001001418 
   
  
    
    DATA FILES FOR PCGP Dyer_iPSC 5hmc 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001421 
   
  
    
    Clinical Implications of Genomic Alterations in the Tumour and Circulation of Pancreatic Cancer Patients 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  125 
 
  
    EGAD00001001422 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Exome Sequencing - April 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001423 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001424 
   
  
    
    We obtained paired longitudinal specimens from a total of 38 glioblastoma (GBM) patients (34 primary and 4 secondary GBM patients). Treatment-naive initial tumors were available for 35 cases; for the other 3 cases, we used the first available recurrent tumors in lieu of initial tumors. Tumor specimens were subjected to whole-exome sequencing (27 of 38 cases, with the matched normal/blood for 22 of the 27 cases) and transcriptome sequencing (30 of 38 cases). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  141 
 
  
    EGAD00001001425 
   
  
    
    The objectives of this project are the identification of markers related to cancer therapy resistance in the blood of breast cancer patients and to study the genetic changes in cancer cells during this development of resistance. Whole genome amplified DNA from Circulating Tumor Cells (CTCs), selected during the course of systemic treatment from blood of metastatic breast cancer patients, will be exome sequenced . The patients selected for this study did not respond to therapy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  149 
 
  
    EGAD00001001426 
   
  
    
    Systematic next generation sequencing efforts are beginning to define the genomic landscape across a range of primary tumours, but we know very little of the mutational evolution that contributes to disease progression.
We therefore propose to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in a cohort of matched primary and metastatic colorectal cancers, and additionally to explore the extent to which those mutations identified as recurrent in the metastatic setting are able to subvert normal biological processes using both genetically engineered mouse models and established cancer cell lines. This study will enable us to define to what extent primary tumour profiling can capture the biological processes operative in matched metastases as well as the significance of intratumoural heterogeneity.
This dataset contains all the data available for this study on 2015-07-02. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  446 
 
  
    EGAD00001001427 
   
  
    
    Targeted cancer gene sequencing of samples enrolled in the SSGXVIII trial from Finland. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  312 
 
  
    EGAD00001001428 
   
  
    
    Identification of human deubiquitylating enzymes whose knock out result in hypersensitivity to DNA damaging agents, by comparing the sequence reads of 'barcode region' from mixed cell culture. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001429 
   
  
    
    Profiling subclonal architecture and phylogeny in tumors by whole-genome sequence data mining and single-cell genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001001430 
   
  
    
    Investigation into causal genes underlying anaplastic meningioma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  73 
 
  
    EGAD00001001431 
   
  
    
    SCLC - RNA sequencing data Publication Peifer et al., 2012, Nature Genetics 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001432 
   
  
    
    PCGP Germline Study Whole Genome Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1337 
 
  
    EGAD00001001433 
   
  
    
    PCGP Germline Study Whole Exome Sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  906 
 
  
    EGAD00001001435 
   
  
    
    Aligned whole genome bisulfite sequencing data for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  30 
 
  
    EGAD00001001436 
   
  
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
    
   
  4 
 
  
    EGAD00001001437 
   
  
    
    HipSci - Healthy Normals - Exome Sequencing - April 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  122 
 
  
    EGAD00001001438 
   
  
    
    HipSci - Healthy Normals - RNA Sequencing - May 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  116 
 
  
    EGAD00001001439 
   
  
    
    Mammary cell samples from donors 28/32/33. Contains 12 MiSeq sequencefiles and 12 alignment files derived from HiSeq runs. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  12 
 
  
    EGAD00001001440 
   
  
    
    This project entailed generation of high depth WGS (30x) of 100 individuals from the general Greek population. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  100 
 
  
    EGAD00001001441 
   
  
    
    Despite the established role of the transcription factor MYC in cancer, little is known about the impact of a new class of transcriptional regulators, the long non-coding RNAs (lncRNAs), on the way MYC is able to influence cellular transcriptome. To this aim we have intersected RNA-sequencing data from two MYC-inducible cell lines and from a cohort of 91 mature B-cell lymphomas carrying, or not carrying, genetic variants resulting in MYC over-expression. By this approach, we identified 13 lncRNAs differentially expressed in IG-MYC-positive Burkitt lymphoma and regulated in the same direction by MYC in the model cell lines. Among them we focused on a lncRNA that we named MINCR, for MYC-Induced long Non-Coding RNA, showing a strong correlation with MYC expression in MYC-positive lymphomas and also in pancreatic ductal adenocarcinomas. To understand its cellular role we performed RNA interference (RNAi) experiments and found that MINCR knock-down is associated with a reduction in cellular viability, due to an impairment in cell cycle progression. Differential gene expression analysis following RNAi showed a strongly significant enrichment of cell cycle genes among the genes down-regulate following MINCR knock-down. Interestingly these genes are enriched in MYC binding sites in their promoters, suggesting that MINCR acts as a modulator of MYC transcriptional program. Accordingly, following MINCR knock-down, we observed a reduction in the binding of MYC to the promoters of selected cell cycle genes. Finally we provide evidences that down-regulation of AURKA, AURKB and CTD1 may explain the reduction in cellular proliferation observed upon MINCR knock-down. We therefore suggest that MINCR is a newly identified player in the MYC transcriptional network able to control the expression of cell cycle genes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  49 
 
  
    EGAD00001001442 
   
  
    
    This project is to explore the contribution of de novo mutations to severe structural malformations diagnosed prenatally using ultrasound. These malformations include heart, CNS, renal and GI abnormalities. In this pilot project we aim to exome sequence 30 parent-foetus trios to ~50X mean coverage and identify de novo functional variants using an algorithm developed in the Hurles group 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001001443 
   
  
    
    RNASeq sequencing.
Each library was sequenced using TruSeq SBS Kit v3-HS, in paired-end mode with a read length of 2 × 76 bp. We generated more than 20 million paired-end reads for each sample in a fraction of a sequencing lane on HiSeq2000 (Illumina Inc.) following the manufacturer’s protocol. Image analysis, base calling and quality scoring of the run were processed using the manufacturer’s software Real Time Analysis (RTA 1.13.48) and followed by generation of FASTQ sequence files. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  199 
 
  
    EGAD00001001444 
   
  
    
    Atypical teratoid/rhabdoid tumor (ATRT) is one of the most common brain tumors in infants and young children. Although the prognosis of ATRT patients is poor, some patients respond very well to current treatments, suggesting inter-tumor molecular heterogeneity. To investigate this further, we genetically and epigenetically analyzed a large cohort of ATRTs (n = 170). Three distinct molecular subgroups of ATRTs, associated with differences in demographics, tumor location and type of SMARCB1 alterations, were identified using DNA-methylation or gene expression analyses. Whole genome DNA- and RNA-sequencing found no other recurrent mutations explaining the differences between subgroups. However, whole genome bisulfite-sequencing and H3K27Ac ChIP-sequencing of primary tumors revealed clear differences in methylation patterns and enhancer landscapes, leading to the identification of subgroup-specific regulatory networks. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  55 
 
  
    EGAD00001001445 
   
  
    
    Deep sequencing of melanoma for driver mutations 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  3 
 
  
    EGAD00001001446 
   
  
    
    Genomic and transcriptomic characterization of drug-resistant colon cancer stem cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001447 
   
  
    
    Whole genome sequencing of single cell derived organoids from normal colon tissue and colorectal cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  73 
 
  
    EGAD00001001448 
   
  
    
    Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  11 
 
  
    EGAD00001001449 
   
  
    
    PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation.  The library will be sequenced either by HiSeq or MiSeq. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001001450 
   
  
    
    This study is to ascertain whether it is feasible to extract single cell from a tumour, perform amplification, generate a library and sequence a targeted pulldown. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001451 
   
  
    
    JMML targeted sequencing of candidate genes 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  75 
 
  
    EGAD00001001452 
   
  
    
    Anaplastic oligodendrogliomas (AOs) are rare primary brain tumors which are generally incurable, with heterogeneous prognosis and few treatment targets identified. Most oligodendrogliomas have chromosome 1p/19q co-deletion and IDH mutation. We analyzed 51 AOs by whole-exome sequencing, identifying previously reported frequent somatic mutations in CIC and FUBP1. We also identified recurrent mutations in TCF12 and in an additional series of 83 AO. Overall 7.5% of AO are mutated for TCF12, which encodes an oligodendrocyte-related transcription factor. 80% of TCF12 mutations identified were in either the bHLH domain, which is important for TCF12 function as a transcription factor, or were frame shift mutations leading to TCF12 truncated for this domain. We show that these mutations compromise TCF12 transcriptional activity and are associated with a more aggressive tumor type. Our analysis provides further insights into the unique and shared pathways driving AO. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001001453 
   
  
    
    The project is to evaluate the genomic binding sites of the histone demethylase JARID1C. This gene was recently identified in CGP as a novel recessive cancer gene in human renal cell carcinoma. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  4 
 
  
    EGAD00001001454 
   
  
    
    Previously we performed deep WGS on 6 parents and 13 children from 3 large families from the Scottish Family Health Study to identify de novo mutations. This prelim is cover the additional sequencing of one grandchild from one of these three families. The inclusion of a third generation individual will provide additional experimental validation for the de novo mutations found in the initial trio. As in the previous study, the DNA will be WGS to a depth of approximately 25X to achieve this purpose.These data can only be used for the investigation of the genetic causes of the reported clinical phenotypes in these patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001456 
   
  
    
    1000Genomes imputed data set of 581 cases and 417 controls for male-pattern baldness 
    
   
  
    
   
  1 
 
  
    EGAD00001001457 
   
  
    
    All samples from the "100" project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001001458 
   
  
    
    Whole genome sequencing of EBV-transformed B cells in order to determine whether EBV induction of activation-induced cytidine deaminase (AID) produces genome-wide mutations and/or chromosomal rearrangements. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001001459 
   
  
    
    Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer. 
                   This dataset contains all the data available for this study on 2015-08-05. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001001460 
   
  
    
    Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH).
This dataset contains all the data available for this study on 2015-08-05. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  62 
 
  
    EGAD00001001461 
   
  
    
    CBP has opposing functions during cerebellar development and is a targetable tumor suppressor at late stages of medulloblastoma initiation 
    
   
  
    
   
  30 
 
  
    EGAD00001001462 
   
  
    
    Exome sequencing of 142 samples with corresponding Sanger sequencing results for 416 variants and 288 negative sites. DNA library preps prepared with Illumina TruSeq sample preparation kit. The captured DNA libraries were PCR amplified using the supplied paired-end PCR primers. Sequencing was performed with an Illumina HiSeq2000 (SBS Kit v3, one pool per lane) generating 2x101-bp reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  142 
 
  
    EGAD00001001464 
   
  
    
    Exome Sequencing.
3 μg of genomic DNA from each sample were sheared and used for the construction of a paired-end sequencing library as described in the paired-end sequencing sample preparation protocol provided by Illumina41. Enrichment of exonic sequences was then performed for each library using either the Sure Select Human All Exon 50 Mb or All Exon+UTRs v4 kits following the manufacturer’s instructions (Agilent Technologies). Exon-enriched DNA was pulled down by magnetic beads coated with streptavidin (Invitrogen), followed by washing, elution and 18 additional cycles of amplification of the captured library. Enriched libraries were sequenced (2 × 76 bp) in one lane of an Illumina GAIIx sequencer or in two lanes of a HiSeq2000 when using pools of eight samples. 
    
   
  
    
   
  - 
 
  
    EGAD00001001465 
   
  
    
    18 Exomes for discovery set and 60 Targeted panel for prevalence set 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  127 
 
  
    EGAD00001001466 
   
  
    
    Whole Genome sequencing.
2 μg of genomic DNA from each sample was used for the construction of two short-insert paired-end sequencing libraries. Both types of libraries were sequenced in paired-end mode on Illumina GAIIx (2 × 151 bp) using Sequencing kit v4 or Illumina HiSeq2000 (2x101 bp) using TruSeq SBS Kit v3. 
    
   
  
    
   
  - 
 
  
    EGAD00001001467 
   
  
    
    WGS of 8 trios - affected child and both normal parents 
    
   
  
    
   
  24 
 
  
    EGAD00001001468 
   
  
    
    PAR-CLIP was performed on the Argonaute-2 protein (AGO2) in four lymphoma cell lines:NamalwaRajiSU-DHL-4SU-DHL-6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001001469 
   
  
    
    RNA-Seq data for 1 T-cell acute leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001470 
   
  
    
    ChIP-Seq data for 2 plasma cell sample(s). 13 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001471 
   
  
    
    RNA-Seq data for 11 Multiple myeloma sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001001472 
   
  
    
    ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001473 
   
  
    
    Bisulfite-Seq data for 2 cytotoxic CD56-dim natural killer cell sample(s). 24 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001474 
   
  
    
    RNA-Seq data for 14 mature neutrophil sample(s). 14 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001001475 
   
  
    
    DNase-Hypersensitivity data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001476 
   
  
    
    DNase-Hypersensitivity data for 4 CD14-positive, CD16-negative classical monocyte sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001477 
   
  
    
    RNA-Seq data for 3 neutrophilic myelocyte sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001478 
   
  
    
    RNA-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001479 
   
  
    
    Bisulfite-Seq data for 1 memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001480 
   
  
    
    RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001481 
   
  
    
    ChIP-Seq data for 15 Acute Myeloid Leukemia sample(s). 75 run(s), 72 experiment(s), 72 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001001482 
   
  
    
    Bisulfite-Seq data for 6 Acute Myeloid Leukemia sample(s). 66 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001483 
   
  
    
    RNA-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001484 
   
  
    
    Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001485 
   
  
    
    ChIP-Seq data for 3 Acute Myeloid Leukemia - SAHA sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001486 
   
  
    
    Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001487 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001488 
   
  
    
    RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001489 
   
  
    
    RNA-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001490 
   
  
    
    ChIP-Seq data for 6 Acute promyelocytic leukemia sample(s). 29 run(s), 27 experiment(s), 27 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001491 
   
  
    
    Bisulfite-Seq data for 6 inflammatory macrophage sample(s). 83 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001492 
   
  
    
    RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001493 
   
  
    
    Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 5 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001494 
   
  
    
    Bisulfite-Seq data for 1 memory B cells sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001495 
   
  
    
    ChIP-Seq data for 4 neutrophilic metamyelocyte sample(s). 18 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001496 
   
  
    
    RNA-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001497 
   
  
    
    Bisulfite-Seq data for 2 conventional dendritic cell sample(s). 30 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001498 
   
  
    
    Bisulfite-Seq data for 5 alternatively activated macrophage sample(s). 79 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001499 
   
  
    
    ChIP-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 9 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001500 
   
  
    
    RNA-Seq data for 2 CD38-negative naive B cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001501 
   
  
    
    RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001502 
   
  
    
    ChIP-Seq data for 2 germinal center B cell sample(s). 12 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001503 
   
  
    
    ChIP-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001504 
   
  
    
    RNA-Seq data for 3 band form neutrophil sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001505 
   
  
    
    ChIP-Seq data for 7 CD4-positive, alpha-beta T cell sample(s). 39 run(s), 39 experiment(s), 39 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001506 
   
  
    
    RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001507 
   
  
    
    Bisulfite-Seq data for 1 mature eosinophil sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001508 
   
  
    
    ChIP-Seq data for 9 mature neutrophil sample(s). 48 run(s), 45 experiment(s), 45 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001001509 
   
  
    
    Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001510 
   
  
    
    Bisulfite-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 36 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001511 
   
  
    
    ChIP-Seq data for 4 band form neutrophil sample(s). 18 run(s), 17 experiment(s), 17 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001001512 
   
  
    
    RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001513 
   
  
    
    ChIP-Seq data for 5 CD8-positive, alpha-beta T cell sample(s). 26 run(s), 26 experiment(s), 26 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001514 
   
  
    
    ChIP-Seq data for 4 alternatively activated macrophage sample(s). 22 run(s), 22 experiment(s), 22 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001515 
   
  
    
    RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001516 
   
  
    
    Bisulfite-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 61 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001517 
   
  
    
    ChIP-Seq data for 4 neutrophilic myelocyte sample(s). 14 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001518 
   
  
    
    ChIP-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 17 run(s), 17 experiment(s), 17 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001519 
   
  
    
    ChIP-Seq data for 6 naive B cell sample(s). 34 run(s), 28 experiment(s), 28 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001520 
   
  
    
    RNA-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001521 
   
  
    
    RNA-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001522 
   
  
    
    Bisulfite-Seq data for 2 plasma cell sample(s). 17 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001523 
   
  
    
    RNA-Seq data for 4 plasma cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001524 
   
  
    
    DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001525 
   
  
    
    RNA-Seq data for 1 mature eosinophil sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001526 
   
  
    
    RNA-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001527 
   
  
    
    ChIP-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 18 run(s), 18 experiment(s), 18 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001528 
   
  
    
    ChIP-Seq data for 1 Leukemia sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001529 
   
  
    
    Bisulfite-Seq data for 1 precursor B cell sample(s). 6 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001530 
   
  
    
    Bisulfite-Seq data for 1 Acute Myeloid Leukemia - CTR sample(s). 18 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001531 
   
  
    
    RNA-Seq data for 1 class switched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001532 
   
  
    
    RNA-Seq data for 4 monocyte - None sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001533 
   
  
    
    ChIP-Seq data for 4 Acute Myeloid Leukemia - CTR sample(s). 21 run(s), 21 experiment(s), 21 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001534 
   
  
    
    RNA-Seq data for 5 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 23 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001535 
   
  
    
    RNA-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001536 
   
  
    
    ChIP-Seq data for 1 Acute Myeloid Leukemia - MC2884 sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001537 
   
  
    
    Bisulfite-Seq data for 3 Acute promyelocytic leukemia sample(s). 24 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001538 
   
  
    
    RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001539 
   
  
    
    ChIP-Seq data for 2 mature eosinophil sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001540 
   
  
    
    RNA-Seq data for 1 conventional dendritic cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001541 
   
  
    
    Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001542 
   
  
    
    RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001543 
   
  
    
    RNA-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001544 
   
  
    
    RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001545 
   
  
    
    DNase-Hypersensitivity data for 1 alternatively activated macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001546 
   
  
    
    RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001547 
   
  
    
    RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001548 
   
  
    
    Bisulfite-Seq data for 2 class switched memory B cell sample(s). 21 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001549 
   
  
    
    DNase-Hypersensitivity data for 1 Acute Myeloid Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001550 
   
  
    
    RNA-Seq data for 7 erythroblast sample(s). 29 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001551 
   
  
    
    RNA-Seq data for 1 Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001552 
   
  
    
    ChIP-Seq data for 9 CD14-positive, CD16-negative classical monocyte sample(s). 56 run(s), 53 experiment(s), 53 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001001553 
   
  
    
    Bisulfite-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 13 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001554 
   
  
    
    ChIP-Seq data for 1 adult endothelial progenitor cell sample(s). 8 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001555 
   
  
    
    RNA-Seq data for 7 Acute promyelocytic leukemia sample(s). 7 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001556 
   
  
    
    Bisulfite-Seq data for 1 naive B cell sample(s). 5 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001557 
   
  
    
    ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001558 
   
  
    
    RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001559 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001560 
   
  
    
    DNase-Hypersensitivity data for 2 monocyte sample(s). 4 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001561 
   
  
    
    RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001562 
   
  
    
    ChIP-Seq data for 5 Chronic lymphocytic leukemia sample(s). 24 run(s), 23 experiment(s), 23 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001563 
   
  
    
    Bisulfite-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001564 
   
  
    
    Bisulfite-Seq data for 1 regulatory T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001565 
   
  
    
    Bisulfite-Seq data for 1 monocytes - T=0days sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001566 
   
  
    
    RNA-Seq data for 2 neutrophilic metamyelocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001567 
   
  
    
    Bisulfite-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001568 
   
  
    
    ChIP-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001569 
   
  
    
    ChIP-Seq data for 1 Acute lymphocytic leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001570 
   
  
    
    ChIP-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001571 
   
  
    
    Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 56 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001572 
   
  
    
    RNA-Seq data for 4 monocyte sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001573 
   
  
    
    DNase-Hypersensitivity data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001574 
   
  
    
    ChIP-Seq data for 2 erythroblast sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001575 
   
  
    
    Bisulfite-Seq data for 8 macrophage sample(s). 117 run(s), 8 experiment(s), 8 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001001576 
   
  
    
    ChIP-Seq data for 12 macrophage sample(s). 49 run(s), 49 experiment(s), 49 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001001577 
   
  
    
    ChIP-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001578 
   
  
    
    ChIP-Seq data for 1 mesenchymal stem cell of the bone marrow sample(s). 9 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001579 
   
  
    
    RNA-Seq data for 3 segmented neutrophil of bone marrow sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001580 
   
  
    
    ChIP-Seq data for 2 monocyte sample(s). 6 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001001581 
   
  
    
    DNase-Hypersensitivity data for 16 macrophage sample(s). 20 run(s), 16 experiment(s), 16 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001582 
   
  
    
    RNA-Seq data for 18 macrophage sample(s). 19 run(s), 18 experiment(s), 18 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001001583 
   
  
    
    Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001584 
   
  
    
    ChIP-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001585 
   
  
    
    Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001586 
   
  
    
    RNA-Seq data for 4 alternatively activated macrophage sample(s). 6 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001587 
   
  
    
    Bisulfite-Seq data for 1 germinal center B cell sample(s). 6 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001588 
   
  
    
    ChIP-Seq data for 4 segmented neutrophil of bone marrow sample(s). 20 run(s), 19 experiment(s), 19 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001001589 
   
  
    
    ChIP-Seq data for 7 inflammatory macrophage sample(s). 36 run(s), 36 experiment(s), 36 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001590 
   
  
    
    Bisulfite-Seq data for 4 CD38-negative naive B cell sample(s). 44 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001591 
   
  
    
    Bisulfite-Seq data for 7 CD14-positive, CD16-negative classical monocyte sample(s). 101 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001592 
   
  
    
    ChIP-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001593 
   
  
    
    RNA-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001594 
   
  
    
    ChIP-Seq data for 6 CD38-negative naive B cell sample(s). 20 run(s), 20 experiment(s), 20 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001001595 
   
  
    
    ICGC PACA-CA Release 20 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  516 
 
  
    EGAD00001001596 
   
  
    
    Whole Exome Sequencing data from the germline of the patient as well as the tumors in bone marrow (T-ALL), Liver (Histiocytic Sarcoma) and ileum (non-Langerhans Cell Histiocytosis). 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  4 
 
  
    EGAD00001001598 
   
  
    
    RNA-sequencing data from teh hT-RPE-MycER cell line after MYC activation and after MINCR knock-down in conditions of MYC ON or OFF 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001001600 
   
  
    
    PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients.
This dataset contains all the data available for this study on 2015-09-03. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001001601 
   
  
    
    The intersection of genome-wide association analyses with physiological and functional data indicates that variants regulating islet gene transcription influence type 2 diabetes (T2D) predisposition and glucose homeostasis. However, the specific genes through which these regulatory variants act remain poorly characterized. To identify such effector transcripts for T2D and glycemic traits, we generated expression quantitative trait locus (eQTL) data in 118 human islet samples using RNA-sequencing and high-density genotyping. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  118 
 
  
    EGAD00001001602 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001607 
   
  
    
    In this dataset, 16 trios- primary tumor, relapse and corresponding normals- for patients with neuroblastoma are provided. For one patient, more than one relapse was available for the analyses. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001001608 
   
  
    
    Aligned BAM files of whole exome sequencing of 20 syCRCs and 10 normal counterparts. Each sample of 4 patients (S13, S3, S12 and S6) underwent two sequencing rounds. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001001609 
   
  
    
    Maternal Plasma RNA Sequencing for Genomewide Transcriptomic Profiling and Identification of Pregnancy-Associated Transcripts 
    
   
  
    
   
  14 
 
  
    EGAD00001001612 
   
  
    
    After overexpression and knockdown of both described novel miRs nmiR-1 and nmiR-2 in BL cell lines (SU-DHL4 for nmiR-1 and Raji for nmiR-2), we performed regular RNA-Seq (including Mock controls for all cell lines) to identify their direct and indirect downstream mRNA targets. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001001613 
   
  
    
   
  
    
   
  10 
 
  
    EGAD00001001614 
   
  
    
   
  
    
   
  26 
 
  
    EGAD00001001615 
   
  
    
   
  
    
   
  10 
 
  
    EGAD00001001616 
   
  
    
   
  
    
   
  2 
 
  
    EGAD00001001618 
   
  
    
    Sequence data from two medullary thyroid carcinoma patients: WGS datasets generated from tumors and matched normal tissues and RNA-Seq from tumors are included. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001001619 
   
  
    
    miRNA seq data of 43 cases out of dataset EGAD00001000650 (MMML) 
    
   
  
    
   
  43 
 
  
    EGAD00001001620 
   
  
    
    release_2: ICGC PedBrain: RNA sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  45 
 
  
    EGAD00001001621 
   
  
    
    release_2: ICGC PedBrain: ChIP-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001001622 
   
  
    
    BBMRI - BIOS project - Freeze 1 - Fastq files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2199 
 
  
    EGAD00001001623 
   
  
    
    BBMRI - BIOS project - Freeze 1 - Bam files 
    
   
  
    
   
  2117 
 
  
    EGAD00001001624 
   
  
    
    release_2: ICGC PedBrain: whole exome sequencing and Target-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  188 
 
  
    EGAD00001001625 
   
  
    
    release_2: ICGC PedBrain: whole genome sequencing 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  209 
 
  
    EGAD00001001626 
   
  
    
    RNA-Seq Illumina GAII dataset for the TraIT cell-line use case (added reverse and forward reads). 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  6 
 
  
    EGAD00001001627 
   
  
    
    This dataset contains RNA sequencing raw data from four parental tumors that were used for classification of gene expression subtypes (Verhaak, Cancer Cell 2010) using ssGSEA. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001628 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  299 
 
  
    EGAD00001001629 
   
  
    
    Whole-genome somatic rearrangement and point mutation analysis in cell lines with induced telomere fusions. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001001630 
   
  
    
    release_2: ICGC PedBrain: whole genome bisulfite sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  108 
 
  
    EGAD00001001631 
   
  
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  334 
 
  
    EGAD00001001632 
   
  
    
    miRNA seq data of 13 cases (MMML) 
    
   
  
    
   
  13 
 
  
    EGAD00001001633 
   
  
    
    BAM files for two WES TRAIP patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001634 
   
  
    
    This dataset includes the whole genomes, sequenced to high depth (30x) of 25 individuals from Papua New Guinea. The individuals were chosen from several geographically distinct Papuan groups, focusing on the highland regions: Bundi, Kundiawa, Mendi, Marawaka and Tari. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  25 
 
  
    EGAD00001001635 
   
  
    
    Whole genome sequencing detected structural rearrangements of TERT in 17/75 high stage neuroblastoma with 5 cases resulting from chromothripsis. Rearrangements were associated with increased TERT expression and targeted immediate up- and down-stream regions of TERT, placing in 7 cases a super-enhancer close to the breakpoints. TERT rearrangements (23%), ATRX deletions (11%) and MYCN amplifications (37%) identify three almost non-overlapping groups of high stage neuroblastoma, each associated with very poor prognosis. This submission contains all newly sequenced samples only.study_refcenter AMC 
    
   
  
    
   
  42 
 
  
    EGAD00001001636 
   
  
    
    Whole-genome sequencing at 4x of 250 samples from the Greek isolatecollection HELIC 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  250 
 
  
    EGAD00001001637 
   
  
    
    Whole-genome sequencing at 1x of samples from the Cretan Greek isolate collection HELIC-MANOLIS. Genome-wide association studies of complex traits have been successful in identifying common variant associations, but a substantial heritability gap remains. The field of complex trait genetics is shifting towards the study of low frequency and rare variants, which are hypothesised to have larger effects. The study of these variants can be empowered by focusing on isolated populations, in which rare variants may have increased in frequency and linkage disequilibrium tends to be extended. This work focuses on an isolated population from Crete, Greece. Sequencing is very efficient in isolated populations, because variants found in a few samples will be shared by others in extended haplotype contexts, supporting accurate imputation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1003 
 
  
    EGAD00001001638 
   
  
    
    The HELIC study has been whole genome sequencing individuals from 2 Greek isolatedpopulations at 1x depth. The genotype calling process crucially involves a VQSR stepfollowed by imputation-based refinement. We have been investigating optimal ways toincrease calling accuracy. To aid us in setting appropriate parameters for VQSR and otherQC steps, we have carried out whole exome sequencing of a small number ofHELIC samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001001639 
   
  
    
    Low depth (4x) Illumina HiSeq raw sequence data for 2000 Ugandans from various ethno-linguistic group from rural South-West Uganda (related individuals included). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2000 
 
  
    EGAD00001001642 
   
  
    
    RIKEN collection of WGS reads of 530 liver cancer and matched blood samples from 260 donors. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  530 
 
  
    EGAD00001001643 
   
  
    
    RIKEN collection of WGS read of 59 multi-centric liver cancers or intra-haptatic metastasis and matched blood samples from 19 donors. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001001644 
   
  
    
    MicroRNAs (miRs) have been recognized as promising biomarkers. It is unknown to what extent tumor-derived miRs are differentially expressed between primary colorectal cancers (pCRCs) and metastatic lesions, and to what extent the expression profiles of tumor tissue differ from the surrounding normal tissue. Next-generation sequencing (NGS) of 220 fresh-frozen samples, including paired primary and metastatic tumor tissue and non-tumorous tissue from 38 patients, revealed expression of 2245 known unique mature miRs and 515 novel candidate miRs. Unsupervised clustering of miR expression profiles of pCRC tissue with paired metastases did not separate the two entities, whereas unsupervised clustering of miR expression profiles of pCRC with normal colorectal mucosa demonstrated complete separation of the tumor samples from their paired normal mucosa. Two hundred and twenty-two miRs differentiated both pCRC and metastases from normal tissue samples (false discovery rate (FDR) <0.05). The highest expressed tumor-specific miRs were miR-21 and miR-92a, both previously described to be involved in CRC with potential as circulating biomarker for early detection. Only eight miRs, 0.5% of the analysed miR transcriptome, were differentially expressed between pCRC and the corresponding metastases (FDR <0.1), consisting of five known miRs (miR-320b, miR-320d, miR-3117, miR-1246 and miR-663b) and three novel candidate miRs (chr 1-2552-5p, chr 8-20656-5p and chr 10-25333-3p). These results indicate that previously unrecognized candidate miRs expressed in advanced CRC were identified using NGS. In addition, miR expression profiles of pCRC and metastatic lesions are highly comparable and may be of similar predictive value for prognosis or response to treatment in patients with advanced CRC. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001001645 
   
  
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001001646 
   
  
    
    Fastq files corresponding to RNA-Seq dataset for PTPN1 project (EGAS00001000554) 
    
   
  
    
      
      Illumina Genome Analyzer 
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001655 
   
  
    
    Genome and transcriptome sequence data from an atypical teratoid rhabdoid tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001656 
   
  
    
    Genome and transcriptome sequence data from an atypical chronic lymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001657 
   
  
    
    Genome and transcriptome sequence data from a parotid gland cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001658 
   
  
    
    Genome and transcriptome sequence data from an odontogenic ghost cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001660 
   
  
    
    Whole exome sequencing was performed to explore the mutational landscape and potential molecular signature of HPV-positive versus HPV-negative OAC. Four hr-HPV-positive and 8 HPV-negative treatment-naive fresh-frozen OAC tissue specimens and matched normal tissue were analysed to identify somatic genomic mutations 
    
   
  
    
   
  24 
 
  
    EGAD00001001661 
   
  
    
    Genotype and exome data for an Australian Aboriginal population: a reference panel for health-based research. 
    
   
  
    
   
  72 
 
  
    EGAD00001001662 
   
  
    
    Whole genome sequences of ACC primagrafts, Histone modification maps and transcription factor binding maps for ACC primagrafts and primary tumors. Processed ChIP-seq data is available on GEO under accession number GSE76465. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  58 
 
  
    EGAD00001001663 
   
  
    
    Low coverage (4x-8x) Illumina HiSeq curated sequence data from 3 African populations from the AGV project; 100 Baganda from Uganda (4x), 100 Zulu from South Africa (4x), and 120 Gumuz, Wolayta, Oromo, Somali and Amhara from Ethiopia (8x). Pre-processed, jointly called and filtered with GATK, refined with Beagle3, phased with SHAPEIT2. 
    
   
  
    
   
  1 
 
  
    EGAD00001001664 
   
  
    
    LGG Epilepsy Cohort WGS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001001665 
   
  
    
    LGG Epilepsy Cohort WXS 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  61 
 
  
    EGAD00001001666 
   
  
    
    LGG Epilepsy Cohort RNA-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001001667 
   
  
    
    Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  12 
 
  
    EGAD00001001668 
   
  
    
    Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001001669 
   
  
    
    Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001001672 
   
  
    
    Part of RNA sequencing data of Malignant Lymphoma Study (ICGC) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001001673 
   
  
    
    Part of WGS seq data of Maligant Lymphoma study (ICGC) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  112 
 
  
    EGAD00001001674 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  299 
 
  
    EGAD00001001675 
   
  
    
    RNA-seq of peripheral blood samples from CLL patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001001676 
   
  
    
    Tagmentation-based whole-genome bisulfite sequencing of isolated cell types from healthy controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001686 
   
  
    
    In the autozygosity exome sequencing of Born-in-Bradford samples of Pakistani origin there
is a mother who is homozygous for an apparent truncating stop codon in PRDM9, the gene
responsible for localising recombination during meiosis. We plan to deep sequence mother
and child with X10, and physically phase the mother with PacBio sequencing.
We will use this data to identify recombination locations, and test whether these are
consistent with the known fine scale recombination map.
Data Access is controlled by the Wellcome Trust Sanger Institute DAC and the Born In Bradford Executive Group. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001687 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001001688 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001001689 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001001690 
   
  
    
    Tumor-Normal paired samples of PTC 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  182 
 
  
    EGAD00001001691 
   
  
    
    Esophageal cancer is one of the most aggressive cancers and the sixth leading cause of cancer death worldwide1. Approximately 70% of the global esophageal cancers occur in China and over 90% histopathological forms of this disease are esophageal squamous cell carcinoma (ESCC)2-3. Currently, there are limited clinical approaches for early diagnosis and treatment for ESCC, resulting in a 10% 5-year survival rate for the patients. Meanwhile, the full repertoire of genomic events leading to the pathogenesis of ESCC remains unclear. Here we show a comprehensive genomic analysis in 158 ESCC cases, as part of the International Cancer Genome Consortium (ICGC) Research Projects (http://icgc.org/icgc/cgp/72/371/1001734). We conducted whole-genome sequencing in 14 ESCC cases and whole-exome sequencing in 90 cases. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  208 
 
  
    EGAD00001001692 
   
  
    
    Whole exome sequencing of germline DNA was performed and subsequent polymorphisms in genes known and putatively involved in the innate immune response to fungi were identified 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001001693 
   
  
    
    Fastq files of RNAseq of 182 samples of biliary tract cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  182 
 
  
    EGAD00001001694 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001695 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001696 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001697 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001698 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001699 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001700 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001701 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001702 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001703 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001704 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001705 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001706 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001707 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001708 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001709 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001710 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001711 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001712 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001713 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001714 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001715 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001716 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001717 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001718 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001719 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001720 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001721 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001722 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001723 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001724 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001725 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001726 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001727 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001728 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001729 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001730 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001731 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001732 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001733 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001734 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001735 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001736 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB38_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001737 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB38_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001739 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001740 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001741 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001742 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001743 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001744 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001745 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001746 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001747 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001748 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001749 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001750 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001751 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001752 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001753 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001754 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001755 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001756 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001757 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001758 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001759 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001760 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001761 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001762 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001763 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001764 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001765 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001766 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001767 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001768 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001769 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001770 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001771 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001772 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001773 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001774 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001775 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001776 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001777 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001778 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001779 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001780 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001781 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB8_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001783 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB8_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001784 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW12_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001786 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW12_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001787 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001788 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001789 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001790 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW15_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001792 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW15_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001793 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001794 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001795 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001796 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001797 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001798 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001799 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW22_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001800 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW22_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001802 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001803 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001804 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001805 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001806 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001807 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001808 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001809 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001810 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001811 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001812 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001813 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001814 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001815 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001816 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001817 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001818 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001819 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001820 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001821 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001822 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001823 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001824 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001825 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001826 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001827 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001828 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001829 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001830 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001831 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001833 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW4_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001834 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW4_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001835 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001836 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001837 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001838 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001839 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001840 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001841 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_C 
    
   
  
    
   
  1 
 
  
    EGAD00001001842 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_F 
    
   
  
    
   
  1 
 
  
    EGAD00001001843 
   
  
    
    50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_M 
    
   
  
    
   
  1 
 
  
    EGAD00001001844 
   
  
    
    Whole genome sequencing of 64 HER2-Positive Breast Cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  128 
 
  
    EGAD00001001845 
   
  
    
    Leeds Melanoma Cohort 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001846 
   
  
    
    2 BRAFV600E cell lines that have been made resistance to 1. the BRAF inhibitor PLX4720 and 2. the combination therapy of dabrafenib and trametinib seem to have a internal duplication in the kinase domain. We would like to know if this is caused by a translocation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001001847 
   
  
    
    4C-seq data was generated for regions of interest to confirm enhancer-gene promoter interactions 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001848 
   
  
    
    DDD DATAFREEZE 2014-11-04: 4293 trios - VCF files 
    
   
  
    
   
  - 
 
  
    EGAD00001001849 
   
  
    
    The genomic sequence of brain expressed miRNA genes was sequenced in Swedish schizophrenia patients 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  186 
 
  
    EGAD00001001850 
   
  
    
    Genomic DNA from Swedish control individuals was pooled. Then the genomic sequence of brain expressed miRNA genes was determined in the pools. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  149 
 
  
    EGAD00001001851 
   
  
    
    The genomic sequence of brain expressed miRNA genes was sequenced in Belgian epilepsy patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  163 
 
  
    EGAD00001001852 
   
  
    
    Genomic DNA from Belgian control individuals was pooled. Then the genomic sequence of brain expressed miRNA genes was determined in the pools. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  39 
 
  
    EGAD00001001853 
   
  
    
    In this dataset are the data from :- 17 patients studied by WGS- 49 patients studied by WES- 9 (/49) patients studied by RNASeq at 2 time points- the same 9 patients studied by ERRBS at 2 time points 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  199 
 
  
    EGAD00001001854 
   
  
    
    Exome sequencing of nine PCC/PGL tumors, SF and FFPE samples 
    
   
  
    
   
  18 
 
  
    EGAD00001001856 
   
  
    
   
  
    
   
  100 
 
  
    EGAD00001001857 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  381 
 
  
    EGAD00001001858 
   
  
    
    Raw fastq files from WGS sequencing of CLL and matching blood normal for the ICGC Techval Benchmark1 study.  Sequence data was provided to multiple centers for independent analysis and comparison. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001859 
   
  
    
    Raw fastq files for sequence data generated at 5 sequencing centers from a Medulloblastoma sample and matching blood normal control. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001001860 
   
  
    
   
  
    
   
  19 
 
  
    EGAD00001001861 
   
  
    
    Exome Sequencing to Define the Landscape of Plasma Cells in Systemic Light chain Amyloidosis 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001001862 
   
  
    
    RNA-seq of PDXs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001863 
   
  
    
    Exome data of PDX models. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001001864 
   
  
    
    DATA FILES FOR PCGP MB WGS - Supersedes (EGAD00001000269) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  76 
 
  
    EGAD00001001865 
   
  
    
    Sequence Data of total RNA, miRNA, WGB, mRNA, NOMe, Chip (H3K27ac,H3K27me, H3K36me3, H3K4me1, H3K4me3, H3K9me3, Input)Short Desrciption: Epigenetic profiling of human CD4+ memory T cells reveals their proliferative history and argues in favor of a progressive differentiation model driven by epigenetically  controlled master regulators. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001001869 
   
  
    
    We report the first combined analysis of whole genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21).  Whole genome and transcriptome sequence was obtained from 9 anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 years prior to death. Transcriptome analysis revealed increased expression of AR-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only 1 of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied.  The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today given this knowledge, use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations is critical for actionability, and can only be determined through analysis of multiple sites of metastasis. Our findings suggest that a large set of deeply analyzed cases could serve as powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001001870 
   
  
    
    Deep sequencing of 151 cancer genes in 6 synchronous CRC of 3 patients 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001001871 
   
  
    
    Megakaryocytes and erythroblasts derive from the same progenitor cell type but carry out very different functions. In order to understand how the different functional phenotypes arise we have characterised the epigenetic landscape of these cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001001872 
   
  
    
    Targeted exome  sequencing of patient derived xenografts  from primary colorectal tumours and liver metastases.
                   This dataset contains all the data available for this study on 2016-01-06. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  333 
 
  
    EGAD00001001873 
   
  
    
    AML emerges as a consequence of accumulating independent genetic aberrations that direct regulation and/or dysfunction of genes resulting in aberrant activation of signalling pathways, resistance to apoptosis and uncontrolled proliferation. Given the significant heterogeneity of AML genomes, AML patients demonstrate a highly variable response rate and poor median survival in response to current chemotherapy regimens. For the past 4 years we have conducted gene expression profiling on purified bone marrow populations equating to normal haematopoietic stem and progenitor cells from healthy subjects and patients with de novo AML in order to identify AML signatures of aberrantly expressed genes in cancer versus normal. We are now applying a series of bioinformatic methodologies combined with clinical and conventional diagnostic data to establish novel genomics strategies for improved prognostication of AML. Additionally, we use our AML signatures to unravel oncogenic signalling pathway activities in AML patients and test inhibitory drugs for these pathways inn preclinical therapeutic programmes. We consider that superimposing GEP and clinical data for our AML patient cohort with additional data on their mutational status will significantly improve the prognostic power of the study as well as unravel yet unknown mutations associated with aberrant signalling activities of oncogenic pathways. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  215 
 
  
    EGAD00001001874 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001876 
   
  
    
    Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study. These data are included in the manuscript entitled, "Response to Angiotensin Blockade with Irbesartan in a Patient with Metastatic Colorectal Cancer". 
    
   
  
    
   
  4 
 
  
    EGAD00001001879 
   
  
    
    A pilot to establish the feasability of using a custom Agilent targeted pulldown of 110 genes implicated in colorectal tumourigensis to sequence for driver mutations in a set of 30 FFPE colorectal adenomas. If successful, we propose to sequence an additional 350 adenomas as part of a MRC research study in order to define the pattern of driver mutations across the spectrum of pathological subtypes including coventional adenomas, serrated adenomas and hyperplastic polyps 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001001880 
   
  
    
    RIKEN collection of RNA-seq reads for 458 liver cancer samples and matched normal liver from 247 donors. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  458 
 
  
    EGAD00001001881 
   
  
    
    RIKEN collection of WGS reads for 269 liver cancer tumors and matched normal blood or liver tissue from 258 donors. In total there are 1864 paired fastq sets sequenced on Illumina HiSeq 2000 or Genome Analyzer II instruments with paired reads of 75–101 bp. Quality control and duplication removal has not been performed. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  528 
 
  
    EGAD00001001885 
   
  
    
    January 2016 update of RNA-Seq data (bams, fastqs) for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001001887 
   
  
    
    Exome sequencing VCF files describing mutations during glioma progression. 
    
   
  
    
   
  82 
 
  
    EGAD00001001889 
   
  
    
    ***THIS DATA CAN ONLY BE USED FOR NON-COMMERCIAL CANCER RESEARCH*** Sequencing of organoid cell lines derived from oesophageal tumour sections taken from patients diagnosed with primary oesophageal cancer who underwent tumour resection surgery. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001001891 
   
  
    
    Whole genome bisulfite sequencing of pedbrain - medulloblastoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001001892 
   
  
    
    BLUEPRINT Bisulfite-seq and Whole Genome Sequencing of mantle cell lymphoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001001897 
   
  
    
    15x whole genome sequencing in samples from the Cretan Greek isolate collection HELIC MANOLIS 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1482 
 
  
    EGAD00001001898 
   
  
    
    The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2016-01-27. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  131 
 
  
    EGAD00001001899 
   
  
    
    HDAC and PI3K Antagonists Cooperate to Inhibit Growth of MYC-driven Medulloblastoma 
    
   
  
    
   
  102 
 
  
    EGAD00001001900 
   
  
    
    DNA sequencing reads of human adult stem cell cultures from liver, colon and small intestine. Including biopsy or blood samples of the donors. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  61 
 
  
    EGAD00001001901 
   
  
    
    Monoclonal gammopathy of undetermined significance (MGUS) is a premalignant precursor of multiple myeloma (MM) with a 1% risk of progression per year. Although targeted analyses have shown the presence of specific genetic abnormalities such as IGH translocations, RB1 deletion, 1q gain, hyperdiploidy or RAS genes mutations, little is known about molecular mechanism of malignant transformation. We have performed whole exome sequencing together with SNP array analysis in 33 flow-cytometry separated abnormal PC samples of MGUS patients to describe somatic gene mutations and chromosome changes at the genome-wide level. Non-synonymous mutations (NS-SNVs) and copy number alterations (CNAs) were present in 97.0% and in 63.6% of cases, respectively. Importantly, the number of somatic mutations was significantly lower in MGUS compared to MM (p<10-4) and we have identified 6 myeloma significantly mutated genes which are KRAS, NRAS, DIS3, HIST1H1E, EGR1 and LTB in the MGUS dataset. We also found a positive correlation with increasing chromosome changes and somatic mutations. IGH translocations were present in 27.3% of cases comprising t(4;14), t(11;14), t(14;16) or t(14;20) and were in a similar frequency to MM, which corresponded with primary lesion hypothesis. Data from this study showed MGUS is a genetically comprehensive disease, however overall genetic instability is significantly lower compared to MM. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001001909 
   
  
    
    Paired-end whole exome sequenncing (Illumina) of primary enucleated retinoblastoma and matching lymphocyte DNA was performed to find somatic alterations that are related to oncogenesis. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  143 
 
  
    EGAD00001001913 
   
  
    
    Exome sequencing data for Mesothelioma 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  198 
 
  
    EGAD00001001914 
   
  
    
    RNA-seq data for mesothelioma cell lines after spliceostatin (SSA) or control (DMSO) treatment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001001915 
   
  
    
    RNA-Seq data for Mesothelioma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  211 
 
  
    EGAD00001001916 
   
  
    
    Targeted sequencing using SPET for Mesothelioma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  207 
 
  
    EGAD00001001917 
   
  
    
    PacBio data for mesothelioma cell line NCI-H2595. 
    
   
  
    
      
      PacBio RS II 
      
    
   
  1 
 
  
    EGAD00001001918 
   
  
    
    Multi-region Illumina whole-exome and/or whole-genome sequencing on tumor regions collected from early-stage NSCLC patients who underwent definitive surgical resection prior to receiving adjuvant therapy.Patients covered by this dataset: L012, L013, L015, L017 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  15 
 
  
    EGAD00001001920 
   
  
    
    TEST3 dataset containing 1 FASTQ file with mRNA reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001001921 
   
  
    
    All pituitary samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  84 
 
  
    EGAD00001001922 
   
  
    
    RNA-seq from normal human tissues (2 x 250 bp) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001001923 
   
  
    
    RNA sequence data for conditionally reprogrammed cells from patient HUB_5 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001001925 
   
  
    
    1461 Neuropathological and clinically characterised cases from the MRC Brain Bank 
    
   
  
    
   
  1461 
 
  
    EGAD00001001926 
   
  
    
    Esophageal Squamous Cell Carcinoma (ESCC) is one of the deadliest cancers worldwide. We performed 71 Whole-exome sequencing of Esophageal Squamous Cell Carcinoma on Chinese Patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  141 
 
  
    EGAD00001001927 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001001928 
   
  
    
    This study will analyse the guide sequence which were used for making mutations in the Cas9-expressing cells. We used GeCKO v2 library which were released by Feng Zhang, 2014. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  61 
 
  
    EGAD00001001930 
   
  
    
    Cancer genes can affect ribosomal RNA processing and this can underlie their essentiality to cells, making them cell-essential in the same way as ribosomal genes themselves. We want to confirm this, in order to understand the results of our CRISPR drop-out screens.NOTE FROM BESPOKE TEAM: Run a single read 1 (forward read) of 30 bases, then an index 1 read as normal.  This would fit a 50cycle kit 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001001932 
   
  
    
    HipSci - Healthy Normals - Exome Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  123 
 
  
    EGAD00001001933 
   
  
    
    HipSci - Healthy Normals - RNA Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  118 
 
  
    EGAD00001001935 
   
  
    
    Cancer amplicon reads consisting of BAM paired end reads from primary multiple myeloma samples. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  88 
 
  
    EGAD00001001936 
   
  
    
    Firs 1106 16S rDNA data for the Flemish Gut Flora Project 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1061 
 
  
    EGAD00001001937 
   
  
    
    Targeted amplicon sequencing of samples as part of the study "Methanol-based fixation is superior to buffered formalin for next-generation sequencing of DNA from clinical cancer samples. The amplicon panel consists of 48 amplicons in TP53, PTEN, EGFR, PIK3CA, KRAS and BRAF genes as described previously [Forshew, STM 2012]. All libraries were pooled and quantify using DNA 1000 kit on Agilent 2100 Bioanalyzer and KAPA SYBR FAST ABI Prism qPCR Kit (KAPA Biosystems) on 7900HT Fast Real-Time PCR System (Applied Biosystems) according to the supplier's recommendations. Reads were aligned using bwa-mem v0.7.12-r1039 to the 1000 genomes version of human genome build GRCh37, retaining duplicate reads. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  66 
 
  
    EGAD00001001938 
   
  
    
    Shallow whole-genome sequencing of samples from the study "Methanol-based fixation is superior to buffered formalin for next-generation sequencing of DNA from clinical cancer samples". DNA from each sample (100ng) was sheared on Covaris S220 (Covaris): duty cycle - 10%, intensity -5.0, bursts per sec - 200, duration - 300 sec, mode - frequency sweeping, power - 23V, temperature -5:5 C to 6 C, water level - 13. Libraries were prepared with the TruSeq Nano DNA LT Sample Prep Kit (Illumina) using a modi?ed protocol - Sample Puri?cation Beads were replaced by Agencourt AMPure XP beads (Beckman Coultier) and size selection after the End Repair was done to remove only the short fragments. Quality and quantity for contructed libraries were assessed with DNA 7500 kit on Agilent 2100 Bioanalyzer and with Kapa Quanti?cation kit (KAPA Biosystems) on 7900HT Fast Real-Time PCR System (Applied Biosystems) according to the supplier's recommendations, respectively. Libraries from 18 barcoded samples were pooled together in equimolar amounts and each pool was loaded on a single lane of a HiSeq Single End Flowcell (Illumina), followed by cluster generation on a cBot (Illumina) and sequencing on a HiSeq 2500 (Illumina) in a single-read 50bp mode. Reads were aligned using bwa-mem v0.7.12-r1039 to the 1000 genomes version of human genome build GRCh37. Picard (http://picard.sourceforge.net) was used to remove duplicate reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001001939 
   
  
    
    Mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  476 
 
  
    EGAD00001001940 
   
  
    
    Un-mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  476 
 
  
    EGAD00001001941 
   
  
    
    Variants derived from mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma. 
    
   
  
    
   
  476 
 
  
    EGAD00001001942 
   
  
    
    We performed target re-sequencing for 1.29 Mb interval of chromosome 9 (chr9:21299764–22590271, hg19). NimbleGen SeqCap EZ choice system was used as a target enrichment method (Roche Diagnostics). A DNA probe set complementary to the target region was designed by NimbleDesign. The libraries were sequenced on the Illumina MiSeq platform with 2×150-bp paired-end module (Illumina). Fastq files for 48 Japanese patients with endometriosis are deposited. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  48 
 
  
    EGAD00001001943 
   
  
    
    Here, we studied well-phenotyped individuals from the Flemish Gut Flora Project (FGFP, N=1,106, Belgium) and the effect of environments on microbiome. The 69 major significant phenotypes found in this study are provided. 
    
   
  
    
   
  1068 
 
  
    EGAD00001001944 
   
  
    
    RNA sequencing of paediatric glioblastoma in the ICGC PedBrain project 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001001947 
   
  
    
    Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001948 
   
  
    
    Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001001949 
   
  
    
    HipSci - Monogenic Diabetes - Exome Sequencing - April 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001950 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Exome Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001951 
   
  
    
    HipSci - Monogenic Diabetes - Exome Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001952 
   
  
    
    HipSci - Bardet-Biedl Syndrome - RNA Sequencing - April 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001001953 
   
  
    
    HipSci - Monogenic Diabetes - RNA Sequencing - April 2015 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001954 
   
  
    
    HipSci - Bardet-Biedl Syndrome - RNA Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001001955 
   
  
    
    HipSci - Monogenic Diabetes - RNA Sequencing - January 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001001956 
   
  
    
    ICGC Release 21 for PACA-CA from OICR 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  516 
 
  
    EGAD00001001957 
   
  
    
    March 2016 update of Whole genome bisulfite sequencing assay data (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001001958 
   
  
    
    March 2016 update of whole genome shotgun sequencing data (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001001959 
   
  
    
    March 2016 update of smRNA-Seq assays data (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001001960 
   
  
    
    upcoming publication 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001001961 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  4 
 
  
    EGAD00001001962 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001963 
   
  
    
    Genome and transcriptome sequence data from a non small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001964 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001965 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001001966 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001001967 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of right lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001968 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001001969 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001001973 
   
  
    
    Exome sequencing of 184 samples from consanguineous families with different congenital heart defects collected at KAIMRC, Riyadh, Saudi Arabia. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  179 
 
  
    EGAD00001001977 
   
  
    
    DDD DATAFREEZE 2014-11-04: 4293 trios - phenotypic and family descriptions 
    
   
  
    
   
  - 
 
  
    EGAD00001001978 
   
  
    
    This dataset contains FASTQ files for multi-region exome-sequencing of EGFR-mutant lung adenocarcinomas from Asian patient. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. Multiple runs for each sample, and 368 fastq in total. Please refer to the sample-ID from filename for merging. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  95 
 
  
    EGAD00001001979 
   
  
    
    This dataset contains BAM file for multi-region exome-sequencing of EGFR-mutant lung adenocarcinomas from Asian patient. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  95 
 
  
    EGAD00001001980 
   
  
    
    This dataset contains BAM files of targeted Amplicon deep-sequencing data, for validation of the mutations found in WES. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  95 
 
  
    EGAD00001001981 
   
  
    
    This dataset contains FASTQ files of targeted Amplicon deep-sequencing data, for validation of the mutations found in WES. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. 140 fastq in total, multiple runs for some of the samples. Please refer to the sample-ID from filename for merging. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  95 
 
  
    EGAD00001001983 
   
  
    
    Immunoglobulin heavy chain gene high throughput sequencing of paediatric acute lymphoblastic leukaemia samples, for the purpose of MRD on the Illumina MiSeq platform. This dataset contains summary fastq files and raw bcl files from the MiSeq for this study. In the study we identify errors associated with multiplexing that could potentially impact on the accuracy of MRD analysis. We optimise a strategy combining high purity, sequence-optimised oligonucleotides, dual-indexing and an error-aware demultiplexing approach to minimise errors and maximise sensitivity. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  491 
 
  
    EGAD00001001984 
   
  
    
    To identify recurrent somatic alterations in this unique subset of gastric cancers, whole exome and SNP6 analyses were performed using frozen cancer tissue. The somatic mutation analyses were also performed using blood of the same patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  160 
 
  
    EGAD00001001986 
   
  
    
    This study is meant to gain further knowledge in haematological cancers. Patients samples (mainly DNAs or PCR products) from haematolocical cancer patients will be sequenced, and the outputs will be correlated to their diagnosis and/or prognosis; the findings may also add more insight into the understanding of biology in this type of tumour. We will be sequencing Primary Testicular Lymphomas (PTL) to identify genetic drivers of this rare cancer 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001001987 
   
  
    
    March 2016 update of Whole genome bisulfite sequencing assay data (bams) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
   
  18 
 
  
    EGAD00001001988 
   
  
    
    Cholangiocarcinoma whole genome sequencing data 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  118 
 
  
    EGAD00001001991 
   
  
    
    Meta-genomic sequencing of 1,200 LifeLines-DEEP participants 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1135 
 
  
    EGAD00001001994 
   
  
    
    CCA targeted sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  376 
 
  
    EGAD00001001995 
   
  
    
    Whole genome sequencing (30X) using Hiseq X TEN on 4 HCC cell lines, primary HCCs and early-passage PDCs 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001001996 
   
  
    
    RIKEN collection of WGS reads for 13 multicentric liver cancers or intrahepatic metastasis and matched blood samples for 12 donors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001001998 
   
  
    
    This dataset consists of sequencing data on 15 patients with Sezary syndrome. On 12 of these patients, we have exome sequencing data while on 10 patients, we have RNA sequencing data. In total for seven patients, we have both exome as well as RNA sequencing data. We looked for gene mutations and fusion events in these patients to identify genes that could be involved in the pathogenesis of the disease. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001001999 
   
  
    
    HipSci - Embryonic Stem Cells - Exome Sequencing - April 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002000 
   
  
    
    HipSci - Embryonic Stem Cells - RNA Sequencing - April 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002001 
   
  
    
    Mapped data (bam files) for high-throughput whole genome sequence data for 83 modern Aboriginal Australians 
    
   
  
    
   
  83 
 
  
    EGAD00001002002 
   
  
    
    To characterize the subclonal genomic architecture of non-androgen-deprived metastatic prostate cancer, we performed whole-genome sequencing (WGS) of pelvic lymph node metastases and matching noncancerous blood from 10 patients to an average sequencing depth of 55x. The patients are part of PELICAN (Project to ELIminate Lethal Cancer) study led by G. Steven Bova at Johns Hopkins University (USA) and Tampere University (Finland). As of September 2020, study using these data is:
Wedge et al, Nature Genetics 2018 (PMID: 29662167) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001002003 
   
  
    
    Human subjects (COPD patients or apparently healthy controls) where investigated by bronchoscopy and a 5 mm brush was used to sample the subsegment airways of the right lung. The material obtained mainly consist of bronchial epithelial cells plus some contamination with leukocytes. For further details see Ziegler-Heitbrock et al, European Respiratory Journal, 40:823-829, 2012. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  544 
 
  
    EGAD00001002005 
   
  
    
    Using whole exome sequencing (WES), we identified homozygosity for a missense variant, VPS11: c.2536T>G (p.C846G), as the genetic cause of a leukoencephalopathy syndrome in two individuals from two unrelated Ashkenazi Jewish (AJ) families. Both patients exhibited highly concordant disease progression characterized by infantile onset leukoencephalopathy with brain white matter abnormalities, severe motor impairment, cortical blindness, intellectual disability, and seizures. 
    
   
  
    
   
  2 
 
  
    EGAD00001002006 
   
  
    
    Whole genome sequencing of paediatric glioblastoma in the ICGC PedBrain project 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  115 
 
  
    EGAD00001002007 
   
  
    
    To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL.  We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001002008 
   
  
    
    To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL.  We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001002009 
   
  
    
    Exome sequencing of high-risk prostate cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001002010 
   
  
    
    high-throughput sequencing of methylated and hydroxymethylated DNA from tumor and non-tumor tissue of patients with high-risk prostate cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001002011 
   
  
    
    RNA sequencing data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years. 
    
   
  
    
   
  64 
 
  
    EGAD00001002012 
   
  
    
    ChIPseq data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years. 
    
   
  
    
   
  16 
 
  
    EGAD00001002014 
   
  
    
    Isolated populations have unique population genetics characteristics that can help boost power in genetic association studies for complex traits. Leveraging these advantageous characteristics requires an in-depth understanding of parameters that have shaped sequence variation in isolates. This study performs a comprehensive investigation of these parameters using low-depth whole genome sequencing (WGS) across multiple isolates. 
    
   
  
    
   
  6840 
 
  
    EGAD00001002015 
   
  
    
    The use of reference DNA standards generated from cancer cell lines sequenced in the Cancer Genome Project to establish the sensitivity, specificity, accuracy and reproducibility of the WTSI GCLP sequencing pipeline 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  57 
 
  
    EGAD00001002016 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LICA-FR. 
    
   
  
    
   
  12 
 
  
    EGAD00001002017 
   
  
    
    Genome and transcriptome sequence data from a breast primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002018 
   
  
    
    Genome and transcriptome sequence data from a melanoma skin cancer - squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  4 
 
  
    EGAD00001002019 
   
  
    
    Genome and transcriptome sequence data from a  patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002020 
   
  
    
    Genome and transcriptome sequence data from a metastatic NPC patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002021 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002022 
   
  
    
    Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002023 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002024 
   
  
    
    Genome and transcriptome sequence data from an anal rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002025 
   
  
    
    Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002026 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  4 
 
  
    EGAD00001002027 
   
  
    
    Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002028 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002029 
   
  
    
    Genome and transcriptome sequence data from an ovarian granulosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002030 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002031 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002032 
   
  
    
    Genome and transcriptome sequence data from an adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002033 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002034 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002035 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002036 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002037 
   
  
    
    Genome and transcriptome sequence data from an adrenal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002038 
   
  
    
    Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  5 
 
  
    EGAD00001002039 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002040 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002041 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002042 
   
  
    
    Genome and transcriptome sequence data from an endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002043 
   
  
    
    Genome and transcriptome sequence data from a recurrent glioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002044 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002045 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002046 
   
  
    
    Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002047 
   
  
    
    Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002048 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002049 
   
  
    
    Genome and transcriptome sequence data from an adrenal cortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002050 
   
  
    
    In this project we will use exome sequencing to identify somatic mutations in lesions from a patient with a germline mutation in the protection of telomeres 1 gene (POT1). This dataset contains all the data available for this study on 2016-04-20. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  36 
 
  
    EGAD00001002051 
   
  
    
    BRAF V600E colorectal cancers do not respond to the only currently FDA approved targeted therapy for CRC. There is currently a trial underway in the UK recruiting V600E CRC patients for treatment with a triple therapy combination of Cetuximab, Trametinib and Dabrafenib. We have mutagenized a pool of V600E CRC cell lines and treated with this triple therapy to select out drug resistant clones. We will now sequence these drug resistant clones with the aim of identifying common point mutations engendering resistance to this new therapy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001002053 
   
  
    
    dataset CML WGS VCF 
    
   
  
    
   
  29 
 
  
    EGAD00001002054 
   
  
    
    dataset CML WES VCF 
    
   
  
    
   
  24 
 
  
    EGAD00001002055 
   
  
    
    Whole exome sequencing from matched tumor-control samples of 121 primary lymphoma samples. Sequencing was performed on Illumina HiSeq2000. The dataset contains FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  242 
 
  
    EGAD00001002056 
   
  
    
    Paired-end RNA sequencing using total RNA from 136 primary lymphoma samples. Sequencing was performed on the Illumina HiSeq2000 with 300bp insert size. The dataset contains FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  136 
 
  
    EGAD00001002057 
   
  
    
    dataset CML WGS pairend fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001002058 
   
  
    
    dataset CML WGS pairend bam 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001002059 
   
  
    
    dataset CML WES pairend fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001002060 
   
  
    
    dataset CML WES pairend bam 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001002061 
   
  
    
    BMI1 ChIP-seq on human K562 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001002062 
   
  
    
    BMI1 ChIP-seq on human K562 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001002064 
   
  
    
    Zhong Shan Hospital liver tumor single cell sequencing: 111 single cell and 6 tissues 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  117 
 
  
    EGAD00001002065 
   
  
    
    Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001002066 
   
  
    
    KRAS mutant CRC is currently in clinical trial with a combination of a MEK and Akt inhibitor. These patients will likely develop resistance to this combination. We aim to identify the mechanisms of resistance via ENU mutagenesis, with a view to identifying additional therapeutics which have the ability to overcome this resistance. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  86 
 
  
    EGAD00001002067 
   
  
    
    Renal cell carcinoma (RCC) is a genomically heterogeneous tumor. In the present project, the question whether intratumoral heterogeneity follows a zonal pattern indicating spatial niches was addressed. Whole exome sequencing of 16 paired samples from tumor periphery and center revealed a number of region-specific functional SNVs and Indels. Therefore, RCCs are not composed of evenly admixed tumor cells but show topological differences in their clonal composition. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001002068 
   
  
    
    The dataset consists of 232 RNA-seq samples (whole blood) obtained from healthy female from the TwinsUK adult registry cohort. The samples were obtained at two time points separated on average by 22 months. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  232 
 
  
    EGAD00001002069 
   
  
    
    Complete genomics data for VCaP and PC346c. 
    
   
  
    
   
  2 
 
  
    EGAD00001002070 
   
  
    
    Whole genome sequencing CRAM files for four samples from the BRIDGE Consortium (SPEED project) with pathogenic variants in a gene associated with a movement disorder. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002071 
   
  
    
    qDNAseq shallow sequencing dataset of the cell line use case. 
    
   
  
    
   
  5 
 
  
    EGAD00001002072 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001002073 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001002074 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of WNT reporter of PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002075 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002076 
   
  
    
    RNAseq of Patient-derived xenograft derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001002077 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  87 
 
  
    EGAD00001002078 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001002079 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002080 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001002081 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002082 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Blood EDTA 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  69 
 
  
    EGAD00001002083 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002084 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001002085 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001002086 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002087 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001002088 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  87 
 
  
    EGAD00001002089 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001002090 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002091 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001002092 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002093 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Blood EDTA 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001002094 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002095 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001002096 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001002097 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002098 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001002099 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  55 
 
  
    EGAD00001002100 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001002101 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002102 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001002103 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002104 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of Blood EDTA 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  76 
 
  
    EGAD00001002105 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer metastasis sample 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
      AB 5500xl-W Genetic Analysis System 
      
    
   
  16 
 
  
    EGAD00001002106 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of Patient-derived xenograft derived from colorectal cancer metastasis sample 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  1 
 
  
    EGAD00001002107 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer primary tumor sample 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
      AB 5500xl-W Genetic Analysis System 
      
    
   
  66 
 
  
    EGAD00001002108 
   
  
    
    Exome and targeted amplicon sequencing data for tumor, germline and plasma samples from a patient with metastatic breast cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  30 
 
  
    EGAD00001002109 
   
  
    
    TSACP TruSeq Amplicon Panel dataset for the TraIT cell line use case 
    
   
  
    
   
  5 
 
  
    EGAD00001002110 
   
  
    
    Chronic lymphocytic leukemia (CLL) is characterized by substantial clinical heterogeneity, despite relatively few genetic alterations. To provide a basis for studying epigenome deregulation in CLL, we established genome-wide chromatin accessibility maps for 88 CLL samples from 55 patients using the ATAC-seq assay, and we also performed ChIPmentation and RNA-seq profiling for ten representative samples. Based on the resulting dataset, we devised and applied a bioinformatic method that links chromatin profiles to clinical annotations. Our analysis identified sample-specific variation on top of a shared core of CLL regulatory regions. IGHV mutation status – which distinguishes the two major subtypes of CLL – was accurately predicted by the chromatin profiles, and gene regulatory networks inferred for IGHV-mutated vs. IGHV-unmutated samples identified characteristic differences between these two disease subtypes. In summary, we discovered widespread heterogeneity in the chromatin landscape of CLL, established a community resource for studying epigenome deregulation in leukemia, and demonstrated the feasibility of chromatin accessibility mapping in cancer cohorts and clinical research. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  138 
 
  
    EGAD00001002111 
   
  
    
    70 Whole exome sequencing from 9 patients with DIPG for project Spatial and Temporal Homogeneity of Driver Mutations in Diffuse Intrinsic Pointine Glioma 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  70 
 
  
    EGAD00001002112 
   
  
    
    RNA-seq data from 195 pediatric BCP-ALL cases. Alignment: TopHat 2.0.7. Reference genome: hg19. 
    
   
  
    
      
      Illumina HiScanSQ 
      
    
   
  195 
 
  
    EGAD00001002113 
   
  
    
    Mate pair whole genome sequencing data from 15 pediatric BCP ALL cases. Reference genome: hg19. Alignment: BWA 0.7.9a. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001002115 
   
  
    
    Targeted sequencing of 173 genes in 2433 primary breast tumours.  Data includes 2433 tumour samples, 523 adjacent normal (breast) samples and 127 blood samples.  Libraries were prepared with Illumina's Nextera custom enrichment kit targetting all the exons of the most frequently mutated breast cancer genes.  Libraries were multiplexed (48 libraries per lane) and sequenced on Illumina HiSeq 2000 (100bp paired-end reads).  Somatic mutations were calling with a custom pipeline.  
We identified 40 mutation-driver (Mut-driver) genes, and determined associations between mutations, driver CNA profiles, clinical-pathological parameters and survival. We assessed the clonal states of Mut-driver mutations, and estimated levels of intra-tumour heterogeneity using mutant-allele fractions. The results emphasize the importance of genome-based stratification of breast cancer, and have important implications for designing therapeutic strategies.
Referece: Pereira et al. (2016) The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes.  Nature Communications 
    
   
  
    
   
  3083 
 
  
    EGAD00001002116 
   
  
    
    Raw data (fastq files) from whole exome sequencing of AML patients (paired diagnosis and complete remission samples) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001002117 
   
  
    
    Raw data (fastq files) from targeted resequencing of AML patients at diagnosis 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  68 
 
  
    EGAD00001002118 
   
  
    
    Raw data (fastq files) from targeted resequencing of AML patients at relapse 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  24 
 
  
    EGAD00001002119 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LAML-KR. 
    
   
  
    
   
  18 
 
  
    EGAD00001002120 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: ORCA-IN. 
    
   
  
    
   
  26 
 
  
    EGAD00001002121 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BTCA-SG. 
    
   
  
    
   
  24 
 
  
    EGAD00001002122 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BRCA-UK. 
    
   
  
    
   
  90 
 
  
    EGAD00001002123 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: MALY-DE. 
    
   
  
    
   
  202 
 
  
    EGAD00001002124 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: EOPC-DE. 
    
   
  
    
   
  113 
 
  
    EGAD00001002125 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BOCA-UK. 
    
   
  
    
   
  148 
 
  
    EGAD00001002126 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PRAD-UK. 
    
   
  
    
   
  116 
 
  
    EGAD00001002127 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PBCA-DE. 
    
   
  
    
   
  496 
 
  
    EGAD00001002128 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PRAD-CA. 
    
   
  
    
   
  244 
 
  
    EGAD00001002129 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BRCA-EU. 
    
   
  
    
   
  158 
 
  
    EGAD00001002130 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: CLLE-ES. 
    
   
  
    
   
  194 
 
  
    EGAD00001002131 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: RECA-EU. 
    
   
  
    
   
  190 
 
  
    EGAD00001002132 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PACA-AU. 
    
   
  
    
   
  192 
 
  
    EGAD00001002133 
   
  
    
    Dataset contains Whole Exome Sequencing(WES) data from 37 individuals as aligned bam-files. The reads have been aligned using bowtie2 to human genome hg19 build. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001002135 
   
  
    
    ChIPseq data of Atypical teratoid/rhabdoid tumors (ATRT) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001002136 
   
  
    
    RNA sequencing data of Atypical teratoid/rhabdoid tumors (ATRT) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001002137 
   
  
    
    WGBS data of Atypical teratoid/rhabdoid tumors (ATRT) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001002138 
   
  
    
    WGS data of Atypical teratoid/rhabdoid tumors (ATRT) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001002142 
   
  
    
    Paired PCR-free whole genome sequencing data of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. The data was generated with mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions. 
    
   
  
    
   
  24 
 
  
    EGAD00001002143 
   
  
    
    We expanded our previous collection of longitudinal GBM patients (EGAS00001001041) by recruiting 21 additional patients. Tumor specimens were subjected to whole-exome sequencing (16 of 21 cases, with the matched normal/blood) and transcriptome sequencing (16 of 21 cases). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  86 
 
  
    EGAD00001002144 
   
  
    
    The morphology of the first humans in the Americas (Paleoamericans) differs from that of Native Americans, and has raised the question of whether or not there are also differences in origin or genetics. A few populations who survived until relatively recently have been suggested to retain Paleoamerican morphology. One of these populations is from La Jolla. Here, we have generated genome sequence data from four La Jolla individuals in order to investigate these questions
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002145 
   
  
    
    Whole exome sequencing data of primary, secondary and tertiary tumor from a patient. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001002146 
   
  
    
    The dataset contains the whole genome sequencing data of a family with two unaffected parents and two probands that showed Hereditary spastic paraplegias symptoms. Sequencing reads were aligned to human genome (GRCh38) using BWA-MEM, followed by indel-realignment and PCR-duplicates marking. Alignment results are available for download in BAM format. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001002148 
   
  
    
    Directed differentiation of stem cells offers a scalable solution to the need for human cell models recapitulating islet biology and T2D pathogenesis. We profiled mRNA expression at six stages of an induced pluripotent stem cell (iPSC) model of endocrine pancreas development from two donors, and characterized the distinct transcriptomic profiles associated with each stage. Established regulators of endodermal lineage commitment, such as SOX17 (log2 fold change [FC] compared to iPSCs=14.2, p-value=4.9x10-5) and the pancreatic agenesis gene GATA6 (log2 FC=12.1, p-value=8.6x10-5), showed transcriptional variation consistent with their known developmental roles. However, these analyses highlighted many other genes with stage-specific expression patterns, some of which may be novel drivers or markers of islet development. For example, the leptin receptor gene, LEPR, was most highly expressed in published data from in vivo-matured cells compared to the endocrine pancreas-like cells (log2 FC=5.5, p-value=2.0x10-12), suggesting a role for the leptin pathway in the maturation process. Endocrine pancreas-like cells showed significant stage-selective expression of adult islet genes, including INS, ABCC8, and GLP1R, and enrichment of relevant GO-terms (e.g. “insulin secretion”; odds ratio=4.2, p-value=1.9x10-3): however, principal component analysis indicated that in vitro-differentiated cells were more immature than adult islets. Integration of the stage-specific expression information with genetic data from T2D genome-wide association studies revealed that 46 of 82 T2D-associated loci harbor genes present in at least one developmental stage, facilitating refinement of potential effector transcripts. Together, these data show that expression profiling in an iPSC islet development model can further understanding of islet biology and T2D pathogenesis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001002149 
   
  
    
    Low coverage whole genome sequencing for the identification of somatic copy number alterations (SCNA) and focal amplification mapping in plasma DNA of prostate cancer patients 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  95 
 
  
    EGAD00001002150 
   
  
    
    Low coverage whole genome sequencing for the identification of somatic copy number alterations (SCNA) and focal amplification mapping of corresponding tumor material 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  8 
 
  
    EGAD00001002151 
   
  
    
    Whole transcriptome sequencing of 231 children with newly-diagnosed ALL 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  231 
 
  
    EGAD00001002152 
   
  
    
    Whole exome sequencing for the matched germline and tumor DNA from 10 ALL cases with ZNF384 rearrangements. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001002153 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PAEN-IT. 
    
   
  
    
   
  74 
 
  
    EGAD00001002154 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PAEN-AU. 
    
   
  
    
   
  98 
 
  
    EGAD00001002155 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LIRI-JP. 
    
   
  
    
   
  524 
 
  
    EGAD00001002156 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: ESAD-UK. 
    
   
  
    
   
  198 
 
  
    EGAD00001002157 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: MELA-AU. 
    
   
  
    
   
  140 
 
  
    EGAD00001002158 
   
  
    
    This is an in vitro genome-wide CRISPR/cas9 screen in human glioblastoma stem cells, screening for genes essential for survival of these cells. These cells express cas9 and have been transfected with a guide RNA library causing gene knockouts. We will analyse the sequencing data for depletion of guide RNAs. This dataset contains all the data available for this study on 2016-06-02. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002159 
   
  
    
    Exome Seq for Study EGAS00001001844 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002160 
   
  
    
    Exome Seq for EGAS00001001845 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001002161 
   
  
    
    Transcriptome from EGAS00001001845 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002162 
   
  
    
    Exome Seq from EGAS00001001846 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001002163 
   
  
    
    Transcriptome from EGAS00001001846 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002164 
   
  
    
    Exome from EGA00001001848 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002165 
   
  
    
    Samples were sequenced from 33 multiple myeloma patients including tumor presentation and relapse samples and a matched patient control sample. Tumor DNA was isolated from CD138-positive plasma cells. Control DNA originated from peripheral blood leukapheresis products collected after induction therapy. Libraries were prepared using the SureSelectQXT sample prep kit and the SureSelect Clinical Research Exome kit (Agilent), with additional baits covering the Ig and MYC loci. Paired-end sequencing was performed to an average sequencing depth of 118× on a HiSeq2500 (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  99 
 
  
    EGAD00001002166 
   
  
    
    Exome from EGAS00001001861 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  17 
 
  
    EGAD00001002167 
   
  
    
    A KNIH001 mRNA-seq paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002168 
   
  
    
    A KNIH002 mRNA-seq paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002169 
   
  
    
    A KNIH003 mRNA-seq paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002170 
   
  
    
    A KNIH004 mRNA-seq paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002171 
   
  
    
    A KNIH005 mRNA-seq paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002172 
   
  
    
    A KNIH006 mRNA-seq paired end data for beta cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002173 
   
  
    
    A KNIH007 mRNA-seq paired end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002174 
   
  
    
    A KNIH008 mRNA-seq paired end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002175 
   
  
    
    A KNIH009 mRNA-seq paired end data for preadipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002176 
   
  
    
    A KNIH010 mRNA-seq paired end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002177 
   
  
    
    A KNIH011 mRNA-seq paired end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002178 
   
  
    
    The study will analyse by exome sequencing 8 Greek family members with an excess of potentially damaging mutations relating to premature MI and no vessel disease, to identify genetic factors underlying this condition. This is a follow on from project GPMI-NVD 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002179 
   
  
    
    Background: A rare subgroup of HIV infected individuals naturally controls infection without
treatment. These ?elite controllers? constitute an important model for the natural control of
HIV infection. Indeed, the study of these individuals may provide insights into strategies for
the development of HIV vaccines. Although several HLA and chemokine alleles are known
to be over-represented in elite controllers, only a small portion of HIV phenotypic variation is
explained by known genetic variants. The elite controller phenotype is rare and distinct,
representing the extreme of an infectious disease trait. As such, this phenotype may be partly
explained by variation in host immune control, which may be characterized by differences in
rare functional genetic variants. Genomic regions underlying elite control can be potentially
identified by comparing the presence or frequency of variants in this group to that
representing the opposite extreme. In this context, ?rapid progressors? is a group defined by
its rapid immunological and clinical disease progression.
Aim: To extend an existing study, in order to identify DNA sequence variants involved in the
control of HIV infection with greater statistical resolution. Specifically, we aim to sequence up
to 200 exomes from multiple cohort studies within the EuroCoord CASCADE collaboration (a
collaboration of 25 HIV seroconversion cohort studies across Europe). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  183 
 
  
    EGAD00001002180 
   
  
    
    Targeted pulldown of genes known to be recurrently mutated in AML & MDS from patient and normal samples using Agilent Sureselect and for some cases also using Illumina Truseq technology. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  288 
 
  
    EGAD00001002181 
   
  
    
    Barrett?s oesophagus is common in the UK affecting 2 % of the population. Family history has been
recorded among the 4000 Barrett's cases collected so far and have 241 families. Among them we
have assessed 6 multiplex families with proven Barrett?s and defined as having 1 pro band and at
least 3 affected first degree members. We propose to exome sequence the probands of these six
families to assess the presence of pathogenic rare coding variants. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002182 
   
  
    
    The BMP antagonist Grem1 has been shown to be associated with a rare human polyposis
syndrome (HMPS). We have shown that there is a 40KB duplication on chrom 15 found in
some patients with HMPS. Traditional serrated adenomas (rare sporadic polyps) share some
morphological features with HMPS polyps and it has long been hypothesised that they are the
sporadic version of HMPS polyps. We have obtained of one of these
lesions and in this project we aim to characterise this tumour. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001002183 
   
  
    
    This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001002184 
   
  
    
    Sequencing of rare human histiocytic tumour 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002185 
   
  
    
    Exome sequencing of 32 patient samples from Sri Lanka with the condition haemoglobin E beta thalassaemia 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001002186 
   
  
    
    Around 10% of patients who present in melanoma clinics have a first degree relative with a previous diagnosis of melanoma. While around 3% have three or more relatives who have been diagnosed with the disease. In this project we will whole genome sequence patients from large Dutch familial melanoma pedigrees to identify mutations in genes that drive melanomagenesis. The identification of these genes will facilitate the management of familial melanoma patients and their families. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  38 
 
  
    EGAD00001002187 
   
  
    
    To identify transcriptome profile in this unique subset of gastric cancers, RNA-seq analyses were performed using frozen cancer tissue. Adjacent normal tissue of the same patients were used in differently expressed gene selection and fusion gene prediction. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  138 
 
  
    EGAD00001002188 
   
  
    
    Paired-end BAM files of mitochondrial whole genome deep sequencing (mtWGDS) analysis 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  105 
 
  
    EGAD00001002189 
   
  
    
    paired-end BAM files of the sequencing analysis of the mtDNA polymerase gamma (POLG) gene in the MS-affected co-twins 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  54 
 
  
    EGAD00001002190 
   
  
    
    Single-end BAM files of the targeted deep sequencing analysis of several mtDNA candidate regions in blood and buccal-derived DNA of the corresponding twin pairs. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  140 
 
  
    EGAD00001002191 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001002192 
   
  
    
    Additional sequencing data for 173 donors in EGAS00001000154, a study of Pancreatic Ductal Adenocarcinoma. WGS libraries were used for high-cellularity cases, WXS sequencing to high depth on low-cellularity cases. HiSeq 2xxx platform was used in all cases. The analysis files associated with this dataset are merged, de-duplicated bams aligned against GRCh37, one tumour and one normal bam per donor. 
    
   
  
    
   
  346 
 
  
    EGAD00001002193 
   
  
    
    Single case of T-ALL carrying t(4;6), a novel translocation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002194 
   
  
    
    This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
We performed exome sequencing on serial samples from a patient with CMML who progressed to AML.  The exome sequencing suggests that NPM1, TET2 and DNMT3a mutations were present in the dominant clone in the CMML sample and that NRAS is a new subclonal mutation in the AML sample. Diagnostic data shows the presence of a FLT3-ITD mutation in the AML sample, which is likely to have driven progression.  Here we are performing re-sequencing of the putative driver and some passenger mutations which appear to be in the same clone to validate these mutations and to verify the relative quantification of these abnormalities . 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  10 
 
  
    EGAD00001002195 
   
  
    
    The aim of this project is to identify rare genetic variants of large effect implicated in complex diseases by focusing on the study of cardiovascular diseases and related quantitative traits in a well characterized isolated population in Cilento area, Italy.
The reference panel has been selected carefully in order to maximize the imputation coverage and quality on the all population samples. The selected individuals should meet three criteria: selected individuals should be chip-genotyped and closely related to the maximum number of chip-genotyped individuals so as to maximize imputation coverage; relatedness between selected individuals should be minimal, so as to minimize redundancy in genetic information of the reference panel.
We perform exome sequencing on samples from 250 individuals from the Campora and Gioi-Cardile populations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  247 
 
  
    EGAD00001002196 
   
  
    
    Our lab is currently using macrophages as a model system for understanding how genetic
variation modulates the response to external environmental stimulus. We want to extend this
beyond regular polyadenylated RNA to small RNAs such as miRNAs. This project would
cover the costs of a pilot to study miRNA response to LPS stimulus, and will be performed as
part of a rotation project in the lab. We will require a small number of miRNA libraries and a
single lane of MiSeq 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001002197 
   
  
    
    Recent GWAS studies have made extensive use of large eQTL data sets to functionally
annotate index SNPs. With a large number of association signals located outside coding
regions there has been an intense search among sequence variants affecting gene
expression at the transcriptional level. However, little progress has been made in mapping
regulatory variants that affect protein levels at the translational or post-translational level. It is
now possible to undertake a protein QTL scan for focused sets of e.g. oxidized proteins by
mass spectrometry. We have established a collaboration with a longitudinal, family-based
study in France, the Stanislas cohort, which comprises circa 1000 nuclear families (4,295
individuals) and has follow up data for 10 years (three visits). We have undertaken a pilot
study in a focus set of 257 subjects from 79 families with the aim to integrate GWAS,
transcriptomic and DNA methylation data with proteomic data on a set of 100 proteins
measured in PBMCs. We have already generated GWAS data using Illumina's core-exome
chip as well as DNA methylation profiles with the 450K array. We propose to use RNA seq to
generate transcriptomic data of the corresponding PBMCs.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  155 
 
  
    EGAD00001002198 
   
  
    
    This set of samples is composed of eight young people (7-16 years old) that have developed melanoma with first-degree relatives that have also developed cancer, which suggests a genetic component to their disease. Here we want to sequence these samples in order to find the causative mutations. As these samples do not carry any of the high-penetrance mutations known to date, finding the genes(s) responsible will offer new insights into the genetic mechanisms underlying predisposition to melanoma. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002199 
   
  
    
    Sequencing of rare human histiocytic tumour 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002200 
   
  
    
    Whole exome sequencing of families with Congenital Heart Defects (182 trios). Collaboration with David Brook, University of Nottingham. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  541 
 
  
    EGAD00001002201 
   
  
    
    Data for paper: Epigenetic dynamics of monocyte to macrophage differentiation with Chip Seq, NOMe, mRNA, total RNA, noncoding RNA, whole genome bisulfite seq, 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002202 
   
  
    
    Here we have from 64 samples, their corresponding fastq and bam files.
The study group consisted of 17 obese women with normal glucose tolerance and 15 obese women with T2DM classified according to WHO standards. The groups were matched for age, BMI and waist circumference. All the women had been morbidly obese (BMI>40 kg/m2) for at least five years. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  64 
 
  
    EGAD00001002203 
   
  
    
    Sequence data is from 4 samples from an adult patient with TCF3-PBX1 t(1;19)-positive acute lymphoblastic leukemia.
Exome sequencing was performed on a skin biopsy (normal tissue control) and leukemic bone marrow biopsies taken at diagnosis and at two relapse time points. 
RNA-sequence data is from leukemic bone marrow from two relapse biopsies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001002204 
   
  
    
    1006 Familial early onset gemrline CRC patients sequenced by the Molecular and Population Genetics group of the Institute of Cancer Research 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1006 
 
  
    EGAD00001002205 
   
  
    
    The BLUEPRINT project is a large-scale project investigating epigenetic mechanisms involved in blood formation, in health and disease. The  human variation workpackage (WP10, led by NS) of the project seeks to characterize the effect of common sequence variation on the epigenome status of a cell. To do this, the project will use highly purified blood cells to minimise "experimental noise" and therefore enhance the power to discover modest effects.  Two peripheral blood cell types, the CD14+CD16- monocyte (an important central orchestrator of adaptive immunity and a bridge between innate and adaptive immunity) and the CD65+CD9- neutrophilic granulocyte (the frontline cell for innate immunity) have been selected for this purpose.  The two types of cells will be obtained at high purity from adult blood (AB) of 200 healthy males and females, respectively.  Cells will be purified by using already validated and fully operational protocols that are based on density gradient centrifugation of the buffy coat obtained from whole blood, followed by magnetic bead-based purification using monoclonal antibodies against Cluster of Differentiation (CD) lineage-specific cell surface markers.  Units of 475 ml of AB will be obtained from consenting volunteers of the Cambridge BioResource (CBR), a panel of 10,000 healthy volunteers local to Cambridge who have already consented to participate in biomedical research and of whom biological samples (DNA, plasma, serum) and lifestyle data have been deposited in a repository and database, respectively.  We are requesting funding from the Human Diversity project to sequence the genomes of the 200 CBR volunteers at low pass (6x coverage).  Nuclei, DNA and RNA will be recovered from the purified cells and made available for RNA-seq, DNA-seq and ChIP-seq and genomic DNA for entire genome sequencing will be recovered from the DNA repository. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  155 
 
  
    EGAD00001002207 
   
  
    
    Our aim is to identify genes involved in resistance to anti-cancer therapies. In order to do this we have taken advantage of a lentiviral vector (LV)-based insertional mutagen to mutagenize cancer cell lines. LV-transduced cell lines were then treated with anti-cancer therapies and the emergence of resistant clones scored. DNA from pools of resistant clones was collected, subjected to custom capture by baits designed against the LV sequence, and then sequenced to identify the LV-genomic junction. We hope that the identification of recurrently targeted genes in resistant cell population will allow us to identify genes that mediate drug resistance. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  71 
 
  
    EGAD00001002208 
   
  
    
    Exome sequencing of short SGA children with IGF-I and insulin resistance. Collaboration with Professor David Dunger, University of Cambridge. Funded by NIHR. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001002210 
   
  
    
    Congenital anosmias can be complete (the lack of a sense of smell) or specific (the inability to detect specific smells). To date, only a single recessive gene underlying complete anosmia has been identified. Here we sequenced the exomes of 10 individuals from a single family, including three with complete anosmia, across three generations to identify the genetic basis of congenital anosmia in this family. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002211 
   
  
    
    Given the central importance of Africa to studies of human origins, genetic diversity and disease susceptibility, large-scale and representative characterisation of genetic diversity in Africa is needed. Analyses of ancient DNA from Africa would complement sequencing of modern African populations and provide unique opportunities to transform our understanding of the pre-history of the region. This approach would greatly refine our understanding of population structure and gene flow in Africa and globally, including genetic signatures of ancient admixture. This low coverage sequencing experiment will allow us to test and refine our pipeline for ancient DNA sequencing.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002212 
   
  
    
    Non-syndromic cases of congenital heart defects (CHD) exhibit variable modes of inheritance (Mendelian and non-Mendelian). Several studies have identified strong candidates in humans by taking a candidate gene approach as well as by using whole exome next generation sequencing (NGS). So far these studies could only explain a minor fraction of the observed phenotype in humans, most of them in syndromic cases and no single study has focused on the subset of cases with left ventricular outflow tract obstruction (LVOTO). To discover novel disease-causing genes a large cohort of patients with LVOTO, approximately 100 cases, 25 families and 100 trios have been exome sequenced. This study based on NGS sequencing data yielded several known and novel compelling candidate genes, such as MYH6, NR2F2 and MYH11, but also novel ones, such as ITGB4. To evaluate the significance of our findings in a replication cohort we assembled another 1614 cases with an LVOTO phenotype from our collaborators in Toronto, Berlin and Amsterdam. Targeted resequencing in this additional cohort will help to find additional cases with mutations in the identified candidate genes to strengthen genotype-phenotype association. We will use control data from the INTERVAL project for case/control analyses The pulldowns will be performed as 24-plex ISC with 192 or greater indexes, and the sequencing will be performed with 192 samples per lane, requiring 9 lanes of sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1376 
 
  
    EGAD00001002213 
   
  
    
    This study involves exome sequencing of blood/bone marrow DNA from patients with myeloid malignancies. Blood DNA samples have been taken from patients at different timepoints of disease phenotype. We hope to elucidate mechanisms of clonal evolution in these patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001002214 
   
  
    
    Whole transcriptome sequencing generated from patient, neurosphere and xenograft samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  64 
 
  
    EGAD00001002215 
   
  
    
    Low coverage whole genome sequencing plasma DNA from 50 male, 54 female non-cancer donors. For the analysis of nucleosomal positioning all data from the non-cancer controls were merged. Furthermore, two patients with metastasized breast cancer were sequenced on a NextSeq with higher depth. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  108 
 
  
    EGAD00001002216 
   
  
    
    RNA-Seq on an Ion Torrent Proton of corresponding tumor material of two metastasized breast cancer patients (Breast7, Breast13). 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  2 
 
  
    EGAD00001002217 
   
  
    
    Merged file of low-coverage WGS from 179 plasma DNA samples from non-cancer controls and cancer patients for assessment of size distribution of plasma nuclear DNA fragments. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001002218 
   
  
    
    Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - 129_cohort
EAC whole genomic sequencing data - Publication Secrier & Li et al., 2016, Nature Genetics 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002219 
   
  
    
    Whole exome sequencing generated from 13 sets of patient, neurosphere and xenograft samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  82 
 
  
    EGAD00001002220 
   
  
    
    Enteropathy-associated T-cell lymphoma (EATL), a rare and aggressive intestinal malignancy of intraepithelial T lymphocytes, comprises two disease variants (EATL-I and EATL-II) differing in clinical characteristics and pathological features. Here we report findings derived from whole exome sequencing of 15 EATL-II tumor-normal tissue pairs. 
    
   
  
    
   
  15 
 
  
    EGAD00001002221 
   
  
    
    Whole exome sequencing of a subset of participants from the INTERVAL study. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4502 
 
  
    EGAD00001002225 
   
  
    
    This study involves targeted sequencing of samples from myeloid malignancies at different timepoints to assess clonal evolution of malignancy
a.	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  147 
 
  
    EGAD00001002226 
   
  
    
    1. Odors are detected, firstly, by olfactory sensory neurons (OSNs) in the olfactory epithelium of the nose. This neurons then project directly to the olfactory bulb in the brain. Olfaction depends on cellular regeneration of the OE, olfactory bulb and hippocampus, and on their continual re-wiring. The olfactory neural pathway includes regions of the frontal, temporal and limbic brain, which in turn overlap with brain areas involved in brain disorders. OSNs are the only aspect of the human brain exposed to the external environment. This not only makes them vulnerable to environmental changes, but also accessible for biomedical studies.
We have already sequenced and developed a protocol for analyzing the transcriptome of mouse main olfactory epithelium and single OSNs. We propose here to perform a similar study for samples from the human olfactory epithelium. 
We have developed a minimally invasive method for obtaining human OSNs, among other cells from the nasal epithelium. In this experiment, we have obtained cell samples from the olfactory epithelium, including OSN, from healthy volunteers. We would like to further characterize them by RNA sequencing. This will give us valuable insight into human olfaction. It will also provide a first step into a new avenue to study, and find biomarkers for, brain diseases though the analysis of these easily available neurons.   
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001002227 
   
  
    
    In collaboration with Dr David Savage, we have identified a patient with a very unusual
phenotype, lacking almost all visceral fat, but showing a massive accumulation of white fat
tissue behind her neck and significantly elevated liver fat.
Whole exome sequencing of the proband and her unaffected parents and brother has been
run previously, however no causative variant has been found and the sequencing coverage
was generally poor. We propose to conduct whole genome sequencing of all 4 family
members at a depth of 30X. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001002228 
   
  
    
    Congenital anosmias can be complete (the lack of a sense of smell) or specific (the inability to detect specific smells). Here we obtained genomic DNA from families with multiple individuals with anosmia, suggesting they are congenital. These include those inherited in a manner consistent with dominant and recessive alleles. We have sequenced the exomes of both affected and unaffected family members on the Illumina platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001002229 
   
  
    
    Detection of BAP1 mutations in DNA from uveal melanoma and mesothelioma samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001002230 
   
  
    
    Patient-derived xenografts (n=96) were derived from metastatic melanoma patients. RNA expression profiling will be preformed to study  1. HLA-typing and  2. the effect of the tumour microenvironment on tumour growth
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001002231 
   
  
    
    Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shred by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  80 
 
  
    EGAD00001002232 
   
  
    
    Mapping genetic evolution of pancreatic cancer precursor lesions such as IPMNs and PanINs. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001002233 
   
  
    
    RNA sequencing of peripheral immune cells from patients +/- an IBD risk variant. Peripheral immune cells +/- in vitro test compound treatment.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001002234 
   
  
    
    This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001002235 
   
  
    
    Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001002236 
   
  
    
    The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001002237 
   
  
    
    The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001002238 
   
  
    
    ChIP-Seq (H3K4me3, H3K4me1, H3K9me3, H3K27ac, H3K27me3, H3K36me3, Input) data for HL60 cell line generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002239 
   
  
    
    June 2016 data update (bam/fastq for CEMT0062, CEMT0068, CEMT0072, CEMT0086, CEMT0087 ChIP-Seq and RNA-Seq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001002240 
   
  
    
    Whole-exome sequencing of a RUNX1-mutated pedigree, including samples from mother, father and four offsprings. Recurrent somatic JAK-STAT mutations were found among the diseased individuals. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002241 
   
  
    
    Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - chemo_cohort 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002242 
   
  
    
    This dataset contains RNA-seq and Hi-C data files of induced pluripotent stem (iPS) cells and iPS cell-derived neural progenitors (NPCs) derived from a germline  chromothripsis patient and both parents. iPS cells of the patient (cell lines 14 and 15), the father (lines 23 (with two replicates) and 32) and mother (line 30) were
differentiated to NPCs and RNA was collected on day 0, day 7 and day 10 of differentiation. In addition, Hi-C data for two iPS cell-derived NPC lines from the patient (14 and 15) and two lines from the father (23 and 32) was generated. 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD00001002243 
   
  
    
    RNA-seq data for clinical samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001002244 
   
  
    
    WGS data for cell lines and clinical samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001002245 
   
  
    
    This data set consists of 82 whole genome low pass sequencing bams used in HF-GBM-Tumor-Neurosphere-Xenograft 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  82 
 
  
    EGAD00001002246 
   
  
    
    The T2D-GENES/GoT2D 13K exome sequencing study includes ~13,000 samples, half T2D cases and half T2D controls, from five ancestries (~5K Europeans, ~2K each of African-American, East-Asian, South-Asian, and Hispanic). Samples underwent deep exome sequencing, with SNVs and INDEls called according to GATK best practices; variant sites were then filtered according to the GATK best practices, and then samples and variants underwent further filtering based on aggregate genotype quality as described in Fuchsberger et al. (e.g. low call rate, excess heterozygosity for samples, low call rate or coverage for variants).
Please note that one of the samples in the T2D-GENES vcf does not have phenotype data. 
    
   
  
    
   
  13007 
 
  
    EGAD00001002247 
   
  
    
    The GoT2D study includes ~2800 samples, half T2D cases and half T2D controls, of Northern European ancestry sequenced over 3 three technologies: deep whole exome sequencing, low-pass (4x) whole genome sequencing, and OMNI 2.5M genotyping. Samples were ascertained to be phenotypically "extreme" (e.g. leaner, younger cases and older, more obese controls). Genotypes (SNVs, INDELs, and SVs) were called separately for each technology and then integrated via genotype refinement into a single phased reference panel; samples and variants were then excluded based on QC procedures described in Fuchsberger et al. 
Please note that 2 of the samples in the GoT2D vcf do not have phenotype data. 
    
   
  
    
   
  2872 
 
  
    EGAD00001002248 
   
  
    
    Total of 49 tumor specimens from 20 patients were subjected for whole-exome and/or whole-transcriptome sequencing including matched normal/blood. Tumor samples are acquired based on 4 categories; 1) locally adjacent tumors, 2) multifocal/multicentric tumors, 3) 5-ALA (+/-) tumors and 4) Longitudinal tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  104 
 
  
    EGAD00001002249 
   
  
    
    Single-Cell RNA Sequencing of 355 cells isolated from 7 tissue fragments of 3 patients corresponding to locally adjacent tumor, multifocal with recurrence and sections segregated by a marker of tumor cellularity (5-ALA). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  355 
 
  
    EGAD00001002250 
   
  
    
    mRNA-Seq, HiSeq 2000 dataset of the Cell-line use case 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002251 
   
  
    
    Exome sequencing of families with Congenital Heart Defects of diverse sub-phenotypes. Comprises both parent-offspring trios for sporadic cases and multiplex families. Collaboration with David Brook, University of Nottingham. Funded by the British Heart Foundation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  646 
 
  
    EGAD00001002252 
   
  
    
    This data set contains next generation sequencing (NGS) data  of two serial tumor samples (primary and a metastasis) from a patient with colorectal cancer showing an ERBB2 c.2264T>C (p.Leu755Ser). NGS was performed using the Illumina TruSeq Amplicon Cancer Panel (TSACP, Illumina) covering 212 amplicons in 48 cancer associated genes on the Illumina MiSeq sequencing platform. The dataset contains two BAM files. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001002253 
   
  
    
    Thirty cutaneous SCC WES tumour samples with matched normal include 20 samples from South et al. JID and 10 new samples. These 30 samples has been used to support the findings in the TGFb Nature Communications paper (DOI: 10.1038/ncomms12493). They are also a part of the ongoing study of cSCC genomic landscape of 40 cSCC samples in total. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001002254 
   
  
    
    Single-end sequencing data (trimmed to 60bp) of 104 plasma samples from donors without tumors (male=50; female=54) were merged and used to establish coverage profiles around the TSS and to establish a gene expression prediction algorithm. Dataset includes merged alignements of low coverage whole genome sequencing from plasma DNA from 50 male, 54 female non-cancer donors. Furthermore, 2 patients with metastasized breast cancer were sequenced on a NextSeq with higher depth. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  3 
 
  
    EGAD00001002255 
   
  
    
    Sequencing Data for DEEP Paper: "reChIP-seq reveals widespread bivalency of H3K4me3 and H3K27me3 in CD4+ memory T-Cells"
Sample: 51_Hf01_BlCM_Ct (human, female, Blood, CD4+ central memory cell, normal control)
Sequencing types are: total RNA, Whole Genome Bisulfite, ChipSeq (H3K27ac, H3K9me3, H3k36me3, H3K4me1, H3k27me3, H3K4me3, Input), reChipSeq (H3K27me3, H3K4me3) 
    
   
  
    
   
  1 
 
  
    EGAD00001002256 
   
  
    
    Corresponding data set is composed of whole exome sequencing of Korean ER positive breast cancer under 35. This set provides 100 alignment files from normal-tumor paired whole exome sequencing of 50 patients. This is a part of total project data set. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  100 
 
  
    EGAD00001002257 
   
  
    
    This dataset includes whole genome sequence information for three individuals (Mother, Father and Newborn) used in this study.  Genomes were sequenced using Illumina HiSeq technology. Files included are fastq files in paired read format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002258 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002259 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001002260 
   
  
    
    Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - 129_rnaseq
EAC expression data - Publication Secrier & Li et al., 2016, Nature Genetics 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001002261 
   
  
    
    These files contain indels and structural variants on 769 GoNL samples (SV release 6, 2016-05-25). 
    
   
  
    
      
      Illumina HiSeq 2000; 
      
    
   
  - 
 
  
    EGAD00001002262 
   
  
    
    26 cell lines derived from human Diffuse Large B Cell lymphomas (DLBCL) or Burkit Lymphomas (BL) were subjected to whole exome sequencing. Exome capture was carried out using the SeqCap EZ Exome Library 2.0 kit (Roche/Nimblegen) and 100 bp single-read sequencing was performed on a HiSeq2500 (Illumina). 82% of the coding region was covered at least 30x. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001002263 
   
  
    
    This is the first dataset for the Botseq sequencing project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001002264 
   
  
    
    This data set consists of whole genome SMRT sequencing fastqs generated from 2 xenograft samples. 
    
   
  
    
      
      PacBio RS II 
      
    
   
  2 
 
  
    EGAD00001002265 
   
  
    
    A pulldown experiment with Agilent SureSelect probes designed on regions that were more likely to contain de novo mutations. 266 candidate sites were selected based on whole genome sequencing data. The probes also included the exons of genes that have been identified as neurodevelopmental disorder genes in DDD (the DDG2P genes) 1,336 targets. In addition, the design included the standard iPLEX sites. 
    
   
  
    
   
  4 
 
  
    EGAD00001002266 
   
  
    
    The contribution of genetic predisposing factors to the development of pediatric acute lymphoblastic leukemia (ALL), the most frequently diagnosed cancer in childhood, has not been fully elucidated. Children presenting with multiple de novo leukemias are more likely to suffer from genetic predisposition. Here, we selected five of these patients and analyzed the mutational spectrum of normal and malignant tissues. 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
      AB SOLiD 4 System 
      
    
   
  14 
 
  
    EGAD00001002268 
   
  
    
    PCHiC 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  53 
 
  
    EGAD00001002269 
   
  
    
    We expressed PDGFRAmut, wild-type PDGFRA and a GFP control from lentivirus, in two primary GBM patient-derived cell lines that we had cultured as monolayers. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001002270 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10282) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002271 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10345) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002272 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10360) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002273 
   
  
    
    We performed bulk exome-seq on a primary GBM and a blood sample from SF10345 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002274 
   
  
    
    We performed bulk exome-seq on a primary GBM and a blood sample from SF10360 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002275 
   
  
    
    We performed bulk exome-seq on a primary GBM and a blood sample from SF10282 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002276 
   
  
    
    Exome sequencing reads of two UFM individuals and their family members (totally 11 individuals) belonging to two different Fragile X families. Alignment files in BAM format are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001002277 
   
  
    
    Variation in the Glucose Transporter gene SLC2A2 is associated with glycaemic response to metformin 
    
   
  
    
   
  1 
 
  
    EGAD00001002278 
   
  
    
   
  
    
   
  58 
 
  
    EGAD00001002279 
   
  
    
    ChIP-Seq data for 3 monocyte - None sample(s). 17 run(s), 17 experiment(s), 17 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002280 
   
  
    
    ChIP-Seq data for 1 Acute Lymphocytic Leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002281 
   
  
    
    ChIP-Seq data for 5 plasma cell sample(s). 24 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002282 
   
  
    
    ChIP-Seq data for 1 unswitched memory B cell sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002283 
   
  
    
    ChIP-Seq data for 3 effector memory CD8-positive, alpha-beta T cell sample(s). 16 run(s), 15 experiment(s), 15 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002284 
   
  
    
    Bisulfite-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 61 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002285 
   
  
    
    DNase-Hypersensitivity data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002286 
   
  
    
    DNase-Hypersensitivity data for 28 CD14-positive, CD16-negative classical monocyte sample(s). 28 run(s), 28 experiment(s), 28 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001002287 
   
  
    
    RNA-Seq data for 1 T-cell Acute Lymphocytic Leukemia sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002288 
   
  
    
    RNA-Seq data for 3 neutrophilic myelocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002289 
   
  
    
    RNA-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002290 
   
  
    
    DNase-Hypersensitivity data for 4 macrophage - T=6days LPS sample(s). 6 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002291 
   
  
    
    Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 11 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002292 
   
  
    
    ChIP-Seq data for 3 Acute Myeloid Leukemia - SAHA sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002293 
   
  
    
    ChIP-Seq data for 2 regulatory T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002294 
   
  
    
    Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 35 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002295 
   
  
    
    RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002296 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days_LPS_T=4hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002297 
   
  
    
    ChIP-Seq data for 2 monocyte - RPMI_BG_T=1hr sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001002298 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - SAHA sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002299 
   
  
    
    RNA-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002300 
   
  
    
    DNase-Hypersensitivity data for 2 monocyte - T=0days sample(s). 4 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002301 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=4hrs sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002302 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=24hrs sample(s). 22 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002303 
   
  
    
    Bisulfite-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 45 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002304 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002305 
   
  
    
    Bisulfite-Seq data for 6 alternatively activated macrophage sample(s). 94 run(s), 7 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002306 
   
  
    
    RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002307 
   
  
    
    ChIP-Seq data for 3 Activated B-Cell-Like Diffuse Large B-Cell Lymphoma sample(s). 12 run(s), 12 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002308 
   
  
    
    RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002309 
   
  
    
    Bisulfite-Seq data for 2 mature eosinophil sample(s). 23 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002310 
   
  
    
    ChIP-Seq data for 1 conventional dendritic cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002311 
   
  
    
    Bisulfite-Seq data for 2 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002312 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (24h) sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002313 
   
  
    
    Bisulfite-Seq data for 7 Acute Lymphocytic Leukemia sample(s). 132 run(s), 9 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002314 
   
  
    
    ChIP-Seq data for 2 macrophage - T=6days B-glucan sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002315 
   
  
    
    RNA-Seq data for 6 naive B cell sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002316 
   
  
    
    RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002317 
   
  
    
    ChIP-Seq data for 7 alternatively activated macrophage sample(s). 50 run(s), 49 experiment(s), 49 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002318 
   
  
    
    ChIP-Seq data for 4 neutrophilic myelocyte sample(s). 28 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002319 
   
  
    
    ChIP-Seq data for 3 Acute Promyelocytic Leukemia - ATRA sample(s). 21 run(s), 20 experiment(s), 20 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002320 
   
  
    
    RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002321 
   
  
    
    RNA-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002322 
   
  
    
    Bisulfite-Seq data for 5 plasma cell sample(s). 77 run(s), 5 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002323 
   
  
    
    RNA-Seq data for 7 plasma cell sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002324 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002325 
   
  
    
    Bisulfite-Seq data for 2 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002326 
   
  
    
    RNA-Seq data for 2 mature eosinophil sample(s). 3 run(s), 3 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002327 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=1hr sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002328 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_T=4hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002329 
   
  
    
    ChIP-Seq data for 3 Burkitt Lymphoma sample(s). 13 run(s), 13 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002330 
   
  
    
    Bisulfite-Seq data for 1 precursor B cell sample(s). 6 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002331 
   
  
    
    Bisulfite-Seq data for 3 neutrophilic myelocyte sample(s). 31 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002332 
   
  
    
    RNA-Seq data for 1 Acute Lymphocytic Leukemia - CTR sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002333 
   
  
    
    Bisulfite-Seq data for 2 Acute Myeloid Leukemia - CTR sample(s). 28 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002334 
   
  
    
    RNA-Seq data for 2 adult endothelial progenitor cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002335 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_T=24hrs sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002336 
   
  
    
    RNA-Seq data for 5 Mantle Cell Lymphoma sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002337 
   
  
    
    RNA-Seq data for 4 macrophage - T=6days LPS sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002338 
   
  
    
    RNA-Seq data for 4 monocyte - None sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002339 
   
  
    
    RNA-Seq data for 6 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 24 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002340 
   
  
    
    ChIP-Seq data for 7 Acute Myeloid Leukemia - CTR sample(s). 45 run(s), 44 experiment(s), 44 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002341 
   
  
    
    RNA-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002342 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MS275 sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002343 
   
  
    
    RNA-Seq data for 2 Acute Myeloid Leukemia - SAHA sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002344 
   
  
    
    ChIP-Seq data for 1 Acute Myeloid Leukemia - MC2884 sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002345 
   
  
    
    RNA-Seq data for 2 conventional dendritic cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002346 
   
  
    
    Bisulfite-Seq data for 2 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 34 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002347 
   
  
    
    RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002348 
   
  
    
    RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002349 
   
  
    
    RNA-Seq data for 2 central memory CD4-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002350 
   
  
    
    DNase-Hypersensitivity data for 3 erythroblast sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002351 
   
  
    
    RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002352 
   
  
    
    RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002353 
   
  
    
    RNA-Seq data for 3 Acute Promyelocytic Leukemia - CTR sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002354 
   
  
    
    Bisulfite-Seq data for 3 class switched memory B cell sample(s). 43 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002355 
   
  
    
    DNase-Hypersensitivity data for 37 Acute Myeloid Leukemia sample(s). 38 run(s), 37 experiment(s), 37 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001002356 
   
  
    
    RNA-Seq data for 3 Acute Promyelocytic Leukemia - ATRA sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002357 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002358 
   
  
    
    RNA-Seq data for 8 erythroblast sample(s). 30 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002359 
   
  
    
    RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2392 sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002360 
   
  
    
    RNA-Seq data for 4 monocyte - T=0days sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002361 
   
  
    
    Bisulfite-Seq data for 3 naive B cell sample(s). 39 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002362 
   
  
    
    ChIP-Seq data for 3 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 19 run(s), 18 experiment(s), 18 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002363 
   
  
    
    RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002364 
   
  
    
    Bisulfite-Seq data for 2 central memory CD4-positive, alpha-beta T cell sample(s). 41 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002365 
   
  
    
    RNA-Seq data for 1 blast forming unit erythroid sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002366 
   
  
    
    RNA-Seq data for 3 neutrophilic metamyelocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002367 
   
  
    
    Bisulfite-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 34 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002368 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_BG_T=4hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001002369 
   
  
    
    ChIP-Seq data for 2 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 7 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002370 
   
  
    
    Bisulfite-Seq data for 2 CD8-positive, alpha-beta thymocyte sample(s). 28 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002371 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_T=6days sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002372 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (4h) sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002373 
   
  
    
    Bisulfite-Seq data for 1 macrophage - T=6days untreated sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002374 
   
  
    
    DNase-Hypersensitivity data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002375 
   
  
    
    RNA-Seq data for 2 monocyte sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002376 
   
  
    
    ChIP-Seq data for 2 monocyte - RPMI_LPS_T=1hr sample(s). 8 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001002377 
   
  
    
    ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002378 
   
  
    
    Bisulfite-Seq data for 3 band form neutrophil sample(s). 34 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002379 
   
  
    
    ChIP-Seq data for 4 Multiple Myeloma sample(s). 34 run(s), 28 experiment(s), 28 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002380 
   
  
    
    RNA-Seq data for 3 segmented neutrophil of bone marrow sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002381 
   
  
    
    ChIP-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 16 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002382 
   
  
    
    DNase-Hypersensitivity data for 4 macrophage sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002383 
   
  
    
    Bisulfite-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 30 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002384 
   
  
    
    ChIP-Seq data for 106 Chronic Lymphocytic Leukemia sample(s). 174 run(s), 163 experiment(s), 162 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  107 
 
  
    EGAD00001002385 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days sample(s). 18 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002386 
   
  
    
    ChIP-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002387 
   
  
    
    RNA-Seq data for 7 alternatively activated macrophage sample(s). 9 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002388 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001002389 
   
  
    
    ChIP-Seq data for 3 Lymphoma_Follicular sample(s). 11 run(s), 11 experiment(s), 11 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002390 
   
  
    
    ChIP-Seq data for 4 segmented neutrophil of bone marrow sample(s). 24 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001002391 
   
  
    
    ChIP-Seq data for 2 osteoclast sample(s). 17 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002392 
   
  
    
    Bisulfite-Seq data for 4 Type 1 diabetes mellitus sample(s). 32 run(s), 4 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002393 
   
  
    
    Bisulfite-Seq data for 3 segmented neutrophil of bone marrow sample(s). 34 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002394 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_T=4hrs sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002395 
   
  
    
    Bisulfite-Seq data for 2 monocyte - None sample(s). 70 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002396 
   
  
    
    Bisulfite-Seq data for 6 Chronic Lymphocytic Leukemia sample(s). 84 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002397 
   
  
    
    ChIP-Seq data for 5 Mantle Cell Lymphoma sample(s). 35 run(s), 35 experiment(s), 35 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002398 
   
  
    
    DNase-Hypersensitivity data for 4 macrophage - T=6days untreated sample(s). 6 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002399 
   
  
    
    ChIP-Seq data for 2 monocyte - RPMI_LPS_T=4hrs sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002400 
   
  
    
    ChIP-Seq data for 9 T-cell Acute Lymphocytic Leukemia sample(s). 41 run(s), 41 experiment(s), 41 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002401 
   
  
    
    RNA-Seq data for 14 Multiple Myeloma sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001002402 
   
  
    
    RNA-Seq data for 1 late basophilic and polychromatophilic erythroblast sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002403 
   
  
    
    Bisulfite-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 54 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002404 
   
  
    
    Bisulfite-Seq data for 2 adult endothelial progenitor cell sample(s). 38 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002405 
   
  
    
    Bisulfite-Seq data for 1 monocyte - T=0days sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002406 
   
  
    
    ChIP-Seq data for 2 Acute Promyelocytic Leukemia - MC2392 sample(s). 14 run(s), 12 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002407 
   
  
    
    Bisulfite-Seq data for 2 Acute Promyelocytic Leukemia - CTR sample(s). 27 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002408 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002409 
   
  
    
    RNA-Seq data for 14 mature neutrophil sample(s). 14 run(s), 14 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001002410 
   
  
    
    Bisulfite-Seq data for 1 macrophage - T=6days B-glucan sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002411 
   
  
    
    ChIP-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 21 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002412 
   
  
    
    Bisulfite-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 33 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002413 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_T=6days sample(s). 10 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002414 
   
  
    
    RNA-Seq data for 1 unswitched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002415 
   
  
    
    ChIP-Seq data for 3 monocyte - Attached_T=1hr sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001002416 
   
  
    
    Bisulfite-Seq data for 3 memory B cell sample(s). 47 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002417 
   
  
    
    RNA-Seq data for 6 inflammatory macrophage sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002418 
   
  
    
    ChIP-Seq data for 38 Acute Myeloid Leukemia sample(s). 244 run(s), 226 experiment(s), 226 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001002419 
   
  
    
    Bisulfite-Seq data for 19 Acute Myeloid Leukemia sample(s). 338 run(s), 32 experiment(s), 38 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001002420 
   
  
    
    ChIP-Seq data for 1 monocyte - T=10day_RANK_M-CSF sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002421 
   
  
    
    ChIP-Seq data for 15 Acute Lymphocytic Leukemia sample(s). 79 run(s), 78 experiment(s), 78 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001002422 
   
  
    
    RNA-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002423 
   
  
    
    Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002424 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002425 
   
  
    
    DNase-Hypersensitivity data for 4 macrophage - T=6days B-glucan sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002426 
   
  
    
    RNA-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002427 
   
  
    
    Bisulfite-Seq data for 2 osteoclast sample(s). 88 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002428 
   
  
    
    Bisulfite-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 60 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002429 
   
  
    
    Bisulfite-Seq data for 6 inflammatory macrophage sample(s). 83 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002430 
   
  
    
    ChIP-Seq data for 3 class switched memory B cell sample(s). 21 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002431 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC3324 sample(s). 2 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002432 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=4hrs sample(s). 18 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002433 
   
  
    
    RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002434 
   
  
    
    Bisulfite-Seq data for 2 hematopoietic multipotent progenitor cell sample(s). 16 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002435 
   
  
    
    ChIP-Seq data for 4 neutrophilic metamyelocyte sample(s). 32 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002436 
   
  
    
    RNA-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002437 
   
  
    
    Bisulfite-Seq data for 2 conventional dendritic cell sample(s). 30 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002438 
   
  
    
    RNA-Seq data for 3 CD38-negative naive B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002439 
   
  
    
    ChIP-Seq data for 3 central memory CD4-positive, alpha-beta T cell sample(s). 11 run(s), 9 experiment(s), 9 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002440 
   
  
    
    Bisulfite-Seq data for 1 thymocyte sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002441 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=24hrs sample(s). 21 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002442 
   
  
    
    ChIP-Seq data for 4 germinal center B cell sample(s). 24 run(s), 22 experiment(s), 22 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002443 
   
  
    
    RNA-Seq data for 7 Acute Myeloid Leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002444 
   
  
    
    ChIP-Seq data for 9 CD4-positive, alpha-beta T cell sample(s). 68 run(s), 63 experiment(s), 63 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002445 
   
  
    
    ChIP-Seq data for 3 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 11 run(s), 11 experiment(s), 11 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002446 
   
  
    
    RNA-Seq data for 3 band form neutrophil sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002447 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_T=1hr sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002448 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002449 
   
  
    
    ChIP-Seq data for 10 mature neutrophil sample(s). 105 run(s), 86 experiment(s), 86 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002450 
   
  
    
    ChIP-Seq data for 3 central memory CD8-positive, alpha-beta T cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002451 
   
  
    
    Bisulfite-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 36 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002452 
   
  
    
    RNA-Seq data for 3 germinal center B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002453 
   
  
    
    ChIP-Seq data for 2 monocyte - RPMI_T=1hr sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001002454 
   
  
    
    ChIP-Seq data for 4 band form neutrophil sample(s). 26 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001002455 
   
  
    
    ChIP-Seq data for 7 CD8-positive, alpha-beta T cell sample(s). 38 run(s), 38 experiment(s), 38 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002456 
   
  
    
    RNA-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002457 
   
  
    
    RNA-Seq data for 4 macrophage - T=6days untreated sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002458 
   
  
    
    ChIP-Seq data for 2 macrophage - T=6days untreated sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002459 
   
  
    
    DNase-Hypersensitivity data for 2 Chronic Lymphocytic Leukemia sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002460 
   
  
    
    Bisulfite-Seq data for 8 CD4-positive, alpha-beta T cell sample(s). 108 run(s), 8 experiment(s), 16 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002461 
   
  
    
    RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (24h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002462 
   
  
    
    ChIP-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 20 run(s), 19 experiment(s), 19 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002463 
   
  
    
    ChIP-Seq data for 6 cytotoxic CD56-dim natural killer cell sample(s). 34 run(s), 34 experiment(s), 34 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002464 
   
  
    
    Bisulfite-Seq data for 2 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002465 
   
  
    
    RNA-Seq data for 27 Acute Myeloid Leukemia sample(s). 27 run(s), 27 experiment(s), 27 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001002466 
   
  
    
    ChIP-Seq data for 15 naive B cell sample(s). 67 run(s), 59 experiment(s), 59 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001002467 
   
  
    
    RNA-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002468 
   
  
    
    DNase-Hypersensitivity data for 3 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002469 
   
  
    
    RNA-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002470 
   
  
    
    ChIP-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 23 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002471 
   
  
    
    RNA-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002472 
   
  
    
    Bisulfite-Seq data for 1 Acute Promyelocytic Leukemia - ATRA sample(s). 9 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002473 
   
  
    
    RNA-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002474 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_T=24hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001002475 
   
  
    
    Bisulfite-Seq data for 1 monocyte - Attached_T=1hr sample(s). 23 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002476 
   
  
    
    RNA-Seq data for 3 class switched memory B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002477 
   
  
    
    ChIP-Seq data for 2 mature eosinophil sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002478 
   
  
    
    RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002479 
   
  
    
    RNA-Seq data for 2 osteoclast sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002480 
   
  
    
    DNase-Hypersensitivity data for 1 CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002481 
   
  
    
    DNase-Hypersensitivity data for 2 alternatively activated macrophage sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002482 
   
  
    
    RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002483 
   
  
    
    Bisulfite-Seq data for 2 CD4-positive, alpha-beta thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002484 
   
  
    
    ChIP-Seq data for 10 CD14-positive, CD16-negative classical monocyte sample(s). 80 run(s), 76 experiment(s), 76 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002485 
   
  
    
    ChIP-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 20 run(s), 20 experiment(s), 20 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002486 
   
  
    
    Bisulfite-Seq data for 2 central memory CD8-positive, alpha-beta T cell sample(s). 36 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002487 
   
  
    
    ChIP-Seq data for 2 adult endothelial progenitor cell sample(s). 16 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002488 
   
  
    
    ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 13 run(s), 13 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002489 
   
  
    
    RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002490 
   
  
    
    ChIP-Seq data for 3 Acute Promyelocytic Leukemia - CTR sample(s). 22 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002491 
   
  
    
    ChIP-Seq data for 2 monocyte - T=0days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002492 
   
  
    
    Bisulfite-Seq data for 2 regulatory T cell sample(s). 41 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002493 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MS-275 (20h) sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002494 
   
  
    
    ChIP-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002495 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_T=6days_LPS_T=4hrs sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002496 
   
  
    
    Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 57 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002497 
   
  
    
    Bisulfite-Seq data for 3 neutrophilic metamyelocyte sample(s). 32 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002498 
   
  
    
    ChIP-Seq data for 2 macrophage - T=6days LPS sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002499 
   
  
    
    DNase-Hypersensitivity data for 2 Acute Lymphocytic Leukemia - CTR sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002500 
   
  
    
    RNA-Seq data for 1 Acute Promyelocytic Leukemia - MS-275 (20h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002501 
   
  
    
    Bisulfite-Seq data for 6 macrophage sample(s). 88 run(s), 7 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002502 
   
  
    
    Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=1hr sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002503 
   
  
    
    ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002504 
   
  
    
    ChIP-Seq data for 9 macrophage sample(s). 55 run(s), 55 experiment(s), 55 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002505 
   
  
    
    Bisulfite-Seq data for 5 Mantle Cell Lymphoma sample(s). 65 run(s), 5 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002506 
   
  
    
    ChIP-Seq data for 3 Germinal Center B-Cell-Like Diffuse Large B-Cell Lymphoma sample(s). 10 run(s), 10 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002507 
   
  
    
    RNA-Seq data for 6 macrophage sample(s). 7 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002508 
   
  
    
    Bisulfite-Seq data for 9 mature neutrophil sample(s). 116 run(s), 9 experiment(s), 18 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002509 
   
  
    
    RNA-Seq data for 1 colony forming unit erythroid sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002510 
   
  
    
    ChIP-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002511 
   
  
    
    Bisulfite-Seq data for 3 germinal center B cell sample(s). 37 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002512 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days_LPS_T=4hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002513 
   
  
    
    RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (4h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002514 
   
  
    
    ChIP-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002515 
   
  
    
    ChIP-Seq data for 9 inflammatory macrophage sample(s). 58 run(s), 58 experiment(s), 58 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002516 
   
  
    
    ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2494 sample(s). 2 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002517 
   
  
    
    ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002518 
   
  
    
    RNA-Seq data for 7 Chronic Lymphocytic Leukemia sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001002519 
   
  
    
    Bisulfite-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 39 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002520 
   
  
    
    Bisulfite-Seq data for 4 CD38-negative naive B cell sample(s). 51 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002521 
   
  
    
    Bisulfite-Seq data for 5 Multiple Myeloma sample(s). 63 run(s), 7 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002522 
   
  
    
    RNA-Seq data for 4 macrophage - T=6days B-glucan sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002523 
   
  
    
    Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002524 
   
  
    
    ChIP-Seq data for 9 CD38-negative naive B cell sample(s). 48 run(s), 44 experiment(s), 44 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002525 
   
  
    
    RNA-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002526 
   
  
    
    RNA-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002527 
   
  
    
    DEEP (German Epigenome Project) sequence data of following samples (Sequencing Types: Chip-Seq, WGBS-Seq, RNA-Seq, sncRNA-Seq, NOMe-Se, DNase-Seq):
41_Hf01_LiHe_Ct, 41_Hf02_LiHe_Ct, 41_Hf03_LiHe_Ct, 01_HepG2_LiHG_Ct1, 01_HepG2_LiHG_Ct2, 01_HepaRG_LiHR_D31, 01_HepaRG_LiHR_D32, 01_HepaRG_LiHR_D33, 43_Hm01_BlMo_Ct, 43_Hm03_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm03_BlMa_Ct, 43_Hm05_BlMa_Ct, 43_Hm03_BlMa_TO, 43_Hm05_BlMa_TO, 43_Hm03_BlMa_TE, 43_Hm05_BlMa_TE, 51_Hf01_BlCM_Ct, 51_Hf03_BlCM_Ct, 51_Hf04_BlCM_Ct, 51_Hf02_BlCM_Ct, 51_Hf05_BlCM_Ct, 51_Hf06_BlCM_Ct, 51_Hf06_BlCM_T1, 51_Hf06_BlCM_T2, 51_Hf03_BlEM_Ct, 51_Hf04_BlEM_Ct, 51_Hf02_BlEM_Ct, 51_Hf05_BlEM_Ct, 51_Hf06_BlEM_Ct, 51_Hf06_BlEM_T1, 51_Hf06_BlEM_T2, 51_Hf03_BlTN_Ct, 51_Hf04_BlTN_Ct, 51_Hf02_BlTN_Ct, 51_Hf05_BlTN_Ct, 51_Hf06_BlTN_Ct, 51_Hf06_BlTN_T1, 51_Hf06_BlTN_T2, 51_Hf07_BmTM4_Ct, 51_Hf08_BlTM4_Ct, 51_Hf08_BmTM4_SP1, 51_Hf08_BmTM4_SP2, 51_Hf05_BlTA_Ct, 44_Mm01_WEAd_C2, 44_Mm03_WEAd_C2, 44_Mm02_WEAd_C2, 44_Mm07_WEAd_C2, 44_Mm04_WEAd_C1, 44_Mm05_WEAd_C1 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  46 
 
  
    EGAD00001002528 
   
  
    
    WGS from EGAS00001001857 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001002530 
   
  
    
    Additional files for "The Genomic Landscape of Core-Binding Factor Acute Myeloid Leukemias" (EGAS00001000349). This dataset includes the processed RNASeq data referenced in this paper. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001002531 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002532 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002533 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002534 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002535 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002536 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002537 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002538 
   
  
    
    Genome and transcriptome sequence data from a uterine sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002539 
   
  
    
    Genome and transcriptome sequence data from an oligodendroglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002540 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002541 
   
  
    
    Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002542 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002543 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002544 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002545 
   
  
    
    Genome and transcriptome sequence data from a duodenal malignancy patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002546 
   
  
    
    Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002547 
   
  
    
    Exome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  4 
 
  
    EGAD00001002548 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002549 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002550 
   
  
    
    Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002551 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002552 
   
  
    
    Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002553 
   
  
    
    Genome and transcriptome sequence data from an unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002554 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002555 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002556 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002557 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002558 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002559 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002560 
   
  
    
    Genome and transcriptome sequence data from a cervical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002561 
   
  
    
    Genome and transcriptome sequence data from a metastatic cervical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002562 
   
  
    
    Genome and transcriptome sequence data from an osteogenic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002563 
   
  
    
    Genome and transcriptome sequence data from a follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002564 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002565 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  5 
 
  
    EGAD00001002566 
   
  
    
    Genome and transcriptome sequence data from a uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002567 
   
  
    
    Genome and transcriptome sequence data from a rectosigmoid adenocarcinoma (colorectal cancer) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002568 
   
  
    
    Genome and transcriptome sequence data from a metastatic endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002569 
   
  
    
    Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002570 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002571 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002572 
   
  
    
    Genome and transcriptome sequence data from an infiltrating ductal carcinoma of right breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002573 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002574 
   
  
    
    Genome and transcriptome sequence data from a ductal carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002575 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002576 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002577 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002578 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002579 
   
  
    
    Genome and transcriptome sequence data from a carcinoma of left lower outer quadrant patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002580 
   
  
    
    Genome and transcriptome sequence data from a right breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002581 
   
  
    
    Genome and transcriptome sequence data from a metastatic myxofibrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002582 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of anal canal patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002583 
   
  
    
    Genome and transcriptome sequence data from a retroperitoneal leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002584 
   
  
    
    Genome and transcriptome sequence data from a vulvar metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002585 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002586 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of vulva patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002587 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002588 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002589 
   
  
    
    Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002590 
   
  
    
    Genome and transcriptome sequence data from an adenomacarcinoma of vulva patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002591 
   
  
    
    Genome and transcriptome sequence data from a neuroendocrine tumor likely pancreatic origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      MinION 
      
    
   
  2 
 
  
    EGAD00001002592 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002593 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002594 
   
  
    
    Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002595 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the GE junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002596 
   
  
    
    Genome and transcriptome sequence data from a porocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002597 
   
  
    
    Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002598 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002599 
   
  
    
    Genome and transcriptome sequence data from a medullary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002600 
   
  
    
    Genome and transcriptome sequence data from an adnexal tumor probable of Wolffian origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002601 
   
  
    
    Genome and transcriptome sequence data from an invasive ductal carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002602 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002603 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002604 
   
  
    
    Genome and transcriptome sequence data from a clear cell carcinoma of ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002605 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002606 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002607 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer (likely PNET) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002608 
   
  
    
    Genome and transcriptome sequence data from a pleomorphic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002609 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002610 
   
  
    
    Genome and transcriptome sequence data from an invasive carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002611 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002612 
   
  
    
    Genome and transcriptome sequence data from an esophageal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002613 
   
  
    
    Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002614 
   
  
    
    Genome and transcriptome sequence data from a thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002615 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002616 
   
  
    
    Genome and transcriptome sequence data from a superficial pleomorphic liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002617 
   
  
    
    Genome and transcriptome sequence data from a small cell/neuroendocrine carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002618 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002619 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002620 
   
  
    
    Genome and transcriptome sequence data from a myxoid liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002621 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002622 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002623 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002624 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002625 
   
  
    
    Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002626 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002627 
   
  
    
    Genome and transcriptome sequence data from an adenoid cystic carcinoma of the trachea patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002628 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002629 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002630 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002631 
   
  
    
    Genome and transcriptome sequence data from a serous endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002632 
   
  
    
    Genome and transcriptome sequence data from a testicular cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002633 
   
  
    
    Genome and transcriptome sequence data from an endometrial carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002634 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002635 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002636 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002637 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002638 
   
  
    
    Genome and transcriptome sequence data from a metastatic prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002639 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002640 
   
  
    
    Genome and transcriptome sequence data from a clival chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002641 
   
  
    
    Genome and transcriptome sequence data from a metastatic small cell carcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002642 
   
  
    
    Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002643 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002644 
   
  
    
    Genome and transcriptome sequence data from a multifocal hepatocellular carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002645 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002646 
   
  
    
    Genome and transcriptome sequence data from an epithelioid mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002647 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002648 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002649 
   
  
    
    Variants called from RNA-seq data of meningioma tumors. 
    
   
  
    
   
  25 
 
  
    EGAD00001002650 
   
  
    
    Somatic variants called from whole-exome sequencing of meningioma-blood pairs 
    
   
  
    
   
  87 
 
  
    EGAD00001002651 
   
  
    
    Presurgical studies allow study of the relationship between mutations and response of estrogen receptor positive (ER+) breast cancer to aromatase inhibitors (AIs) but have been limited to small biopsies. Here in Phase I of this study, we perform exome sequencing on baseline, surgical core-cuts and blood from 60 patients (40 AI treated, 20 Controls). In poor responders (based on Ki67 change) we find significantly more somatic mutations than good responders. Subclones exclusive to baseline or surgical cores   occur in approximately 30% of tumours. In Phase II we combine targeted sequencing on another 28 treated patients with Phase I. We find six genes frequently mutated: PIK3CA, TP53, CDH1, MLL3, ABCA13 and FLG with 71% concordance between paired cores. TP53 mutations are associated with poor response. We conclude that multiple biopsies are essential for confident mutational profiling of ER+ breast cancer and TP53 mutations are associated with resistance to oestrogen deprivation therapy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  443 
 
  
    EGAD00001002652 
   
  
    
    50 ng of genomic double stranded DNA was enzymatically sheared to an average size of 200 bp. Further processing was performed using Illumina Nextera Rapid Capture Custom Kit (Illumina) and 100 bp paired-end sequencing was performed with 24 samples per lane on a Illumina HiSeq 2000 (Illumina) to reach a coverage of 100-1000x. 
    
   
  
    
   
  284 
 
  
    EGAD00001002653 
   
  
    
    Genomic DNA from leukemic and remission bone marrow mononuclear cells was isolated with the QIAamp DNA Blood Extraction Kit (Qiagen, Venlo, The Netherlands). Libraries were prepared with the Illumina TruSeq DNA Sample Prep and TruSeq Exome Enrichment Kits (Illumina, San Diego, CA, USA) according to the manufacturer's recommendations. 100 bp paired-end sequencing was performed on a HiSeq 2000 (Illumina) to about 80x coverage. 
    
   
  
    
   
  57 
 
  
    EGAD00001002654 
   
  
    
    This dataset contains RNA-seq, ATAC-seq, and ChIP-seq samples from the SJERG cohort. We applied ChIP-Seq for Dux4 on two B-cell ALL cell-lines(REH, Nalm6) along with INPUT. ATAC-Seq on two B-cell ALL cell-lines(REH, Nalm6) and xenograft of a B-cell ALL patient(ERG000016). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001002655 
   
  
    
    BLUEPRINT ChIP-Seq from two mantle cell lymphoma patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002656 
   
  
    
    Whole exome sequencing BAM files and whole genome sequencing CRAM files for 722 individuals from the NIHR-BioResource Rare Diseases Consortium (SPEED project) with inherited retinal disease. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  707 
 
  
    EGAD00001002657 
   
  
    
    Reverse Capture Hi-C 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002658 
   
  
    
    Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 45 low-risk myelodysplastic syndrome (LRMDS) cases. Gene expression profiles (GEPs) of the 45 LRMDS have been compared to GEPs derived from likewise highly purified mesenchymal cells obtained from bone marrow specimens of healthy donors for the identification of inflammatory signatures. Additionally, an overlap in inflammatory signatures has been determined by comparing the GEPs of these 45 LRMDS cases to the GEPs of 4 Shwachman-Diamond syndrome and 3 Diamond-Blackfan anemia cases, both representing different subclasses of congenital pre-leukemia syndromes with a tendency of leukemic progression and perturbed niche compartment. Finally, the GEPs and gene expression signatures have been utilized for prognostication and the prediction of leukemic progression. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001002659 
   
  
    
    Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 10 healthy donors (HDs). This data set is used as a baseline control to observe the differences between gene expression profiles (GEPs) of pre-leukemia cases (45 low-risk myelodysplastic syndrome, 4 Shwachman-Diamond and 3 Diamond-Blackfan anemia patients) and gene expression patterns observed in a normal, healthy context. Through differential expression and gene set enrichment analysis we determined that inflammatory signaling pathways are significantly more active in mesenchymal cells of pre-leukemia cases compared to their healthy counterparts. Finally, we determined through statistical modelling of healthy donor's GEPs which pre-leukemia cases have significantly more active inflammatory signaling and demonstrated a strong relation to survival statistics. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001002660 
   
  
    
    Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 4 Shwachman-Diamond syndrome (SDS) cases. This data set, comprising 4 SDS cases, is used as complement to 45 low-risk myelodysplastic syndrome (LRMDS) and 3 Diamond-Blackfan anemia (DBA) cases to demonstrate aberrant inflammatory signaling as a common mechanism in pre-leukemia syndromes to induce genotoxic stress in hematopoietic stem cells. In addition this data set is used to determine different overlapping gene expression signatures in pre-leukemia syndromes compared to gene expression profiles of highly purified mesenchymal cells of healthy donors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001002661 
   
  
    
    Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 3 Diamond-Blackfan anemia (DBA) cases. This data set, comprising 3 DBA cases, is used as complement to 45 low-risk myelodysplastic syndrome (LRMDS) and 4 Shwachman-Diamond syndrome (SDS) cases to demonstrate aberrant inflammatory signaling as a common mechanism in pre-leukemia syndromes to induce genotoxic stress in hematopoietic stem cells. In addition, this data set is used to determine different overlapping gene expression signatures in pre-leukemia syndromes compared to gene expression profiles of highly purified mesenchymal cells of healthy donors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001002662 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LINC-JP. 
    
   
  
    
   
  62 
 
  
    EGAD00001002663 
   
  
    
    BLUEPRINT: A human variation panel of genetic influences on epigenomes and transcriptomes in three immune cells (WGS) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  197 
 
  
    EGAD00001002664 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: CMDI-UK. 
    
   
  
    
   
  98 
 
  
    EGAD00001002665 
   
  
    
    Mapped sequence reads in BAM format for 64 individuals reporting Kanak ancestry recruited in New Caledonia sequenced at four times target coverage using the Illumina HiSeq 4000 platform. 
    
   
  
    
   
  64 
 
  
    EGAD00001002666 
   
  
    
    Genomic DNA from leukemic and remission bone marrow mononuclear cells was isolated with the QIAamp DNA Blood Extraction Kit (Qiagen, Venlo, The Netherlands). Libraries were prepared using Nextera Rapid Capture Exome Kit (Illumina, San Diego, USA). Paired-end sequencing of 100 bp reads was performed on a HiSeq 2000 (Illumina) to obtain at least a 50 x coverage. 
    
   
  
    
   
  105 
 
  
    EGAD00001002667 
   
  
    
    Additional files for "The Genomic Landscape of Core-Binding Factor Acute Myeloid Leukemias" (EGAS00001000349). This dataset includes the processed Excap data referenced in this paper. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  327 
 
  
    EGAD00001002668 
   
  
    
    Metagenomic shotgun sequencing of Irritable bowel syndrome patients and matched controls 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  336 
 
  
    EGAD00001002669 
   
  
    
    Part of WGS data for Prostate (ICGC) 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001002670 
   
  
    
    ChIP-Seq data for 182 mature neutrophil sample(s). 2847 run(s), 366 experiment(s), 355 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  186 
 
  
    EGAD00001002671 
   
  
    
    RNA-Seq data for 212 CD4-positive, alpha-beta T cell sample(s). 212 run(s), 212 experiment(s), 212 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  212 
 
  
    EGAD00001002672 
   
  
    
    ChIP-Seq data for 172 CD14-positive, CD16-negative classical monocyte sample(s). 572 run(s), 345 experiment(s), 340 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  174 
 
  
    EGAD00001002673 
   
  
    
    ChIP-Seq data for 154 CD4-positive, alpha-beta T cell sample(s). 355 run(s), 265 experiment(s), 250 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  158 
 
  
    EGAD00001002674 
   
  
    
    RNA-Seq data for 197 CD14-positive, CD16-negative classical monocyte sample(s). 197 run(s), 197 experiment(s), 197 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  197 
 
  
    EGAD00001002675 
   
  
    
    RNA-Seq data for 205 mature neutrophil sample(s). 205 run(s), 205 experiment(s), 205 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  205 
 
  
    EGAD00001002676 
   
  
    
    DATA FILES FOR PCGP SJERG (WGS) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001002677 
   
  
    
    DATA FILES FOR PCGP SJERG (WXS) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001002678 
   
  
    
    The data set consists of low-pass whole genome sequence data of single CTCs, pools of CTCs and germline controls for a cohort of 31 SCLC patients at both baseline, and for 5 patients at relapse. In addition 9 CDX models and associated germline controls (where available) are included. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  319 
 
  
    EGAD00001002679 
   
  
    
    This dataset contains WES files for the SJACT cohort associated with the paper "Genetic landscape of pediatric Adrenocortical Tumor". In this paper, we analyse 37 adrenocortical tumours (ACTs) by whole-genome, whole-exome and/or transcriptome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001002680 
   
  
    
    This dataset contains RNA-Seq files for the SJACT cohort associated with the paper "Genetic landscape of pediatric Adrenocortical Tumor". In this paper, we analyse 37 adrenocortical tumours (ACTs) by whole-genome, whole-exome and/or transcriptome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001002681 
   
  
    
    RNA-seq, ChIP-seq, and ATAC-seq files for PCGP SJERG paper titled "Deregulation of DUX4 and ERG in acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 (ILLUMINA) 
      
    
   
  53 
 
  
    EGAD00001002682 
   
  
    
    BLUEPRINT DNA methylation profiles of monocytes, T cells and B cells in type 1 diabetes-discordant monozygotic twins (Bisulfite-Seq data). 
    
   
  
    
   
  8 
 
  
    EGAD00001002684 
   
  
    
    Whole genome sequencing of 98 tumour-normal pairs for the PAEN-AU pancreatic neuroendocrine cancer project. 
    
   
  
    
   
  196 
 
  
    EGAD00001002685 
   
  
    
    Breast cancer PDTX sequencing data from Bruna et al, Cell 2016
- Exome Sequencing
- Shallow Whole Genome Sequencing
- RRBS Methylation Sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  393 
 
  
    EGAD00001002686 
   
  
    
    CD4 T-Cell ChIP-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001002687 
   
  
    
    CD4 T-Cell RNA-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002689 
   
  
    
    ICGC Oesophageal Adenocarcinoma tissue samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001002690 
   
  
    
    Exome sequencing of for 10 patients: 10 tumors, 10 cell lines and 7 blood samples (for 3 patients blood was not available) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001002691 
   
  
    
    RNAseq data for 10 patients: 10 tumors and 10 cell lines 
    
   
  
    
      
      NextSeq 500 
      
    
   
  20 
 
  
    EGAD00001002692 
   
  
    
    DATA FILES FOR MULLIGHAN MEF2D RNASEQ STRANDED 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  200 
 
  
    EGAD00001002693 
   
  
    
    Innate immune memory is the phenomenon whereby innate immune cells such as monocytes or macrophages undergo functional reprogramming after exposure to microbial components such as LPS. We apply an integrated epigenomic approach to characterize the molecular events involved in LPS-induced tolerance in a time dependent manner. ChIP-seq, RNA-seq, WGBS and ATAC-seq data were generated. This analysis identified epigenetic programs in tolerance and trained macrophages, and the potential transcription factors involved.
Experimental set-up
Time-course in vitro culture of human monocytes. Two innate immune memory states can be induced in culture through an initial exposure of primary human monocytes to either LPS or BG for 24 hours, followed by removal of stimulus and differentiation to macrophages for an additional 5 days. Cells were collected at baseline (day 0), 1 hour, 4 hour, 24 hour and 6 days. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
      unspecified 
      
    
   
  71 
 
  
    EGAD00001002695 
   
  
    
    48 samples from the TRACK-HD cohort. All samples carry the Huntington’s disease expansion. The subjects were selected on the basis of rate of disease progression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001002696 
   
  
    
    Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001002697 
   
  
    
    Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001002698 
   
  
    
    Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  387 
 
  
    EGAD00001002699 
   
  
    
    This data set includes RNAseq data from 136 samples from the TRACK-HD cohort including premanifest, manifest and control subjects. 
Data can only be used for Huntington's disease related research. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  136 
 
  
    EGAD00001002704 
   
  
    
    DATA FILES FOR MULLIGHAN MEF2D RNASEQ UNSTRANDED 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  217 
 
  
    EGAD00001002705 
   
  
    
    McGill EMC Release 6 data 
    
   
  
    
      
      unspecified 
      
    
   
  59 
 
  
    EGAD00001002707 
   
  
    
    Whole exome sequencing of a normal sample, primary tumor sample, and relapse tumor sample of a transformed non-Hodgkins follicular lymphoma patient with extraordinary response to treatment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001002708 
   
  
    
    ATAC-seq data for 7 sample(s) from tonsil, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001002709 
   
  
    
    ATAC-seq data for 136 sample(s) from venous blood, on Genome GRCh38. 141 run(s), 139 experiment(s), 139 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  136 
 
  
    EGAD00001002710 
   
  
    
    ATAC-seq data for 4 sample(s) from bone marrow, on Genome GRCh38. 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001002711 
   
  
    
    ChIP-Seq_H3K4me3 data for 133 mature neutrophil sample(s). 208 run(s), 136 experiment(s), 136 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  133 
 
  
    EGAD00001002712 
   
  
    
    ChIP-Seq_H3K27me3 data for 131 mature neutrophil sample(s). 321 run(s), 134 experiment(s), 134 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  131 
 
  
    EGAD00001002713 
   
  
    
    DNase accessibility data for BLUEPRINT consortium immune cells included in eFORGE software tool 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001002714 
   
  
    
    We recruited 100 healthy, male donors of self-reported European descent (EUB) and 100 of self-reported African descent (AFB) (Ghent, Belgium). For each participant, peripheral blood mononuclear cells (PBMCs) were isolated from whole blood on Ficoll-Paque density gradients. Monocytes were then positively selected with magnetic CD14 microbeads and exposed for 6 hours to different ligands activating TLR4 (LPS), TLR1/2 (Pam3CSK4), TLR7/8 (R848) and to a human seasonal influenza A virus (IAV). High-quality RNA was obtained from unstimulated and stimulated monocytes for 970 of the 1000 samples (200 x 5 conditions), and was sequenced on an Illumina HiSeq2000. On average, 34 million 101-bp single-end reads were obtained per sample. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  970 
 
  
    EGAD00001002715 
   
  
    
    Exome sequencing of isolate populations and Generation Scotland 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1027 
 
  
    EGAD00001002716 
   
  
    
    In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors)  
Data provided here consist of 122 Bam files for WES (83 Tumors and 39 blood) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  122 
 
  
    EGAD00001002717 
   
  
    
    In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors).
Data provided here consist of 71 unmapped Bam files form whole transcriptome RNA-seq. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001002718 
   
  
    
    In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors).
Data provided here consist of 71 mapped Bam files form whole transcriptome RNA-seq. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001002719 
   
  
    
    This dataset contains whole-genome sequencing data files from colon organoid cultures, which were mutated using CRISPR-Cas9 for specific genes (APC, KRAS, TP53 and SMAD4) to generate in vitro transformed cancer cells. After introducing each mutation, the resulting cultures were subjected to whole-genome sequencing. In addition, some cultures were xenotransplanted in recipient mice. The resulting primary tumors and corresponding metastases were subjected to whole-genome sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001002721 
   
  
    
    Whole genome sequencing of 300 individuals from 142 diverse populations 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001002722 
   
  
    
    Exome sequencing for 26 patients with matched blood
RNA-seq for 41 patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  93 
 
  
    EGAD00001002724 
   
  
    
    September 2016 data update (bam/fastq/vcf) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001002725 
   
  
    
    Autism spectrum disorder (ASD) is a collection of neuro-developmental disorders characterized by deficits in social interaction and social communication, along with restricted and repetitive behaviour patterns. we globally interrogated the histone acetylomes of enhancers in a large cohort of ASD and control samples by analyzing tissue from three brain regions postmortem: prefrontal cortex (PFC), temporal cortex (TC) and cerebellum (CB). H3K27ac was selected as the representative acetylation mark and 288 ChIP-seq were performed on these postmortem samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  291 
 
  
    EGAD00001002726 
   
  
    
    Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  60 
 
  
    EGAD00001002727 
   
  
    
    1,591 single cells from 11 colorectal cancer patients were profiled using Fluidigm based single cell RNA-seq protocol to characterized cellular heterogeneity of colorectal cancer. 630 single cells from 7 cell lines were profiled similarly to benchmark de novo cell type identification algorithms. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  2221 
 
  
    EGAD00001002728 
   
  
    
    In this dataset, exome sequencing of bone marrow samples taken during multiple timepoints of disease progression from 13 AML patients are present. These samples were take either before/after treatment, at diagnosis or at relapse. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001002729 
   
  
    
    Haplotype Reference Consortium Release 1.1 - subset for release via the EGA 
    
   
  
    
   
  11227 
 
  
    EGAD00001002730 
   
  
    
    SPEED - childhood dystonia KMT2B dataset 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002731 
   
  
    
    whole exome sequencing of tumor- as well as PBMC-derived DNA of five melanoma patients for identification of naturally presented patient-specific neoepitopes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001002732 
   
  
    
    DNA methylation was analyzed for stem/progenitor cell types and terminally differentiated cell types of the human blood lineage (HSC, MPP, CMP, MEP, GMP, CLP, MLP0, MLP1, MLP2, MLP3, MK, CD4+ Tcell, CD8+ Tcell, Bcell, NK, Neut, Mono). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  63 
 
  
    EGAD00001002733 
   
  
    
    Gene expression was analyzed for stem/progenitor cell types and terminally differentiated cell types of the human blood lineage (HSC, MPP, CMP, GMP, CLP, MLP0, MLP1, MLP2, MLP3). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  13 
 
  
    EGAD00001002734 
   
  
    
    Whole Genome Sequencing data set for the study  "Premalignant SOX2 in ovarian cancer patients" 
    
   
  
    
      
      Complete Genomics 
      
    
   
  39 
 
  
    EGAD00001002735 
   
  
    
    mRNA, total RNA, small noncoding RNA, NOMe-Seq and DNase-Seq data from following samples (not every Sequencing Type for every sample):
01_HepG2_LiHG_Ct1
41_Hf01_LiHe_Ct
41_Hf02_LiHe_Ct
41_Hf03_LiHe_Ct
51_Hf03_BlCM_Ct
51_Hf04_BlCM_Ct
51_Hf03_BlEM_Ct
51_Hf04_BlEM_Ct
51_Hf03_BlTN_Ct
51_Hf04_BlTN_Ct
Metadata available at deep.dkfz.de 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001002736 
   
  
    
    WES of human:
A mutation in VPS15 (PIK3R4) causes a ciliopathy and affects IFT20 release from the cis-Golgi
WES (Agilent SureSelect All Exon XT2 50 Mb kit) has been realized on three affected siblings (II.1, II.3, II.5) and one healthy sister (II.4).
Raw data (BAM files) are provided:
- II.1.aligned.sorted.dedup.realign.recal.bam
- II.3.aligned.sorted.dedup.realign.recal.bam
- II.5.aligned.sorted.dedup.realign.recal.bam
- II.4.aligned.sorted.dedup.realign.recal.bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001002738 
   
  
    
    Background: In follicular lymphoma (FL), studies addressing the prognostic value of microenvironment-related immunohistochemical (IHC) markers and tumor cell-related genetic markers have yielded conflicting results, precluding implementation in practice. Therefore, the Lunenburg Lymphoma Biomarker Consortium (LLBC) performed a validation study for published markers. Methods: To maximize sensitivity, an end-of-spectrum design was applied for 122 uniformly immunochemotherapy-treated FL patients retrieved from international trials and registries; early failure (EF): progression or lymphoma-related death <2 years versus long remission: response duration of >5 years. IHC staining for T-cells and macrophages was performed on tissue microarrays from initial biopsy and scored with a validated computer-assisted protocol. Shallow whole-genome and deep targeted sequencing was performed on the same samples.  Results: 96/122 cases with complete molecular and immunohistochemical data were included in the analysis. EZH2 wild-type (p=0.006), gain of chromosome 18 (p=0.002), low percentages of CD8+ cells (p=0.011) and CD163+ areas (p=0.038) were associated with EF. No significant differences in other markers were observed, thereby refuting previous claims on their prognostic significance.  Conclusion: Using an optimized study design, this LLBC study validates wild-type EZH2 status, gain of chromosome 18, low percentages of CD8+ cells and CD163+ area as predictors of EF to immunochemotherapy in FL. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001002739 
   
  
    
    Aligned sequence data from 14 Prostate cancer samples with BRCA2 mutations 
    
   
  
    
   
  49 
 
  
    EGAD00001002740 
   
  
    
    We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  164 
 
  
    EGAD00001002741 
   
  
    
    Additional Xenograph files for PCGP SJERG 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001002742 
   
  
    
    Whole-genome sequencing data from Chad and Lebanon. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001002743 
   
  
    
    These samples comprise both melanoma cases and controls sequenced for a selection of loci linked to disease susceptibility. These bams are a subset of the sequencing restricted specifically to the GRCh37 coding areas of the BAP1 gene. 
    
   
  
    
   
  3186 
 
  
    EGAD00001002744 
   
  
    
    RNA sequencing data of human small intestinal macrophage subtypes 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001002745 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001002746 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001002747 
   
  
    
    Whole-exome sequencing (WES) of 216 breast cancer metastasis-normal pairs from patients who underwent a biopsy in the context of the SAFIR01, SAFIR02, SHIVA or MOSCATO prospective trials (France). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  432 
 
  
    EGAD00001002748 
   
  
    
    DDD DATAFREEZE 2014-11-04: 4293 trios - exome sequence CRAM files 
    
   
  
    
   
  - 
 
  
    EGAD00001002749 
   
  
    
    A KNIH001 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002750 
   
  
    
    A KNIH002 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002751 
   
  
    
    A KNIH003 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002752 
   
  
    
    A KNIH004 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002753 
   
  
    
    A KNIH005 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002754 
   
  
    
    A KNIH006 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for beta cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002755 
   
  
    
    A KNIH007 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002756 
   
  
    
    A KNIH008 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002757 
   
  
    
    A KNIH009 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for preadipocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002758 
   
  
    
    A KNIH010 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002759 
   
  
    
    A KNIH011 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002760 
   
  
    
    A KNIH001 miRNA-seq single end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002761 
   
  
    
    A KNIH002 miRNA-seq single end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002762 
   
  
    
    A KNIH003 miRNA-seq single end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002763 
   
  
    
    A KNIH004 miRNA-seq single end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002764 
   
  
    
    A KNIH005 miRNA-seq single end data for islet cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002765 
   
  
    
    A KNIH006 miRNA-seq single end data for beta cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002766 
   
  
    
    A KNIH007 miRNA-seq single end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002767 
   
  
    
    A KNIH008 miRNA-seq single end data for adipocytes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002768 
   
  
    
    A KNIH009 miRNA-seq single end data for preadipocytes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002769 
   
  
    
    A KNIH010 miRNA-seq single end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002770 
   
  
    
    A KNIH011 miRNA-seq single end data for podocytes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001002772 
   
  
    
    In this study we characterized genomic alterations in three bladder cancer patients with metastatic disease courses. Multiple regions were procured by laser microdissection or punctures from primary tumor, lymph node metastases and from distant metastases. Data provided here consist of 35 Bam files for WES (32 Tumors and 2 blood, 1 adjacent normal) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  35 
 
  
    EGAD00001002883 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer sample of a validation cohort of 60 PDX 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001002884 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample at early/late passages 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002885 
   
  
    
    Raw sequence data, fastq format 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001002886 
   
  
    
    Exome sequencing of North American Brain Expression Consortium (NABEC) subject. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  298 
 
  
    EGAD00001002890 
   
  
    
    Exome sequencing of 102 French-Canadians 
    
   
  
    
   
  102 
 
  
    EGAD00001002891 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002892 
   
  
    
    The data contains genome sequencing of clear cell renal cell carcinomas and normal kidney tissues. The samples were collected from patients from different European countries. 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  21 
 
  
    EGAD00001002893 
   
  
    
    This dataset contains all RNA-seq runs for the BLN panel of cell lines and matched parental tumors. Tumor/cell line pairs have been authenticated using SNP profiles and all pairs were confirmed. Please note: The dataset also contains raw data from an early primary culture (BLN-1) where no stable cell line could be generated. Please also note different reference genomes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001002896 
   
  
    
    Amplicon sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are Illumina amplicon deep sequencing libraries (n = 118) to validate somatic predictions made in the whole genome sequencing libraries. Specifically, there are 72 tumor libraries and 46 normal libraries. Some patients may have multiple amplicon libraries sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  118 
 
  
    EGAD00001002897 
   
  
    
    Whole genome sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are libraries from 41 patients. Specifically: 15 transformed follicular lymphoma (TFL), 6 early progressers (PFL), and 20 non-early progressers (NPFL). For TFL and PFL patients, trios consisting of diagnostic (T1), transformed/progressed (T2) and a matching normal are available (n = 63 libaries in total). For NPFL patients, a tumor-normal pair are available (n = 40 libraries). 
    
   
  
    
   
  103 
 
  
    EGAD00001002898 
   
  
    
    Oliocapture sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are sequencing libraries from the extension cohort of 277 patients. Specifically, there are 402 tumor libraries and 82 normal libraries. 
    
   
  
    
   
  484 
 
  
    EGAD00001002899 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002900 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002901 
   
  
    
    ATAC-seq data for 2 sample(s) for monocyte RPMI_LPS_T=24hrs_RPMI_T=5days from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002902 
   
  
    
    ATAC-seq data for 3 sample(s) for naive B cell from venous blood, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001002903 
   
  
    
    ATAC-seq data for 3 sample(s) for naive B cell from tonsil, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001002904 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs_RPMI_T=5days from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002905 
   
  
    
    ATAC-seq data for 3 sample(s) for unswitched memory B cell from venous blood, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001002906 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002907 
   
  
    
    ATAC-seq data for 2 sample(s) for osteoclast from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002908 
   
  
    
    ATAC-seq data for 2 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002909 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002910 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002911 
   
  
    
    ATAC-seq data for 1 sample(s) for germinal center B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002912 
   
  
    
    ATAC-seq data for 2 sample(s) for plasma cell from tonsil, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002913 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002914 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_T=6days from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002915 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs_RPMI_T=5days_LPS_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002916 
   
  
    
    ATAC-seq data for 106 sample(s) Chronic Lymphocytic Leukemia from venous blood, on Genome GRCh38. 111 run(s), 109 experiment(s), 109 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  106 
 
  
    EGAD00001002917 
   
  
    
    ATAC-seq data for 2 sample(s) for germinal center B cell from tonsil, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002918 
   
  
    
    ATAC-seq data for 5 sample(s) Mantle Cell Lymphoma from venous blood, on Genome GRCh38. 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001002919 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002920 
   
  
    
    ATAC-seq data for 4 sample(s) Multiple Myeloma for plasma cell from bone marrow, on Genome GRCh38. 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001002921 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002922 
   
  
    
    ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002923 
   
  
    
    ChIPmentation data for 2 sample(s) for memory B cell from venous blood, on Genome GRCh38. 6 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002924 
   
  
    
    ChIPmentation data for 2 sample(s) for central memory CD8-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 11 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002925 
   
  
    
    ChIPmentation data for 1 sample(s) for immature conventional dendritic cell GM-CSF_IL4_T=6_days from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002926 
   
  
    
    ChIPmentation data for 1 sample(s) for effector memory CD4-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002927 
   
  
    
    ChIPmentation data for 1 sample(s) for central memory CD4-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002928 
   
  
    
    ChIPmentation data for 7 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001002929 
   
  
    
    ChIPmentation data for 1 sample(s) for CD38-negative naive B cell from cord blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002930 
   
  
    
    ChIPmentation data for 1 sample(s) Acute Lymphocytic Leukemia from bone marrow, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002931 
   
  
    
    ChIPmentation data for 3 sample(s) Lymphoma_Follicular from lymph node, on Genome GRCh38. 7 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001002932 
   
  
    
    ChIPmentation data for 1 sample(s) for germinal center B cell from tonsil, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002933 
   
  
    
    ChIPmentation data for 1 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002934 
   
  
    
    ChIPmentation data for 1 sample(s) for cytotoxic CD56-dim natural killer cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002935 
   
  
    
    ChIPmentation data for 2 sample(s) Acute Myeloid Leukemia for blast cell from bone marrow, on Genome GRCh38. 12 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002936 
   
  
    
    ChIPmentation data for 5 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001002937 
   
  
    
    ChIPmentation data for 1 sample(s) for naive B cell from tonsil, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002938 
   
  
    
    ChIPmentation data for 2 sample(s) T-cell Acute Lymphocytic Leukemia from capillary blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002939 
   
  
    
    ChIPmentation data for 3 sample(s) Burkitt Lymphoma from lymph node, on Genome GRCh38. 10 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001002940 
   
  
    
    ChIPmentation data for 1 sample(s) for conventional dendritic cell from cord blood, on Genome GRCh38. 4 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002941 
   
  
    
    ChIPmentation data for 1 sample(s) for mature conventional dendritic cell GM-CSF_IL4_T=6_days_R848_T=24hrs from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002942 
   
  
    
    ChIPmentation data for 2 sample(s) for regulatory T cell from venous blood, on Genome GRCh38. 14 run(s), 9 experiment(s), 9 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002943 
   
  
    
    ChIPmentation data for 1 sample(s) for effector memory CD8-positive, alpha-beta T cell, terminally differentiated from venous blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002944 
   
  
    
    ChIPmentation data for 2 sample(s) Activated B-Cell-Like Diffuse Large B-Cell Lymphoma from lymph node, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002945 
   
  
    
    ChIPmentation data for 1 sample(s) for effector memory CD8-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001002946 
   
  
    
    ChIPmentation data for 2 sample(s) Germinal Center B-Cell-Like Diffuse Large B-Cell Lymphoma from lymph node, on Genome GRCh38. 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001002947 
   
  
    
    ChIP-Seq data for 5 sample(s) for thymocyte from thymus, on Genome GRCh38. 17 run(s), 17 experiment(s), 17 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001002948 
   
  
    
    ChIP-Seq data for 1 sample(s) for conventional dendritic cell from cord blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002949 
   
  
    
    ChIP-Seq data for 1 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002950 
   
  
    
    ChIP-Seq data for 1 sample(s) for memory B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002951 
   
  
    
    ChIP-Seq data for 1 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002952 
   
  
    
    ChIP-Seq data for 4 sample(s) T-cell Acute Lymphocytic Leukemia from capillary blood, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001002953 
   
  
    
    RNA-Seq data for 8 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 8 run(s), 8 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001002954 
   
  
    
    RNA-Seq data for 1 sample(s) Acute Lymphocytic Leukemia from bone marrow, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002955 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=0day from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002956 
   
  
    
    RNA-Seq data for 2 sample(s) T-cell lymphoma for helper T cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002957 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=2day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002958 
   
  
    
    RNA-Seq data for 1 sample(s) Acute Myeloid Leukemia from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002959 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=1day_M-CSF_S100A9_4hr_RANL from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002960 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=6day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002961 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=10day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002962 
   
  
    
    RNA-Seq data for 1 sample(s) Acute Myeloid Leukemia for blast cell from bone marrow, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002963 
   
  
    
    RNA-Seq data for 6 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 6 run(s), 6 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001002964 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=1day_4hr_RANK from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002965 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=2day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002966 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=10day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002967 
   
  
    
    RNA-Seq data for 1 sample(s) for monocyte T=6day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002968 
   
  
    
    RNA-Seq data for 2 sample(s) Acute Myeloid Leukemia for myeloid cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001002969 
   
  
    
    Bisulfite-Seq data for 1 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 3 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001002972 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002973 
   
  
    
    Genome and transcriptome sequence data from a rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002974 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002975 
   
  
    
    Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002976 
   
  
    
    Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002977 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002978 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002979 
   
  
    
    Genome and transcriptome sequence data from a GI primary (prev breast cancer) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002980 
   
  
    
    Genome and transcriptome sequence data from a metastatic fibrolamellar hepatocelluar carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002981 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002982 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectosigmoid adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002983 
   
  
    
    Genome and transcriptome sequence data from a metastatic carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002984 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002985 
   
  
    
    Genome and transcriptome sequence data from a adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002986 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002987 
   
  
    
    Genome and transcriptome sequence data from a metastatic endocervical adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002988 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002989 
   
  
    
    Genome and transcriptome sequence data from a medullary thyroid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002990 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002991 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002992 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002993 
   
  
    
    Genome and transcriptome sequence data from a metastatic carcinoma of primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002994 
   
  
    
    Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002995 
   
  
    
    Genome and transcriptome sequence data from a carcinosarcoma of the uterus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002996 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002997 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001002998 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001002999 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003000 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003001 
   
  
    
    Genome and transcriptome sequence data from a serous carcinoma of fallopian tube patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003002 
   
  
    
    Genome and transcriptome sequence data from a metastatic adult granulosa cell tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003003 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003004 
   
  
    
    Genome and transcriptome sequence data from a glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003005 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003006 
   
  
    
    Genome and transcriptome sequence data from a metastatic medullary thyroid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003007 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003008 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma presumably of ovarian origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003009 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003010 
   
  
    
    Genome and transcriptome sequence data from a metastatic uterine leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003011 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003012 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003013 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003014 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003015 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003016 
   
  
    
    Genome and transcriptome sequence data from a metastatic ductal carcinoma of the breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003017 
   
  
    
    Genome and transcriptome sequence data from a metastatic large cell neuroendocrine tumour of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003018 
   
  
    
    Genome and transcriptome sequence data from a metastatic clear cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003019 
   
  
    
    Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003020 
   
  
    
    Genome and transcriptome sequence data from a low grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003021 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003022 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003023 
   
  
    
    Genome and transcriptome sequence data from a metastatic renal cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003024 
   
  
    
    Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003025 
   
  
    
    Genome and transcriptome sequence data from an endometrial adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003026 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003027 
   
  
    
    Genome and transcriptome sequence data from an anaplastic myxopapillary ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003028 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003029 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003030 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003031 
   
  
    
    Genome and transcriptome sequence data from a metastatic collecting duct kidney cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003032 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003033 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003034 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003035 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003036 
   
  
    
    Genome and transcriptome sequence data from an ovarian adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003037 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003038 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003039 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003040 
   
  
    
    Genome and transcriptome sequence data from a chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003041 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003042 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003043 
   
  
    
    Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003044 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003045 
   
  
    
    Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003046 
   
  
    
    Genome and transcriptome sequence data from a sigmoid cancer and an ampullary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003047 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003048 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003049 
   
  
    
    Genome and transcriptome sequence data from a prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003050 
   
  
    
    Genome and transcriptome sequence data from a serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003051 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003052 
   
  
    
    Genome and transcriptome sequence data from a metastatic malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003053 
   
  
    
    Genome and transcriptome sequence data from an adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003054 
   
  
    
    Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003055 
   
  
    
    Genome and transcriptome sequence data from a small bowel carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003056 
   
  
    
    Genome and transcriptome sequence data from a solitary fibrous tumors (sarcoma) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001003057 
   
  
    
    Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003058 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003059 
   
  
    
    Genome and transcriptome sequence data from a metastatic mullerian tumor of endometrium patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003060 
   
  
    
    Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003061 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of the distal esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003062 
   
  
    
    Genome and transcriptome sequence data from an extraosseous osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003063 
   
  
    
    Genome and transcriptome sequence data from an atypical bronchial carcinoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003064 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003065 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma of the palate patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003066 
   
  
    
    Genome and transcriptome sequence data from an appendiceal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003067 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastroesophageal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003068 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003069 
   
  
    
    Genome and transcriptome sequence data from a pancreatic neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003070 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003071 
   
  
    
    Genome and transcriptome sequence data from a pleural mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003072 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003073 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003074 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003075 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon caner patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003076 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003077 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003078 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003079 
   
  
    
    Genome and transcriptome sequence data from a presumed metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003080 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003081 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003082 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003083 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003084 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003085 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003086 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003087 
   
  
    
    Genome and transcriptome sequence data from a pancreatic neuroendocrine cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003088 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003089 
   
  
    
    Genome and transcriptome sequence data from a pancreatic neuroendocrine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003090 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003091 
   
  
    
    Genome and transcriptome sequence data from a clear cell carcinoma of ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003092 
   
  
    
    Using sequencing and gene expression analyses, we identified a subgroup of HCA characterized by fusion of the INHBE and GLI1 genes and activation of sonic hedgehog pathway. Molecular subtypes of HCAs associated with different patients’ risk factors for HCA, disease progression, and pathology features of tumors. This classification system might be used to select treatment strategies for patients with HCA.
Related Publication:
Molecular Classification of Hepatocellular Adenoma Associates With Risk Factors, Bleeding, and Malignant Transformation
Nault, Jean-CharlesLaurent, Christophe et al.
Gastroenterology , Volume 152 , Issue 4 , 880 - 894.e6
http://dx.doi.org/10.1053/j.gastro.2016.11.042 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001003096 
   
  
    
    As part of the International Parkinson's Disease Genomics Consortium, exomes of Parkinson's disease (PD) patients and healthy controls were sequenced to study the genetic etiology of PD. This UK cohort consists of 70 PD patients. Researchers can apply for access to fastq files for this cohort. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  77 
 
  
    EGAD00001003097 
   
  
    
    High-coverage sequencing data from 47 Yemenis samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  47 
 
  
    EGAD00001003098 
   
  
    
    Low-coverage sequencing data from 99 Lebanese samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  99 
 
  
    EGAD00001003099 
   
  
    
    RNAseq data set (Mollaoglu et al., MYC drives progression of small cell lung cancer to a variant neuroendocrine subtype with vulnerability to Aurora kinase inhibition)
RNA isolation from primary tumors and healthy lungs was performed using RNeasy Mini Kit (Qiagen) with the standard protocol. RNA was subjected to library construction with the Illumina TruSeq Stranded mRNA Sample Preparation Kit (cat# RS-122-2101, RS-122-2102) according to manufacturer’s protocol. Chemically denatured sequencing libraries (25 pM) are applied to an Illumina HiSeq v4 single read flow cell using an Illumina cBot. Hybridized molecules were clonally amplified and annealed to sequencing primers with reagents from an Illumina HiSeq SR Cluster Kit v4-cBot (GD-401-4001). Following transfer of the flowcell to an Illumina HiSeq 2500 instrument (HCSv2.2.38 and RTA v1.18.61), a 50 cycle single-read sequence run was performed using HiSeq SBS Kit v4 sequencing reagents (FC-401-4002). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001003100 
   
  
    
    UKBEC 1st release of Exome data for 65 neuropathologically confirmed control individuals of European descent. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  65 
 
  
    EGAD00001003101 
   
  
    
    The need for a detailed catalogue of local variability for the study of rare diseases within the context of the Medical Genome Project motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population. 
    
   
  
    
   
  267 
 
  
    EGAD00001003102 
   
  
    
    We sequenced the polyA+ fraction of the RNA of the leukocytes from 624 sardinian individuals with RNAseq. Prior to library preparation we added either ERCC ExFold RNA Spike-In. An average of 60M reads per samples with 51 bp paired-end reads were generated on a HiSeq 2000 (Illumina). Sequencing reads were then aligned using STAR-2.2.0c2 to the h37d5 reference genome supplemented with the ERCC spike-ins sequences. We further provided an exon-exon junction database that we generated from the GENCODE v14 annotation. In order to remove a contamination from a parallel experiment, we discarded any reads that mapped to the genomic regions of CBLB (chr3:105370773-105592330) and BCL11A (chr2:60672555-60784156). Filtered aligned reads (bam format) are shared. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  624 
 
  
    EGAD00001003103 
   
  
    
    Cohort of 19 ADPKD patients characterized using long-read sequencing. The variant identification provided high sensitivity in identifying PKD1 pathogenic variants, with a diagnostic yield of 94.7%. This dataset includes all sequencing data (BAM files) of the 19 patients, in addition to their raw variants (unfiltered) obtained from the long-read sequencing as well as Sanger sequencing (VCF file). 
    
   
  
    
      
      PacBio RS II 
      
    
   
  19 
 
  
    EGAD00001003106 
   
  
    
    Human HiC 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001003107 
   
  
    
    We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003108 
   
  
    
    We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003109 
   
  
    
    We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003110 
   
  
    
    We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003111 
   
  
    
    We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003112 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10592) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003113 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10679) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003114 
   
  
    
    We collected fresh tissue from an untreated GBM (SF10281) directly from the operating
room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003115 
   
  
    
    Whole genome sequencing data of 15 French Caucasian and 10 African-Caribbean men with prostate Cancer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001003116 
   
  
    
    Benchmark data set containing five tumor/normal pairs of non-small cell lung cancer (NSCLC) patients. Tissue pairs were screened with bisulfite (BS) sequencing, MeDIP methylation enrichment sequencing and RNA sequencing in order to identify differentially methylated and expressed spots in the genomes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001003117 
   
  
    
    In this study, we sequenced three NUT midline carcinoma genomes and their transcriptomes (NMC1, NMC2 and Ty-82), and two paired normal blood samples (for NMC1 and NMC2). Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X machines. Transcriptome (mRNA) sequencing was performed in HiSeq 2500 machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003118 
   
  
    
    Targeted capture sequencing for cases with MDS who were subjected to unrelated bone marrow transplantation via Japan marrow donor program 
    
   
  
    
   
  797 
 
  
    EGAD00001003119 
   
  
    
    TP53 targeted panel aligned reads consisting of BAM paired end reads from ovarian cancer tumor samples Data Access Committee 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  76 
 
  
    EGAD00001003120 
   
  
    
    We used WGS (Complete Genomics) to characterise five metastatic tumours from a BRAF mutant melanoma patient who presented intrinsic resistance. 
    
   
  
    
      
      Complete Genomics 
      
    
   
  6 
 
  
    EGAD00001003121 
   
  
    
    Dataset is composed of FASTQ files from 165 samples of small round cell sarcomas which were RNA-sequenced (whole transcriptome) with either Illumina HiSeq 2500 (120 million reads per sample, paired-end 100 pb) or Illumina NextSeq 500 (110 million reads per sample, paired-end 150) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  165 
 
  
    EGAD00001003122 
   
  
    
    December 2016 data update (bam/fastq for WGBS on samples CEMT0062, CEMT0068, CEMT0072, CEMT0086, CEMT0087) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003125 
   
  
    
    WGS data of medulloblastoma tumor/control pairs. 
    
   
  
    
   
  224 
 
  
    EGAD00001003126 
   
  
    
    WGS data of medulloblastoma tumor/control pairs. 
    
   
  
    
   
  74 
 
  
    EGAD00001003127 
   
  
    
    WGS data of medulloblastoma tumor/control pairs. 
    
   
  
    
   
  482 
 
  
    EGAD00001003128 
   
  
    
    Exome sequencing data for medulloblastoma tumor/control pairs 
    
   
  
    
   
  35 
 
  
    EGAD00001003130 
   
  
    
    Whole exome sequencing data for patients with Bosma arhinia microphthalmia syndrome (BAMS). The dataset includes 21 samples from 7 families with BAMS; see Gordon et al, Nature Genetics, 2017. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  21 
 
  
    EGAD00001003131 
   
  
    
    The dataset consists of two main sample groups. 1) The inter-tumour sample group contains a total of 97 samples from 27 patients. Each patient has a single normal and primary sample as well as one or more metastases. All samples were sequenced using IonTorrent PGM and a custom colorectal cancer (CRC) panel. 2) The intra-tumour sample group contains a total of 68 samples from a single tumour as well as a normal tissue sample. All 68 samples were sequenced using IonTorrent PGM and a custom CRC panel. Shallow whole genome sequencing was additionally applied to 10 of the samples using Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Ion Torrent PGM 
      
    
   
  193 
 
  
    EGAD00001003132 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: GACA-CN. 
    
   
  
    
   
  84 
 
  
    EGAD00001003133 
   
  
    
    RRBS data of 86 Ewing patients (French). Illumina HiSeq 2000/2500 (Fastq files available). Sheffield et al. Nat Med. 2017 Jan 30 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001003134 
   
  
    
    DATA FILES FOR GRUBER SJAMLM7 EXOME 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001003135 
   
  
    
    DATA FILES FOR GRUBER SJAMLM7 RNASEQ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001003136 
   
  
    
    We carried out whole-genome oxidative bisulfite sequencing (WGoxBS) in the placentas of two healthy female and two healthy male pregnancies generating an average genome depth of coverage of 25x. The sex-specific differential methylation pattern observed in this region was validated in additional 8 healthy placentas (including 2 from the WGoxBS) using SureSelect in-solution target capture. For WGoxBS, placental genomic DNA (4 µg) from 4 healthy pregnancies was processed to achieve 10 kb fragments with the g-Tube (Covaris), according to the manufacturer's instructions. To increase the number of uniquely sequenced reads, two independent libraries were generated for each individual. Multiplexed sequencing was carried out on the Illumina MiSeq, HiSeq 2000, and HiSeq 2500 instruments with 2x100, 2x50 and 2x125 cycles using MiSeq Reagent Kit v3, HiSeq SBS Kit v3 and HiSeq SBS Kit v4, respectively. For SureSelect in-solution capture, placental genomic DNA (3.5 µg) from 8 healthy pregnancies (including 2 from the WGoxBS) was fragmented by the Covaris S220 system according to the SureSelect Methyl-Seq target enrichment protocol (Agilent). All 8 libraries were pooled and sequenced on the Illumina HiSeq 2500 instrument with 2 × 125 cycles using HiSeq SBS Kit v4 and a single lane of the Illumina HiSeq 4000 instrument with 2 × 150 cycles using HiSeq 3000/4000 SBS Kit following Illumina's guidelines (Illumina Application Note: Epigenetics February 2016). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001003137 
   
  
    
    Metastatic and primary tumour samples were collected from 4 patients with advanced breast cancer. Samples were collected at autopsy and also from biopsies taken during life. Tumour and germline samples are available. Whole exome sequencing was performed on all samples. 
    
   
  
    
   
  52 
 
  
    EGAD00001003138 
   
  
    
    A dataset consisting of Multi-regional Whole Exome Sequencing (WES) and Whole Genome Sequencing (WGS) data for 54 samples from 9 patients with hepatocellular carcinoma. The dataset includes 45 tumor samples and 9 normal blood samples. Selected somatic variants were validated by Sequenom. Patients covered are: Patient 1, Patient 2, Patient 3, Patient 4, Patient 5, Patient 6, Patient 7, Patient 8, Patient 9 and Patient 10. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001003139 
   
  
    
    200PG : WGS Aligned Sequence (fastq) : Aligned WG sequence data (bam) in this dataset are from the 124 CPCGene Tumour/Normal Pairs used in the 200PG Study.  https://www.ncbi.nlm.nih.gov/pubmed/28068672 
    
   
  
    
   
  262 
 
  
    EGAD00001003140 
   
  
    
    We analyzed the spectrum and clinical significance of MYC and BCL2 mutations in 347 DLBCL cases from population-based cohort of BC, Canada. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  347 
 
  
    EGAD00001003141 
   
  
    
    List of SNPs, and their frequencies, extracted from a low pass whole genome sequencing of 3,514 individuals. 
    
   
  
    
   
  1 
 
  
    EGAD00001003142 
   
  
    
    RNA sequencing of 31 patient-derived fibroblast cell lines from patients with inborn errors of cobalamin (vitamin B12) metabolism, and 7 control samples. The RNA seq library was prepared using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina RS-122–2301) including Ribo-Zero Gold depletion to remove ribosomal RNA. Sequencing was done via llumina Hi-Seq2000 sequencer, using 100bp paired end reads. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001003143 
   
  
    
    Total stranded TruSeq RNA sequencing by Illumina of six tumor samples from six cases of pediatric Pilocytic astrocytoma. The data is published in the following paper: Tomic TT, Olausson J, Wilzen A, Sabel M, Truve K, Sjogren H, Dosa S, Tisell M, Lannering B, Enlund F, Martinsson T, Aman P, Abel F. A new GTF2I-BRAF fusion mediating MAPK pathway activation in pilocytic astrocytoma. PLoS One. 2017 Apr 27;12(4):e0175638. 
    
   
  
    
      
      ILLUMINA 
      
      Illumina HiScanSQ 
      
    
   
  6 
 
  
    EGAD00001003145 
   
  
    
    Sensory neurons are nerve cells that are activated by sensory input such as heat, light and convey information to the brain. Although a key cell type in complex organisms, human sensory neurons are challenging to study because they are impossible to obtain from living donors. We have collaborated with the Neucentis Pharmaceutical Research Unit to differentiate sensory neuron like cells from human induced pluripotent stem cells  derived as part of the Human Induced Pluripotent Stem Cells Initiative. We will sequence RNA from 100 IPS lines derived from healthy individuals and perform RNA-seq on the differentiated cells to identify noncoding variants that alter gene expression in human sensory neurons. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  123 
 
  
    EGAD00001003146 
   
  
    
    We performed whole genome sequencing of nine OC patient-derived cell lines and one normal cell line (HOSEpiC) to analyze if the cell lines harbor OC-typical genomic aberrations absent in normal cells and to relate genomic features to drug sensitivities. 
    
   
  
    
   
  10 
 
  
    EGAD00001003148 
   
  
    
    Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for near-diploid immortalized lymphoblastoid cell line GM18507. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  192 
 
  
    EGAD00001003149 
   
  
    
    Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for third-passage patient-derived primary triple-negative breast cancer xenograft SA501X3F. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  384 
 
  
    EGAD00001003150 
   
  
    
    Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for fourth-passage patient-derived primary triple-negative breast cancer xenograft SA501X4F. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  384 
 
  
    EGAD00001003151 
   
  
    
    Bulk whole-genome BAM files for 184-hTERT-L2, SA501X3F, and SA501X4F. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003152 
   
  
    
    Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for near-diploid immortalized breast epithelial cell line 184-hTERT-L2. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  192 
 
  
    EGAD00001003153 
   
  
    
    Sequencing of untreated pancreatic cancer metastases and primary tumor sections. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  49 
 
  
    EGAD00001003154 
   
  
    
    RNA-Seq files for SJOS study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001003155 
   
  
    
    WES files for SJMDS paper titled 'Genomic Landscape of Pediatric Myelodysplastic Syndromes' 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001003156 
   
  
    
    WGS files for SJMDS paper titled 'Genomic Landscape of Pediatric Myelodysplastic Syndromes' 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001003157 
   
  
    
    Alignment of Genome Denmark Phase II dataset to GRCh38. The dataset consists of 150 Danish individuals (50 trios) sequenced to 80X. The BAM-file contains data from multiple libraries created from one individual with libraries of 180, 500, 800, 2000, 5000, 10000 and 20000 bp. The libraries were created using standard Illumina protocols for paired end reads (180-800bp libraries) and mate pair libraries (2kb-20kb). 
    
   
  
    
   
  150 
 
  
    EGAD00001003158 
   
  
    
    Bam files consisting of aligned MeDIP-seq reads from cord blood cells and cord blood mononuclear cells of twins conceived through in vitro fertilisation 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  75 
 
  
    EGAD00001003159 
   
  
    
    Bam files consisting of aligned MeDIP-seq reads from cord blood cells and cord blood mononuclear cells of twins not conceived through in vitro fertilisation 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  105 
 
  
    EGAD00001003160 
   
  
    
    Exome data from patients and parents with DONSON mutations 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001003161 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Exome Sequencing - October 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003162 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PACA-CA. 
    
   
  
    
   
  298 
 
  
    EGAD00001003163 
   
  
    
    Whole genome sequencing data of 20 carcinosarcomas. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001003164 
   
  
    
    Variant call set (vcf) for three (primary and two recurrent) tumors 
    
   
  
    
   
  3 
 
  
    EGAD00001003165 
   
  
    
    Whole genome sequencing was performed for 81 liver cancer cell lines. Additional whole exome sequencing was performed for a subset of 11 liver cancer cell lines. SK_HEP_1 was also provided, though considered not hepatic origin. These sequencing data provided the detailed genomic characterization of liver cancer models. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  82 
 
  
    EGAD00001003168 
   
  
    
    The blood samples of eight lung cancer patients and one benign lung tumor patient are collected for this dataset. Blood samples were centrifuged first at 1,600 × g for 10 minutes, and then the plasma was transferred into new micro tubes and centrifuged at 16,000 × g for another 10 minutes.  The plasma was collected and stored at -80⁰C. CfDNA was extracted from 5 ml plasma using the Qiagen QIAamp Circulating Nucleic Acids Kit and quantified by Qubit 3.0 Fluoromter (Thermo Fisher Scientific). Bisulfite conversion of cfDNA was performed by using EZ-DNA-Methylation-GOLD kit (Zymo Research). After that, Accel-NGS Methy-Seq DNA library kit (Swift Bioscience) was used to prepare the sequencing libraries. The DNA libraries were then sequenced with 150bp paired-end reads. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001003174 
   
  
    
    There are 116 liver cancer cases in this study and belong to LICA-CN project 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  232 
 
  
    EGAD00001003176 
   
  
    
    For each subject, genomic DNA from whole blood, circulating cell free DNA and tumor tissues (whenever possible) were performed targeting next generation sequencing on Illumina Miseq or Hiseq 4000 platforms. The sequencing results of whole blood were used to distinguish germline and somatic mutations. Specimens were collected from patients with different kinds of solid tumors, but most are lung cancer patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  1845 
 
  
    EGAD00001003180 
   
  
    
    HipSci - Monogenic Diabetes - RNA Sequencing - October 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003181 
   
  
    
    HipSci - Bardet-Biedl Syndrome - RNA Sequencing - October 2016 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003186 
   
  
    
    Variants on the Y chromosome for 62 danish males in VCF format from the GenomeDenmark Phase 2 cohort. Variants were called using reference based approaches such as the haplotype-caller module from GATK and using alignment of denovo assemblies to the reference using ASMvar. 
    
   
  
    
   
  68 
 
  
    EGAD00001003187 
   
  
    
    TBD 
    
   
  
    
      
      Complete Genomics 
      
    
   
  9 
 
  
    EGAD00001003188 
   
  
    
    Variants and genotypes called in 50 danish parent-offspring trios from 80x Illumina sequencing data using BayesTyper. Data was produced using different insert size libraries of the sizes 180, 500, 800, 2000, 5000, 10000 and 20000 bp. The sample IDs for the fathers and mothers are TrioID-01 and TrioID-02, respectively, and the IDs for the children are TrioID-0x, where x is a number between 3 and 7 
    
   
  
    
   
  150 
 
  
    EGAD00001003189 
   
  
    
    Whole genome sequencing of 8 HER2-Positive Breast Cancer (in complement to EGAD00001001844) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001003190 
   
  
    
    WGS blood data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 67 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 30x. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  67 
 
  
    EGAD00001003191 
   
  
    
    WGS cancer data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 78 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 50x. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001003192 
   
  
    
    RNA-Seq data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 78 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 75bp paired-end protocol to a minimum mean reads of 50 million. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001003193 
   
  
    
    Exome sequencing for 2 infertile brothers 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003194 
   
  
    
    This dataset contains whole exome sequence of six HCC patients from Qidong China who are very likely exposed to aflatoxin. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001003196 
   
  
    
    Amplicon-based fungal metagenomic sequencing for the identification of fungal species in brain tissue from Alzheimer's disease. The study consists in 14 samples, sequenced using Illumina's paired-end technology. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001003200 
   
  
    
    Files from whole exome sequencing of 26 tumors and two matched normals from one melanoma patient. The 26 tumors include the untreated primary, cutaneous metastases and distant metastases to internal organs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001003203 
   
  
    
    Aligned (hg19) sequencing data from 16 participants with FL/DLBCL. 
    
   
  
    
   
  37 
 
  
    EGAD00001003204 
   
  
    
    Understanding how cells sense and respond to their environment, and how these responses are modulated by genetic variation, are fundamental biological problems, particularly for understanding how pathogenic organisms invade and manipulate the cells of the human immune system. Macrophages recognize and respond to many important human pathogens including HIV-1, Mycobacteria tuberculosis and Salmonella. This study will focus on the cellular response of human macrophages to Salmonella infection and how this response is modulated by the genetic bacground of the individual as well as additional pro-inflammatory stimulus (interferon-gamma priming). We will acquire 100 human induced pluripotent stem cell lines from the HipSci project, differentiate the cells in vitro into macrophages and expose them to four environmental conditions: (i) no stimulation, (ii) interferon-gamma (18h), (iii) Salmonella typhimurium SL1344 (5h), (iv) interferon-gamma (18h) + Salmonella (5h).Subsequently, we will isolate RNA from the samples for sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  236 
 
  
    EGAD00001003205 
   
  
    
    160 WES and 25 WGS for HBV related HCC, and 15 WES for ICC belongs LICA-CN 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  402 
 
  
    EGAD00001003206 
   
  
    
    BACKGROUND
TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC). 
METHODS
Multiregion high-depth whole-exome sequencing (M-seq) was performed on 100 early stage NSCLC tumors resected prior to systemic therapy. A total of 327 tumor regions were sequenced and analyzed to define evolutionary histories, obtain a census of clonal and subclonal events, and assess the relationship between ITH and recurrence-free survival (RFS). 
RESULTS
Widespread ITH was observed for both somatic copy number alterations (median 48% [0.03-88%]) and mutations (median 30% [0.5-93%]). Driver mutations in EGFR, MET, BRAF and TP53 were almost always clonal. However, heterogeneous driver alterations occurring later in evolution were found in over 75% of tumors and were common in PIK3CA, NF1 and genes involved in chromatin modification and DNA response and repair. Genome doubling and ongoing dynamic chromosomal instability (CIN), illustrated by mirrored subclonal allelic imbalance, were identified as causes of ITH resulting in parallel evolution of driver copy number events, including amplifications of CDK4, FOXA1, and BCL11A. Elevated copy number heterogeneity was associated with shorter RFS (HR=4.9, P=0.00044), which remained significant in a multivariate analysis.
CONCLUSIONS
ITH mediated through CIN, rather than point mutational heterogeneity, was associated with increased risk of relapse, supporting its value as a prognostic predictor, and the need to target this high-risk phenotype. 
    
   
  
    
   
  427 
 
  
    EGAD00001003207 
   
  
    
    Whole genome sequencing data for MMML (28 tumor/control pairs) 
    
   
  
    
   
  56 
 
  
    EGAD00001003208 
   
  
    
    Whole genome sequencing data for MMML (12 tumor/control pairs) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001003210 
   
  
    
    Whole genome sequencing data for MMML (cell_line) 
    
   
  
    
   
  8 
 
  
    EGAD00001003211 
   
  
    
    Deep (>25x mean coverage) whole genome sequencing on 5-10 families drawn from the Scottish Family Health Study with four or more children. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  57 
 
  
    EGAD00001003213 
   
  
    
    The olfactory gene repertoire is largely species-specific, shaped by the nature and necessity
of chemosensory information for survival in each species' niche. We are intrigued by this
interspecific variation and started to investigate the olfactory transcriptome in primates for
evidence of selection at the level of receptor gene choice. Having collected this data from
two primates, we now wish to extend the analysis to humans.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001003215 
   
  
    
    This data set contains whole exome sequences of individuals with self-stated parental
relatedness from the East London Genes & Health cohort. Rare frequency functional variants
in these healthy individuals will be studied with respect to the genetic health of the
participants and loss-of-function analysis of human genes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001003216 
   
  
    
    Whole genome sequencing of tumour normal pairs of human undifferentiated sarcomas. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  98 
 
  
    EGAD00001003217 
   
  
    
    Targeted resequencing at high depth (21 genes, 9 chromosomal regions): at least 4 FFPE samples per case and matched germline DNA: * 100 cases with detailed outcome data, including 15 cases with tumour relapse (515 samples) * 40 cases with matched pre-chemotherapy biopsies (240 samples) * 50 nephrogenic rests matched to above cases (50 samples)
We expect a proportion (possibly 10%) of cases to be mutationally silent on the above studies, and propose to subsequently carry out integrated whole-genome, methylome and transcriptome studies on matched frozen tissue from these cases 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001003218 
   
  
    
    There are 80 Brain cancer cases (160 samples)in this study and belong to GBM-CN project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001003220 
   
  
    
    Whole genome, whole exome, and custom panel sequencing of high-grade meningioma cohort 
    
   
  
    
   
  188 
 
  
    EGAD00001003221 
   
  
    
    Aligned, merged and deduplicated BAM files from BGISeq-500 sequencing of six samples: matched tumour-normal pairs from three melanoma patients. 
    
   
  
    
   
  6 
 
  
    EGAD00001003222 
   
  
    
    Aligned, merged and deduplicated BAM files from HiSeqXTen sequencing of six samples: matched tumour-normal pairs from three melanoma patients. 
    
   
  
    
   
  6 
 
  
    EGAD00001003223 
   
  
    
    We collected tumor samples and adjacent nomal mucosae from 5 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003224 
   
  
    
    We collected tumor samples and adjacent nomal mucosae from 17 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001003225 
   
  
    
    Whole Genome Sequencing Illumina HiSeq data from 111 men with prostate cancer. Samples were taken from primary tissue obtained at prostatectomy (target sequencing depth 50X) with matched blood control (target sequencing depth 30X). This data is from batches 4 to 6. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  221 
 
  
    EGAD00001003227 
   
  
    
    ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: OV-AU. 
    
   
  
    
   
  146 
 
  
    EGAD00001003230 
   
  
    
    Small RNA expression profiles of the blood plasma-derived exosomes from B-cell chronic lymphocytic leukemia patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003231 
   
  
    
    Poly A transcriptome sequence of mutifocal hepatocelular carcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003234 
   
  
    
    Aligned whole genome sequence from AML relapse project 
    
   
  
    
   
  33 
 
  
    EGAD00001003235 
   
  
    
    Raw exome sequence data(fastq) for the GATCI project 
    
   
  
    
      
      unspecified 
      
    
   
  172 
 
  
    EGAD00001003236 
   
  
    
    Raw whole genome sequence data(fastq) for the GATCI project 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001003237 
   
  
    
    Primary mucosal melanomas (MMs) arise from melanocytes located in mucosal membranes lining the respiratory, gastrointestinal and urogenital tracts. MMs frequently present late and have a poor prognosis; the 5-year survival rate is only 14%. MM makes up only ~1.4% of all melanomas and it is this rarity that makes knowledge of the genetic changes that contribute to its pathogenesis limited to a small number of exome/genome studies and other targeted studies. Thus to investigate the somatic alterations and mutation spectra in MM genomes, we have extracted genomic DNA from formalin-fixed, paraffin-embedded (FFPE) human MMs, and subjected them to whole exome sequencing. Given the propensity of MM to metastasize, we will also be sequencing metastatic MM lesions; primary and metastatic lesions from the same individual represent an excellent opportunity to identify potential drivers of metastasis in MM. Finally we will sequence 'normal' DNA from the same individual, where possible, to exclude germline variations. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  110 
 
  
    EGAD00001003239 
   
  
    
    This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001003240 
   
  
    
    Study of cell lineage and embryogenesis using biopsy samples from sites across the whole body (post mortem). Sample donors are recruited sensitively through the Phoenix study and consent to samples being taken after their death for both the Phoenix study and this WTSI study. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001003241 
   
  
    
    Toxoplasmosis is a zoonotic disease caused by a ubiquitous protozoan parasite called Toxoplasma gondii, which can
infect all mammal and bird species throughout the world. seroprevalence varies widely between countries. Studies
have estimated that between 7-34% of people in the UK have been infected with T. gondii. The vast majority of these
people will not have noticed any symptoms, however about 10% of people develop a mild to moderate self limiting
flu-like illness. Following the acute active stage of the infection the parasite persists in the body in the form of cysts,
particularly in heart and skeletal muscle and nervous system tissues, for many years, and usually for life. In
immunocompetent persons these cysts do not pose a health risk. We will use RNA-seq to quantify the transcriptional
response of macrophages to T gondii infection.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001003242 
   
  
    
    This study comprises of three different datasets. 1) 57 samples from the 1243 canapps cell line study,2) 91 FFPE normal samples and 3) 87 samples from the SCORT WS2 dataset. The aim is to sequence these 235 samples in order to test the new V2 Colorectal bait design. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  92 
 
  
    EGAD00001003243 
   
  
    
    Corresponding data set is composed of RNA sequencing of Korean ER positive breast cancer under 35 years old. This set provides 50 alignment files of 50 tumor samples. This is a part of total project data set. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001003244 
   
  
    
    We aim to sequence the mRNA transcriptome of 22 human melanoma cell lines in biological triplicate in order to define the gene expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001003245 
   
  
    
    We aim to sequence the small RNAs of 22 human melanoma cell lines in biological triplicate in order to define the microRNAs expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  66 
 
  
    EGAD00001003246 
   
  
    
    Whole exome sequencing of hepatosplenic T cell lymphoma (HSTL) tumors, paired normals, and cell lines, including (1) 68 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL tumor samples, (2) 20 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL paired normal samples, and (3) 2 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL cell lines. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  90 
 
  
    EGAD00001003247 
   
  
    
    Liberal variant calls generated with VarScan 
    
   
  
    
   
  37 
 
  
    EGAD00001003248 
   
  
    
    A BRAF V600E colorectal organoid which is sensitive to MAP kinase inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a combination of Trametinib, Dabrafenib and Cetuximab. Single cell derived organoids were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors from a screen in 2 2D BRAF V600E colorectal cell lines. Pools of resistant clones were also sequenced. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  36 
 
  
    EGAD00001003250 
   
  
    
    1cm biospies of from patients undergoing bladder cystectomy will be collected.  The underlying muscle and stroma will be removed and the remaining epithelia dissected into small sequential areas which will be sent for ultra-deep exome sequencing using a panel of known cancer and viral genes.  Sequence analysis using similar methods to Martincorena I et al (Science 2015, 348:880) will provide an idea of the  somatic mutational landscape in these patient samples.  Individual patient muscle samples will also be sequenced as a reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  55 
 
  
    EGAD00001003252 
   
  
    
    Sequencing of drug resistant organoids 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001003253 
   
  
    
    Targeted gene screen of cell line tumour samples for testing the new V2 Colorectal gene panel. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  57 
 
  
    EGAD00001003254 
   
  
    
    R&D project to develop low input library construction methods. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001003255 
   
  
    
    Transcriptome of anaplastic meingiomas 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001003256 
   
  
    
    Whole genome sequencing for 131 early onset prostate tumor/control pairs (ICGC) 
    
   
  
    
   
  262 
 
  
    EGAD00001003257 
   
  
    
    Hi-C and promoter capture Hi-C data for HT29 and LoVo.
2 replicates per cell line for the Hi-C.
3 replicates per cell line for the CHi-C. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001003258 
   
  
    
    ChIPseq data for H3K4me1 and H3K9me3 in HT29; H3K4me1, H3K27me3, H3K9me3, H3K36me3 in LoVo. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001003259 
   
  
    
    Regions of common inter-individual DNA methylation differences in human monocytes – potential function and genetic basis
WGBS Data of Samples:
43_Hm03_BlMo_Ct, 43_Hm02_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm01_BlMo_Ct
For details about sequencing or sample metadata check http://deep.dkfz.de/ 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001003260 
   
  
    
    The cell lines in this study are a combination of internally sequenced (cosmic) and externally sequenced cell lines known to be “double-wild-type” (lacking BRAF and NRAS somatic mutations). These sequences were realigned in this data set for consistency. 
    
   
  
    
   
  22 
 
  
    EGAD00001003261 
   
  
    
    These are seven sequencing files form whole exome and whole genome of five tissue samples collected from one pancreatic cancer patient 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003262 
   
  
    
    High-coverage WES sequencing of DNA samples from 50 PTCs was performed on the Illumina HiSeq 2500 or 4000 System 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001003263 
   
  
    
    ICGC DCC Release 24, PACA-CA Deep KRAS sequencing 
    
   
  
    
   
  82 
 
  
    EGAD00001003264 
   
  
    
    ICGC DCC Release 24, PACA-CA Exome sequence 
    
   
  
    
   
  190 
 
  
    EGAD00001003265 
   
  
    
    For CCOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine.
All CCOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001003266 
   
  
    
    For ENOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. For ENOC, DAH985 and DG1288 are recurrent and both were treated with chemotherapy after their first surgery. DAH123 is a untreated sample, metastasis from an primary endometrial tumour.
All HGSC, GCT, CCOC and the rest ENOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001003267 
   
  
    
    For GCT cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine.
All GCT tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001003268 
   
  
    
    HGSC cases in the OvCaRe and CRCHUM Tumour Banks were selected according to the following criteria: (i) were administered platinum taxane based therapy; (ii) relapsed within 12 months (365 days) or had at least longer than 4.5 years (1642.5 days) follow-up data; (iii) had at least 50% tumour content by H&E staining and expert pathology review. All cases were re-reviewed by expert pathologists to confirm the diagnosis of HGSC. Germline BRCA1 and BRCA2 was determined for all patients through hereditary cancer screening programs. The design of cases selection as a discovery cohort was engineered to amplify biological differences by selecting cases from the extremes of the outcome distribution.
All HGSC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  118 
 
  
    EGAD00001003269 
   
  
    
    High-coverage WGS sequencing of DNA samples from 90pairs GCs was performed on the Illumina HiSeq X Ten System. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1332 
 
  
    EGAD00001003270 
   
  
    
    ICGC DCC Release 24, PACA-CA Whole Genome sequence merged alignments 
    
   
  
    
   
  95 
 
  
    EGAD00001003271 
   
  
    
    WGS of T-cell and NK-cell lymphoma
The tumor samples were sequenced with Illumina HiSeq 2500 platform and the resulting FASTq files have been uploaded. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001003272 
   
  
    
    March 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003273 
   
  
    
    Low-coverage whole genome sequencing for the establishment of genomewide copy number alterations in pleura effusions and respective primary tumors 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  20 
 
  
    EGAD00001003274 
   
  
    
    Whole genome sequencing data for MMML (tumor/control pairs and one cell_line) 
    
   
  
    
   
  315 
 
  
    EGAD00001003275 
   
  
    
    Targeted resequencing of samples was done with TruSeq custom amplicon low input kit (TSCA-LI, Illumina). The oligo capture probes were designed to include a prefix of 8 random nucleotides at the 5 end of each probe. The assay is designed such that each targeted locus is annealed with two probes, resulting in amplicons tagged with unique molecular identifiers (UMI) (22) of 16 bases.
 Raw FASTQ sequencing files were processed as following: (a) The first 8 bases were trimmed from each read and recorded with the corresponding base quality scores (BQ) in the attribute field. (b) Reads were aligned with BWA. (c) First round of PCR duplicate cleaning was performed with picard tools markDuplicates using the parameters BARCODE_TAG=BC TAGGING_POLICY=All REMOVE_DUPLICATES=true (d) Since in the previous step only duplicate reads with identical UMIs were removed, a second pass of filtering was done. Reads with identical mapping were considered unique only if their corresponding UMIs were different in at least 3 positions (i.e., UMI edit distance > 2). (e) Paired-end read pairs overlapping genomic positions were clipped to avoid overestimation of the sequencing coverage using bamUtils clipOverlap. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  74 
 
  
    EGAD00001003276 
   
  
    
    Whole genome sequencing data for MMML (24 tumor/control pairs), fastq-files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001003278 
   
  
    
    Whole Exome and Target Sequencing Data in 75 Samples from 5 Hepatocellular Carcinoma Patients. The sequencing was performed by Illumina HiSeq 4000.
Background and aims: Intratumoral heterogeneity (ITH) challenges identifying mutations with target therapy potential whereas circulating cell-free DNAs (cfDNAs) could reflect nearly the entire mutation spectrum in given tumors. We investigated how to minimize the limit of ITH for profiling hepatocellular carcinoma (HCC).Methods: Thirty-two multi-regional HCC samples from five patients were subjected to whole exome sequencing (WES) and targeted deep sequencing (TDS). ITH extent was measured by the average percentage of non-ubiquitous mutations (present in parts of tumor regions). Matched cfDNAs were also analyzed by WES and TDS. Profiling efficiencies of single tumor specimen and cfDNA were compared and the one better depicted mutational landscape was selected to screen therapeutic targets.Results: We found variable extents of ITH in HCCs and observed branched and parallel evolution patterns. ITH level decreased at higher sequencing depth of TDS than that measured by WES (28.1% vs 34.9%, P < 0.01) but it remained unchanged upon additional samples analyzed. TDS of single tumor specimen detected an average of 70% the total mutations in HCC. Although more mutations were detected in cfDNA under TDS than WES, an average of 47.2% total HCC mutations uncovered by cfDNA suggested tissue outperform cfDNA and the latter may serve as alternative in profiling HCC genome. Consequently, TDS of single tumor tissue in 66 patients and cfDNAs in four unresectable HCCs identified 38.6% (26/66 and 1/4) patients bearing therapeutic targets.Conclusions: TDS of single tumor specimen could largely circumvent ITH to uncover mutations indicative of target therapy in HCC. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  124 
 
  
    EGAD00001003279 
   
  
    
    RNA sequencing data for 170 medulloblastoma tumor samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  171 
 
  
    EGAD00001003280 
   
  
    
   
  
    
      
      NextSeq 550 
      
    
   
  16 
 
  
    EGAD00001003281 
   
  
    
    Genomic alterations driving tumorigenesis result from the interaction of environmental exposures and endogeneous cellular processes. With a diversity of risk factors including viral infection, carcinogenic exposures and metabolic diseases, liver cancer is an ideal model to study these interactions. Whole genome sequencing of liver tumors identified 10 mutational signatures showing distinct relationships with environmental exposures, replication and transcription. Transcription-coupled damage was specifically associated with the liver-specific signature 16 and alcohol intake. Flood of indels were identified in very highly expressed hepato-specific genes, likely resulting from replication-transcription collisions. Reconstruction of sub-clonal architecture revealed mutational signature evolution during tumor development exemplified by the vanishing of aflatoxin-B1 signature in African migrants. These findings shed new light on the natural history of liver cancers. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001003282 
   
  
    
    Analysis scripts and output 
    
   
  
    
   
  37 
 
  
    EGAD00001003283 
   
  
    
    Whole genome sequencing data for MMML (healthy cell_line) 
    
   
  
    
   
  24 
 
  
    EGAD00001003284 
   
  
    
    Whole exome sequencing of enteropathy-associated T cell lymphoma (EATL) tumors and paired normals, as well as RNA-sequencing of EATL tumors: including (1) 69 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples, (2) 36 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL paired normal samples, and (3) 32 RNAseq, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  137 
 
  
    EGAD00001003285 
   
  
    
    RNA sequencing data for MMML (3 tumor samples and 1 gcbcell) 
    
   
  
    
   
  5 
 
  
    EGAD00001003286 
   
  
    
    Whole genome sequencing data for MMML (7 tumors and 8 controls) 
    
   
  
    
   
  15 
 
  
    EGAD00001003290 
   
  
    
    Whole genome sequencing for 12 late onset prostate cancer tumor/control pairs (ICGC) 
    
   
  
    
   
  24 
 
  
    EGAD00001003291 
   
  
    
    This dataset represents RNA-sequencing data from 278 primary colon cancers obtained from fresh-frozen tumor sections. RNA-sequencing was performed using TruSeq library preparation and samples were sequenced on Illumina NextSeq and HiSeq. The data are available as Illumina NextSeq and HiSeq fastq files (_R1.fastq and _R2.fastq for each tumor sample, 556 files in total). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  278 
 
  
    EGAD00001003292 
   
  
    
    WGS sequencing for cases from the ICGC ESAD-UK project
Tumours 50x Normals 30x 
HiSeq X
BAM files
These samples are all available in ICGC release 24 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001003293 
   
  
    
    RNA-Seq and WXS from 6 glioblastoma patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001003294 
   
  
    
    Integrated callset of high coverage Ethiopian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 
    
   
  
    
   
  5 
 
  
    EGAD00001003295 
   
  
    
    Integrated callset of high coverage Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 
    
   
  
    
   
  3 
 
  
    EGAD00001003296 
   
  
    
    Integrated callset of low coverage Ethiopian and Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019) 
    
   
  
    
   
  220 
 
  
    EGAD00001003297 
   
  
    
   
  
    
   
  9 
 
  
    EGAD00001003298 
   
  
    
    BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 95 pancreatic adenocarcinoma cases. 
    
   
  
    
   
  96 
 
  
    EGAD00001003301 
   
  
    
    Whole exome sequencing of 10 metastatic biopsies from four TRACERx100 patients (see EGA dataset EGAS00001002247), collected either after relapse or death. The data from these samples are initially published with Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early stage lung cancer evolution. Nature, http://dx.doi.org/10.1038/nature22364 (2017). 
Abstract: 
Earlier detection of relapse following primary surgery for non-small cell lung cancer and the characterization of emerging subclones seeding metastatic sites might offer new therapeutic approaches to limit tumor recurrence. The potential to non-invasively track tumor evolutionary dynamics in ctDNA of early-stage lung cancer is not established. Here we conduct a patient-specific approach to ctDNA profiling in the first 100 lung TRACERx (TRAcking Cancer Evolution through therapy (Rx)) study participants, including one patient co-recruited to the PEACE (Posthumous Evaluation of Advanced Cancer Environment) post-mortem study. We identify independent predictors of ctDNA release in early-stage non-small cell lung cancer and perform tumor volume limit of detection analyses. Through blinded profiling of post-operative plasma, we observe evidence of adjuvant chemotherapy resistance and identify patients destined to experience recurrence of their lung cancer. Finally, we show that phylogenetic ctDNA profiling tracks the subclonal nature of lung cancer relapse and metastases, providing a new approach for ctDNA driven therapeutic studies. 
    
   
  
    
   
  10 
 
  
    EGAD00001003302 
   
  
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  21 
 
  
    EGAD00001003303 
   
  
    
    The evolution of four breast cancers was analyzed using longitudinal samples collected over 2-15 years. Whole-genome sequencing and single-cell RNA-Seq were used to analyze evolution. We have deposited VCF files for SNV, indel, and structural variant calls from WGS data, and a text file showing transcripts per million (TPM) expression for the single-cell RNA-Seq data. 
    
   
  
    
   
  16 
 
  
    EGAD00001003304 
   
  
    
    We collected tumor samples and adjacent nomal mucosae from 46 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing. 
    
   
  
    
      
      Complete Genomics 
      
    
   
  38 
 
  
    EGAD00001003305 
   
  
    
    Diffuse Intrinsic Pontine Glioma (DIPG) is a fatal brain cancer that arises in the brainstem of children with no effective treatment. To understand what drives DIPGs we integrated whole-genome-sequencing with methylation, expression and copy-number profiling. 
    
   
  
    
      
      AB SOLiD System 
      
      Illumina HiSeq 2500 
      
    
   
  23 
 
  
    EGAD00001003306 
   
  
    
    Exome sequencing data of 15 French Caucasian and 10 African-Caribbean men with prostate Cancer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001003307 
   
  
    
    In this project we will use exome sequencing to identify somatic mutations in lesions from a patient with a germline mutation in the protection of telomeres 1 gene (POT1). This dataset contains all the data available for this study on 2017-04-27. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  40 
 
  
    EGAD00001003308 
   
  
    
    This is an in vitro genome-wide CRISPR/cas9 screen in human glioblastoma stem cells, screening for genes essential for survival of these cells. These cells express cas9 and have been transfected with a guide RNA library causing gene knockouts. We will analyse the sequencing data for depletion of guide RNAs. This dataset contains all the data available for this study on 2017-04-27. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003309 
   
  
    
    The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2017-04-27. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  139 
 
  
    EGAD00001003310 
   
  
    
    There are 66 pairs of LAML cases(complete genomics) in this project which belongs to LAML-CN..The library is constructed by the Completes Genomics protocol. 
    
   
  
    
      
      Complete Genomics 
      
    
   
  66 
 
  
    EGAD00001003311 
   
  
    
    Dataset contains one sample derived from gDNA of human fibroblasts. Files are in FASTQ format and were generated using the Agilent SureSelect Human All Exon 50Mb Kit and followed by Next Generation Sequencing on a HighSeq2000 instrument (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003315 
   
  
    
    This dataset includes the high-throughput sequencing data from a study entitled "Clonal History and Genetic Predictors of Transformation into Small Cell Carcinomas from Lung Adenocarcinomas". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X or HiSeq 2500 machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001003316 
   
  
    
    RNAseq of LC2AD with AD80 or DMSO
Plenker et al., Mechanistic insight into RET kinase inhibitors targeting the DFG-out conformation in RET-rearranged cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003317 
   
  
    
    There are 22 pairs of LAML cases in this project which belongs to LAML-CN.The library is constructed by the Illumina protocol. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  63 
 
  
    EGAD00001003318 
   
  
    
    RNA-sequencing alignment for SYSCOL colorectal adenoma-carcinoma samples 
    
   
  
    
   
  314 
 
  
    EGAD00001003320 
   
  
    
    Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer
This dataset contains all the data available for this study on 2017-05-04. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  106 
 
  
    EGAD00001003321 
   
  
    
    This dataset contains all the data available for this study on 2017-05-04. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  523 
 
  
    EGAD00001003323 
   
  
    
    Runs that contain data for the sensitivity and specificity experiments for BiSeqS. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  2 
 
  
    EGAD00001003324 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  21 
 
  
    EGAD00001003325 
   
  
    
    Exome from EGAS00001002441 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003326 
   
  
    
    Azoospermia, characterized by the absence of spermatozoa in the ejaculate is a common cause of male infertility with a poorly characterized etiology. Exome sequencing analysis of two azoospermic brothers allowed the identification of a homozygous splice mutation in SPINK2, encoding a serine protease inhibitor believed to target acrosin, the main sperm acrosomal protease. In accord with these findings we observed that homozygous Spink2 KO male mice had azoospermia. Moreover, despite normal fertility, heterozygous male mice had a high rate of morphologically abnormal spermatozoa and a reduced sperm motility. Further analysis demonstrated that in the absence of Spink2, protease-induced stress initiates Golgi fragmentation and prevents acrosome biogenesis leading to spermatid differentiation arrest.  We also observed a deleterious effect of acrosin overexpression in HEK cells, effect that was alleviated by SPINK2 coexpression confirming its role as acrosin inhibitor. These results demonstrate that SPINK2 is necessary to neutralize proteases during their cellular transit towards the acrosome and that its deficiency induces a pathological continuum ranging from oligoasthenoteratozoospermia in heterozygotes to azoospermia in homozygotes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001003328 
   
  
    
    Clinical and genetic information of an individual with RVOT-VT and a KCNK2 (TREK1) gene mutation obtained after whole exome sequencing. 
    
   
  
    
   
  1 
 
  
    EGAD00001003329 
   
  
    
    The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry.  In the UK there are large populations with very high first cousin marriage rates of 20-50%.  Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations.  This pilot study based on existing cohort samples from the Born In Bradford study will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed.  The data deposited in the EGA consist of  low coverage whole exome sequencing on these samples.
This dataset contains all the data available for this study on 2017-05-11. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3188 
 
  
    EGAD00001003330 
   
  
    
    The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations.
This dataset contains all the data available for this study on 2017-05-11. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  416 
 
  
    EGAD00001003331 
   
  
    
    Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH). 
This dataset contains all the data available for this study on 2017-05-11. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  78 
 
  
    EGAD00001003332 
   
  
    
    PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients
This dataset contains all the data available for this study on 2017-05-11. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001003334 
   
  
    
    Targeted exome  sequencing of patient derived xenografts  from primary colorectal tumours and liver metastases. 
                   This dataset contains all the data available for this study on 2017-05-11. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  573 
 
  
    EGAD00001003335 
   
  
    
    A resource for assessment of exon CNV calling methods in targeted NGS data, we here present the ICR96 exon CNV validation series. The dataset includes high-quality sequencing data from a targeted NGS assay (the TruSight Cancer Panel) together with Multiplex Ligation-dependent Probe Amplification (MLPA) results for 96 independent samples. 66 samples contain at least one validated exon CNV and 30 samples have validated negative results for exon CNVs in 26 genes. The dataset includes 46 exon CNVs in BRCA1, BRCA2, TP53, MLH1, MSH2, MSH6, PMS2, EPCAM and PTEN, giving excellent representation of the cancer predisposition genes most frequently tested in clinical practice. Moreover, the validated exon CNVs include 25 single exon CNVs the most difficult exon CNV to detect. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  96 
 
  
    EGAD00001003336 
   
  
    
    BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 29 pancreatic neuroendocrine cases. 
    
   
  
    
   
  29 
 
  
    EGAD00001003337 
   
  
    
    T cells isolated from peripheral blood, tumors and adjacent normal tissues from six hepatocellular carcinoma patients. SmartSeq2 and Tang2009 protocol were used to amplify RNA from single T cells. High depth enables simultaneously expression profiling and TCR assembling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  5063 
 
  
    EGAD00001003338 
   
  
    
    This is a test dataset derived from public data of the 1000 Genomes Project. Its purpose is not to allow for any inference about cohort data or results, but to aid bioinformaticians in the technical development and testing of tools, as well as data consumers in learning how to access information.  
This dataset consists of 2508 samples from the 1000 Genomes Project (https://www.nature.com/articles/nature15393). Samples' (e.g. NA18534) data can be accessed through the IGSR portal (e.g. https://www.internationalgenome.org/data-portal/sample/NA18534) or their corresponding folder at the 1000 Genomes' FTP site (e.g. http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/data/CHB/NA18534/exome_alignment/). 
There are several different types of data this dataset encompasses: Variant Calling Format (VCF, or its binary counterparts BCF) files, both joint (e.g. ALL_chr22_20130502_2504Individuals.vcf.gz) and split (HG01775.chrY.vcf.gz); exome sequencing CRAM files (e.g. NA18534.GRCh38DH.exome.cram); whole genome sequencing CRAM/BAM files (e.g. NA19239.cram). Additionally, there are multiple files that were sliced to create shorter files, which allows for a quick download, formated as "{FILE-INFO}__{NUMBER-OF-READS}r__{CHR}.{START-COORDINATE}-{END-COORDINATE}.{FILETYPE}" (e.g. "HG01500.GRCh38DH__90r__3.10000-10500__4.10000-10500.cram"). These files can be downloaded directly through the EGA-download-client PyEGA3 (https://github.com/EGA-archive/ega-download-client). 
    
   
  
    
      
      AB SOLiD 4 System 
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001003339 
   
  
    
    Whole exome library making will be performed on genomic DNA derived from radiotherapy induced sarcoma samples and matched normal DNA from the same patients. Next Generation sequencing will be performed on the resulting libraries and mapped to build 37 of the human reference genome to facilitate the identification of mutations
This dataset contains all the data available for this study on 2017-05-17. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003340 
   
  
    
    DDD DATAFREEZE 2016-10-03: 7831 trios - VCF files 
    
   
  
    
   
  - 
 
  
    EGAD00001003341 
   
  
    
    Sequence data from fungal infection isolated from neural tissue in ALS patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  34 
 
  
    EGAD00001003342 
   
  
    
    Identification of fusion transcripts by RNA-sequencing and 
Whole genome sequencing of a breast cancer patient sample (METABRIC ID MB-0152) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003344 
   
  
    
    Transcriptome profiling of 25 prostate tumor samples by RNA-Seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001003345 
   
  
    
    exome sequence data for 57 HIV elite long term non-progressors and rapid progressors. Complete dataset of improved BAMs mapped to hs37d5 and including phenotype information. 
    
   
  
    
   
  57 
 
  
    EGAD00001003347 
   
  
    
    This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-05-24. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  75 
 
  
    EGAD00001003348 
   
  
    
    The differentiation of distinct multifocal hepatocellular carcinoma (HCC): multicentric disease vs. intrahepatic metastases, in which the management and prognosis varies substantively, remains problematic. We aim to stratify multifocal HCC and identify novel diagnostic and prognostic biomarkers by performing whole genome and transcriptome sequencing, as part of a multi-omics strategy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001003349 
   
  
    
    ChIP-seq data (H3K4Me1, H3K4Me3, H3K27Ac histone modifications) in experimental triplicates on multiple myeloma cell line KMS11 and plasma cell leukaemia cell lines L363 and JJN3. ChIP reactions were performed on a Diagenode SX-8G IP-Star Compact using Diagenode automated Ideal Kit. ChIP libraries were generated using HTP Illumina library preparation kit, and sequenced using Illumina HiSeq 2000 with 100 bp single-ended reads. ChIP-seq files are in BED format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001003350 
   
  
    
    DDD DATAFREEZE 2016-10-03: 7831 trios - phenotypic and family descriptions 
    
   
  
    
   
  - 
 
  
    EGAD00001003351 
   
  
    
    In order to comprehensively investigate the genetic relationship between PTC tumors and benign nodules, we totally collected 127 fresh-frozen biopsies samples from 28 patients with concurrent thyroid benign nodule and PTC (n=20) or simple benign nodule (n=8). We carried out whole-exome sequencing  on all the 127 biopsies samples and RNA-sequencing in total of 40 samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  127 
 
  
    EGAD00001003353 
   
  
    
    BAM outputs from STAR (https://github.com/alexdobin/STAR) analysis of RNASeq sequencing on HiSeq platform of 56 tumour samples from 46 melanoma cases.
Gene model = Ensembl version 70 
    
   
  
    
   
  - 
 
  
    EGAD00001003354 
   
  
    
    From 9 patients undergoing hip joint replacement surgery for osteoarthritis, we collected 3 cartilage samples each: a low-grade sample (no obvious evidence of damage or fibrillation); a high-grade sample (damaged and fibrillated cartilage); an osteophytic sample (overlaid bony protrusions mainly around the margins of the articular surface). Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample.
This dataset contains all the data available for this study on 2017-06-09. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001003355 
   
  
    
    From 17 patients undergoing knee joint replacement surgery for osteoarthritis, we collected 4 samples each: intact cartilage, degraded cartilage, synovium, and meniscus. We also collected blood for DNA analysis. Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample.
This dataset contains all the data available for this study on 2017-06-09. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  72 
 
  
    EGAD00001003356 
   
  
    
    Up to now, there are two hypothesis about the pathogenesis of the  relationship of intravenous leiomyomatosis and  uterine myoma. One theory suggests that the IVL comes from the smooth muscle cell in the vessel wall.The other theory indicates that the IVL derives from the uterine myometrium. However, limited to the technology, few studies have been deeply explore the underlying relation. In this study, we employ the RNA sequencing to explore the molecule relationship between IVL and uterus myoma. In order to identify the molecule relationship between IVl and  uterine myoma we conducted transcriptome sequencing and bioinformaitc analysis 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001003357 
   
  
    
    Aligned, merged and deduplicated BAM files from HiSeq whole exome sequencing of 106 samples: matched tumour-normal pairs from 53 melanoma patients. 
    
   
  
    
   
  - 
 
  
    EGAD00001003358 
   
  
    
    The dataset consists of samples from papillary thyroid cancer patients. A total of 181 DNA samples from blood/normal and cancer tissue are subjected to whole
exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and
quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset. 
    
   
  
    
   
  189 
 
  
    EGAD00001003359 
   
  
    
    In this study, we present the results of a custom “pan-cardiomyopathy panel” in a molecular screening of 38 unrelated patients, 16 affected by DCM, 14 by HCM, and 8 by
ARVC. 
The panel was designed using the Design Studio Tool  (Illumina, San Diego, CA,USA). Coding regions and intron–exon boundaries of 115 genes, known to be associated with 7 DCM, HCM, and ARVC as well as channelopathies, were selected for targeted gene enrichment. For genes with multiple transcripts, all exons included in transcripts expressed in cardiac muscle were considered in the gene panel design. 
Total DNA was extracted from peripheral blood samples using the Wizard Genomic DNA Purification Kit (Promega, Mannheim, Germany) according to the manufacturer’s instructions, quantified, and qualitatively checked using NanoDrop 2000c (Thermo Fisher Scientific, Waltham, MA, USA).
Custom targeted gene enrichment and DNA library preparation were performed using the Nextera Capture Custom Enrichment kit (Illumina) according to the manufacturer’s instructions. Targeted regions were sequenced using the Illumina MiSeq platform, generating approximately two millions of 150-bp paired-end reads for each sample (Q30 ≥90%). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  38 
 
  
    EGAD00001003360 
   
  
    
    Bam files containing mitochondrial alignments, extracted from CPCGene Whole Genome Alignments 
    
   
  
    
   
  432 
 
  
    EGAD00001003361 
   
  
    
    VCF files containing mitochondrial variant calls using MToolbox 
    
   
  
    
   
  432 
 
  
    EGAD00001003362 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  49 
 
  
    EGAD00001003363 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001003364 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001003365 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003366 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003367 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003368 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003369 
   
  
    
    RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001003370 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003371 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003372 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001003373 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003374 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001003375 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003376 
   
  
    
    Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001003377 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003378 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003379 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001003380 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003381 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003382 
   
  
    
   
  
    
      
      MinION 
      
    
   
  26 
 
  
    EGAD00001003383 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003384 
   
  
    
    Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001003385 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of Blood EDTA (OT2_cohort) 
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
    
   
  1 
 
  
    EGAD00001003386 
   
  
    
    Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer primary tumor sample (OT2_cohort) 
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
    
   
  1 
 
  
    EGAD00001003387 
   
  
    
   
  
    
      
      MinION 
      
    
   
  19 
 
  
    EGAD00001003388 
   
  
    
    Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 366 samples: matched tumour-normal pairs from 183 melanoma cases comprising 48 primary melanomas, 15 cell lines, and 120 metastases. Sequencing was performed on the Illumina HiSeq 2000 and Xten platforms at Australian and Korean sequencing centres. Data was aligned to the human genome (GRCh37) using BWA-MEM. 
    
   
  
    
   
  - 
 
  
    EGAD00001003389 
   
  
    
    WGS and WXS files for Dyer ATRX study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001003390 
   
  
    
    DCM-cases (149 human DCM samples)
human heart biopsies from 149 patients with dilated cardiomyopathy (DCM) were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology. Each sample-dataset contains the output from tophat-1.4.1 (one *.bam file with the aligned reads and two *.fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  149 
 
  
    EGAD00001003391 
   
  
    
    DCM-controls (113 human non-DCM samples)
human heart biopsies from 113 non-diseased controls were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology.  Each sample-dataset contains the output from tophat-1.4.1 (one *.bam file with the aligned reads and two *.fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  113 
 
  
    EGAD00001003392 
   
  
    
    High-coverage WGS sequencing of DNA samples from 51pairs GCs was performed on the Illumina HiSeq X Ten System. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001003393 
   
  
    
    This dataset contains bam files for RNA-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft) and 3 pairs of neuroblastoma tumors at diagnosis and at relapse. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001003394 
   
  
    
    This dataset contains bam files for ChIP-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft). It includes the bam files for the H3K27ac mark as well as the bam files of the corresponding input DNA for each sample. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003395 
   
  
    
    This dataset consists of the exome sequencing data for 30 tumour and germline DNA pairs derived from relapsed/refractory DLBCL. 
    
   
  
    
   
  60 
 
  
    EGAD00001003396 
   
  
    
    WGS minibam files for SJLIFE 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3036 
 
  
    EGAD00001003397 
   
  
    
    Twenty samples were collected in pairs, i.e., HCC tissue and adjacent non-cancerous tissue. The collected tissue samples were stored in liquid nitrogen. First, 50 mg of tissue was lysed in TRIzol (Invitrogen) to extract RNA following the manufacturer’s instructions. Next, ribosomal RNA was depleted using a RiboZero Gold kit (Epicentre Bio-technologies). RNA integrity was assessed with an Agilent Bioanalyzer 2100. An RNA-Seq library was generated with the rRNA-depleted samples using an Illumina standard RNA Sample Prep kit according to the manufacturer’s instructions. The library was subsequently sequenced on an Illumina HiSeq2500 as 125-bp paired-ends with approximately 300-bp size selection. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001003399 
   
  
    
    RNAseq dataset of 34 samples (6 normals, 7 stroma-enriched, 21 malignant cells-enriched) from patients with resected pancreatic ductal carcinoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001003400 
   
  
    
    We present targeted NGS panel data from 170 samples that were processed using the TruSightTM Cancer (TSC) panel (Illumina, San Diego, CA, USA), which targets 94 genes and 284 SNPs associated with a predisposition towards cancer. The samples are enriched for CNVs in the genes of interest. All CNVs have previously been assessed with MLPA and can therefore be considered as confirmed. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  170 
 
  
    EGAD00001003404 
   
  
    
    RRBS sequencing of 7 tumour regions and a normal sample from a single TRACERx patient. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003405 
   
  
    
    High-coverage WGS sequencing of DNA samples from 23pairs GCs was performed on the Illumina HiSeq X Ten System. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001003406 
   
  
    
    DDD DATAFREEZE 2016-10-03: 7831 trios - exome sequence CRAM files 
    
   
  
    
   
  - 
 
  
    EGAD00001003407 
   
  
    
    Whole-genome sequencing and phasing of admixed Aboriginal Australian genomes and Papua New Guinean genomes using 10x Genomics Chromium technology.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-06-27. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001003408 
   
  
    
    Chip-Seq sequencing data of Atypical teratoid/rhabdoid tumors (ATRT) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001003409 
   
  
    
    Amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are part of a clinical, pathological and genetic continuum. The purpose of the present study was to assess the mutation burden that is present in ALS and/or FTD known disease-causing genes in 54 patients (16 with available postmortem neuropathological diagnosis) with concurrent ALS and FTD (ALS/FTD) not-carrying the C9orf72 hexanucleotide repeat expansion, the most important genetic cause in both diseases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001003410 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: PACA-AU. 
    
   
  
    
   
  81 
 
  
    EGAD00001003411 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: PACA-AU. 
    
   
  
    
   
  81 
 
  
    EGAD00001003412 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  152 
 
  
    EGAD00001003413 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  145 
 
  
    EGAD00001003414 
   
  
    
    June 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD00001003415 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: OV-AU. 
    
   
  
    
   
  93 
 
  
    EGAD00001003416 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: OV-AU. 
    
   
  
    
   
  93 
 
  
    EGAD00001003417 
   
  
    
    This dataset includes genomic information of 19 adult cerebellar glioblastomas (C-GBMs). Whole-exome sequencing data are available for 9 C-GBMs and their 8 corresponding matched blood samples, and glioma-specific targeted-DNA sequencing (GliomaSCAN) data from additional 10 C-GBMs are also available. Among them, Whole-transcriptome sequencing data were conducted for 6 C-GBM tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001003419 
   
  
    
    Whole exome seqeuncing from primary human JMML samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001003421 
   
  
    
    Sequence data of 28 Samples (19 chronic lymphocytic leukemia, 9 control)
Including RNA-Seq and ChIP-Seq of following histone modifications: H3, H3K4me1, H3K4me3, H3K9ac, H3K9me3, H3K27ac, H3K27me3, H3K36me3
Project see: http://www.cancerepisys.org/ 
    
   
  
    
   
  28 
 
  
    EGAD00001003422 
   
  
    
    WXS from barcoded cells that are FACS sorted from GBM-719 xenografts, and the germline reference from patient GBM-719. The 4 xenografts are named according to passage (secondary or tertiary) and treatment (vehicle control or temozolomide). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003423 
   
  
    
    Pulmonary arterial hypertension (PAH) is a rare disorder with a poor prognosis. Deleterious variation within genes encoding components of the transforming growth factor-ß pathway underlie the majority of heritable forms of PAH. Identifying the missing genetic contribution is challenging, even with genes of large effect size, since it likely involves mutations in genes confined to small numbers of PAH cases. In this study, we performed whole genome sequencing, comparing 1038 PAH index cases to 6385 subjects with other rare diseases. Rare variant analysis identified mutations in novel causal genes, namely ATP13A3, AQP1 and SOX17, and provided independent validation of a critical role for GDF2 in PAH. We detected mutations predicted to be disruptive of function in most, but not all, previously reported PAH genes. Taken together these findings provide new insights into the molecular basis of PAH, and support a central role for endothelial dysregulation in disease pathogenesis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  149 
 
  
    EGAD00001003425 
   
  
    
    A EGFR mutant NSCLC cell line which is sensitive to AZD9291 inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a AZD9291. Single cell derived colonies were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors hypothesised from currently available literature.
This dataset contains all the data available for this study on 2017-07-05. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  177 
 
  
    EGAD00001003426 
   
  
    
    High depth whole genome sequencing  from GemCode (10x Genomics) DNA libraries containing long range linkage information for one Baganda trio and one Baganda child (parent already sequenced at high depth).
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-07-05. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001003427 
   
  
    
    Genome-wide profiling of DNA methylation levels by RRBS in 349 samples, derived from 112 glioblastoma (IDH wildtype) patients, 13 IDH muated brain tumor patients, and 5 normal brain controls. For each patient samples from at least two and up to six tumor resections are available. For 6 patients multiple regions of each tumor were sampled. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 3000 
      
      Illumina HiSeq 4000 
      
    
   
  349 
 
  
    EGAD00001003428 
   
  
    
    RNAseq data from the study: "Widespread DNA hypomethylation and differential gene expression in Turner syndrome". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  37 
 
  
    EGAD00001003429 
   
  
    
    RNA analysis of two patients 11 and 15 with WGS done on Illumina HiSeq2000. For research purpose and authorised user only. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001003430 
   
  
    
    RNA analysis of six patients 34, 35, 36, 37, 38 and 39 with WGS done on Illumina HiSeq2500. For research purpose and authorised user only. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003431 
   
  
    
    High-coverage WGS sequencing of DNA samples from 45pairs GCs was performed on the Illumina HiSeq X Ten System. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  88 
 
  
    EGAD00001003432 
   
  
    
    ChIP-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001003433 
   
  
    
    RNA-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  98 
 
  
    EGAD00001003434 
   
  
    
    Whole Exome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  149 
 
  
    EGAD00001003435 
   
  
    
    Whole Genome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  150 
 
  
    EGAD00001003436 
   
  
    
    Seven files of patients 3, 21, 29, 30, 31, 32 and 33 with WGS done on Illumina MiSeq with high coverage. For research purpose and authorised user only. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  7 
 
  
    EGAD00001003437 
   
  
    
    Fourteen files of patients 1, 2, 4, 6, 7, 8, 9, 12, 14, 16, 17, 18, 19 and 27 with WGS done on Illumina MiSeq with low coverage. For research purpose and authorised user only. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001003438 
   
  
    
    Three files of patients 20, 23 and 25  with WGS done on Illumina HiSeq 2000. For research purpose and authorised user only. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003439 
   
  
    
    Three files of patients 10, 11 and 13 with WGS done on Illumina HiSeq X Ten. For research purpose and authorised user only. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001003440 
   
  
    
    One file of patient 16 with WGS done on Illumina HiSeq X-Ten. For research purpose and authorised user only. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003441 
   
  
    
    Total of 584 tumor specimens and/or patient-derived cells across 14 cancer types were subjected for whole-exome/targeted-exome and/or whole-transcriptome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  584 
 
  
    EGAD00001003443 
   
  
    
    Massively parallel nanowell-based single-cell gene expression profiling 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001003444 
   
  
    
    This dataset contains both standard RNA-Seq and small RNA-Seq of TSC related cortical tubers and age matched cortical controls. For the standard RNA-Seq paired-end sequencing was carried out. Each sample was split across multiple lanes. For the files available here the multiple lanes have been merged together, resulting in one forward and one reverse .fastq file for each sample. Small RNA-Seq was carried out on the same samples that underwent standard RNA-Seq. Again paired-end sequencing was carried out. The files here are raw and will need to be undergo quality control and trimming. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  44 
 
  
    EGAD00001003445 
   
  
    
    Clear cell renal cancer is characterized by near-universal loss of the short arm of chromosome 3 (3p). This event arises through unknown mechanisms, but critically results in the loss of several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cancer (ccRCC) recruited into the Renal TRACERx study. We find novel hotspots of point mutations in the 5'-UTR of TERT, targeting a MYC-MAX repressor, that result in telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. Using molecular clocks, we estimate this occurs in childhood or adolescence, generally preceding emergence of the most recent common ancestor by years to decades. Similar genomic changes recent common ancestor by years to decades. Similar genomic changes are seen in inherited kidney cancers. Modeling differences in age-incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Targeting essential genes in deleted regions of chromosome 3p could represent a potential preventative strategy for renal cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  164 
 
  
    EGAD00001003446 
   
  
    
    This dataset includes deep coverage (>60x) whole exomes of 15 human embryonic stem cell lines. Genomic DNA was purified and fragmented using the Illumina Nextera system for library preparation and sequenced using 150bp paired-end reads. Sequencing reads were aligned to the hg19 reference genome using the BWA MEM alignment program. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  15 
 
  
    EGAD00001003448 
   
  
    
    strand-specific RNA-seq data from 19 gastric tumors and their adjacent normal tissues, plus 16 gastric cancer cell lines, one normal gastric cell line, and 3 normal stomach RNAs 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  58 
 
  
    EGAD00001003452 
   
  
    
    The samples include paired tumor and normal tissues from 205 patients (201 for normal and primary tumor tissues; 4 for normal, primary tumor and liver metastatic tissues). 
High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001003453 
   
  
    
    16S sequencing of stool samples of LifeLines-DEEP, domain V4 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1010 
 
  
    EGAD00001003454 
   
  
    
    Validation of HLA variation of 8 individuals from the GenomeDenmark Phase 2 study. Validation is performed Sanger sequencing of selected amplicons (5-10 amplicons per sample). 
    
   
  
    
      
      AB 3730xL Genetic Analyzer 
      
    
   
  8 
 
  
    EGAD00001003455 
   
  
    
    The MHC vcf call set was generated using a modified AsmVar and BayesTyper pipeline. In contrast to the original pipeline, where variant calling is performed using alignment of collapsed assemblies to a reference genome, the MHC call set was produced using alignment of phased MHC haplotypes. Two iterations of BayesTyper was run, a first iteration for each haplotype seperately and a second iteration performing joint variant calling on all haplotypes. The sample IDs for the fathers and mothers are TrioID-01 and TrioID-02, respectively, and the IDs for the children are TrioID-0x, where x is a number between 3 and 7. 
    
   
  
    
   
  25 
 
  
    EGAD00001003456 
   
  
    
    There are 5WGS and 35WES sample pairs from the first affiliated hospital of kunming medical university, which belongs to ICGC projects COCA-CN. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  80 
 
  
    EGAD00001003457 
   
  
    
    Placental biopsies (n = 64 female placentas, n = 67 male placentas) were selected from healthy pregnancies from the POPs cohort. These patients had no evidence of hypertension at booking and during pregnancy, did not experience pre-eclampsia, Hemolysis, Elevated Liver enzymes, and Low Platelets (HELLP) syndrome, gestational diabetes, or diabetes mellitus type I or type II and other obstetric complications. They delivered live babies with a birth weight percentile in the normal range (20-80th percentile), with no evidence of slowing in fetal growth trajectory. Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer.
Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  147 
 
  
    EGAD00001003458 
   
  
    
    Fastq data of genomics heterogeneity of multiple synchronous lung cancer. Whole-genome sequencing (WGS)  were performed in 3 tumour samples, one regional lymph node metastasis sample and peripheral blood sample from the same patient  with MSLCs. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001003459 
   
  
    
    Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual.  Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001003460 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  13 
 
  
    EGAD00001003461 
   
  
    
    H3K27ac ChIP-seq and input genome sequencing was performed in 19 primary prostate tumours classified as intermediate risk. Sequencing of ChIP DNA was performed on an Illumina HiSeq 2000 as either single end 50 bp reads (for 7 samples) or paired end 100 bp reads (for 12 samples). Input DNA from all samples was sequenced using single-end 50 bp reads. The files provided are in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001003462 
   
  
    
    Placental biopsies (n = 64 female placentas, n = 67 male placentas) were selected from healthy pregnancies from the POPs cohort. A quality control process was also applied for the RNA-Seq datasets: reads were trimmed with Trim Galore!, which uses cutadapt internally and were mapped to the same version of human genome reference (hg19). TopHat2, a splice-aware mapper built on top of Bowtie2 short-read aligner, was used in the mapping process in which so-called two-pass (or two-scan) alignment protocol was applied to rescue unmapped reads from the initial mapping step. In the second mapping, previously unmapped reads were re-aligned to the exon-intron junctions detected in the first-mapping by TopHat2 and were combined across all 131 placenta samples. The initial and second mapped reads were merged by samtools 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  147 
 
  
    EGAD00001003463 
   
  
    
    These are the vcf files of exome sequencing of the two probands who were found to harbor mutations in KLB.
Sample: EGAN00001564799 is the proband 1; Sample: EGAN00001564800 is the proband 11 in the KLB paper. 
Exome capture was performed using the SureSelect All Exon capture (Agilent Technologies, Santa Clara, CA USA) and sequenced on the HiSeq2500 (Illumina, San Diego CA USA). 
    
   
  
    
   
  2 
 
  
    EGAD00001003464 
   
  
    
    For RNA-Seq total RNA was isolated following LDC67 or JQ1 treatment. 3’RNAseq libraries were prepared with QUANT SEQ FWD 3´mRNA-Seq Kit (Lexogen, Austria), sequenced on an Illumina HiSeq 4000 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003466 
   
  
    
    This dataset contains 21 tumor-normal pairs of exome sequencing data of HCC patient from Chang Gung Memorial Hospital, Taiwan. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001003467 
   
  
    
    This dataset contains 77 tumor-normal pairs of exome sequencing data of HCC patient from National Taiwan University, Taiwan. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  154 
 
  
    EGAD00001003468 
   
  
    
    A CKD23_C_Mesan_WGBS paired end data for Mesangial cells(kidney) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003469 
   
  
    
    A CKD24_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003470 
   
  
    
    A CKD25_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003471 
   
  
    
    A CKD27_C_Mesan_WGBS paired end data for Mesangial cells(kidney) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003472 
   
  
    
    A DB31_N_Alpha_WGBS paired end data for alpha cells(PSA-NCAM(-), pancreas) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003473 
   
  
    
    A IPS01_N_Fibroblast_WGBS paired end data for iPSC(Oct4) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003474 
   
  
    
    A IPS02_N_NPC_WGBS paired end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003475 
   
  
    
    A IPS03_N_ENeuron_WGBS paired end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003476 
   
  
    
    A IPS04_X_Fibroblast_WGBS paired end data for iPSC(Oct4) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003477 
   
  
    
    A IPS05_X_NPC_WGBS paired end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003478 
   
  
    
    A IPS06_X_ENeuron_WGBS paired end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003479 
   
  
    
    A OB56_N_PreA_WGBS paired end data for Preadipocytes(fat) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003480 
   
  
    
    A OB57_D_PreA_WGBS paired end data for Preadipocyte(fat) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003481 
   
  
    
    A CKD23_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003482 
   
  
    
    A CKD24_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003483 
   
  
    
    A CKD25_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003484 
   
  
    
    A CKD27_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003485 
   
  
    
    A DB31_N_Alpha_mRNA-Seq paired end data for alpha cells(PSA-NCAM(-), pancreas) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003486 
   
  
    
    A OB56_N_PreA_mRNA-Seq paired end data for Preadipocytes(fat) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003487 
   
  
    
    A OB57_D_PreA_mRNA-Seq paired end data for Preadipocyte(fat) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003488 
   
  
    
    A IPS01_N_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003489 
   
  
    
    A IPS02_N_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003490 
   
  
    
    A IPS03_N_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003491 
   
  
    
    A IPS04_X_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003492 
   
  
    
    A IPS05_X_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003493 
   
  
    
    A IPS06_X_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003494 
   
  
    
    A DB31_N_Alpha_smRNA-Seq single end data for alpha cells(PSA-NCAM(-), pancreas) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003495 
   
  
    
    A OB56_N_PreA_smRNA-Seq single end data for Preadipocytes(fat) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003496 
   
  
    
    A OB57_D_PreA_smRNA-Seq single end data for Preadipocyte(fat) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003497 
   
  
    
    A CKD23_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003498 
   
  
    
    A CKD24_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003499 
   
  
    
    A CKD25_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003500 
   
  
    
    A CKD27_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003501 
   
  
    
    A IPS01_N_Fibroblast_smRNA-Seq single end data for iPSC(Oct4) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003502 
   
  
    
    A IPS02_N_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003503 
   
  
    
    A IPS03_N_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003504 
   
  
    
    A IPS04_X_Fibroblast_smRNA-Seq single end data for iPSC(Oct4) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003505 
   
  
    
    A IPS05_X_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003506 
   
  
    
    A IPS06_X_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003507 
   
  
    
    All the samples were obtained from the Pregnancy Outcome Prediction–a prospective cohort study of nulliparous women attending the Rosie Hospital, Cambridge (UK) for their dating ultrasound scan between January 14, 2008, and July 31, 2012. Ethical approval for the study was given by the Cambridgeshire 2 Research Ethics Committee (reference number 07/H0308/163) and all participants provided written informed consent. Cases of preeclampsia (PET) were defined on the basis of the 2013 ACOG criteria and cases of small for gestational age (SGA)infants were confined to severe SGA, i.e. a customized birth weight <5th percentile.
Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer.
Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  52 
 
  
    EGAD00001003508 
   
  
    
    All the samples were obtained from the Pregnancy Outcome Prediction–a prospective cohort study of nulliparous women attending the Rosie Hospital, Cambridge (UK) for their dating ultrasound scan between January 14, 2008, and July 31, 2012. Ethical approval for the study was given by the Cambridgeshire 2 Research Ethics Committee (reference number 07/H0308/163) and all participants provided written informed consent. Cases of preeclampsia (PET) were defined on the basis of the 2013 ACOG criteria and cases of small for gestational age (SGA)infants were confined to severe SGA, i.e. a customized birth weight <5th percentile.
Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer.
Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  91 
 
  
    EGAD00001003509 
   
  
    
    Whole Exome Sequencing reads consisting of BAM paired end reads from Follicular Lymphoma samples. 
    
   
  
    
   
  11 
 
  
    EGAD00001003510 
   
  
    
    BAM files with sequencing reads derived from Illumina whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease.
Whole genome sequencing was performed using Illumina HiSeq X Ten and samples were prepared using TruSeq library prep. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001003511 
   
  
    
    BAM files with sequencing reads derived from Oxford Nanopore MinION whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease.
Samples were prepared using 1D and 2D library preps. 
    
   
  
    
      
      MinION 
      
    
   
  2 
 
  
    EGAD00001003512 
   
  
    
    This dataset includes bam files from 58 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the HTT repeat expansion. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using bwa. Twelve of the samples used TruSeq Nano library preparation and 46 samples used TruSeq DNA PCR-free sample preparation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  58 
 
  
    EGAD00001003513 
   
  
    
    This dataset includes bam files from 3,001 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these bam files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  3001 
 
  
    EGAD00001003514 
   
  
    
    HipSci - Healthy Normals - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  123 
 
  
    EGAD00001003515 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003516 
   
  
    
    HipSci - Monogenic Diabetes - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003517 
   
  
    
    HipSci - Alport Syndrome - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001003518 
   
  
    
    HipSci - Battens Disease - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001003519 
   
  
    
    HipSci - Bleeding and Platelet Disorders - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001003520 
   
  
    
    HipSci - Congenital Hyperinsulinia - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003521 
   
  
    
    HipSci - Hereditary Cerebellar Ataxias - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001003522 
   
  
    
    HipSci - Hereditary Spastic Paraplegia - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003523 
   
  
    
    HipSci - Hypertrophic Cardiomyopathy - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001003524 
   
  
    
    HipSci - Kabuki Syndrome - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003525 
   
  
    
    HipSci - Macular Dystrophy - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003526 
   
  
    
    HipSci - Primary Immune Deficiency - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003527 
   
  
    
    HipSci - Retinitis Pigmentosa - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003528 
   
  
    
    HipSci - Usher Syndrome - Exome Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001003529 
   
  
    
    HipSci - Healthy Normals - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  118 
 
  
    EGAD00001003530 
   
  
    
    HipSci - Monogenic Diabetes - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003531 
   
  
    
    HipSci - Bardet-Biedl Syndrome - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001003532 
   
  
    
    HipSci - Alport Syndrome - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001003533 
   
  
    
    HipSci - Battens Disease - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001003534 
   
  
    
    HipSci - Congenital Hyperinsulinia - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003535 
   
  
    
    HipSci - Kabuki Syndrome - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003536 
   
  
    
    HipSci - Primary Immune Deficiency - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003537 
   
  
    
    HipSci - Hereditary Spastic Paraplegia - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003538 
   
  
    
    HipSci - Hereditary Cerebellar Ataxias - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001003539 
   
  
    
    HipSci - Bleeding and Platelet Disorders - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001003540 
   
  
    
    HipSci - Hypertrophic Cardiomyopathy - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001003541 
   
  
    
    HipSci - Macular Dystrophy - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003542 
   
  
    
    HipSci - Retinitis Pigmentosa - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003543 
   
  
    
    HipSci - Usher Syndrome - RNA Sequencing - July 2017 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001003544 
   
  
    
    Whole exome sequencing data to 30 PDOX models (28 early passages, 3 late passages (1 overlap)), 3 cell lines, and 20 matching human tumors 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  53 
 
  
    EGAD00001003545 
   
  
    
    Low-coverage whole genome sequencing data for 30 PDOX models (28 early passages, 4 late passages (2 overlaps)), 3 cell lines, and 21 matching human tumors 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001003546 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: LIRI-JP. 
    
   
  
    
   
  130 
 
  
    EGAD00001003547 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: LIRI-JP. 
    
   
  
    
   
  130 
 
  
    EGAD00001003548 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: CLLE-ES. 
    
   
  
    
   
  74 
 
  
    EGAD00001003549 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: CLLE-ES. 
    
   
  
    
   
  74 
 
  
    EGAD00001003550 
   
  
    
    Cell line exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001003551 
   
  
    
    The samples include paired tumor and normal tissues from 106 patients .
High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  212 
 
  
    EGAD00001003553 
   
  
    
    Follicular lymphoma (FL) is an incurable B cell malignancy characterized by advanced stage disease and a heterogeneous clinical course. Recent genomic studies have focused on profiling “single” FL biopsies over several time-points, however, multi-site sampling in solid cancers has demonstrated profound spatial intra-tumor heterogeneity (ITH) with implications for precision medicine based initiatives. This study examined the extent of spatial heterogeneity in FL by whole exome sequencing 22 synchronously removed spatially separated biopsies from 9 patients. We observed significant differences in the extent of ITH across cases, with two distinct patterns of high and low spatial heterogeneity emerging. Site-specific alterations in genes with biological, prognostic or therapeutic relevance included, TNFRSF14, PIK3CD, TNFAIP3, PTEN, EP300 and XBP1. In depth characterization of these variants using deep-sequencing techniques confirmed their discordant nature, suggesting on-going genetic diversification driving evolution after widespread tumor dissemination. There was evidence of tumors comprising multiple competing subclones, with distinct clusters of mutations demonstrating differential expansions within spatially-separated sites. For cases where spatial tumors were examined at two time-points (FL and transformation to diffuse large B cell lymphoma (DLBCL)), the degree of heterogeneity increased with transformation. Collectively, our results demonstrate that spatial ITH is prevalent in FL. The existence of site-specific aberrations suggests that a single biopsy may not be sufficient in all patients to capture the full genomic complexity present and these spatial variations need to be considered in biomarker-led clinical studies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001003555 
   
  
    
    40 paired normal and tumour whole-exome sequencing samples was used to investigate the genomic landscape of cutaneous squamous cell carcinoma 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  80 
 
  
    EGAD00001003556 
   
  
    
    We will perform RNAseq to evaluate the effects of the loss of a list of TSGs on the transcriptome.
This dataset contains all the data available for this study on 2017-08-10. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  25 
 
  
    EGAD00001003557 
   
  
    
    This dataset is belong to 2014 whole genome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta).
There are 67 paired CR samples from Chunnam University.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  134 
 
  
    EGAD00001003558 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: RECA-EU. 
    
   
  
    
   
  100 
 
  
    EGAD00001003559 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: RECA-EU. 
    
   
  
    
   
  100 
 
  
    EGAD00001003560 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: MALY-DE. 
    
   
  
    
   
  99 
 
  
    EGAD00001003561 
   
  
    
    ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: MALY-DE. 
    
   
  
    
   
  99 
 
  
    EGAD00001003562 
   
  
    
    This dataset includes bam files from 120 samples. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. All samples were processed with TruSeq DNA PCR-free sample preparation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  118 
 
  
    EGAD00001003563 
   
  
    
    Whole exome sequencing of diffuse intrinsic pontine glioma (DIPG) cells isolated from the pons and from a sub-ventricular zone site of spread within the frontal lobe from the same individual (SU- DIPG-XIII) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003564 
   
  
    
    The aim of the project is the definition of the molecular defect in a cohort of Rett-like patients negative for  mutations in known disease genes.  To this aim, a number of unrelated trios (patients plus parents) will be analysed by exome sequencing.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-08-16. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  46 
 
  
    EGAD00001003565 
   
  
    
    The project is focused on the axonal forms of Charcot-Marie-Tooth (CMT) disease.  We have selected 13 families (7 from Spain and 6 from Czech Republic) that have been indepth clinically assessed and previously tested for mutations in known CMT genes without causal variants characterised.  In these patients we expect to discover several CMT2 genes.  Thus, we requested for exome sequencing of 45 DNAs:27 exomes in families from Spain and 18 exomes in the families from Czech Republic.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-08-16. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001003567 
   
  
    
    Reduced Representation Bisulfite Sequencing for WEHI-AML-1 and WEHI-AML-2. RRBS libraries were made with the NuGEN Ovation RRBS Methyl-Seq System. Bisulfite conversion was performed with the Qiagen Epitect kit. Sequencing was performed on an Illumina HiSeq2500. 
    
   
  
    
   
  7 
 
  
    EGAD00001003568 
   
  
    
    Genome sequencing at diagnosis and post induction for WEHI-AML-1 and WEHI-AML-2. Whole genome sequencing was performed on an Illumina HiSeq X Ten. 
    
   
  
    
   
  4 
 
  
    EGAD00001003569 
   
  
    
    Transcriptome sequencing for WEHI-AML-1 and WEHI-AML-2. RNA libraries were generated using the Illumina TruSeq RNA Sample Preparation Kit v2 and sequenced on an Illumina HiSeq2500. 
    
   
  
    
   
  9 
 
  
    EGAD00001003570 
   
  
    
    Exome sequencing for WEHI-AML-1 and WEHI-AML-2. Exome capture was performed with the Human All Exon v5_UTR Capture Library and the Agilent Technologies SureSelectXT2 Target Enrichment System, with sequencing on an Illumina HiSeq2500. 
    
   
  
    
   
  9 
 
  
    EGAD00001003571 
   
  
    
    The data consists of 678189 genome-wide polymorphic variants of 3658 individuals from ERF/GRIP region in a variant call format (vcf) file. ERF has been genotyped with different genotyping platform: Illumina 318 k, 350 k, 610 k and Affymetrics 200 k. 
    
   
  
    
   
  3658 
 
  
    EGAD00001003573 
   
  
    
    RNA sequencing data for the PDOX model EPD-613FH 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003574 
   
  
    
    Clonal evolution study of Intrahepatic cholangiocarcinoma: 69 PDPCs and 6 tissues. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  81 
 
  
    EGAD00001003579 
   
  
    
    Samples prepared using Safe-SeqS technology. All samples ran on an Illumina MiSeq instrument. Fastq files for read 1 and the index read present (R and I respectively). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  49 
 
  
    EGAD00001003580 
   
  
    
    WGS sequencing for 303 cases (620 samples) from the ICGC ESAD-UK project
Tumours 50x Normals 30x 
HiSeq X
BAM files
These samples are all available in ICGC release 26 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001003581 
   
  
    
    Using low input SMART-seq protocol, the whole transcriptome of human small intestine macrophage subtypes is characterized. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  33 
 
  
    EGAD00001003582 
   
  
    
    Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq unmapped reads 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001003583 
   
  
    
    516 DNA samples were collected from individuals upon enrollment into the European Prospective Investigation into Cancer and Nutrition study between 1993 and 1998 across 17 different centers. 126bp pair-end reads sequencing data from the Illumina platform were converted to fastq format, the 2bp molecular barcode information at each read of the pair was trimmed and was written in the reads name. The Thymine nucleotide required for ligation was removed from the sequences. Burroughs-Wheeler Aligner (BWA-mem) was used for alignment of the processed fastq files to the reference hg19 genome, following indel-re-alignment using GATK. An in-house algorithm was written to collapse read families that share the same molecular barcode sequence 
    
   
  
    
   
  516 
 
  
    EGAD00001003584 
   
  
    
    Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq mapped reads 
    
   
  
    
   
  - 
 
  
    EGAD00001003585 
   
  
    
    Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - WGS mapped reads 
    
   
  
    
   
  - 
 
  
    EGAD00001003586 
   
  
    
    Whole Genomes Define Concordance in Matched Primary, Xenograft, and Organoid Models of Pancreas Cancer - WGS mapped reads 
    
   
  
    
   
  54 
 
  
    EGAD00001003587 
   
  
    
    This data is belong to 2015 whole exome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta).
There are 40 paired NR samples from Chunnam University.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  80 
 
  
    EGAD00001003589 
   
  
    
    Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS mapped reads 
    
   
  
    
   
  27 
 
  
    EGAD00001003590 
   
  
    
    Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS unaligned reads 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001003591 
   
  
    
    Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 25 
    
   
  
    
   
  211 
 
  
    EGAD00001003592 
   
  
    
    Merged bam files for PACA-CA Whole Exome Sequencing, for DCC release 25 
    
   
  
    
   
  216 
 
  
    EGAD00001003593 
   
  
    
   
  
    
      
      Complete Genomics 
      
    
   
  24 
 
  
    EGAD00001003596 
   
  
    
    The MITOEXME project aims to improve protocols for molecular diagnosis of patients with OXPHOS disorders with a focus on a next generation sequencing methods and to increase the knowledge of pahtophysiological mechanisms by identification of new targets and cellular studies. In this project we will sequence the exomes fo 120 patients. 
This dataset contains all the data available for this study on 2017-08-29. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  125 
 
  
    EGAD00001003597 
   
  
    
    Promoter capture HiC on KMS11 (multiple myeloma) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003598 
   
  
    
    This data is belong to 2017 AML prospective data which is aligned to human reference(human_g1k_v37.fasta).
There are 10 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001003599 
   
  
    
    This data is belong to 2017 AML genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 10 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001003600 
   
  
    
    Exome sequencing data for 1001 DLBCL patients and RNA sequencing data for 775 DLBCL patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1776 
 
  
    EGAD00001003601 
   
  
    
    The dataset for Direct Detection of Early-Stage Cancers using Circulating Tumor DNA includes 602 bam files from next-generation sequencing on the Illumina HiSeq2500 or MiSeq.  The samples analyzed include cancer cell lines as well as plasma and tissue specimens from healthy individuals and patients with cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  550 
 
  
    EGAD00001003602 
   
  
    
    Dataset consisting of: 
(1) N=234 genome-wide chromatin accessibility (ATAC-seq) profiles for distinct N=21 healthy old and N=28 healthy young subjects. ATAC-seq biological samples provided for the following tissues: PBMC (N=24), CD14+ monocytes (N=18), CD8+ memory T cells (N=7), CD8+ naive T cells (N=7), CD4+ memory T cells (N=7), CD4+ naive T cells (N=7), and naive B cells  (N=7).
(2) N=39 genome-wide transcription (RNA-seq) data for distinct N=15 healthy old and N=24 healthy young subjects' PBMCs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  273 
 
  
    EGAD00001003603 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003604 
   
  
    
    Genome and transcriptome sequence data from a metastatic gallbladder cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003605 
   
  
    
    Genome and transcriptome sequence data from a metastatic colonic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003606 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003607 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003608 
   
  
    
    Genome and transcriptome sequence data from a metastatic small cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003609 
   
  
    
    Genome and transcriptome sequence data from a metastatic serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003610 
   
  
    
    Genome and transcriptome sequence data from a mullerian mixed tumor with carcinosarcoma of the ovaries patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003611 
   
  
    
    Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003612 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003613 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003614 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003615 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003616 
   
  
    
    Genome and transcriptome sequence data from an adenocarcimona of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003617 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003618 
   
  
    
    Genome and transcriptome sequence data from a mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003619 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003620 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003621 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003622 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003623 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003624 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003625 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003626 
   
  
    
    Genome and transcriptome sequence data from a retroperitoneal mucinous cystic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003627 
   
  
    
    Genome and transcriptome sequence data from a salivary duct carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003628 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of appendiceal origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003629 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003630 
   
  
    
    Genome and transcriptome sequence data from a radiation-induced pleomorphic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003631 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003632 
   
  
    
    Genome and transcriptome sequence data from a chronic lymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003633 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003634 
   
  
    
    Genome and transcriptome sequence data from a solitary fibrous tumors (sarcoma) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003635 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003636 
   
  
    
    Genome and transcriptome sequence data from a metastatic paraganglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003637 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003638 
   
  
    
    Genome and transcriptome sequence data from a metastatic prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003639 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003640 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003641 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003642 
   
  
    
    Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003643 
   
  
    
    Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003644 
   
  
    
    Genome and transcriptome sequence data from a metastatic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003645 
   
  
    
    Genome and transcriptome sequence data from an anaplastic ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003646 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003647 
   
  
    
    Genome and transcriptome sequence data from an anal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003648 
   
  
    
    Genome and transcriptome sequence data from a glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003649 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003650 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003651 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003652 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003653 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003654 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003655 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003656 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003657 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003658 
   
  
    
    Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001003659 
   
  
    
    Genome and transcriptome sequence data from an ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003660 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003661 
   
  
    
    Genome and transcriptome sequence data from an advanced adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003662 
   
  
    
    Genome and transcriptome sequence data from a left cavernous sinus invasive skull meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003663 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003664 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003665 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003666 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast  cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003667 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003668 
   
  
    
    Genome and transcriptome sequence data from a metastatic rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003669 
   
  
    
    Genome and transcriptome sequence data from a metastatic mucinous adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003670 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003671 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003672 
   
  
    
    Genome and transcriptome sequence data from a metastatic clear cell ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003673 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001003674 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003675 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003676 
   
  
    
    Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003677 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003678 
   
  
    
    Genome and transcriptome sequence data from a thymoma carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003679 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003680 
   
  
    
    Genome and transcriptome sequence data from a low grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003681 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003682 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003683 
   
  
    
    Genome and transcriptome sequence data from a metastatic high grade sarcomatous neoplasm nos patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003684 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003685 
   
  
    
    Genome and transcriptome sequence data from an osterosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003686 
   
  
    
    Genome and transcriptome sequence data from a metastatic neuroendocrine tumor arising from small bowel patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003687 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003688 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003689 
   
  
    
    Genome and transcriptome sequence data from a metastatic epitheloid angiomyelolipoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003690 
   
  
    
    Transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  1 
 
  
    EGAD00001003691 
   
  
    
    Genome sequence data from a metastatic squamous cell carcinoma of the oropharynx patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003692 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003693 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003694 
   
  
    
    Genome and transcriptome sequence data from a pleomorphic sarcomatoid  epithelioid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003695 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003696 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003697 
   
  
    
    Genome and transcriptome sequence data from a metastatic meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003698 
   
  
    
    Genome and transcriptome sequence data from a locally advanced right breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003699 
   
  
    
    Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003700 
   
  
    
    Genome and transcriptome sequence data from a thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003701 
   
  
    
    Genome and transcriptome sequence data from a metastatic myoepithelial carcinoma of parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003702 
   
  
    
    Genome and transcriptome sequence data from a high grade serous carcinoma of the fallopian tube/ovary/peritoneum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003703 
   
  
    
    The incidence of acute myeloid leukemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 60. Only 10-15% of cases evolve from a pre-existing myeloproliferative or myelodysplastic disorder; the remaining cases arise de novo without a detectable prodrome and are diagnosed upon development of bone marrow failure. Analysis of diagnostic blood samples has demonstrated that de novo AML is preceded by the accumulation of somatic mutations in pre-leukemic hematopoietic stem and progenitor cells (preL-HSPCs) that subsequently undergo clonal expansion. If individuals in this pre-leukemic phase could be identified, methods for determination of risk and monitoring for progression to overt AML could be developed. However recurrent AML mutations also accumulate during aging in healthy individuals who never develop AML, referred to as age related clonal hematopoiesis (ARCH). To distinguish individuals with preL-HSPCs at high risk of developing AML from those with ARCH, we undertook deep targeted sequencing of genes recurrently mutated in AML in blood samples from 133 individuals in the European Prospective Investigation into Cancer and Nutrition (EPIC) study taken on average 6 years before they developed AML (pre-AML group), together with 683 matched healthy individuals (Control group). Pre-AML cases displayed accelerated age-correlated accumulation of somatic mutations.The identity, number and variant allele frequency (VAF) of mutations differed between the two groups, and were incorporated into a computational model of AML risk prediction that accurately distinguished pre-AML cases from controls on average 7 years prior to AML development. Our findings provide proof of concept that early prediction of AML development is feasible in high-risk populations, paving the way for early disease detection, monitoring, and potentially prevention. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  628 
 
  
    EGAD00001003704 
   
  
    
    Rna sequencing of purified human group 3 innate lymphoid cells from non-reactive lymph nodes and spleen, inflamed tonsils and peripheral blood. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001003705 
   
  
    
    10 single-cell placental RNA libraries were generated using the Chromium Single Cell 3′ Reagent Kit (10X Genomics). All single-cell libraries were sequenced with a customized paired end with dual indexing (98/14/8/10-bp) format according to the recommendation by 10X Genomics. The data were aligned using the Cell Ranger Single-Cell Software Suite (version 1.0). Moreover, plasma RNA from 22 samples were extracted using the RNeasy Mini Kit (Qiagen). cDNA reverse transcription, second-strand synthesis, and RNA-sequencing (RNA-seq) library construction were performed using the Ovation RNA-seq System V2 (NuGEN) kit according to the manufacturer’s protocol. For alignment of the plasma RNA library, adaptor sequences and low-quality bases on the fragment ends (i.e., quality score < 5) were trimmed, and reads were aligned to the human reference genome (hg19) using the TopHat (v2.0.4) software. All aligned reads were deposited in bam file format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  32 
 
  
    EGAD00001003706 
   
  
    
    PRAD-CA, DCC Release 26 : This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  616 
 
  
    EGAD00001003708 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003709 
   
  
    
    Genome and transcriptome sequence data from a high-grade serous fallopian tube carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003710 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003711 
   
  
    
    Genome and transcriptome sequence data from a bilateral breast lobular cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003712 
   
  
    
    Genome and transcriptome sequence data from a primary of unknown origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003713 
   
  
    
    Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003714 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003715 
   
  
    
    Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003716 
   
  
    
    Genome and transcriptome sequence data from a melanoma of the right buccal mucosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003717 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003718 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003719 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003720 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003721 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003722 
   
  
    
    Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003723 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of the anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003724 
   
  
    
    Genome and transcriptome sequence data from a T-cell rich B cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003725 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectosigmoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001003726 
   
  
    
    Genome and transcriptome sequence data from a large-cell neuroendocrine lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003727 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003728 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003729 
   
  
    
    Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003730 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003731 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003732 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003733 
   
  
    
    Genome and transcriptome sequence data from a metastatic uterine leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003734 
   
  
    
    Genome and transcriptome sequence data from a spindle cell carcinoma of the left parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003735 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003736 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003737 
   
  
    
    Genome and transcriptome sequence data from a sinus adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003738 
   
  
    
    Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003739 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003740 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003741 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003742 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003743 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003744 
   
  
    
    Genome and transcriptome sequence data from a pleomorphic xanthoastrocytoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001003745 
   
  
    
    Exome sequencing fastq files from 6 mutation carriers and 5 non-carriers from 2 families. One µg DNA was used for library preparation using the TruSeq DNA LT Sample Prep Kit v2 according to the manufacturer’s instructions (Illumina). Hybridization was performed using Nimblegen SeqCap EZ Exome v3 (Roche) and Paired-end Sequencing (2x100 bp) on the Illumina HiSeq 2000 with TruSeq v3 chemistry (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001003746 
   
  
    
    Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 14 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples, and 8 non-matched normal skeletal muscle samples weer sequenced. BAM files are available for download. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001003747 
   
  
    
    Optimisation of ex vivo Memory B cell Expansion/Differentiation for Interrogation of Rare Peripheral Memory B Cell Subset Responses
1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-09-13. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  38 
 
  
    EGAD00001003748 
   
  
    
    Sequencing of B-cell receptor repertoires in healthy individuals and patients with chronic lymphocytic leukemia.
1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/
 
This dataset contains all the data available for this study on 2017-09-13. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  387 
 
  
    EGAD00001003749 
   
  
    
    Isotype-resolved sequencing of B cell receptor in measles virus infection
1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/
This dataset contains all the data available for this study on 2017-09-13. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  182 
 
  
    EGAD00001003750 
   
  
    
    This is the first whole exome sequencing analysis of a primary meningeal melanocytic tumour (MMT) alongside the patients germline. Here we report the CRAM files from the tumour and germline. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003751 
   
  
    
    Whole genome sequencing data for primary tumors, matching control material from blood and their corresponding organoid.
Whole transcriptome data for organoids. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  102 
 
  
    EGAD00001003752 
   
  
    
    single nucleotide variant calls from somatic sniper, vcf format 
    
   
  
    
   
  34 
 
  
    EGAD00001003753 
   
  
    
    single nucleotide variant calls from somatic sniper, vcf format. input for subclonal reconstruction 
    
   
  
    
   
  20 
 
  
    EGAD00001003754 
   
  
    
    structural variant calls from Delly, vcf format 
    
   
  
    
   
  37 
 
  
    EGAD00001003755 
   
  
    
    This dataset provides whole genome sequencing data of normal/tumors pairs from 9 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 27 samples (9 normals, 16 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001003756 
   
  
    
    Prostate Cancer - RNA-Seq unmapped reads 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  98 
 
  
    EGAD00001003757 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Fastq files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3686 
 
  
    EGAD00001003758 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Bam files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3686 
 
  
    EGAD00001003759 
   
  
    
    ATAC-seq data for 5 non-diabetic human pancreatic islet samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001003760 
   
  
    
    There are 88 paired samples from HCC patients including tumors and matched adjacent normal tissues which were sequencing by Illumina HiSeq 2000 platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  176 
 
  
    EGAD00001003761 
   
  
    
    This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  617 
 
  
    EGAD00001003762 
   
  
    
    Whole Exome sequencing of paediatric High Grade Gliomas 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  99 
 
  
    EGAD00001003763 
   
  
    
    15 whole exome sequencing datasets from five patients. Data is provided as bam files. Libraries were generated using the SeqCap EZ Exome v3.0 kit and sequenced on an Illumina sequencer 
    
   
  
    
   
  15 
 
  
    EGAD00001003764 
   
  
    
    Four RNA-sequencing datasets from two patients with initial low-grade glioma and copy number alteration at IDH1 upon recurrence. Data is provided as bam files. 
    
   
  
    
   
  4 
 
  
    EGAD00001003765 
   
  
    
    Whole-exome sequencing of 20 samples of actinic keratosis (10) and cutaneous squamous cell carcinoma (10) was performed to investigate a potential
relationship between DNA methylation-based subtypes and genetic mutation patterns. 7 samples were shown to belong to the stem cell-like subclass (4 AK and 2 SCC), 12 - to the keratinocyte-like subtype (6 AK and 6 SCC) and one SCC sample is unclassified (was not included in the methylation analysis). Exome regions were captured using Agilent Low Input Exome-Seq Human v5 kit and sequenced on Illumina Hiseq4000 with paired-end 100-nucleotide reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  20 
 
  
    EGAD00001003769 
   
  
    
    This dataset is a time-series of EGFR-mutant NSCLC clinical specimens from an individual patient profiled using tumor-based whole exome sequencing and the data is in BAM format. 
DNA was extracted from FFPE for primary tumor and frozen tumor tissue samples and matched non-tumor tissue using the Qiagen Allprep DNA/RNA Mini Kit.  The library preparation protocol was based on the Agilent SureSelect Library Prep and Capture System. DNA was resuspended in a low TE buffer and sheared (Duty Cycle 5%; Intensity 175; Cycles/Burst: 200; Time: 300s, Corvaris S2 Utrasonicator).  Bar-coded exome libraries were prepared using the Agilent Sure Select V5 library kit per manfucaturer’s specifications. The libraries were run on the HiSeq2500.
Raw paired end reads (100bp) in FastQ format generated by the Illumina pipeline were aligned to the full hg19 genomic assembly obtained from USCS, gencode 14, using bwa version 0.7.12. Picard tools version 1.117 was used to sort, remove duplicate reads and generate QC statistics. Tumor DNA was sequenced to median depth of 303X (range 114.39-383.41) and the matched germline DNA to average depth of 231.65. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003770 
   
  
    
    We performed RNA-seq on polyA-enriched mRNA isolated from the original liver biopsy tissue (Liver tissue), primary liver cells (PLC), hepatocyte-like cells (HLCs) differentiated from induced pluripotent stem cells (iPSCs), and iPSCs. RNA libraries were prepared using the Illumina TruSeq Stranded mRNA Sample Preparation protocol (ref. RS-122-2101, Illumina, San Diego CA, US) and sequenced using the Illumina HiSeq2500 platform following the manufacturer’s protocol. Samples were sequenced in paired-end mode to a length of 2x76 base pairs. Images from the instrument were processed using the manufacturer’s software to generate FASTQ sequence files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001003776 
   
  
    
    186 tumor/normal matched samples from whole exome sequecing and 178 samples (168 tumors, 10 normals) from whole transcriptome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  550 
 
  
    EGAD00001003778 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001003779 
   
  
    
    Whole genome sequencing (WGS) data of human small intestinal organoid cultures, which were deleted for the XPC gene using CRISPR-Cas9. Contains WGS data of 1 clone and 1 subclone. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001003780 
   
  
    
    RNA-seq data obtained from directed differentiation of a subset of FiPSCs and BiPSCs cell lines towards islet-like cells. RNA  was collected at two key developmental stages: definitive endoderm (DE) and pancreatic progenitors (PP). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001003781 
   
  
    
    Paired whole exome sequencing for 32 primary MDS, 14 MDS/MPN, and 8 AML-MRC cases (total = 54).  Normal comparator genomic DNA was extracted from lymphocytes purified by flow cytometry.  Bulk myeloid cells were used as a source of tumor gDNA.  Files uploaded are mapped BAM files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  94 
 
  
    EGAD00001003782 
   
  
    
    When available (25 primary MDS, 12 MDS/MPN, and 6 AML-MRC cases), high quality RNA (stranded-total) was submitted for RNA-seq.  RNA was extracted from bulk myeloid cells which was used as the tumor population.  Files uploaded are mapped BAM files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001003783 
   
  
    
    Recent studies using next-generation sequencing strategies have described the landscape of genetic alterations in diffuse large B-cell lymphoma (DLBCL). However, little is known about the clinical relevance of recurrent mutations and copy number alterations and their transcriptional footprints. This study examines the frequency, interaction and clinical impact of recurrent genetic aberrations in DLBCL using high-resolution technologies in a large population-based cohort. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  376 
 
  
    EGAD00001003784 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Bam files - unrelated samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3559 
 
  
    EGAD00001003785 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Fastq files - unrelated samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3559 
 
  
    EGAD00001003786 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Bam files - GoNL samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  420 
 
  
    EGAD00001003787 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Fastq files - GoNL samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  420 
 
  
    EGAD00001003788 
   
  
    
    Whole Exome Sequencing of 9 Colorectal Cancer (CRC) samples performed on Illumina HiSeq4000 consisting of aligned paired reads. RNAseq data sequenced on Illumina NextSeq500 consisting of FASTQ single reads from 3 CRC colon samples. A total of 12 samples from five patients (we matched normal tissue or pbmc and tumors) were sequenced on Illumina NextSeq500. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  24 
 
  
    EGAD00001003789 
   
  
    
    Exome reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001003790 
   
  
    
    RNA seq reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001003791 
   
  
    
    The SAHGP characterises the genomes of 24 individuals (8 Coloured  and 16 black southeastern Bantu-speakers) using deep whole genome sequencing (WGS). 
    
   
  
    
   
  24 
 
  
    EGAD00001003792 
   
  
    
    The dataset for High Grade Serous Ovarian Carcinomas Originate in the Fallopian Tube includes 46 bam files from next-generation sequencing on the Illumina HiSeq2500.  The samples analyzed include multiple lesions from nine patients, five with high grade serous ovarian carcinoma and four who are BRCA-carriers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  46 
 
  
    EGAD00001003793 
   
  
    
    By differential gene expression analysis followed by protein expression and functional studies, we define that the naive T cells having divided the least since thymic emigration express complement receptors (CR1 and CR2) known to bind complement C3b- and C3d-decorated microbial products and, following activation, produce IL-8 (CXCL8), a major chemoattractant for neutrophils in bacterial defense. We also observed an IL-8–producing memory T cell subpopulation coexpressing CR1 and CR2 and with a gene expression signature resembling that of RTEs.
JCI Insight. 2017;2(16):e93739.
https://doi.org/10.1172/jci.insight.93739 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001003794 
   
  
    
   
  
    
   
  8 
 
  
    EGAD00001003795 
   
  
    
    This dataset includes Nimblegen SeqCap EZ Exome v3 data for each lesion of three patients with multicentric glioma. For two patients, each lesion was sequenced along with whole blood. For a third patient, 3 pieces from the right lesion and 4 pieces from the left were sequenced along with whole blood. In each case BAM files that have been aligned with BWA mem alignment are available. 
    
   
  
    
   
  15 
 
  
    EGAD00001003797 
   
  
    
    This dataset contains WES data (.bam files) and associated phenotype information from 10 patients included in our microbiome study who went on to anti PD-1 immunotherapy for the treatment of metastatic melanoma at the University of Texas MD Anderson Cancer Center. Both tumor and matching germ line normal were sequenced on each patient using Illumina HiSeq 2500. The average coverage was 283X in tumors and 135X in germline (tumor+germline overall:209, Range: 0-1552). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001003799 
   
  
    
    We performed whole-exome sequencing and whole epigenome sequencing (RRBS) of samples collected from different time points during radiotherapy from thirty-four ESCC patients. We compared the genetic and epigenetic features of the different time biopsy samples to reveal the changes in ESCC received radiotherapy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  180 
 
  
    EGAD00001003800 
   
  
    
    Whole Exome Sequencing was performed in a dilution series containing known amounts of human and mouse DNA, 3x 100% human 0% mouse, 2x 90/10, 3x 50/50, 2x 25/75 and 3x 0/100. A set of breast cancer clinical samples, matched normal tissue and matched PDTXs (total number = 14) were also analysed. Paired-end 75bp sequences for the dilution series and paired-end 125bp for the clinical samples were obtained on Illumina HiSeq2500; fastq files are provided.
A triplicate analysis of the transcriptome using RNA-seq was also performed for the Universal Human RNA Reference and the Universal Mouse RNA Reference samples. Paired-end 150bp fastq files obtained on Illumina HiSeq4000 are provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001003801 
   
  
    
    RNAseq Data set 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001003802 
   
  
    
    106 FFPE tumor samples from small bowel were sequenced with Illumina HiSeq 4000. Exome capture was performed with NimbleGen SeqCap EZ Exome Library v3 Kit. Reads were aligned with BWA–MEM v.0.7.12 to GRCh37 reference genome. Variant calls were produced with GATK HaplotypeCaller. Variant calls were filtered against all data from gnomAD database using allele frequency threshold 0.0001 in order to remove germline variation. 
    
   
  
    
   
  106 
 
  
    EGAD00001003803 
   
  
    
    This dataset contains VCF files from a variant calling analysis of 19 neuroblastoma patients. 
WES or WGS data of the primary tumor were compared to WES cfDNA analysis at the time of diagnosis and at a 2nd timepoint (complete remission, partial remission, disease progression or relapse). For 4 patients, WGS of germline, tumor at diagnosis and tumor at relapse DNA was performed on Illumina HiSeq2500, with 100-bp paired-end reads. For the other patients, WES was performed using either an AgilentSureSelect Human All Exon v5 or a Roche Nimblegen SeqCap EZ Exome V3 kit on Illumina HiSeq2000, with 100-bp paired-end reads.
SNVs observed in any of the primary tumors or cfDNA samples studied by WES were targeted using a capture sequencing panel at all intermediate time points. 
    
   
  
    
   
  146 
 
  
    EGAD00001003804 
   
  
    
    Exome fastq files of 98 hepatocellular carcinoma and matched nomral (BCM, HCC-JP) 
    
   
  
    
      
      ILLUMINA 
      
      Illumina HiSeq 2000 
      
    
   
  196 
 
  
    EGAD00001003805 
   
  
    
    A whole genome mutation analysis of cortical kidney tissue, an early passage kidney organoid culture derived from the kidney tissue sample, and a late passage of the same organoid culture. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001003806 
   
  
    
    cDNA depleted RNA (500ng total RNA input) was fragmented to 150-200 nucleotides in first strand buffer for 3 minutes at 94°C. Random hexamer primed first strand was generated in presence of dATP, dGTP, dCTP and cTTP. Second strand was generated using dUTP instead of dTTP to tag the second strand. Subsequent steps to generate the sequencing libraries were performed with the KAPA HTP Library Preparation Kit for Illumina sequencing with minor modifications, i.e., after indexed adapter ligation to the dsDNA fragments, the library was treated with USER enzyme (NEB_M5505L) in order to digest the second strand derived fragments. After amplification of the libraries, samples with unique sample indexes were pooled and sequenced paired-end 2x50bp on a HiSeq2500 system following standard Illumina guidelines. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001003807 
   
  
    
    Whole transcriptome RNA sequencing (RNA-seq) of human induced pluripotent stem cell lines from three independent donors at seven islet developmental stages: definitive endoderm (DE), primitive gut tube (GT), posterior foregut (PF), pancreatic endoderm (PE), endocrine progenitors (EP), endocrine-like cells (EN), and beta-like cells (BLC). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001003808 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  47 
 
  
    EGAD00001003809 
   
  
    
    This dataset includes 186 whole genome sequencing samples which combine to create 93 pairs. Each pair is comprised of two sequencing experiments carried out on the same donor to the NIHR BioResource Rare Disease cohort. These samples have been used to validate the Telomerecat method (a method for estimating telomere length from whole genome sequencing). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001003810 
   
  
    
    An RNA Seq study of the effects of HDAC inhibitor Quisinostat on six different synovial sarcoma cell lines 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001003811 
   
  
    
    Our project will examine the role of PIK3CA mutations and their sensitivity to endocrine therapies and its role, with the addition of complete ovarian suppression. We plan to test our hypotheses using tumour samples collected from patients enrolled in the SOFT/IBCSG24-02 clinical study (Suppression of Ovarian Function Trial - (NCT00066690). SOFT is a phase III trial that randomised 3066 premenopausal women to evaluate if adding ovarian suppression to adjuvant endocrine therapy will improve clinical outcomes. 
This dataset contains all the data available for this study on 2017-11-22. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  81 
 
  
    EGAD00001003812 
   
  
    
    Whole genome sequencing of sampels from isolated populations from Croatia. The samples are sequenced using the Illumina HiSeq X Ten system.  
This dataset contains all the data available for this study on 2017-11-22. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001003813 
   
  
    
    The data contain whole exome sequencing of 27 Greenlanders in nine trios. Data were produced by Agilent SureSelect capture followed by paired-end Illumina HiSeq 2000 sequencing to a depth of 90.1X. More details on processing and analysis can be found in Moltke et al, Nature 2014 (PMID 25043022). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001003814 
   
  
    
    The data contain whole deep RNA sequencing of leukocytes from 17 Greenlanders. RNA was purified from peripheral blood with the PAXGene Blood miRNA Kit (Qiagen). The RNA sequencing library was prepared following the instructions of the TruSeq RNA Sample Prep Kit v2 (Illumina). For mRNA isolation and fragmentation 200 ng of total RNA was purified by oligo-dT beads. The qualified libraries were amplified on cBot to generate the cluster on the flowcell (TruSeq PE Cluster Kit V3–cBot–HS, Illumina). The amplified flow cell was sequenced paired-end on the HiSeq 4000 System (TruSeq SBS KIT-HS V3, Illumina). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001003815 
   
  
    
    Whole exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001003816 
   
  
    
    HALT AML mRNA - RNASeq mapped reads 
    
   
  
    
   
  22 
 
  
    EGAD00001003818 
   
  
    
    BAM files of targeted next-generation DNA sequencing data of 13 chordoid gliomas of the third ventricle (2 paired tumor-normal samples and 11 tumor-only samples).  Genomic DNA was extracted from formalin-fixed, paraffin-embedded blocks of tumor tissue from 13 patients with chordoid glioma of the third ventricle using the QIAamp DNA FFPE Tissue Kit (Qiagen).  Genomic DNA was also extracted from leukocytes in a peripheral blood sample from one of the patients and a non-neoplastic gastric biopsy specimen from one of the patients.  Capture-based next-generation DNA sequencing was performed at the University of California, San Francisco Clinical Cancer Genomics Laboratory, using an assay that targets all coding exons of approximately 500 cancer-related genes, select introns of 47 genes, and TERT promoter with a total sequencing footprint of 2.8 Mb (UCSF500 Cancer Panel).  Sequencing libraries were prepared from genomic DNA, and target enrichment was performed by hybrid capture using a custom oligonucleotide library (Nimblegen SeqCap EZ Choice).  Captured libraries were sequenced as paired-end 100 bp reads on an Illumina HiSeq 2500 instrument.  Duplicate sequencing reads were removed computationally to allow for accurate allele frequency determination and copy number calling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001003819 
   
  
    
    The dataset includes a subset of 762 individuals that were found to be closely related (≤3rd degree), including 263 Chinese and 499 Malays from 
the Singapore Living Biobank. There samples are whole-exome sequenced on Illumina HiSeq2000 platform (125bp paired end) with the exonic regions being captured using the  Nimblegen SeqCap EZ Exome v3 kits.All the files are in the BAM format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  762 
 
  
    EGAD00001003820 
   
  
    
    Whole transcriptome, strand-specific RNA-seq libraries were prepared from total RNA purified using RNeasy mini kit (Qiagen) using Ribo-Zero technology (Epicentre, an Illumina company) for depletion of rRNA followed by library preparation using ScriptSeq ScriptSeq RNA-Seq Library preparation Kit from Illumina. The paired raw sequence reads were processed using TopHat2 and mapped to the humane reference genome HG19. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001003821 
   
  
    
    WES was performed using the KAPA-Hyper prep kit from Illumina (Roche, Basel, Switzerland) for library construction, followed by exome capture using Niblegen SeqCap EZ Human Exome Library v3.0 (Roche). Reads were mapped using BWA MEM against the humane reference genome HG19. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  42 
 
  
    EGAD00001003822 
   
  
    
    The dataset comprises 8 breast cancer, 11 ovarian cancer, 1 benign tumour, 18 normal tissue, 2 endometrium, and 23 white blood cell samples. Genome wide methylation analysis was performed by Reduced Representation Bisulfite Sequencing (RRBS) on Illumina HiSeq 2500. Data is provided as FASTQ files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  63 
 
  
    EGAD00001003823 
   
  
    
    Somatic mutations were called using  whole exome Sequencing (WES) data from colorectal cancer samples (dataset EGAD00001003821) using MuTect2, with matched constitutional WES-data obtained from leukocytes samples as reference. 
    
   
  
    
   
  37 
 
  
    EGAD00001003824 
   
  
    
    Whole genome sequencing data on 10 human cancer cell lines 
    
   
  
    
      
      Complete Genomics 
      
      Illumina Genome Analyzer IIx 
      
    
   
  14 
 
  
    EGAD00001003825 
   
  
    
    Patients with T-cell prolymphocytic leukemia (T-PLL) were profiled with 
multiple OMICS approaches based on Next-Generation Sequencing (NGS). In 
total, data from RNA-Seq, Whole-Exome Sequencing, Whole-Genome 
Sequencing and amplicon panel analyses in 134 samples are available. All 
samples were processed as paired-end libraries on Illumina sequencing 
machines. The data are available as paired FastQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  134 
 
  
    EGAD00001003827 
   
  
    
    The data set contains bam files aligned using bwa-0.7.8 mem -t 8 -R. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001003828 
   
  
    
    This dataset contains paired fastq files for LMS tumor samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  37 
 
  
    EGAD00001003829 
   
  
    
    The data set contains paired end fastq files for whole exome sequencing data for Leiomyosarcoma tumor and control samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  96 
 
  
    EGAD00001003831 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001003832 
   
  
    
    Patient information
SSc patients were recruited at the Department of Rheumatology of the Leiden University Medical Center (Leiden, The Netherlands). All patients met the American Rheumatism Association classification criteria for SSc (Subcommittee for scleroderma criteria 1980), and were classified according to LeRoy and Medsger criteria as either limited or diffuse cutaneous disease (LeRoy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, Medsger TA Jr, Rowell N 1988). Institutional review board approval and written informed consent was obtained before patients entered this study. Two 4 mm skin biopsies were taken from a standardized location on the most proximal part of the lower arm, distal from the elbow. In 10 patients the skin biopsy came from a clinically affected area and in 4 patients the skin was locally unaffected. One sample was used for RNA sequencing and one sample was used for immunohistochemistry. Skin biopsies from healthy individuals were commercially sourced (Tissue Solutions, UK) and collected from donors undergoing skin resection surgery and after informed consent. To match the healthy skin with patients as much as possible, skin biopsies from healthy controls were also taken from a similar position (the under-arm (for 4 controls) and leg (for 2 controls)). Healthy skin donors were selected to match the age and sex of the SSc patient cohort. Biopsies from patients and controls were equally treated and were both stored at -80°C until RNA isolation was performed. RNA from frozen skin biopsies was isolated using RNeasy kit from fibrous tissue (Qiagen, the Netherlands). RNA quantity was determined by using SimplyNano 2000 and quality was assessed on Tapestation (Agilent, the Netherlands). All samples included in the study had a RIN score above 7.0.
Transcriptome characterisation and analysis
RNA sequencing was performed using polyA selection and a stranded protocol using Ion Torrent next generation sequencing technology (Service XS, The Netherlands). The Ion PI Template OT2 200 Kit v3 and Ion PI Sequencing 200 Kit v3 were used according to the manufacturer’s instructions. 20 samples were run on 11 PI chips. PI chip analyses, base calling and quality checks were performed using the Torrent Server Suite. An average of 42 million 100 bp reads was generated per sample. Following quality control, reads were aligned to the human genome (Homo sapiens GRh38.78) using Bowtie2 and STAR (Dobin et al. 2013; Langmead and Salzberg 2012). Reads were first aligned with STAR. For the unmapped reads from STAR, a second alignment step was performed using bowtie2 (local very sensitive options) 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  20 
 
  
    EGAD00001003834 
   
  
    
    This dataset contains whole genome sequencing FASTQ data for 12 cholangiocarcinoma tumor samples, and their matched normal samples. These 12 samples are in addition to 59 samples available in dataset EGAD00001001988, and consist of patients from Thailand, Romania, and Singapore. Paired-end sequencing data was generated by Illumina Hiseq 2000 and 2500, with insert sizes of 170 and 350. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001003835 
   
  
    
    Whole genome sequencing data of 25 prostate tumor and corresponding normal samples, aligned with the CGP BWA-mem workflow. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  50 
 
  
    EGAD00001003837 
   
  
    
    This dataset, named Stockholm tumor progression cohort, contains exome-sequencing samples of matched primary and metastasis samples from 20 metastatic breast cancer patients. All patients have one or more sequenced normal samples as well. The total number of samples is 125. The dataset has been used, apart from other studies, to explore tumor evolution patterns in metastatic breast cancer at Karolinska Institute Stockholm. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  125 
 
  
    EGAD00001003838 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001003839 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001003840 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003841 
   
  
    
    One sample of human genomic DNA.
DNA extracted from whole blood.
Reads obtained  using an exome enrichment kit (Truseq, Illumina) and sequencing of 100bp paired-end reads on a HiSeq 2500 sequencing system (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003845 
   
  
    
    A SMC01_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003846 
   
  
    
    A SMC02_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003847 
   
  
    
    A SMC03_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003848 
   
  
    
    A SMC04_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003849 
   
  
    
    A SMC05_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003850 
   
  
    
    A SMC06_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003851 
   
  
    
    A SMC07_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003852 
   
  
    
    A SMC08_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003853 
   
  
    
    A SMC09_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003854 
   
  
    
    A ADMSC01_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003855 
   
  
    
    A ADMSC02_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003856 
   
  
    
    A ADMSC03_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003857 
   
  
    
    A ADMSC04_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003858 
   
  
    
    A SMC01_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003859 
   
  
    
    A SMC02_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003860 
   
  
    
    A SMC03_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003861 
   
  
    
    A SMC04_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003862 
   
  
    
    A SMC05_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003863 
   
  
    
    A SMC06_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003864 
   
  
    
    A SMC07_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003865 
   
  
    
    A SMC08_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003866 
   
  
    
    A SMC09_smRNA-Seq single end data for skeletal muscle cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003867 
   
  
    
    A ADMSC01_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003868 
   
  
    
    A ADMSC02_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003869 
   
  
    
    A ADMSC03_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003870 
   
  
    
    A ADMSC04_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001003871 
   
  
    
    A SMC01_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003872 
   
  
    
    A SMC02_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003873 
   
  
    
    A SMC05_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003874 
   
  
    
    A SMC06_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003875 
   
  
    
    A SMC07_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003876 
   
  
    
    A SMC08_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003877 
   
  
    
    A SMC09_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003878 
   
  
    
    A ADMSC01_WGBS paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003879 
   
  
    
    A ADMSC02_WGBS paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003880 
   
  
    
    A ADMSC03_WGBS paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003881 
   
  
    
    A ADMSC04_WGBS paired end data for adipose-derived mesenchymal stroaml cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003882 
   
  
    
    EBiSC Whole Genome Sequencing raw FASTQ 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  70 
 
  
    EGAD00001003883 
   
  
    
    Background: Lung carcinoma-in-situ (CIS) lesions are the pre-invasive precursor to lung squamous cell carcinoma. However, only half progress to invasive cancer in three years, while a third spontaneously regress. Whether modern molecular profiling techniques can identify those pre-invasive lesions that will subsequently progress and distinguish them from those that will regress is unknown. Methods: Progressive and regressive CIS lesions were laser-captured and their genome, epigenome and transcriptome interrogated. We analysed 83 progressive lesions, 41 regressive and 33 normal epithelial control samples. DNA methylation and gene expression profiles were further validated using publicly available lung cancer data. Results: Somatic mutation burden was higher in progressive lesions than regressive CIS lesions, across base substitutions, rearrangements, and copy number changes. Driver mutations were present in both progressive and regressive CIS lesions, but were more numerous in progressive cases. Progressive and regressive CIS lesions had distinct epigenomic and transcriptional profiles, with a strong chromosomal instability signature. Gene expression, methylation and copy number profiles can all predict accurately which CIS lesions will progress to lung cancer. Conclusion: Pre-invasive CIS lesions that will subsequently progress to invasive lung cancer can be distinguished from those that will regress using molecular profiling. Progression is associated with a strong chromosomal instability signature. These findings inform the development of novel therapeutic targets. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  69 
 
  
    EGAD00001003884 
   
  
    
    The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  37 
 
  
    EGAD00001003885 
   
  
    
    The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001003886 
   
  
    
    In the present study, we have examined fungal and bacterial infection in brain tissue from 10 AD patients and 16 control subjects by next-generation sequencing NGS using MiSeq sequencing platform (Illumina). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  41 
 
  
    EGAD00001003887 
   
  
    
    Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 8 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples were sequenced. BAM files are available for download. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001003888 
   
  
    
    A SMC03_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003889 
   
  
    
    A SMC04_WGBS paired end data for skeletal muscle cells 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001003890 
   
  
    
    we conducted whole genome sequencing (WGS) to characterize the genomic alterations of 36 never-smoker Chinese patients with lung adenocarcinomas (LUADs). This dataset is containing clean fastq files of 36 never-smoker Chinese patients with lung adenocarcinomas (LUADs) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  72 
 
  
    EGAD00001003891 
   
  
    
    Transcriptome sequencing was performed on 214 patients with myelodysplasia in this study. RNA was obtained from bone marrow CD34+ cells (n=100) and/or bone marrow mononuclear cells (n=165). Transcriptome sequencing was performed for both cell fractions in 51 patients. A total of 211 patients were genotyped by targeted deep sequencing. We also studied bone marrow CD34+ cells and bone marrow mononuclear cells obtained from three healthy adults each. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  266 
 
  
    EGAD00001003892 
   
  
    
    Hepatocellular carcinoma specimens, intrahepatic cholangiocarcinoma specimens and liver normal tissues collected from 7 samples, including 44 fastq files from whole exome sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  21 
 
  
    EGAD00001003894 
   
  
    
    The dataset (vcf files) consists of rare germline variants of 68 Finnish acute myeloid leukemia patients. We performed exome sequencing and filtered the germline variants against ExAC total MAF<0.01 in two gene panels. The 35 genes in the panels studied here have previously been associated with hematological malignancies and/or solid tumors. The dataset contains only variants of the two gene panels. 
    
   
  
    
   
  68 
 
  
    EGAD00001003895 
   
  
    
    the dataset contains RNA bam files of Renal Cell Carcinoma patients, which belongs to "An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001003898 
   
  
    
    This dataset provides whole genome sequencing data of normal/tumors pairs from 4 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 10 samples (4 normals, 4 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003900 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  83 
 
  
    EGAD00001003901 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001003902 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003903 
   
  
    
    Targeted sequencing of 284 patients with AV nodel reentry tachycardia (AVNRT). Sixty-seven genes, plausibly involved in AVNRT  pathophysiology, were targeted. Using haloplex target enrichment system.
Raw paired end fastq files are provided in this dataset. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  284 
 
  
    EGAD00001003904 
   
  
    
    Comprehensive transcriptional characterization of bone marrow endothelial cells by RNA sequencing was performed to determine the molecular properties/signatures of endothelium during bone marrow recovery and niche formation.
Regenerative bone marrow endothelium was FACS-isolated from bone marrow aspirates of Acute Myeloid Leukemia patients 17 days after receiving chemotherapy (n=3). Niche-forming endothelial cells were FACS-isolated from fetal bones (gestational age 15-20 weeks) (n=3). Healthy adult bone marrow endothelial cells (n=7) were used as steady-state controls. 
cDNA was prepared using the SMARTer procedure (SMARTer Ultra Low RNA Kit, Clonetech). The provided file type is FASTQ. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  13 
 
  
    EGAD00001003905 
   
  
    
    RNA-Seq files accompanying the paper titled "Somatic Histone H3 Mutations in Diffuse Intrinsic Pontine Gliomas and Non-Brainstem Paediatric Glioblastomas". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD00001003906 
   
  
    
    October 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001003907 
   
  
    
    A Hematogenous Route for Medulloblastoma Leptomeningeal Metastases 
    
   
  
    
   
  79 
 
  
    EGAD00001003908 
   
  
    
    - Six samples from the DEV cell line: 2 controls, 2 transduced with IL4R WT and 2 transduced with IL4R mutant (I242N)
- This DEV cell line is not commercially available and was acquired from a colleague in the Netherlands 
    
   
  
    
   
  6 
 
  
    EGAD00001003909 
   
  
    
    Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  610 
 
  
    EGAD00001003910 
   
  
    
    Deep single-cell RNA sequencing data for 11,138 T cells from tumour, adjacent normal tissue and peripheral blood of treatment-naive CRC patients. The DATA ACCESS AGREEMENT is provided at https://github.com/zhangyybio/single-T-cell-data-access. Applicants can request access to the data by directly downloading it or by sending an email to cancerpku@pku.edu.cn. The process that is used to approve an application includes verifying the institution, participants and research purposes of the application. In general this process will take about two weeks. In principal, any academic research institutions complying with the laws and bioethic regulation policies of China will be approved. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11138 
 
  
    EGAD00001003911 
   
  
    
    We generated human induced pluripotent stem cell (iPSC) lines with a GFP reporter inserted in the endogenous NKX6.1 locus. Characterisation of the reporter lines demonstrated faithful GFP labelling of NKX6.1 expression during pancreas and motor neuron differentiation. We performed three independent in vitro differentiations towards the pancreatic endocrine lineage. We FACS-purified GFP positive and negative cells from stage 7 cultures, and generated Smart-Seq2 RNA-sequencing libraries for the pre-sorted cells, as well as the two GFP-sorted cell populations. Gene expression profiling by RNA-sequencing reveals that the NKX6.1-positive population closely resembles mature human beta cells and the functional evaluation of purified populations shows that the glucose-responsive beta-like cells are enriched within the NKX6.1-positive population. These reporter lines provide a valuable resource to the scientific community for the derivation of functional relevant pancreas and neuronal cell subtypes. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  15 
 
  
    EGAD00001003912 
   
  
    
    This data is belong to 2018 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 12 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001003913 
   
  
    
    74 CD49f single-cell methylomes are from cord blood of donor1, and 84 from cord blood of donor2. Samples from donor1 have one sequencing lane, and samples from donor2 have five sequencing lanes. This dataset was generated using Post-Bisulfite Adapter Ligation (PBAL), a bisulfite based whole genome protocol. In total this dataset consists of 494 runs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  158 
 
  
    EGAD00001003914 
   
  
    
    This dataset provides whole genome sequencing data of tumor/normal pairs from 20 patients with hepatoblastoma using the illumina Novaseq sequencing system. It includes 40 samples (20 normals and 20 hepatoblastoma tumors). Our comprehensive analysis identified somatic mutations, structural variations, copy number variations and non-coding variants in hepatoblastoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  40 
 
  
    EGAD00001003915 
   
  
    
    The dataset contains raw sequences (FASTQ files) from the Illumina 2x150bp paired-end RNA sequencing profiles of 11 fetal human brain samples at 7, 9, 12, 15 and 21 gestational weeks 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001003916 
   
  
    
    Cancer exomes consisting of FASTQ paired-end reads from ovary samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001003917 
   
  
    
    Germline exomes consisting of FASTQ paired-end reads from blood samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001003918 
   
  
    
    Cancer RNA-seq consisting of FASTQ paired-end reads from ovary samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001003919 
   
  
    
    We performed whole genome, whole or targeted exome sequencing for 289 individuals from India. This included 152 clinically diagnosed MODY and 137 control samples. Whole genome libraries were constructed using TruSeqNano DNA Library Preparation Kit (Illumina, CA) and sequenced on Illumina HiSeq2500 (Illumina, CA). The whole exome analysis was performed using Agilent SureSelect (Santa Clara, CA) Human All Exome kit v5 (50 Mb). Exome capture libraries were sequenced on HiSeq 2500 (Illumina, CA). Targeted exome sequencing was performed using custom probes corresponding to 1965 genes implicated in pancreatic cell biology and/or diabetes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  289 
 
  
    EGAD00001003920 
   
  
    
    WGS sequence data from cell lines BT-54/BT-88/BT-92/BT-142 
    
   
  
    
   
  7 
 
  
    EGAD00001003923 
   
  
    
    The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001003924 
   
  
    
    The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001003925 
   
  
    
    This data is belong to 2014 AML-WGS patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 10 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001003926 
   
  
    
    Patient-derived organoids model treatment response of metastatic gastrointestinal cancers (80 targeted exome capture samples and 2 whole-genome sequencing samples) 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  82 
 
  
    EGAD00001003927 
   
  
    
    Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27 
    
   
  
    
   
  246 
 
  
    EGAD00001003928 
   
  
    
    This data is belong to 2016 AML prospective_v1 patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 5 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001003929 
   
  
    
    Exome sequencing data from homologous recombination deficient primary breast cancers as assessed by the functional RAD51 based HR test. There are 12 tumour samples, of which 10 also have matching normal. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001003931 
   
  
    
    Sequencing data from 1,005 cancer patients and 812 healthy controls. All samples prepared using Safe-SeqS technology and sequenced on an Illumina MiSeq and/or HiSeq instrument. Paired FASTQ files for correspond to read 1 and the index read present (R and I respectively). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  212 
 
  
    EGAD00001003932 
   
  
    
    This data is belong to 2014 AML patients' exome data which is aligned to human reference(human_g1k_v37.fasta).
There are 51 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  102 
 
  
    EGAD00001003933 
   
  
    
    Whole exome sequencing (WES), shallow whole genome sequencing (sWGS), ultra-deep targeted sequencing (TS), RNA whole transcriptome sequencing (RNAseq) bam files. 
Targeted TCR sequencing in RNA (RNA-TCRseq) cram files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  267 
 
  
    EGAD00001003934 
   
  
    
    EBiSC Whole Genome Sequencing processed VCF including VEP consequences 
    
   
  
    
   
  70 
 
  
    EGAD00001003935 
   
  
    
    Sequencing of V4 hypervariable region of 16S gene of microbiota present in feces of IBD patients 
    
   
  
    
   
  315 
 
  
    EGAD00001003936 
   
  
    
    Sequencing of V4 hypervariable region of 16S gene from microbiota present in intestinal biopsies of IBD patients 
    
   
  
    
   
  107 
 
  
    EGAD00001003937 
   
  
    
    BBMRI - BIOS project - Freeze 2 - Bam files - Imprinting analysis 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  131 
 
  
    EGAD00001003940 
   
  
    
    This dataset contains whole genome sequencing data from 24 patients. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 112 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001003941 
   
  
    
    Whole-Genome Sequencing of a Healthy Aging Cohort. 
    
   
  
    
      
      Complete Genomics 
      
    
   
  511 
 
  
    EGAD00001003942 
   
  
    
    EBiSC Whole Genome Sequencing processed CRAM 
    
   
  
    
   
  70 
 
  
    EGAD00001003943 
   
  
    
    The oral and gut microbiomes of melanoma patients were characterized before the initiation of ant-PD1 immunotherapy, and compared to treatment response. Validation studies were performed in germ-free mice using stool from patients who responded/did not respond to ant-PD1 immunotherapy.
All baseline oral(n=86) and gut (n=43) microbiome samples were subject to 16S sequencing - V4 region ( merged fastq files have been made available through this portal). Whole genome shotgun sequencing (WGS) was performed on a subset of fecal samples (n=25)- these files are also available( paired end reads). Also available are 16S sequencing results of stool samples from donors (n=2) used in fecal microbiota transplant and murine samples (n=12) from germ-free mice transplanted with stool from responder/non-responder patients.
The fastq files associated with this dataset are stored at ENA under the following links:
Fecal 16S – PRJEB22894
https://www.ebi.ac.uk/ena/browser/view/PRJEB22894
Oral 16S – PRJEB22874
https://www.ebi.ac.uk/ena/browser/view/PRJEB22874
Murine 16S – PRJEB22895
https://www.ebi.ac.uk/ena/browser/view/PRJEB22895
Fecal WGS – PRJEB22893
https://www.ebi.ac.uk/ena/browser/view/PRJEB22893 
    
   
  
    
   
  167 
 
  
    EGAD00001003944 
   
  
    
    Data set of 22 tumor/normal pairs of non-small cell lung cancer (NSCLC) patients. All tissue pairs were screened with MeDIP methylation enrichment sequencing and validations were performed with targeted bisulfite re-sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001003945 
   
  
    
    Bam files for PACA-CA RNA Seq analysis, for DCC release 27 
    
   
  
    
   
  219 
 
  
    EGAD00001003946 
   
  
    
    DNA from 10 human pancreatic islet samples was processed for Whole-genome Bisulphite Sequencing. The resulting libraries were sequenced on an Illumina Hiseq 2000 to generate 100bp paired-end read data. The resulting fastq.gz and mapped bam files were deposited. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001003947 
   
  
    
    18 human pancreatic islet preparations derived from 17 donors  were processed for ATAC-seq. The data was generated on an Illumina Hiseq 2500 sequencing machine to generate 50bp paired end read data. The resulting fastq.gz and mapped bam files were deposited. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001003948 
   
  
    
    Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27 
    
   
  
    
   
  39 
 
  
    EGAD00001003950 
   
  
    
    The dataset consists of samples from papillary thyroid cancer patients. A total of 292 DNA samples from blood/normal and cancer tissue are subjected to whole exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset. 
    
   
  
    
   
  290 
 
  
    EGAD00001003951 
   
  
    
    Whole genome sequencing of 4 childhood T-ALL patients, which was further used in single-cell analysis in the paper "Single cell sequencing reveals the origin and the order of mutation acquisition in T-cell acute lymphoblastic leukemia". 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001003953 
   
  
    
    Fastq files for the whole genome sequencing data (Illumina HiSeq 2500; 32.6-fold) for two diffuse gastric cancers revealing the fusion breakpoints.
2102T: CTNND1-ARHGAP26 gene fusion (g.chr11:57,578,103-g.chr5:142,358,707)
354T: ANXA2-MYO9A gene fusion (g.chr15:60,656,550-g.chr15:72,157,966) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001003955 
   
  
    
    This dataset comprises single-cell RNA sequencing of the human Lin-CD34+38-45RA-90+49f+ phenotype isolated from 2 normal cord donors. Library preparation was performed following a modified CEL-Seq2 protocol. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001003956 
   
  
    
    Illumina platform sequencing data of SureSelect exome libraries prepared from 3 samples from one donor: a normal, primary breast cancer, and cell line derived from metastasis 
    
   
  
    
   
  3 
 
  
    EGAD00001003957 
   
  
    
    Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  23 
 
  
    EGAD00001003958 
   
  
    
    Whole exome sequencing data for 18 mucoepidermoid carcinoma samples. The samples were used for Illumina TruSeq library construction and captured using Agilent V4 exome panel. The PE fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001003959 
   
  
    
    Whole genome sequencing data for 25 adenoid cystic carcinoma samples. The samples were used for Illumina TruSeq library construction and were sequenced on an Illumina HiSeq 2000. The PE fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001003960 
   
  
    
    This data is belong to 2014 Lung squamous patients' exome data which is aligned to human reference(human_g1k_v37.fasta).
There are 104 paired tumor/normal samples from SMC.
All samples has passed QC and re-calibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  208 
 
  
    EGAD00001003961 
   
  
    
    Whole exome data from tumor/normal pairs for adult type ovarian granulosa cell tumor sequencing project.  This data set contains 24 tumor whome exomes and 20 matched normal whole exomes generated using the Agilent V4 exome hybrid capture platform, with sequencing performed on an Illumina HiSeq 2000.  This dataset contains BAM files generated by aligning paired-end reads to the hg19 reference genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001003962 
   
  
    
    January 2018 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001003963 
   
  
    
    March 2018 cumulative data release (bams,fastqs) for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency as part of the International Human Epigenome Consortium 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  193 
 
  
    EGAD00001003964 
   
  
    
    CD8+CD69+CD103+ and CD8+CD69+CD103- T cells were flow sorted from 1 primary triple negative breast cancer, 1 primary HER2 amplified breast cancer, and 1 triple negative liver metastasis. Prior to flow sorting, fresh tumor samples were digested to produce single cell homogenates on the day of surgery. Following RNA extraction, RNASeq was performed using polyA selection. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001003965 
   
  
    
    Whole genome sequencing of Control (blood), Tumor and metastasis triplets for 12 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001003966 
   
  
    
    This dataset conatains RNA sequencing data from 24 patients. Up to two lanes per tumour sample have been seqeunced on a Illumina HiSeq2000 instrument in paired-end mode resulting in 58 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001003967 
   
  
    
    Targeted Gene Panel for 171 PTCLs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  171 
 
  
    EGAD00001003968 
   
  
    
    The Janus Serum Bank (JSB) is a population-based cancer research biobank. This dataset contains small RNA sequencing (RNA-seq) data of 520 JSB samples from cancer-free individuals. Sequencing libraries were indexed and 12 samples were sequenced per lane on a HiSeq 2500 (Illumina) to an average depth of 18 million reads per sample. The dataset files are raw FASTQ files from the sequencing machine (50bp, single-end sequencing) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  520 
 
  
    EGAD00001003969 
   
  
    
    RNA-seq analyses were performed on cDNA libraries prepared from PolyA+ RNA using the Illumina TruSeq protocol for mRNA. The final libraries were sequenced with a paired-end 2×75 bp protocol aiming at 8.5 Gb per sample for a 30x mean coverage of the annotated transcriptome. All sequencing reactions were conducted on an Illumina HiSeq instrument (Illumina, San Diego, CA, USA). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003970 
   
  
    
    For whole-exome sequencing 1 µg of DNA from fresh-frozen tumors was fragmented by sonication technology (for DNA from fresh-frozen tumors: Bioruptor, diagenode, Liѐge, Belgium; for DNA from FFPE material: Covaris). The fragments were end-repaired and adaptor-ligated, including incorporation of sample index barcodes. After size selection, libraries were subjected to an enrichment process with Sure select XT (Agilent). The final libraries were sequenced with a paired-end 2×75 bp protocol for an average coverage of 100-120x 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001003971 
   
  
    
    ICGC-TCGA DREAM Somatic Mutation Calling - Tumour Heterogeneity Challenge - WGS mapped reads 
    
   
  
    
   
  59 
 
  
    EGAD00001003972 
   
  
    
    Fastq files for PACA-CA RNA Seq analysis, for DCC release 27 
    
   
  
    
      
      ILLUMINA 
      
      Illumina HiSeq 2500 
      
    
   
  219 
 
  
    EGAD00001003973 
   
  
    
    This dataset contains whole exome sequencing data from 24 patients. The Agilent SureSelect Human All Exon 50-Mb target enrichment kit was used to capture all human exons for deep sequencing. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 118 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001003974 
   
  
    
    Raw data files for the German Epigenome Project (DEEP), IHEC/EpiRR submission of 2017.
metadata available at: http://deep.dkfz.de/#/experiments 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001003975 
   
  
    
    Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  128 
 
  
    EGAD00001003976 
   
  
    
    Well-differentiated, dedifferentiated, and matched normal tissues from liposarcoma (51 specimens) from 17 patients were obtained for whole exome sequencing. Tumors were submitted from 9 patients were used for RNA sequencing. The bam files are made available in this dataset. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001003977 
   
  
    
    RNA was extracted from formalin-fixed and paraffin embedded tumors of a large cohort of bladder cancer patients before treatment with anti-PD-L1. RNA was sequenced using a capture based approach (exome capture, RNA access). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  348 
 
  
    EGAD00001003978 
   
  
    
    This data is belong to WGS-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 30 paired tumor/normal samples from Samsung Hospital.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001003979 
   
  
    
    This dataset contains ChIP sequencing data from 24 patients. ChIP of 5–10 mg flash-frozen primary ependymoma tumour was performed using 5 mg H3K27ac antibody per ChIP experiment. The enriched DNA has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to two lanes per sample have been sequenced resulting in 70 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001003980 
   
  
    
    Desmoplastic small round cell tumor (DSRCT) RNAseq data. 14 tumor samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001003981 
   
  
    
    This dataset pertains to transcriptome sequencing of paired RNA samples.RNA was isolated from the tumor and adjacent normal tissues of 12 patients (24 samples). We have performed rRNA removal followed by total RNA sequencing in Illumina HiSeq platform.We have uploaded TopHat2 aligned BAM files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001003982 
   
  
    
    This dataset contains whole-genome sequencing data of tumors from 9 patients with mycosis fungoides. The data was generated using the Illumina HiSeq X-Ten platform. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001003983 
   
  
    
    This dataset contains RNA-sequencing data of tumors from 8 patients with mycosis fungoides. The data was generated using the Illumina HiSeq 4000 platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001003984 
   
  
    
    Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen and later used for WGS. For all tumor and matched normal (peripheral blood) samples, DNA was extracted with the Qiagen AllPrep DNA/RNA kit (tumor samples from patients 25,26,28-32) or the Qiagen Blood and Tissue Extraction Kit (tumor samples from patients 1-4,7,9-17, and all blood samples). For all tumor and normal samples, DNA extraction was followed by library construction and sequencing using Illumina HiSeq2500 whole genome shotgun v4 chemistry with paired-end 125bp reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  89 
 
  
    EGAD00001003985 
   
  
    
    Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen, and RNA was extracted using the miRNeasy Mini kit. Nanodrop was used to assess quality (260/280) and quantity. Total RNA samples were also QC checked using the Caliper HT RNA HiSens assay. Samples ranging from 60-255ng RNA were re-arrayed into a 96-well plate. 5'-RACE PCR was carried out as described in "The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer" (Zhang et al.). Briefly, this involved first round and nested PCR with TRB (TCR beta chain) and IGH (immunoglobulin heavy chain) gene-specific primers. The indexed libraries were sequenced on the Illumina HiSeq platform with paired-end 250bp reads using v2 chemistry reagents. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  442 
 
  
    EGAD00001003986 
   
  
    
    A total of 192 positions per patient were deeply sequenced in each corresponding tumor sample (including 4 experimental controls and SNVs predicted to originate at each node of the sample phylogeny, see Zhang et al. for details). Genomic DNA templates were used as starting material to generate PCR products. PCR was set up using Phusion DNA polymerase according to the manufacturer’s specifications. The standard PCR conditions used were an initial denaturation at 98C for 30 seconds, followed by 35 cycles of 98C for 10 seconds, 60C for 15 seconds and 72C for 8 seconds, and a final extension at 72C for 10 minutes. PCR products were cleaned up using PCRClean DX beads. Amplicons were pooled by template for sequencing sample preparation. Sample preparation involved a second round of amplification using Phusion DNA polymerase with 6 PCR cycles, with primers specified in Zhang et al. DNA quality was assessed using the Caliper LabChip GX HighSensitivity Assay and DNA quantity was measured using a Qubit dsDNA HS assay kit on a Qubit fluorometer. The indexed libraries were pooled together and sequenced on the Illumina NextSeq500 platform with paired-end 150bp reads using v2 chemistry reagents. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  180 
 
  
    EGAD00001003987 
   
  
    
    This dataset pertains to whole exome sequencing of paired DNA samples of Gingivo-buccal oral cancer patient.DNA was isolated from the tumor and blood tissues of 47 patients (94 samples).We have performed Nextera exome capture and sequenced exome libraries in Illumina HiSeq platform.We have uploaded BWA-ALN aligned BAM files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  94 
 
  
    EGAD00001003988 
   
  
    
    Paired end Whole Exome Sequencing of fine-needle aspirates from 51 Mutliple Myeloma patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  176 
 
  
    EGAD00001003989 
   
  
    
    Longitudinal biopsies from a melanoma patient who initially responded to MEK plus CDK4/6 inhibitor therapy were whole exome sequenced to identify potential resistance mutations. The biopsies included normal tissue, pre-treatment, on-treatment, and several post-resistance timepoints. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001003990 
   
  
    
    Shallow sequencing of metastatic colorectal cancer samples for the Angiopredict and Nobev cohorts described in:
van Dijk et al., JCO, in revision 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  186 
 
  
    EGAD00001003991 
   
  
    
    Complete clinical phenotypic description of all patients; the number listed represents all the samples linked to the 609 patients present in the dataset. Please consult the key file to visualise the sample-patient relationship 
    
   
  
    
   
  1094 
 
  
    EGAD00001003992 
   
  
    
    Whole Human Islet paired-ended RNA-seq of 64 human pancreatic donors. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2500 
      
    
   
  64 
 
  
    EGAD00001003993 
   
  
    
    The present series corresponds to 161 RNA-seq samples from tumors with matched WES or WGS. Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  161 
 
  
    EGAD00001003994 
   
  
    
    The present series corresponds to 24 whole genome sequencing (12 Tumoral/Non-tumoral pairs). Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001003995 
   
  
    
    Fifteen pleomorphic invasive lobular carcionoma samples and their matched normal controls were subjected to targeted exome sequencing using the Beijing Genomics Institute TumorCare gene panel. Genomic DNA samples were randomly fragmented and captured libraries of each exome were sequenced on an Illumina Hiseq2000 system. CRAM files are provided for each tumor and normal pair. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001003996 
   
  
    
    Illumina platform sequencing of whole genome libraries prepared from normal, Barrett's oesophagus and oesophageal cancer samples from 44 donors 
    
   
  
    
   
  - 
 
  
    EGAD00001003997 
   
  
    
    From 2nd trimester human foetuses we derived liver and intestinal stem cells. These were clonally expanded until enough material was available for whole genome sequencing. For each foetus, reference tissue (skin or bulk liver) was also sequenced to determine all germline variants. These were subtracted from the clones to determine all somatic mutations that had been acquired during embryonic and fetal development. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  50 
 
  
    EGAD00001003999 
   
  
    
    Deep single-cell RNA sequencing data for 12346 T cells from tumour, adjacent normal tissue and peripheral blood of treatment-naïve NSCLC patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  12346 
 
  
    EGAD00001004000 
   
  
    
    Targeted gene screen of cell line tumours for testing the new V4 Colorectal gene panel. . 
This dataset contains all the data available for this study on 2018-03-07. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  53 
 
  
    EGAD00001004001 
   
  
    
    Targeted gene screen of FFPEs, cell lines and primary CRC tumours for testing the new V4 Colorectal gene panel. . 
This dataset contains all the data available for this study on 2018-03-07. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  92 
 
  
    EGAD00001004007 
   
  
    
    Data supporting: "Esophageal adenocarcinoma organoid cultures recapitulate human disease heterogeneity and provide a model for clonality studies and precision therapeutics." Li et al.
WGS (BAM files)
RNAseq (BAM files)
Tumours, organoids, normals 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  53 
 
  
    EGAD00001004008 
   
  
    
    This dataset include NPC blood tumor pair sequencing bam file, include 21 pairs, 42 bam files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001004011 
   
  
    
    This data is belong to 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 10 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001004012 
   
  
    
    This data is belong to additional 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 2 paired tumor/normal samples from SNUH.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001004013 
   
  
    
    Organoids are self-organizing 3D structures grown from stem cells that recapitulate essential aspects of organ structure and function. Here we describe a method to establish long-term culture conditions of human airway epithelial organoids that contain all major cell populations and allow personalized human disease modelling. We collected macroscopically inconspicuous lung tissue from non-small-cell lung cancer (NSCLC) patients undergoing medically indicated surgery and isolated epithelial cells to engineer 3D organoids. We exploit the potential to derive sub-clones from AOs to demonstrate the feasibility of CRISPR gene editing. Finally, we show that AOs readily allow modelling of viral infections such as RSV and for the first time demonstrate the possibility to study neutrophil-epithelium interaction in an organoid model. Taken together, we anticipate that human AOs will find broad applications in the study of adult human airway epithelium in health and disease. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001004014 
   
  
    
    Whole Exome Sequencing Data from paediatric solid tumors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001004016 
   
  
    
    Sebaceous carcinomas (SeC) are cutaneous malignancies that, in rare
cases, metastasize and prove fatal. Here we report whole exome
sequencing on 32 SeC, revealing distinct mutational classes that
explain both cancer ontogeny and clinical course. A UV-damage
signature predominated 10/32 samples, while 9 were instead defined by
microsatellite instability (MSI) mutations. UV-damage SeC exhibited
poorly differentiated, infiltrative histopathologycompared to MSI
signature SeC (p = 0.003), features previously associated with
dissemination. Strikingly, UV-damage SeC transcriptomes and anatomic
distributionclosely resembling those of cutaneous squamous cell
carcinomas (SCC), implicating sun-exposed keratinocytes as a cell of
origin. Like SCC, this UV-damage subclass harbors a high somatic
mutation burden with >50 mutations/Mb, predicting immunotherapeutic
response. In contrast, ocular SeC acquire far fewer mutations without
a dominant signature, but show frequent truncating mutations in the
ZNF750 epidermal differentiation regulator. Our data exemplify how
different mutational processes convergently drive histopathologically
related but clinically distinct cancers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  79 
 
  
    EGAD00001004018 
   
  
    
    The aim of CAGEKID is to carry out comprehensive detection of DNA markers for conventional (clear cell) renal carcinoma. The project includes complete analysis of somatic and constitutional DNA variation, methylation patterns and expression in a large number of constitutional/tumor pairs. CAGEKID is a part of the International Cancer Genome Consortium, ICGC. 
    
   
  
    
   
  708 
 
  
    EGAD00001004020 
   
  
    
    Amplicon data of tumor samples generated for validation of WES findings and further sub clonal mapping 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  78 
 
  
    EGAD00001004021 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001004022 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001004023 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001004027 
   
  
    
    This data is belong to WES-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta).
There are 36 paired tumor/normal samples from Samsung Hospital.
All samples has passed QC and recalibration steps while aligning to reference. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001004028 
   
  
    
    WGS sequencing for 63 cases (126 samples) from the ICGC ESAD-UK project
Tumours 50x Normals 30x 
HiSeq X
BAM files
These samples are earmarked for inclusion in ICGC release 27 (deferred to release 28) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004029 
   
  
    
    WGS sequencing for 43 cases (86 samples) from the ICGC ESAD-UK project
Tumours 50x Normals 30x 
HiSeq X
BAM files
These samples are earmarked for inclusion in ICGC release 28 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001004031 
   
  
    
    AngioPredict CNV and Exome data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  527 
 
  
    EGAD00001004032 
   
  
    
    Fastq files of whole-genome bisulfite sequence of non-cancerous tissue of HBV-associated hepatocellular carcinoma 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001004033 
   
  
    
    Fastq files of whole-genome bisulfite sequence of tumor tissue of HBV-associated hepatocellular carcinoma 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001004034 
   
  
    
    RNA-seq data (bam files) from the hypothalamus of 4 individuals with Prader-Willi syndrome and 4 age-matched control individuals. Detailed information about the study design, case-control matching and RNA-seq data processing is provided in the accompanying publication [Bochukova et al (2018) Cell Reports]. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001004035 
   
  
    
    Exome sequencing was performed on 15 unrelated female patients suffering from primary infertility due to Ovarian Meiotic Defects (OMD). Each reference number corresponds to one of the tested subject. DNA was extracted from Saliva using Oragene saliva DNA collection kit (DNAgenotek Inc., Ottawa, Canada).Exome capture was performed with the Agilent V5 kit and sequencing was performed on Illumina HiSeq 2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001004036 
   
  
    
    Whole exome sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial.
DNA from 86 cases was subjected to Illumina paired end whole exome sequencing using a customised SureSelect Human All Exon V6 capture set. Germline DNA from whole blood was sequenced for 83 cases. 26 cases were sequenced from both fresh frozen tissue and FFPE material, 10 were sequenced from only fresh frozen material and 50 from only FFPE. Data is provided as bwa aligned BAM files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  195 
 
  
    EGAD00001004037 
   
  
    
    Aim to characterise cancer gene landscape in CLL, particularly in cases with mutated POT1 gene. Treatment-naive CLL cases will be interrogated by targeted exome sequencing using a cancer gene panel. . 
This dataset contains all the data available for this study on 2018-03-14. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  123 
 
  
    EGAD00001004038 
   
  
    
    Identification of genes involved in congenital disorders of glycosylation and 3-methylglutaconic aciduria.
There are more than 100 genes known for congenital disorders of glycosylation and new disorders are discovered each year.  WE included patients with a so far unsolved glycosylation disease.
The diagnostic group 3-methyglutaconic aciduria is a heterogenous group of disorders mostly caused by abnormal phospholipid synthesis or in association with mitochondrial dysfunction.  We included patients with a so far unsolved disease and 3-methylglutaconic aciduria.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-03-14. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001004039 
   
  
    
    Albinism is genetically heterogeneous rare genetic condition affecting 1:17000 in the Western world (but more frequent in Africa) whose main feature is a profound visual impairment, characterised by foveal hypoplasia, abnormal chiasmatic connections, nystagmus and photofobia.  All these features result in severly altered visual acuity (<0,1), absent depth perception and poor night vision.  People with albinism are primarily visually handicapped.  In addition, for some types of albinism, the visual phenotype can be presented with partial or total hypopigmentation, hence resulting in a secondary phenotype which can lead to skin cancer if skin is not adequately protected.  Recently a new syndrome has been described, FHONDA, with the same visual abnormalities of albinism but without pigment alteration.  The traditional classification differentiates Oculoculatenous albinism (OCA), where hypopigmentation involves hair, skin and eyes versus Ocular Albinism (OA), where hypopigmentation only affects the eyes.  These are non-sydrimic types of albinism.  Some syndromic forms (Hermansky-Pudlak=HPS, Chediak-Higashi=CHS) affect cells beyond pigment cells, present in the lungs, immune system, platelets and intestines, resulting in more severe phenotypes that can be fatal.  Mutations in at least 19 genes are assocaited with the corresponding types of albinism.  Most hospitals will only diagnose the most frequent cases using traditional Sanger, MLPA approaches.  Some will use CGH arrays.  We aim to diagnose all cases of albinism through the Albinochip proposal, which combines a Sequenom first step of known mutations combined with subsequent NGS approaches.  In some cases we fail to find a second mutation, these are good candidates for further full exome analyses.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2018-03-14. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004040 
   
  
    
    Whole Exome Sequencing of trios (proband + parents) or probands only with Neonatal Diabetes Mellitus (NDM) or Congenital Hyperinsulinism of Infancy (CHI) of unknown genetic origin.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-03-14. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001004041 
   
  
    
    As a contribution to the International Cancer Genome Consortium, exome sequencing of 142 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes  in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  142 
 
  
    EGAD00001004042 
   
  
    
    As a contribution to the International Cancer Genome Consortium, exome sequencing of 102 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes  in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  102 
 
  
    EGAD00001004043 
   
  
    
    The dataset consists in 64 fastq files from 23 patients with acute promyelocytic leukemia. Exome sequencing was conducted on several stages (Diagnosis, Remission, Relapse) for each patient. For 5 patients, only Diagnosis and Relapse samples are available. 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  64 
 
  
    EGAD00001004044 
   
  
    
    Files from whole exome sequencing of eight tumors from eight pancreatic cancer patients along with matched PanIN precursor lesion(s) and a matched normal tissue. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001004045 
   
  
    
    Whole Genome Sequencing has been applied in 32 SRCC patients and the raw data have been subjected to standard procedures. Files with genomic variant calling were obtain at the last step. 
    
   
  
    
      
      HiSeq X Ten;ILLUMINA 
      
    
   
  64 
 
  
    EGAD00001004046 
   
  
    
    Analysis of the reference epigenomes and regulatory landscape of CLL as a whole and its major clinico-biological subtypes (with mutated and unmutated IGHV) in the light of the normal B-cell differentiation.
We have extensively characterized the reference epigenomes of seven primary chronic lymphocytic leukemia samples (CLLs) with mutated (n=5) and unmutated IGHV (n=2) as well as several mature B-cell subpopulations (naive B cells from blood and tonsil, germinal center B cells, memory B cells and plasma cells from tonsil) using genome-wide maps of six histone marks (H3K4me3, H3K4me1, H3K27ac, H3K36me3, H3K9me3 and H3K27me3), DNA accessibility (ATAC-seq), DNA methylation (whole-genome bisulfite sequencing) and gene expression (RNA-seq). Furthermore, we have mapped the regulatory chromatin landscape of 100 additional CLL cases using chIP-seq of H3K27ac and ATAC-seq and linked these data to additional layers of information (whole-genome and/or whole-exome sequencing (WGS/WES), RNA-seq and DNA methylation microarrays) studied in the context of the International Cancer Genome Consortium (ICGC). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  303 
 
  
    EGAD00001004047 
   
  
    
    Peripheral blood mononuclear cells (PBMC) of CLL patients were isolated by density-gradient centrifugation over Linfosep (Biomedics, Madrid, Spain). B cells were purified with a CD19+ magnetic-bead system (MidiMACS, Miltenyi Biotec, Bergish Gladbash, Germany) according to the manufacturers’ instructions. Mean B-cell purity was >99% and the mean percentage of CD5+/CD19+ cells after purification was >98%, as measured by flow cytometry. Total RNA was extracted from purified cells in a single step using TriPure Isolation Reagent (Roche Applied Science, Vilvoorde, Belgium).Whole transcriptome sequencing libraries were prepared using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina). Libraries underwent 2 × 76 bp paired-end sequencing on a HiSeq 2500 instrument (Illumina). The median number of paired-end reads was 60.5 million (range, 49.7-79.7 million). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001004048 
   
  
    
    This dataset contains raw sequences (BAM files) of P1 trio: mother, father and affected child (P1). Whole exome sequencing (WES). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004051 
   
  
    
    fastq of 345 Japanese gastric cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  345 
 
  
    EGAD00001004052 
   
  
    
    Ultra low coverage sequencing results from the project 'Rapid multiplex small DNA sequencing on the MinION nanopore sequencing platform'. Sequencing data of sample NA12877 and NA12878 generated from 3 nanopore sequencing runs are included in this dataset. 
    
   
  
    
      
      MinION 
      
    
   
  6 
 
  
    EGAD00001004055 
   
  
    
    The dataset "RNA-seq colorectal adenomas NKI-AvL TGO series NGS-ProToCol" includes 2 x 30 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 30 snap-frozen colorectal adenomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004056 
   
  
    
    The dataset "RNA-seq colorectal carcinomas NKI-AvL TGO series NGS-ProToCol" includes 2 x 30 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 30 snap-frozen colorectal carcinomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004057 
   
  
    
    The dataset "RNA-seq normal adjacent colon NKI-AvL TGO series NGS-ProToCol" includes 2 x 18 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 18 snap-frozen normal adjacent colon tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004058 
   
  
    
    The dataset "RNA-seq colorectal adenomas NKI-AvL TGO series Gut2009" includes 2 x 32 fastq files from paired-end mRNA sequencing on Illumina HiSeq2500 for 32 snap-frozen colorectal adenomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001004059 
   
  
    
    The dataset "RNA-seq colorectal carcinomas NKI-AvL TGO series Gut2009 " includes 2 x 29 fastq files from paired-end mRNA sequencing on Illumina HiSeq2500 for 29 snap-frozen colorectal carcinomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  29 
 
  
    EGAD00001004061 
   
  
    
    200PT : WG Aligned Sequence (bam)/ Aligned WG sequence data in this dataset are from CPCGene Tumour/Normal Pairs used in the 200PT Study 
    
   
  
    
   
  404 
 
  
    EGAD00001004062 
   
  
    
    This dataset includes whole genome sequencing of 198 epileptic individuals.
Libraries preparation and whole-genome sequencing: gDNA was cleaned up using ZR-96 DNA Clean & ConcentratorTM-5 Kit (Zymo) prior to being quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and its integrity assessed on agarose gels. Libraries were generated using the TruSeq DNA PCR-Free Library Preparation Kit (Illumina) according to the manufacturer’s recommendations. Libraries were quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and the Kapa Illumina GA with Revised Primers-SYBR Fast Universal kit (Kapa Biosystems). Average size fragment was determined using a LabChip GX (PerkinElmer) instrument. The libraries were denatured in 0.05N NaOH and diluted to 8pM using HT1 buffer. The clustering was done on a Illumina cBot and the flowcell was ran on a HiSeq 2500 for 2x125 cycles (paired-end mode) using v4 chemistry and following the manufacturer's instructions. A phiX library was used as a control and mixed with libraries at 0.01 level.
Bioinformatics: The Illumina control software was HCS 2.2.58, the real-time analysis program was RTA v. 1.18.64. Program bcl2fastq v1.8.4 was used to demultiplex samples and generate fastq reads. The filtered reads were aligned to reference Homo_sapiens assembly b37. Each readset was aligned to creates a Binary Alignment Map file (.bam). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  198 
 
  
    EGAD00001004063 
   
  
    
    EZH2, H3K4me3, H3K27ac and H3K27me3 ChIP-seq data consisting of fastq single-end reads from peripheral blood CLL cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  34 
 
  
    EGAD00001004064 
   
  
    
    This dataset contains high-throughput RNA-sequencing of 12 samples, each sample comprising neural precursor cells derived from human induced pluripotent stem cells from individuals with and without the 16p13.11 microduplication (a copy number variant associated with a range of neurodevelopmental disorders). 4 samples derive from patients carrying the 16p13.11 microduplication, and 8 derive from unaffected family controls.
RNA samples were processed to deplete rRNA using the TruSeq Stranded Total RNA with Ribo-Gold kit. Libraries were then sequenced using the NextSeq 500/550 High-Output v2 Kit on the Illumina NextSeq 550 platform, to produce 75 base pair paired-end sequencing reads at an average depth of around 100 million reads per sample. Raw sequencing reads data are stored in two FASTQ files per sample for these paired-end reads. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  12 
 
  
    EGAD00001004066 
   
  
    
    We generated 42 human whole-exome sequencing data sets from fresh-frozen (FF) and FFPE samples. These samples include normal and tumor tissues from two different organs (liver and colon), that we extracted with three different FFPE extraction kits (QIAamp DNA FFPE Tissue kit and GeneRead DNA FFPE kit from Qiagen, Maxwell\textsuperscript{TM} RSC DNA FFPE Kit from Promega). Variant calling analysis shows a very high rate of concordance between matched FF / FFPE pairs and equivalent performance for the three kits we analyzed. We find a significant variation in the difference of total number of variants called between FF and FFPE samples for the three different FFPE DNA extraction kits. Coverage analysis shows that FFPE samples have less good indicators than FF samples, yet the coverage quality remains above accepted thresholds. We detect limited but significant variations in coverage indicator values between the three FFPE extraction kits. Globally, the GeneRead and QIAamp kits have better variant calling and coverage indicators than the Maxwell kit on the samples used in this study, although this kit performs better on some indicators and has advantages in terms of practical usage. Taken together, our results confirm the potential of FFPE samples analysis for clinical genomic studies, but also indicate that the choice of a FFPE DNA extraction kit should be done with careful testing and analysis beforehand in order to maximize the accuracy of the results. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001004067 
   
  
    
    Custom panel sequencing data from 1714 clear cell renal cell carcinoma samples 
    
   
  
    
   
  1714 
 
  
    EGAD00001004068 
   
  
    
    Whole-genome, whole-exome and transcriptome sequencing of pancreatic ductal adenocarcinomas from young adults reveals recurrent NRG1-fusions in KRAS wild-type tumors. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  36 
 
  
    EGAD00001004069 
   
  
    
    Identification of tumor-specific effects on gene expression profile of regulatory T cells and conventional T cells in humans, investigation of the clonal origin of regulatory T cells and impact analysis of tumor-specific conversion of conventional T cells into induced regulatory T cells on the peripheral regulatory T cell repertoire in humans. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2304 
 
  
    EGAD00001004070 
   
  
    
    RNA sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial.
RNA from fresh frozen surgical tissue in 20 cases was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001004071 
   
  
    
    We integrate genomic  (whole-genome sequencing, WGS) and transcriptome (polyA-enriched RNA-Seq) sequencing from 90 NSCLC cases and comprehensively identified the distinct genomic features of Chinese NSCLC patients. 
    
   
  
    
   
  90 
 
  
    EGAD00001004072 
   
  
    
    200PT : SNV vcf files. SNV calls generated using SomaticSniper and PhyloWGS, from the CPCGene 200PT Subclonality study 
    
   
  
    
   
  293 
 
  
    EGAD00001004073 
   
  
    
    200PT : CNA vcf files. Copy Number Abberation calls generated using TITAN and PhyloWGS, from the CPCGene 200PT Subclonality study 
    
   
  
    
   
  292 
 
  
    EGAD00001004074 
   
  
    
    Genome-wide profiling of DNA methylation levels by RRBS in 150 glioblastoma tumor samples. Patients were selected to represent the general population of glioblastoma patients based on Austrian Brain Tumor Registry. These DNA methylation profiles were created for the validation of the glioblastoma progression study (GBMatch) and consist of 106 profiles from FFPE samples and 44 profiles from fresh-frozen samples. For the 44 fresh-frozen samples also WGS data (43 genomes) and RNA-seq data (37 transcriptomes) have been produced for validation purposes. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  150 
 
  
    EGAD00001004075 
   
  
    
    For this tissue dataset, we applied low-pass whole genome sequencing to 98 non-advanced and advanced adenomas. As small number of lesions was sequenced multiple times, this dataset consists of 103 fastq files. These adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  103 
 
  
    EGAD00001004076 
   
  
    
    37 transcriptomes derived from fresh-frozen glioblastoma tumor samples. These transcriptomes have been produced for validation purposes and match the corresponding RRBS and WGS profiles in that DNA and RNA was extracted from the same tumor samples. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  37 
 
  
    EGAD00001004077 
   
  
    
    43 low-coverage genomes derived from fresh-frozen glioblastoma tumor samples. These genomes have been produced for validation purposes and match the corresponding RRBS and RNA-seq profiles in that DNA and RNA was extracted from the same tumor samples. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  43 
 
  
    EGAD00001004078 
   
  
    
    For this tissue dataset, we applied low-pass whole genome sequencing to 96 advanced adenomas. Advanced adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  96 
 
  
    EGAD00001004079 
   
  
    
    RNA-seq data from sorted populations from 10 CML samples and 4 normal bone marrow samples. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001004080 
   
  
    
    This dataset contains four data files relating to the Cambridge Interval SomaLogic pQTL study to go with the corresponding genetic data: 
(1) Genome-wide pQTL summary associations for each analyte.
(2) Mapping table for the genetic variants analysed - containing rsID, position and allele information.
(3) Normalised quantitative readouts for each analyte, along with covariates used for the pQTL analysis.
(4) Table of SOMAmer analytes mapped to their protein targets.
Please see the readme file in the dataset for more information. 
    
   
  
    
   
  3301 
 
  
    EGAD00001004081 
   
  
    
    Smart-seq2 protocol was used to perform single cell RNA-sequencing on 465 immune cells. The immune cells analysed include 215 HLA-DQ2: gluten-(DQ2.5-glia-α1, -α2, -ω1, and -ω2) tetramer-sorted T cells, 247 transglutaminase 2 (TG2)-positive plasma cells from intestinal biopsy or peripheral blood from celiac disease patients, and 3 unassigned cells in 3 batches. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001004082 
   
  
    
    In this study, we applied an Illumina HighSeq platform-based high-coverage WES technique, which, in addition to the exons, allows the determination of 5′- and 3′-UTRs, promoters to a certain length, along with off-target sequences, such as introns, intergenic regions and infecting viruses.
Brains from suicide victims (n = 23; 15 males and eight female) who had suffered from major depressive disorder and from control participants (n = 21; 14 males and seven females) who had died from other causes were used for whole-exome sequencing.
Alignment files in bam format were uploaded. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001004084 
   
  
    
    ChIP-Seq - CEBPE - REH. The ETV6/RUNX1 translocated acute lymphoblastic leukaemia cell line. 
REH was used to perform ChIP-Seq using a CEBPE antibody. Cells were fixed in 1% formaldehyde for 10mins, prior to preparation of chromatin using Active Motif Express ChIP-IT. 2ug of antibody (anti CEBPE Atlas Antibodies HPA002928)was added to 25ug of chromatin O/N at 4C with rotation. Duplicate reactions were pooled and purified. 10ng of ChIP’d and input DNA used for Illumina NGS preparation (NEBNext ChIP-Seq Library kit; New England Biolabs), CEBPE and Input DNA ChIP samples were sequenced on a MiSeq using 150bp Kit v3 paired end and a HiSeq 2500 using 2x101 version 4 paired end (Illumina) respectively. Reactions performed in duplicate. 
shCEBPE RNA-Seq - REH. 
REH cells were lentivirally transduced with a pTRIPZ shRNA vector for transcriptional profiling of CEBPE. Two controls (empty and non-targeting) and two CEBPE shRNAs (V3THS_150517(A13), V3THS_404312(G3) Dharmacon, GE) were transduced into REH cells. Cells were treated with 1ug/ml doxycyclin for 144hrs and total RNA purified using Qiagen RNeasy. Knock down of CEBPE was validated by qRT. RNA integrity >7.7 for all samples. Libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kit and sequenced on an Illuimna HiSeq 2500 using 2x101 version 4 paired end chemistry. 3 biological replicates of each samples were prepared. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001004085 
   
  
    
    In this study, we have examined microbial infection in brain tissue from 9 control samples from healthy patients and 10 samples from patients diagnosed with Multiple sclerosis, by Next-generation sequencing NGS using Miseq sequencing platform (Illumina). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  19 
 
  
    EGAD00001004086 
   
  
    
    We will take a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We will grow these up into colonies, then whole genome sequence each colony. Somatic mutations will act as a unique barcode for each clone. We will then design a panel for targeted resequencing of the mutations that we find. It will then be possible to look for these mutations in the peripheral blood over several years, to see the dynamics of how HSCs contribute to the peripheral blood in health.
This dataset contains all the data available for this study on 2018-04-19. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  207 
 
  
    EGAD00001004087 
   
  
    
    We took a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We grew these up into colonies, then whole genome sequenced each colony. Somatic mutations act as a unique barcode for each clone. We have designed a panel for targeted resequencing of the mutations that we find. We are now looking for these mutations in the peripheral blood, to see the dynamics of how HSCs contribute to the peripheral blood in health. 
This dataset contains all the data available for this study on 2018-04-19. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004088 
   
  
    
    Multiple primary tumors (MPT) affect a substantial proportion of cancer survivors and may result from various causes including inherited predisposition. Currently, germline genetic testing of MPT cases for cancer predisposition gene (CPG) variants is mostly targeted by tumor type. We ascertained pre-assessed MPT cases from genetics centers (defined as ≥2 primaries by age 60 years or ≥3 by 70) and performed whole genome sequencing (WGS) on 460 individuals from 440 families. Despite previous negative genetic assessment/molecular investigations, pathogenic variants in moderate and high-risk CPGs were detected in 67/440 (15.2%) of probands. WGS detected variants that would not be (or were not) detected by targeted resequencing strategies including structural variants at low frequency (6/440 (1.4%) of probands). In most individuals with a germline variant assessed as pathogenic or likely pathogenic (P/LP), at least one of their tumor types was characteristic of variants in the relevant CPG. However, in 29 probands (42.2% of those with a P/LP variant) the tumor phenotype appeared discordant. The frequency of individuals with truncating or splice site CPG variants and at least one discordant tumor type was significantly higher than a control population (χ2=43.642 P=<0.0001). 2/67 (3%) of probands with P/LP variants had evidence of multiple inherited neoplasia allele syndrome (MINAS) with deleterious variants in two CPGs. Summing together variant detection rates from a similarly ascertained previous MPT case series, the present results suggest that first-line comprehensive CPG analysis in a clinical genetics referral-based MPT cohort would detect a deleterious variant in about a third of cases. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  81 
 
  
    EGAD00001004090 
   
  
    
    This dataset contains the aligned whole genome sequencing data of cell line 380.
This cell was established from the peripheral blood of a 15-year-old boy with acute lymphoblastic leukemia at relapse, showing an immature phenotype and carrying an IGH-MYC (t(8;14)) as well as an IGH-BCL2 (t(14;18)) chromosomal translocation. The sequencing was performed on an Illumina X-ten sequencer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001004091 
   
  
    
    Cancer gene panel (T200.1) sequencing data from tumor/normal pairs for adult type ovarian granulosa cell tumor sequencing project. This data set contains 55 tumor panel sequencing data and 44 matched normal panel sequencing data generated using the MD Anderson Cancer Center T200.1 cancer gene hybrid capture platform, with sequencing performed on an Illumina HiSeq 2000.  This dataset contains BAM files generated by aligning paired-end reads to the hg19 reference genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  99 
 
  
    EGAD00001004092 
   
  
    
    The dataset "Low-coverage Whole Genome Sequencing, colorectal adenomas NKI-AvL TGO series NGS-ProToCol" includes 30 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 30 snap-frozen colorectal adenomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004093 
   
  
    
    The dataset "Low-coverage Whole Genome Sequencing, colorectal carcinomas NKI-AvL TGO series NGS-ProToCol" includes 30 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 30 snap-frozen colorectal carcinomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004094 
   
  
    
    The dataset "Low-coverage Whole Genome Sequencing, normal adjacent colon NKI-AvL TGO series NGS-ProToCol" includes 18 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 18 snap-frozen normal adjacent colon tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004095 
   
  
    
    Whole exome sequencing (fastq files) of 41 pairs (82 samples) of myxofibrosarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  82 
 
  
    EGAD00001004096 
   
  
    
    We sequenced the coding exons of core genes involved in telomere maintenance using peripheral blood DNA of 192 CRC patients. The primary sequencing data were generated by using Ion Torrent Personal Genome Machine® (PGM™) platform. 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  192 
 
  
    EGAD00001004098 
   
  
    
    siRNA knockdown of 43 Allelic Imbalance target TFs followed by mRNA-seq done in triplicates in three (GP5D, LoVo, COLO320DM) different cell colorectal adenocarcinoma cell lines. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  426 
 
  
    EGAD00001004099 
   
  
    
    Chip-exo and Chip-nexus for FOXA1, HNF4A, KLF5, MYC, and TCF7L2 in  colorectal cancer cell lines LoVo, GP5D, COLO320DM 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  23 
 
  
    EGAD00001004100 
   
  
    
    Whole genome sequencing of commercial LoVo, GP5D, COLO320DM, CaCo-2 and RPE1 cell lines and three RPE1-TP53 knock-out  cell lines separated by 6 months of culture from their most recent common ancestor. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001004101 
   
  
    
    Target sequencing (fastq files) of 99 pairs (198 samples) of myxofibrosarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  198 
 
  
    EGAD00001004102 
   
  
    
    RNA sequencing (fastq files) of 29 samples of myxofibrosarcoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001004104 
   
  
    
    Clonally expanded human pluripotent and adult (liver + intestine) stem cell clones were subjected to whole genome sequencing to determine the mutational impact of in vitro culture 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001004105 
   
  
    
    Clonally expanded liver adult stem cell clones of healthy liver and cirrhotic liver (due to alcohol abuse, NASH and PSC), as well as biopsies of liver cancers were subjected to whole genome sequencing to determine the mutational impact of precancerous liver disease 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  44 
 
  
    EGAD00001004106 
   
  
    
    The gut microbiota composition is unique to every individual but is shaped by common factors including diet, lifestyle, medication use, early-life determinants, living environment or genetics. Most of these factors may be influenced by ethnicity. This study explored variations in fecal microbiota composition in 6048 individuals with different ethnic backgrounds living in the same geographical area (Amsterdam, the Netherlands).
The HELIUS data are owned by the Amsterdam University Medical Centers, location AMC in Amsterdam, The Netherlands. To allow sharing of microbiome data collected in HELIUS with (inter)national researchers, 16s rRNA sequence analysis has been stored at the European genome-phenome archive (EGA; accession code EGAD00001004106). This requires that access needs to be granted, also because the HELIUS data are stored with relevant phenotypical variables. Access is granted to all researchers affiliated with an internationally recognized research institution who request to use the HELIUS data within the EGA context, after having signed the data transfer agreement. Any researcher can request the data by submitting a proposal to the HELIUS Executive Board as outlined at http://www.heliusstudy.nl/en/researchers/collaboration, by email: heliuscoordinator at amsterdamumc dot nl. The HELIUS Executive Board will check proposals if they do not conflict with ethical approvals and informed consent forms of the HELIUS study. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6056 
 
  
    EGAD00001004108 
   
  
    
    The whole blood of six female volunteers and sperm from one male volunteer were used to extract genomic DNA using a DNeasy Blood & Tissue Kit (QIAGEN). 500 ng gDNA was fragmented into 300 bp by Covaris. Then, the libraries were constructed using a KAPA Hyper Prep Kit (Kapa Biosystems). In total we have 7 samples and the files we uploaded are pair-end fastq files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001004109 
   
  
    
    Dataset included RNA-seq data (Two Fastq files per sample as paired end sequencing was performed) from ribosomal-depleted total RNA in 28 Follicular Lymphoma (FL) criopreserved samples to analyze long non-coding RNA and coding transcript expression profiles. Sample metadata is referred to histological groups of FL tumors (FL1-3A versus FL3B/DLBCL) either in tumor purified cell samples (N=12) as in unpurified tumor samples including normal cells of the lymph node microenvironment (N=16). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001004111 
   
  
    
    Reverse-stranded paired-end 75 base-pair RNA sequencing libraries of 93 metastatic FFPE samples were constructed using Illumina Total RNA Stranded Kits. Ribosomal RNAs (rRNAs) were depleted by using the Ribo-Zero rRNA Removal Kit (Illumina). Libraries were sequenced on a HiSEQ2500 machine. Five samples were re-sequenced using paired-end 50 base-pair libraries due to the smaller insert sizes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  93 
 
  
    EGAD00001004112 
   
  
    
    This data set consist genomic information of 10 Chordoid Glioma samples:
- Exome sequencing: 10 tumors and matched normal DNA for four of them (BAM files)
- RNAseq : 10 tumors (fastq files)
- CNV array: 9 tumors (IDAT files) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  10 
 
  
    EGAD00001004113 
   
  
    
    DNA (n=1281) and RNA (n=767) were extracted from bone marrow aspirates where CD138+ selection had been performed to enrich plasma cells from patients with monoclonal gammopathy of undetermined significance (MGUS), smoldering multiple myeloma (SMM), or multiple myeloma (MM).  DNA and/or RNA were sent to Foundation Medicine where targeted sequencing was performed using their Foundation 1 Heme panel.  Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1281 
 
  
    EGAD00001004114 
   
  
    
    The failure to develop effective therapies for paediatric glioblastoma (pGBM) and diffuse intrinsic pontine glioma (DIPG) is in part due to their intrinsic heterogeneity. Analysis of 142 sequenced cases revealed multiple tumour subclones, spatially and temporally co-existing in a stable manner as observed by multiple sampling strategies. 
This dataset provides multi region sequencing of high grade gliomas and diffuse intrinsic pontine gliomas from 15 patients. DNA was extracted from FFPE sections in 2-13 regions of each tumour and sequenced with Agilent SureSelect whole exome sequencing. Germline DNA was also sequenced in 14 cases. Data was aligned to hg19 with bwa and is provided as 79 separate BAM files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  79 
 
  
    EGAD00001004115 
   
  
    
    Whole genome sequencing reads consisting of paired end Fastq and aligned bam files from pediatric medulloblastoma samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  22 
 
  
    EGAD00001004116 
   
  
    
    RNA sequencing of paediatric high grade gliomas and diffuse intrinsic pontine gliomas. RNA was sequenced from fresh frozen surgical material or from primary cells cultured under stem cell conditions.
RNA was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001004117 
   
  
    
    Tumor DNA was extracted from 100 bone marrow aspirate samples where CD138+ selection had been performed to enrich plasma cells from patients with multiple myeloma. Patient matched control DNA from either peripheral blood leukocytes or CD34+ stem cell harvests was also isolated. Both tumor and control DNA underwent library preparation using the Hyperplus kit (KAPA Biosystems) and were hybridized to baits for a targeted SeqCap myeloma panel (Nimblegen) encompassing 129 genes, regions for SNPs for copy number determination, and the IGH, IGK, IGL loci, as well as approximately 5 Mb surrounding the MYC locus.  Samples were sequenced on a HiSeq2500 using 100 bp paired end reads. Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  200 
 
  
    EGAD00001004118 
   
  
    
    We have studied a unique case of astroblastoma arising in a 6 year-old girl, with multiple recurrences over a period of 10 years, with the pathognomonic MN1:BEND2 fusion
11 surgical samples from either fresh frozen of paraffin embedded material and 1 blood sample were subjected to Illumina short read whole exome sequencing using Agilent SureSelect whole exome v4.
Data is provided as 15 BAM files aligned to hg19 with bwa. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001004119 
   
  
    
    Chromatin immunoprecipitation (ChIP) was carried out employing antibodies against H3K36me3 and RNA polymerase II using the HistonePath and TranscriptionPath assays by ActiveMotif. Whole genome sequencing was carried out using an Illumina HiSeq2000 and data is provided as 6 BAM files. H3K36me3 chipseq RNA polymerase II chipseq and input coverage for each cell line. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001004121 
   
  
    
    A total of 14 samples that has been analyzed with the Spatial Transcriptomics method. 
H&E stain can be sent if requested. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD00001004122 
   
  
    
    Set of multi-region sequenced breast cancer primary samples, lymph nodes and ctDNA. We collected samples from 11 breast cancer patients with lymph node involvement but no sign of distant metastasis. We performed a mix of whole-exome sequencing, targeted capture sequencing, and whole-genome sequencing of primary tumour samples and lymph nodes, as well as targeted capture sequencing of circulating tumour DNA. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  183 
 
  
    EGAD00001004123 
   
  
    
    Whole genome sequencing of 5 paediatric glioma cells lines - KNS42, SF188, UW479, RES186 and RES259.
Illumina paired end sequencing is provided as 5 BAM files aligned to hg19 with bwa. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001004124 
   
  
    
    CRISPR-Cas9 genome editing is widely used to study gene function, from basic biology to biomedical research. Structural rearrangements are a ubiquitous feature of cancer cells and their impact on the functional consequences of CRISPR-Cas9 gene-editing has not yet been assessed. Utilizing CRISPR-Cas9 knockout screens for 250 cancer cell lines, we demonstrate that targeting structurally rearranged regions, in particular tandem or interspersed amplifications, is highly detrimental to cellular fitness in a gene independent manner. In contrast, amplifications caused by whole chromosomal duplications have little to no impact on fitness. This effect is cell line specific and dependent on the ploidy status. We devise a copy-number ratio metric that substantially improves the detection of gene-independent cell fitness effects in CRISPR-Cas9 screens. Furthermore, we develop a computational tool, called Crispy, to account for these effects on a single sample basis and provide corrected gene fitness effects. Our analysis demonstrates the importance of structural rearrangements in mediating the effect of CRISPR-Cas9-induced DNA damage, with implications for the use of CRISPR-Cas9 gene-editing in cancer cells. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001004125 
   
  
    
    Data collected as part of the Normal prostatectomy project analysis. Whole genome sequencing (WGS, targeted at 30X for normal tissue and 50X for tumour tissue) was performed on morphologically normal tissue samples from 30 patients with prostate cancer. In addition, seven prostate tissue samples were sequenced from 7 non-cancer patients: two collected after a cystoprostatectomy and five from samples collected at autopsy. Matched blood controls were included for all patients. An extra five samples were sequenced from the stroma of cell cultured fibroblasts.
In addition a few tumour samples obtained at prostatectomy and their blood matched controls are included in this dataset from the main study that are not included elsewhere. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001004126 
   
  
    
    Sequence data in fastq format was aligned to the GRCH38 reference genome. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration.  Duplicates were marked with Picard Mark Duplicates.  Aligned sequence is in bam format.  Details of the alignment can be found int he bam header. In total, data generated from 174 tumour samples 102  matched blood normal controls was aligned.  Tumour samples were classified as Anaplastic Thyroid, Poorly-differentiated or well-differentiated cancers. 
    
   
  
    
   
  - 
 
  
    EGAD00001004127 
   
  
    
    Sequence was aligned to the GRCH38 reference genome.  Aligned sequence was analyzed with GATK Haplotype Caller, to generate germline variant calls across the SureSelect All Exon V5+UTR target region.  Variant calls are in VCF format.  In total there are samples from 173 donors.  101 donors  have calls generated from both normal and tumour samples tumour samples, 94 of which have a matched normal.  Details for the call can be found in the vcf headers. 
    
   
  
    
   
  - 
 
  
    EGAD00001004128 
   
  
    
    Sequence was aligned to the GRCH38 reference genome.  Aligned sequence was analyzed with SomaticSniper.  Somatic variant calls are in VCF format.  In total there are 94 tumour samples, each with a matched normal. 
    
   
  
    
   
  - 
 
  
    EGAD00001004129 
   
  
    
    Sequence was aligned to the GRCH38 reference genome.  Aligned sequence was analyzed with GATK/MuTect, to generate somatic variant calls across the SureSelect All Exon V5+UTR target region.  Somatic variant calls are in VCF format.  In total there are 166 tumour samples, 94 of which have a matched normal.  Somatic variants for tumours without a matched normal, were called against a panel of normals.  Details for the mutect call can be found in the vcf header. 
    
   
  
    
   
  - 
 
  
    EGAD00001004130 
   
  
    
    Whole genome sequencing of cutaneous melanoma skin and brain metastases and matched normal DNA, as well as RNA sequencing of material from the skin and brain metastases. In addition, RNA sequencing was performed for Dabrafenib and Trametinib treated patient-derived xenografts, together with untreated and vehicle treated controls. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  9 
 
  
    EGAD00001004131 
   
  
    
    Pheno-seq is a new approach that integrates high-throughput imaging and transcriptomic profiling of clonal spheroids/organoids to dissect functional tumor cell heterogeneity in 3D cell culture systems. The method is based on the iCELL8 technology (TakaraBio) that uses barcoded nanowells and a micro-solenoid valve dispenser. The CRC_spheroid dataset contains demultiplexed RNA-sequencing profiles (FASTQ file format, NextSeq 500) of 95 clonal tumor spheroids derived from a patient with colorectal cancer. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001004132 
   
  
    
    This dataset has two Variants Files in VCF format used in ABB project (https://github.com/Francesc-Muyas/ABB). 
One has the variants found in a Rare Variant Association Study performed in CLL patients. This has 1217 samples represented. 
The other variant file has 209 SNPs predicted in 10 samples by GATK HaplotypeCaller and selected for Sanger Sequencing Validation.
Raw reads were aligned against the Human Reference genome (Hg19) with BWA mem and variants were obtained using GATK HaplotypeCaller. 
    
   
  
    
   
  1217 
 
  
    EGAD00001004133 
   
  
    
    Epigenetic profiling of colorectal cancer initiating cells (CC-ICs) to identify bivalently marked genes (H3K4me3 and H3K27me3 ChIP-seq), and investigation of changes in transcriptome following EZH2 inhibition using RNA-seq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001004134 
   
  
    
    The dataset includes sequencing data generated using the TruSight Cancer Panel (TSCP) a targeted NGS assay for analysis of CPGs and orthogonally generated data supporting at least one pathogenic variant in a CPG for a total of 645 pathogenic CPG variants.
The set of pathogenic CPG variants includes strong representation of some of the most challenging types of pathogenic variants, with 339 indels, including 16 complex indels and 24 insertions or deletions with length greater than 5bp, and 74 exon CNVs, including 23 single exon CNVs. There are 502 pathogenic variants in BRCA1 or BRCA2, making this an important first-line validation dataset for laboratories performing NGS testing of BRCA1 and BRCA2. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  639 
 
  
    EGAD00001004135 
   
  
    
    Synovial sarcoma (SS) is defined by a recurrent t(x;18) chromosomal translocation, which produces the hallmark SS18-SSX oncogenic fusion. Incorporation of SS18-SSX into BAF complexes renders BAF complexes aberrant in two distinct manners: the addition of 78aa of SSX onto SS18, and concomitant loss of BAF47 assembly. However, the importance and functional contributions of each of these perturbations on BAF complex targeting and gene expression regulation remain unclear. Here we use an integrative set of genomic approaches in human cancer cell lines and primary tumor samples to define the mechanistic consequences of the SS18-SSX fusion oncoprotein. We find that SS18-SSX hijacks BAF complexes to broad polycomb domains to activate bivalent genes, driving a unique gene expression program distinct from other loss-of-function BAF complex malignancies. Importantly, restoration of BAF47 rescues enhancer activation but is dispensable for proliferative arrest in cell lines. These results demonstrate that gain-of-function SS18-SSX-mediated BAF complex targeting and gene activation is the driving event in SS, and present a mechanism by which distinct functions of BAF complexes can be co-opted to drive oncogenesis. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  85 
 
  
    EGAD00001004136 
   
  
    
    The overall goal of the Identification of recurrent mutations in Cushing’s disease project is to study the impact of whole-exome sequencing (WES) on the clinical care of cancer patients and oncology provider practices. 
The aims of Project are to implement and establish the feasibility of WES in patients with USP8 wild-type corticotroph adenomas; to develop a framework for the understanding of the molecular mechanism of the pathogenesis of corticotroph adenoma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  44 
 
  
    EGAD00001004137 
   
  
    
    WGS sequencing for 409 cases (832 samples) from the ICGC ESAD-UK project
Tumours 50x Normals 30x 
HiSeq X
BAM files
These samples are all available in ICGC release 28 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004138 
   
  
    
    The dataset is comprised of seven samples, one blood sample (germline control) of the patient, one neuroblastoma metastasis from the bone marrow and five derived cell models.
The models include the primary culture, the first xenogenograft passage, a monolayer culture derived from the first xenograft passage and two samples of the fourth xenograft passage, cells and supernatant. For all these samples, whole-exome sequencing data have been generated. The BAM files contain the alignments against the human genome, assembly GRCh37, and also the unaligned reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001004139 
   
  
    
    This dataset consists of 44 compressed paired fastq files, 15 of which are generated from whole exome sequencing, and 29 of which are generated from DNA sequencing using a targeted gene panel capturing the exonic regions of 73 prostate cancer driver genes. Targeted DNA sequencing was performed on an Illumina MiSeq (v3 600 cycle kit), and exome sequencing was done using an Illumina HiSeq 2500 (v4 250 cycle kit) machine. The fastq files are named in accordance with the sample aliases provided, which reflect the pathology of interest to this study (small cell prostatic carcinoma--SCPC), whether it was sequenced using an exome or targeted gene panel, whether the FFPE sample was sourced from tumor or benign tissue (labeled T or B, respectively), and whether there exists multiple samples belonging to a single patient. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  44 
 
  
    EGAD00001004140 
   
  
    
    Whole-genome sequencing (WGS) was performed for 13 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000 or HiSeq X Ten as 2x101 bp or 2x151 bp, respectively.
8 NKTL FFPE specimens were screened for somatic mutations using deep targeted capture sequencing (TCS). FFPE rolls or slides were extracted using QIAamp DNA FFPE Tissue kit (QIAGEN). The FFPE genomic DNA was treated with NEBNext FFPE DNA Repair Mix and assessed by Quant-it PicoGreen dsDNA Assay Kit (Invitrogen). The library was generated from 10-200 ng DNA with SureSelectXT Low Input Target Enrichment System for Illumina Paired-End Sequencing Library (Agilent Technologies) according to manufacturer’s instructions. RNA based probe was designed with SureDesign (Agilent Technologies) to target-capture 140 genes. Next, the captured libraries were pooled in equimolar concentration and sequenced on Illumina Novaseq 6000 platform with SP or S1 chip. Reads aligning to 40 selected genes were isolated post-alignment for this submission.
Prefix used in filenames:
T - Tumor samples
N - Matched-Normal samples 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  34 
 
  
    EGAD00001004141 
   
  
    
    This study contain the WGS and RNA-seq aligned bam files for this particular inflammatory hepatocellular adenoma sample. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001004142 
   
  
    
    146 DNA samples obtained from 73 DLBCL patients (matching tumor and normal) were sequenced with PCR free 1.0 genome shotgun sequencing. All files are in bam format. 
    
   
  
    
   
  146 
 
  
    EGAD00001004143 
   
  
    
    Tumor exome reads consisting of bam files from jaw samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004144 
   
  
    
    This dataset contains FASTQ files obtained through whole exome sequencing of glioma and matched blood samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  117 
 
  
    EGAD00001004145 
   
  
    
    The saliva microbiota of 972 Finnish children, aged 9-14 years was characterized using the 16S rRNA (V3-V4) gene sequencing with Illumina Hiseq platform. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  972 
 
  
    EGAD00001004146 
   
  
    
    Total RNA-seq of intestinal gluten tetramer+ and tetramer- CD4+ T-cells from celiac disease patients, as well as intestinal CD4+ T-cells from healthy control individuals (paired-end fastq files). 
    
   
  
    
   
  14 
 
  
    EGAD00001004147 
   
  
    
    The dataset contains three BAM files that include SPATC1L variants identified in Italian patients affected by hearing loss (both hereditary and age-related hearing loss). Data have been produced by whole exome sequencing and targeted re-sequencing, using Ion Proton and Ion Torrent PGM platforms respectively. 
    
   
  
    
      
      Ion Torrent PGM 
      
      Ion Torrent Proton 
      
    
   
  3 
 
  
    EGAD00001004148 
   
  
    
    bulk RNA-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001004149 
   
  
    
    bulk Exome-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001004150 
   
  
    
    This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes.     
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-06-06. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001004151 
   
  
    
    A case-control series of melanoma cases from Leeds, UK have been sequenced in the Fluidigm platform to identify genetic variants associated with sporadic melanoma development. Samples in which potentially contributing variants have been detected are being sequenced in an orthogonal platform for variant confirmation. . 
This dataset contains all the data available for this study on 2018-06-06. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  201 
 
  
    EGAD00001004152 
   
  
    
    Targeted pulldown of approx 60 ffpe normal samples to use as normal controls . 
This dataset contains all the data available for this study on 2018-06-06. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  80 
 
  
    EGAD00001004153 
   
  
    
    Gastric neuroendocrine tumors (gNETs) occur with an estimated frequency of 2 per 100,000 in the general population.  Type I gastric neuroendocrine tumors (NETs) represent the 75% of gNTEs and arise from gastric enterochromaffin-like (ECL) cells.  They have late age of onset and usually benigh course.  Classically, hypergastrinemia in patients who have autoimmune atrophic gastritis, causes hyperplasia of gastric ECL cells that progresses into type I gastric NETs and parietal cell (PC) destruction.  The genetic bases in families with this disease are unknown.
We performed an exome sequencing study of an atypical aggressive familial gNETs case (with early age onset, nodal infiltrations and gastric adenocarcinomas) that followed a recessive model.  We identified a deleterious mutation in homozygosis in the ATP4A gene, which encodes the proton pump responsible for acid secretion by gastric parietal cells.  This mutation lead to achlorhydria first, and hypergastrinemia and gNET developing as consequence (Calvete et al. 2014).  Recently, two more families with gNETs, classical clinical traits and recessive model have been studies by WES but we didn't find any mutation in the ATP4a gene.  However, putative mutations affecting genes that contribute to the development and the integrity of PC have been found suggesting that genetic alterations associated to this disorder target to a unique cell type (parietal cells).
In order to cinfirm this hypothesis, it is necessary the search for new genes implicated in the gNETs, more familial cases are needed to be studied.  We have identified four more new familial gNETs cases.  Here, we propose their study by WES.  The first family is formed by thress siblings with gNETs.  The other families include two siblings with gNETs.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-06-06. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001004154 
   
  
    
    This data set is comprised of data from seven distinct high grade serous epithelial ovarian cancer (HGS-EOC) partients, from whom multiple biopsies were taken at the time of surgery, from the ovary and from different locations in the peritoneal cavity.
This data set contains 28 samples, sequenced with a whole exome sequencing approach. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001004155 
   
  
    
    Genotype calls for 83 Aboriginal Australian genomes split by chromosomes. In short, genotypes were called individually with samtools. They were subsequently filtered with thresholds related to sequencing depth, location of variants, sequencing error, and strand bias. Once combined, the genotypes were filtered when not in Hardy-Weinberg equilibrium. The genomes were phased with IMPUTE using the 1000 Genomes reference panel. NB: for the Y chromosomes, only the 44 Aboriginal Australian males are included. 
    
   
  
    
   
  83 
 
  
    EGAD00001004156 
   
  
    
    High-coverage whole genome sequences were collected to study patterns of genomic variation across the broad geography of Indonesia and New Guinea. This region has experienced an extremely complex demographic history, including repeated bouts of admixture with archaic and modern human groups. We have sequenced the genomes of 161 individuals from 14 populations spanning this geographical region, from communities close to mainland Asia through to New Guinea. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  161 
 
  
    EGAD00001004157 
   
  
    
    Five subjects from pedigree with co-occurrence  of neurofibromatosis type 1 and moyamoya were sequenced in duplicate (0 and1).
Kinship and phenotype:
NF025, NF026 and NF027 were sibling all affected by neurofibromatosis type 1. NF026 also presented moyamoya.
NF0262 and NF0263 were sibling both affected by neurofibromatosis type 1. NF0262 also presented moyamoya.
NF026 and NF0262 were first cousins. 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  10 
 
  
    EGAD00001004158 
   
  
    
    The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004159 
   
  
    
    The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  25 
 
  
    EGAD00001004160 
   
  
    
    We compared bacterial communities in breast milk from teen (≤19 yr, n = 26) vs. adult (>19 yr, n = 56) mothers, normal weight (BMI 18.5-24.9, n = 63) vs. overweight (BMI ≥ 25, n = 19) mothers, primiparous (parity = 1, n = 41) vs. multiparous (parity > 1, n = 44), early (5-46d postpartum, n = 39) vs. established lactation (4-6 mo postpartum, n = 45), breastfeeding (EBF: PBF, n = 72) vs. mixed feeding (n = 11) and mothers with (Na/K ratio < 0.6, n = 75) and without SCM (Na/K ration ≥ 0.6, n = 10). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  86 
 
  
    EGAD00001004161 
   
  
    
    BAM files from 5 CCND1-negative MCL cases. 4 BAM files corresponded to long insert size Mate Pair-WGS and 3 to WES. In 2 of the cases both technologies were performed. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001004162 
   
  
    
    Undifferentiated sarcomas (USARC) of adults are diverse, rare and aggressive soft tissue cancers. Recent efforts have confirmed that USARC exhibit one of the highest burdens of structural aberrations across human cancer. Here, we sought to unravel the genomic basis of this structural complexity by integrating whole genome sequencing, ploidy analysis and methylation profiling of 53 USARC. We identified whole genome doubling as a prevalent and pernicious force in USARC tumourigenesis. Deconvolution of the complex copy number and rearrangement landscapes show distinct signatures associated with chromothripsis, early-haploidy, and successive whole-genome-doubling events, suggesting four divergent models of sarcoma development. We show similar distinct evolutionary tumourigenic pathways in different sarcoma subtypes from the Cancer Genome Atlas. Thirteen percent of tumours exhibited a hypermutator phenotype, opening new avenues for clinical management such as immunotherapy, whilst the period prior to and between genome doubling events may represent clinically relevant interventional points in USARC. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  56 
 
  
    EGAD00001004163 
   
  
    
    Cancer genomes are frequently characterized by numerical and structural karyotypic abnormalities. Here we combined an inducible centromere-specific inactivation approach with selection for a conditionally essential gene, a strategy we term ‘CEN-SELECT’, and show that single-chromosome missegregation during cell division can directly drive a broad spectrum of structural rearrangement types. Cytogenetic profiling revealed that missegregated chromosomes are 120-fold more susceptible to developing seven major categories of structural variants, including translocations, insertions, deletions, and reassembly into chromothriptically rearranged chromosomes. Whole-genome sequencing of clones with genetically propagatable derivative chromosomes identified complex rearrangements and copy-number alterations that can result in gene inactivation or extrachromosomal gene amplification. We conclude that chromosome segregation errors are sufficient to drive extensive structural variation that recapitulates those commonly associated with human cancers. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001004164 
   
  
    
    Whole exome and RNA-seq of matched normal gastric mucosa (n=34) and gastric cancer tissues (n=34) from gastric cancer patients (n=34) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  136 
 
  
    EGAD00001004168 
   
  
    
    The illumina exome chip genotyping data for 943 PDAC cases and 3,908 controls in the Chinese population. Genotypes were called by the Illumina GenomeStudio software, and the selected variants were re-called by zCall. Standard quality control were performed. 
    
   
  
    
   
  4856 
 
  
    EGAD00001004169 
   
  
    
    PBMCs were purified from blood samples of 8 HTLV-1 infected individuals, and cryo-preserved in fetal calf serum  containing 10% DMSO. DNA from each samples was extracted using Qiagen Blood & Tissue kit according to the manufacturer's protocol. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  97 
 
  
    EGAD00001004171 
   
  
    
    The dataset contains one BAM file that includes a SLC9A3R1 variant identified in two Italian patients affected by age-related hearing loss. Data have been produced by targeted re-sequencing, using Ion Torrent PGM platform. 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  1 
 
  
    EGAD00001004172 
   
  
    
    This dataset contains targeted amplicon sequencing of Germline DNA extracted from 56 blood samples. They were sequenced on Illumina HiSeq 2500 and aligned to human genome assembly GRCh37 (hg19)to produce 127 bam files (2-3 technical replicates per sample). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  55 
 
  
    EGAD00001004173 
   
  
    
    This dataset contains targeted amplicon sequencing of DNA extracted from 300 samples of 142 patients (158 methanol-fixed relapse biopsies and 142 FFPE archival diagnostic tissues). Samples were sequenced on Illumina HiSeq 2500 and were aligned to human genome assembly GRCh37 (hg19)to produce 600 bam files (2 technical replicates per sample). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  300 
 
  
    EGAD00001004174 
   
  
    
    This dataset contains 319 bam files of shallow WGS data (0.1X) aligned to human genome assembly GRCh37 (hg19) from 300 tumor samples sequenced on HiSeq2500 in SE-50bp mode. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  300 
 
  
    EGAD00001004175 
   
  
    
    The dataset contains 438 plasma samples and 418 tissues samples from 102 breast cancer patients and 30 benign breast tumor patients. There are two kinds of file types: bam and fastq. Amplicon sequencing and Capture sequencing were used in our experiment. 
    
   
  
    
      
      Ion Torrent PGM 
      
      NextSeq 500 
      
    
   
  124 
 
  
    EGAD00001004176 
   
  
    
    RNA-sequencing data of pediatric B-cell precursor acute lymphoblastic leukemia, including 18 high hyperdiploid cases and 9 ETV6/RUNX1-positive cases. Sequencing libraries were constructed using the Human Ribo-Zero rRNA Removal Kit (Illumina, San Diego, CA) and sequenced on an Illumina NextSeq 500. RNA sequencing data were processed using the TCGA mRNA-seq pipeline (https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/#mrna-analysis-pipeline). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001004179 
   
  
    
    This dataset contains WES and RNA-Seq fastq files for 65 CML patient samples at various stages of disease progression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  183 
 
  
    EGAD00001004180 
   
  
    
    The French ICGC project on liver tumors is coordinated by Pr Jessica Zucman-Rossi and funded by Inca (French Institute for Cancer). The aim of the present project is to identify the catalog of somatic and germline mutations in liver tumors using whole genome (WGS) and whole exome sequencing (WGS), integrated with DNA methylation and RNA sequencing (RNA-seq) data. The present series corresponds to 60 whole exome tumor/normal pairs with matched RNA-seq. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  120 
 
  
    EGAD00001004183 
   
  
    
    Tumor transcriptome and whole exome sequencing data (matched tumor/normal for somatic mutation calling) along with key phenotypic information are provided for patients enrolled in the phase 2 IMmotion150 trial, assessing efficacy of atezolizumab monotherapy or combination of atezolizumab and bevacizumab versus standard of care (sunitinib) in 1L renal cell carcinoma. This data set accompanies the respective Nature Medicine publication (PMID: 29867230). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  589 
 
  
    EGAD00001004184 
   
  
    
    Whole exome NGS data of 21 sucide victims and 23 control patients sequenced on Illumina HiSeq 2000 platform using the Agilent SureSelect Human All Exon + UTRs V5 target enrichment kit. The dataset contains the paired-end unfiltered FASTQ files, the GRCh37 (b37) aligned BAM files mapped by the BWA MEM algorithm, and the variant files in VCF 4.1 format called with the GATK HaploType caller (version 3.3). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  44 
 
  
    EGAD00001004185 
   
  
    
    These files contain the normalized and raw count abundances miRNA for aSAH patients. These abundances were obtained using Next-Generation Sequencing after selection of the miRNA in the RNA biobank. In total, there are 28 VSP- and 28 VSP+ patients for two-time points. Normalized data were obtained by applying size factor and VSN normalizations as described in Pulcrano-Nicolas et al. Stroke 2018. Raw count data corresponding to the raw abundances of miRNA in aSAH patients. 
    
   
  
    
   
  56 
 
  
    EGAD00001004186 
   
  
    
    Somatic mutations in epithelial cells from endometriosis and normal uterine endometrium, with a total of 24 samples. Target enrichment was conducted by Agilent SureSelect Human All Exon V5 + IncRNA kit. Sequencing was conducted by Illumina HiSeq 2500 platform. Somatic mutation call was performed by Strelka. 
    
   
  
    
   
  24 
 
  
    EGAD00001004187 
   
  
    
    One hundred cryopreserved bone marrow and peripheral blood samples from patients with acute myeloid leukemia (AML) with 10-90% blasts were selected from the biobank of the Department of Hematology of Leiden University Medical Center (LUMC). The AML cases cover all subtypes, and specifically include known subtype-defining balanced chromosomal translocations according to the WHO classification. The samples were obtained from 96 patients and include three pairs of de novo and relapsed AML and one pair of de novo and presumed therapy-related AML (tAML). Total RNA was isolated from mononuclear cells without prior enrichment for leukemic blasts. The quality and integrity of total RNA was checked and RNA libraries were prepared using the TruSeq RNA library preparation kit v2 (Illumina, San Diego, CA) in an ISO/IEC 17025-accredited protocol. This workflow started with enrichment of messenger RNA by oligo dT magnetic beads. After fragmentation, cDNA synthesis was performed, followed by adaptor ligation and PCR amplification. Paired-end sequencing with a read length of 126 bp was performed on an Illumina HiSeq 2500 v4 sequencer to at least 12.5 Gbp per sample. Image analysis, base calling, and quality check was performed with Illumina data analysis pipeline RTA v1.18.64 and Bcl2fastq v1.8.4. RNAseq reads are provided in compressed Sanger FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  100 
 
  
    EGAD00001004188 
   
  
    
    28 Pretreated  Ewing  sarcoma  tumor blood  samples  were  collected  from  the Hospital  for  Sick  Children  (SickKids)  and  Mount  Sinai  Hospital  in  Toronto, Canada in accordance with  each  institution’s  Research  Ethical  Board  (REB)  guidelines. Detailed  clinical information  (age  at  presentation,  gender,  tumor  site,  stage,  etc.) were  obtained  from  the corresponding  institutional  tumor  banks.
Transcriptome  (RNA-Seq)  sequencing  was  performed using established  protocols  on  Illumina  instruments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001004189 
   
  
    
    This dataset includes 111 bam files from WGS sequence data aligned to human genome assembly GRCh37 (hg19) from 56 tumour and matched normal samples.  Libraries were constructed with ~350-bp insert length using the TruSeq Nano DNA Library prep kit (Illumina) and sequenced on an Illumina HiSeq X Ten System in paired-end 150-bp reads mode. The average depth was 60× (range 40-101×) in tumours and 40× (range 24-73×) in matched blood samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  111 
 
  
    EGAD00001004190 
   
  
    
    This dataset contains raw sequencing reads for matched MGUS/SMM to MM patient samples, including normal germline controls. FASTQ files were generated on Illumina NextSeq 500 and HiSeq 4000 machines following exome capture using the Agilent Clinical Research Exome kit. DNA was extracted from CD138+CD38++ cells (representing MGUS/SMM/MM cells) and CD138-CD38- (representing normal cells) isolated from bone marrow. 10 patients are included with 3 samples each representing normal, MGUS/SMM, MM stages. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  301 
 
  
    EGAD00001004192 
   
  
    
    The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  578 
 
  
    EGAD00001004193 
   
  
    
    The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1632 
 
  
    EGAD00001004194 
   
  
    
    Complete Microbiome Metagenomics from feces of 461 IBD patients; The sequencer used was the Illumina HiSeq 2000 with a paired end reads design, reflected in the 2 FastQ format files per sample. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  355 
 
  
    EGAD00001004195 
   
  
    
    The dataset includes paired end fastq files of whole genome sequencing data on the Illumina platfrom. Individual samples are multiple annealing and looping based amplified single fibroblasts and multiple displacement amplified single T lympocytes, including unamplified bulk samples. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001004197 
   
  
    
    We spiked a small number of placental tissue samples with different combinations of Candida albicans, Plasmodium falciparum, Toxoplasma gondii, Human Cytolomega virus and Salmonella bongori (various combination of the equivalents of 1, 10, 100, 1000 and 10000 genome copies). A DNA isolation was performed on these spiked samples and the resulting DNA was subsequently sequenced by MiSeq (18S). These same samples were also analysed by X Ten to allow for a sensitivity comparison of the two methods of the eukaryotic spiked signals (Candida albicans, Plasmodium falciparum and Toxoplasma gondii). In addition, non-spiked placental samples from 50 cases of Fetal Growth Restriction (FGR) (+ matched healthy controls) and 49 cases of Preeclampsia (+ matched healthy controls) and 100 preterm cases were analyzed for their non-human eukaryotic content. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD00001004198 
   
  
    
    Metagenomics data of 80 placental tissue samples analyzed by X Ten for their possible microbial content. These 80 samples from pre-labor C-section deliveries, representing Cohort 1, were spiked with 1100 CFU Salmonella bongori. These same samples were also analyzed by 16S amplicon sequencing (search for ERP109246 in ENA). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  80 
 
  
    EGAD00001004199 
   
  
    
    This dataset was made to verify the computational reconstruction of B cell reseptors from single-cell RNA-seq using BraCeR. The dataset contains BCR-derived reads from single-cell RNA-seq from 13 cells using the Smart-seq2 protocol, as well as targeted BCR-sequencing data from the same cells. 
    
   
  
    
   
  26 
 
  
    EGAD00001004200 
   
  
    
    RNA samples of bone marrow and cord blood were sequenced by 10x genomics platform. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001004201 
   
  
    
    Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  75 
 
  
    EGAD00001004202 
   
  
    
    Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001004203 
   
  
    
    Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  192 
 
  
    EGAD00001004204 
   
  
    
    We used targeted sequencing to capture and measure the abundance as well as the size profiles of EBV DNA in plasma of subjects with and without NPC 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  337 
 
  
    EGAD00001004205 
   
  
    
    Whole-exome sequencing was performed from organoids derived from 10 liver cancer biopsies (7 hepatocellular carcinoma and 3 cholangiocarcinoma), corresponding liver and non-tumoral biopsies. For 3 of the organoids, both early and late passage organoids were sequenced. Whole-exome sequencing was performed using the Agilent Clinical Research Exome capture kit followed by Illumina sequencing. BAM files are provided in this dataset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001004206 
   
  
    
    This dataset contains 135 H3K27ac ChiP-seq experiments.  Monocytes and granulocytes from TB and non-TB samples were obtained, ChIP-seq was performed, and the reads were aligned to hg19. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  161 
 
  
    EGAD00001004207 
   
  
    
    This dataset includes whole genome sequencing data from 93 Bajau and Saluan individuals that were used in the Ilardo et al 2018 study on adaptation to diving in Sea Nomads. Sequencing libraries were built using the TruSeq Nano DNA Library Preparation Kit on an Illumina NeoPrep instrument. Each pool was sequenced 125 Paired-End over one or two lanes on the Illumina HiSeq2500 (version 4 chemistry). Samples were sequenced to an average depth of 5x. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  93 
 
  
    EGAD00001004208 
   
  
    
    Dataset contains targeted sequencing data of 712 plasma cell free DNA samples and 428 white blood cell samples collected from 428 men with metastatic prostate cancer. Target capture was performed using a hydridization-based custom Roche SeqCap EZ Choice kit, designed to capture all exons of 72 prostate cancer driver genes. Cell free DNA was extracted from 10 mL blood samples. Libraries were sequenced using Illumina HiSeq 2500 or Illumina MiSeq instruments to a median coverage of 750x. 62% of samples had ctDNA fraction above 2% of total cfDNA. 
Note that "Dataset type" is erroneously listed as "Amplicon sequencing", because "Captured-based targeted sequencing" or "Hybridization-based targeted sequencing" were not available options in EGA at the time of submission. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1140 
 
  
    EGAD00001004210 
   
  
    
    The dataset comprises RNA-seq information of 4 subpopulations sorted from human fetal pancreas of 3 different donors. Low input libraries were generated using the Smart-seq2 protocol after Ampure XP cleanup of the total RNA extracted from the sorted cells. Libraries were multiplexed and sequenced paired-end over 2 lanes of HiSeq4000 each. Raw data was aligned to the human genome refernence GRCh37 using STAR v2.5.1b with GENCODE v19 as transcriptome reference, and unaligned reads were folded into the final uploaded bams. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001004211 
   
  
    
    RNA extracted from middle temporal gyrus (MG) brain region of healthy elderly controls. Three pairs of samples were generated, each pair consisting of one sample that was enriched for circular RNAs using RNase R, and a second sample that was not enriched (total N=6). Remaining samples in the study from other functionally distinct brain regions are currently under process and will be released soon. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001004212 
   
  
    
    Files from whole exome sequencing of 14 tumors from two cancer patients (endometrial and lung cancer) along with a matched normal tissue per patient. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001004213 
   
  
    
    Sequences from 95 subjects presenting intellectual disability (ID) and 98 subjects presenting intellectual disability and a diagnosis of autism spectrum disorder (ASD). The mtDNA was amplified by long-range PCR with 3 pairs of primers producing overlapping fragments. The three fragments were mixed in equimolar ratios and each sample was sequenced in an Ion Torrent Personal Machine according to manufacturer's user guide (reference genome: NC_012920.1 (rCRS)). 
    
   
  
    
   
  193 
 
  
    EGAD00001004215 
   
  
    
    NGS-ProToCol RNA-seq dataset contains 41x normal adjacent prostate and 51x prostate cancer samples taken from fresh frozen radical prostatectomies, sequenced using random-hexamer priming. RNA-seq was performed on the Illumina HiSeq 2500 platform, 2 x 126 bp stranded paired-end reads at a depth of 70 mln reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  92 
 
  
    EGAD00001004216 
   
  
    
    The dataset “NKI-AvL OpACIN RNA-seq of stage III melanoma patients" includes 18 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 18 stage III melanoma patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004217 
   
  
    
    The dataset “NKI-AvL OpACIN DNA-seq of stage III melanoma patients" includes 2 x 18 normal and 2 x 18 tumor FASTQ files from paired-end whole exome sequencing on Illumina HiSeq2500 for 18 stage III melanoma patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001004218 
   
  
    
    Tumor DNA was extracted from formalin-fixed and paraffin embedded tumors of a large cohort of bladder cancer patients before treatment with anti-PD-L1. Normal DNA was extracted from matched PBMCs. Whole exome sequencing was performed. This is a subset of patients for which RNA sequencing is also provided (with more detailed phenotypic information). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  488 
 
  
    EGAD00001004220 
   
  
    
    41 samples from Zambia generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  41 
 
  
    EGAD00001004221 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AB0029 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004222 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AB6372 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004223 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AH1410 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004224 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AK7565 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004225 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AL4257 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004226 
   
  
    
    Genome sequence data from a GBM patient PT-AR3050 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004227 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-AR5365 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004228 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-BK0248 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004229 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-BM772 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004230 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-CA2271 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004231 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-CM1209 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004232 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-DF5919 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004233 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-DS9789 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004234 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-EV3071 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004235 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-FB6711 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004236 
   
  
    
    Genome sequence data from a GBM patient PT-FR7453 
    
   
  
    
   
  - 
 
  
    EGAD00001004237 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GB9186 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004238 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GB9483 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004239 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GC1519 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004240 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GJ3716 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004241 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GR2309 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004242 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-HN6692 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004243 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-HO0394 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004244 
   
  
    
    WGS data from a GBM patient PT-HS9105 
    
   
  
    
   
  - 
 
  
    EGAD00001004245 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-JB1730 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004246 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-JE6375 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004247 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-JP2405 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004248 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-JW6420 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004249 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-KM5291 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004250 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-LC3356 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004251 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-LR9369 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004252 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-LS4891 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004253 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-MB9777 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004254 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-MD9088 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004255 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-PD6881 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004256 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-RD1291 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004257 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-RL5404 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004258 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-RL7940 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004259 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-RW9277 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004260 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-SK0976 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004261 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-SO0258 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004262 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-TM5196 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004263 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-VO7089 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004264 
   
  
    
    WGS data from a GBM patient PT-WP9124 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001004265 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  195 
 
  
    EGAD00001004266 
   
  
    
    Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  72 
 
  
    EGAD00001004268 
   
  
    
    Samples from 149 trios from the Saguenay-Lac-Saint-Jean asthma familial cohort  were all sequenced using a custom capture panel developed by our group, followed by next-generation sequencing. This custom capture panel covers around 3% of the genome, including coding and non-coding immune regulatory regions. We inferred the sequence in the non-sequenced siblings who were part of the same families as the trios and we imputed the sequence using IMPUTE2 in the whole cohort. 
    
   
  
    
   
  1214 
 
  
    EGAD00001004269 
   
  
    
    This dataset includes 112 head and neck tumour samples with matched normal (blood) samples sequenced using a custom hybrid capture panel. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  224 
 
  
    EGAD00001004270 
   
  
    
    Genome-wide copy number profiling was performed using low-pass whole genome sequencing on archival non-dysplastic mucosa (n=9), low-grade dysplasia (LGD; n=30), high-grade dysplasia (HGD; n=13), mixed LGD/HGD (n=7) and CA-CRC (n=19). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  81 
 
  
    EGAD00001004271 
   
  
    
    The dataset comprises of seven samples described below
1. Muscle samples from three patients with late-onset PEO caused by compound heterozygous POLG variants 
M0305	POLG W748S/R1096C	
M1105	POLG A467T/T251I+P587L	
M1804	POLG A467T/X1240G+35aa
2. Muscle sample from a patient with adPEO with heterozygous TWNK variants 
M0230  TWNK p.Arg357Pro
3. Blood control samples from two patients with late-onset PEO caused by compound heterozygous POLG variants 
DNA2012-1630_S1	POLG W748S/R1096C
DNA2018-0168_S2	POLG A467T/T251I+P587L
4. Muscle samples from healthy control individuals 
DNA2018-0172_S4	Healthy control 2
DNA2018-0173_S5	Healthy control 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001004272 
   
  
    
    We will sequence at 15X coverage the genomes of 960 IBD patients. These samples are currently onsite at Sanger and made available for sequencing via our collaboration with the UK IBD Genetics consortium. During the next quinquennium we intend to sequence the genomes of many thousand IBD patients and these 960 represent the first stage of this effort. Ultimately we will perform association tests comparing these genomes to similar numbers of control genomes to identify rare and low-frequency variants underlying IBD. . 
This dataset contains all the data available for this study on 2018-08-03. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1432 
 
  
    EGAD00001004273 
   
  
    
    In this project we have sequenced the exome of skin moles (melanocytic naevi) and also normal skin from young and old people. We are interested in looking at the clonality of these lesions and the burden of UV mutations
 . 
This dataset contains all the data available for this study on 2018-08-03. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  184 
 
  
    EGAD00001004274 
   
  
    
    1cm biospies of from patients undergoing bladder cystectomy will be collected.  The underlying muscle and stroma will be removed and the remaining epithelia dissected into small sequential areas which will be sent for ultra-deep exome sequencing using a panel of known cancer and viral genes.  Sequence analysis using similar methods to Martincorena I et al (Science 2015, 348:880) will provide an idea of the  somatic mutational landscape in these patient samples.  Individual patient muscle samples will also be sequenced as a reference. . 
This dataset contains all the data available for this study on 2018-08-03. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  71 
 
  
    EGAD00001004275 
   
  
    
    Exome sequencing was performed on fresh-frozen multiple regions of carcinoma, adjacent non-cancerous mucosa and blood from 12 CA-CRC patients (n=55 exomes). 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  64 
 
  
    EGAD00001004276 
   
  
    
    In the present study two large, multiply affected bipolar disorder families from Cuba were investigated using whole exome sequencing (Illumina HiSeq2500 v4). The variant calling files (VCFs) of 15 individuals provided here were generated using the Varbank exome pipeline from the Cologne Center for Genomics (CCG, https://varbank.ccg.uni-koeln.de). 
    
   
  
    
   
  15 
 
  
    EGAD00001004279 
   
  
    
    Genomic DNA of tumours and matched normal gastric tissues was extracted (QIAGEN). Libraries were constructed with 300-400 bp insert length, and 101bp or 151bp paired-end sequencing was performed on Illumina Hiseq instruments 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  80 
 
  
    EGAD00001004280 
   
  
    
    This dataset contains whole genome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot.  Mapping was performed using BWA.  This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001004281 
   
  
    
    The dataset contains the somatic point mutation data from the exome-targeted region of 36 exome or whole genome sequenced microsatellite unstable colorectal cancers and the somatic point mutation data from 93 additional MiSeq sequenced microsatellite unstable colorectal cancers. 
    
   
  
    
   
  129 
 
  
    EGAD00001004286 
   
  
    
    Comprehensive genetic analyses including whole-exome sequencing, targeted sequencing, and whole-genome sequencing of the human genome and the Epstein-Barr virus (EBV) genome were performed to reveal the molecular pathogenesis of EBV-associated hematological malignancy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  453 
 
  
    EGAD00001004287 
   
  
    
    This dataset contains whole exome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot.  Mapping was performed using BWA.  This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  156 
 
  
    EGAD00001004288 
   
  
    
    To validate the methylation status of the four candidate tumor suppressor genes (ADHFE1, EOMES, SALL1, TFPI2) in Han Chinese ESCC patients, we recruited 103 patients and obtained the paired tumors(entitled as T) and adjacent normal tissues (entitled as N) as well. Targeted bisulfite sequencing was conducted to detect the methylation profiles of these four genes in these 103 paired tissues. Furthermore, the raw sequence data (fastq files) was aligned using the BSseeker2 and this dataset included all of the bam file after alignment. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  205 
 
  
    EGAD00001004289 
   
  
    
    Data supporting: "Low-cost and clinically applicable copy number profiling using repeat DNA." Abujudeh et al.
DNA WGS (BAM files)
DNA fastSeq (fastq files)
Tumours, Barrett's, normals. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  60 
 
  
    EGAD00001004290 
   
  
    
    This dataset contains whole genome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot.  Mapping was performed using BWA.  This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  156 
 
  
    EGAD00001004291 
   
  
    
    We performed ATAC-seq experiments using 2 placental samples and 2 buffycoat samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004292 
   
  
    
    Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient. . 
This dataset contains all the data available for this study on 2018-08-13. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  112 
 
  
    EGAD00001004293 
   
  
    
    Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH).  . 
This dataset contains all the data available for this study on 2018-08-13. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  110 
 
  
    EGAD00001004294 
   
  
    
    This study will analyse the guide sequence which were used for making mutations in the Cas9-expressing cells. We used GeCKO v2 library which were released by Feng Zhang, 2014. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2018-08-13. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  92 
 
  
    EGAD00001004295 
   
  
    
    Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-08-13. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  105 
 
  
    EGAD00001004297 
   
  
    
    Lymphoblastoid cell lines established using either wildtype or BALF5-deficient Epstein-Barr virus were analyzed by RNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004298 
   
  
    
    Capture-based whole-genome sequencing of Epstein-Barr virus (EBV) was performed in hematological malignancies such as EBV-positive diffuse large B-cell lymphoma, extranodal NK/T-cell lymphoma, and chronic active EBV infection. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  264 
 
  
    EGAD00001004299 
   
  
    
    Comprehensive genetic analyses including whole-exome sequencing, targeted sequencing, and whole-genome sequencing were performed to reveal the molecular pathogenesis of chronic active Epstein-Barr virus infection. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  187 
 
  
    EGAD00001004300 
   
  
    
    June 2018 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004301 
   
  
    
    Whole exome sequencing data generated from organoid cultures established from gastric cancers, paired gastric tumor frozen tissues and blood leukocyte DNA. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 1500 
      
      unspecified 
      
    
   
  130 
 
  
    EGAD00001004302 
   
  
    
    RNASeq data generated from organoid cultures established from gastric cancers and normal mucosae, paired tumor frozen tissues, and cultured fibroblast. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 1500 
      
    
   
  131 
 
  
    EGAD00001004303 
   
  
    
    Sequence data (bam files) of two RRBS samples for paper "A comprehensive analysis of 195 DNA methylomes reveals shared and cell specific features of partially methylated domains". Short Description: CD4+ T memory cells (CD3+ CD4+ CD45RA- CD45RO+ CD25-) from donors were sorted by flow-cytometry either as a bulk culture ('ex vivo' sample) or in a single-cell format into 96 well-plates ('clone' sample) in the presence of a TCR stimulus. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001004304 
   
  
    
    We intend to use single cell transcriptome analysis to explore the heterogenity of different cell types within the kidney. . 
This dataset contains all the data available for this study on 2018-08-20. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1290 
 
  
    EGAD00001004305 
   
  
    
    As part of the Human Cell Atlas we will study fetal tissue. . 
This dataset contains all the data available for this study on 2018-08-20. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  27 
 
  
    EGAD00001004306 
   
  
    
    We performed whole-exome sequencing on multiple regions (n=2-3) from four primary untreated breast tumors (n=1 HER2+, n=2 ER+/HER2-, n=1 triple-negative), as well as matched normal. We also performed whole-exome sequencing on one region from the pre-treatment diagnostic core biopsy and multiple regions (n=2-6) from the post-treatment surgical specimen for five HER2+ primary breast tumors, as well as matched normal; all were treated with combination chemotherapy and trastuzumab. Analysis of these specimens allows characterization of breast tumor heterogeneity and clonal evolution. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001004307 
   
  
    
    Exome sequencing from cfDNA blood samples. 30 sets of 2x76 Illumina reads in Fastq format. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001004308 
   
  
    
    The Central Asian Kyrgyz highland population provides a unique opportunity to address genetic diversity and understand the genetic mechanisms underlying hypoxia-induced high altitude pulmonary hypertension (HAPH). While a significant fraction of the population is unaffected, there are susceptible individuals who display HAPH in the absence of any lung, cardiac or hematologic disease. We report herein the analysis of the whole genome sequencing of healthy individuals compared with HAPH patients and other controls.
In this study, 34 male individuals from Central Asian Kyrgyz highland are sequenced with Illumina HiSeq 2000 with mean-coverage of 30X. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001004309 
   
  
    
    Targeted next-generation-sequencing of 494 cancer-associated genes was done in a series of 14 frozen pairs of matched primary breast cancers and brain metastases (28 samples). 
DNA libraries of all coding exons were prepared using the Haloplex Target Enrichment System.
Sequencing was done using the 2*150bp paired-end technology on the Illumina NextSeq500 platform. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001004310 
   
  
    
    Whole exome sequencing data of 17 SPTCL cases, including 7 matched-normal samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001004311 
   
  
    
    The GoDARTS T2D-GENES exome sequencing study includes 1924 samples, 965 T2D cases and 959 T2D controls, from European ancestry. This cohort is part of a larger exome sequencing effort from the T2D-GENES project and contains the exome sequencing vcf from the GoDARTS samples. The other data generated from the T2D-GENES project can be found in dbGAP. Samples underwent deep exome sequencing, with SNVs and INDEls called according to GATK best practices. 
    
   
  
    
   
  1924 
 
  
    EGAD00001004312 
   
  
    
    ChIP-Seq files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  158 
 
  
    EGAD00001004313 
   
  
    
    The dataset includes 13 bam files. Each bam file is a different colorectal cancer patient organoid. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  13 
 
  
    EGAD00001004314 
   
  
    
    The dataset contains data from a single patient sample with partial lipodystrophy. The data is supplied in the form of 2 files. , a BAM file containing the (raw) sequencing data and a VCF file containing the called variants. The data is limited to a region consisting of the AGPAT2 gene on chromosome 9 and 1MB on both sides. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001004315 
   
  
    
    WGBS files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001004316 
   
  
    
    24 samples from Cameroon generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001004317 
   
  
    
    The blood samples of four liver cancer patients and four healthy people, and the solid liver tumor samples of two liver cancer patients are collected for this dataset. Blood samples were centrifuged first at 1,600 × g for 10 minutes, and then the plasma was transferred into new micro tubes and centrifuged at 16,000 × g for another 10 minutes. The plasma was collected and stored at -80⁰C. CfDNA was extracted from 5 ml plasma using the Qiagen QIAamp Circulating Nucleic Acids Kit and quantified by Qubit 3.0 Fluoromter (Thermo Fisher Scientific). Bisulfite conversion of cfDNA was performed by using EZ-DNA-Methylation-GOLD kit (Zymo Research). After that, Accel-NGS Methy-Seq DNA library kit (Swift Bioscience) was used to prepare the sequencing libraries. The DNA libraries were then sequenced with 150bp paired-end reads. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001004318 
   
  
    
    We used single-cell transcriptomics to study >60,000 cells from the developing murine cerebellum, and show that different molecular subgroups of childhood cerebellar tumors mirror the transcription of cells from distinct, temporally restricted cerebellar lineages. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  47 
 
  
    EGAD00001004319 
   
  
    
    This dataset contains Linked-Read Whole Exome Sequencing (lrWES) from individuals with known disease-causing variants. The dataset comprises of 30 samples from 10 donors, where multiple samples from the same donor reflect experimental differences assaying the effect of input DNA length on coverage and phasing. Raw data (i.e. BAM files) and variant analysis (i.e. VCF files) for each sample are included in this dataset. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001004320 
   
  
    
    RNA-Seq data from 6 Giant Cell Lesions of the Jaw (GCLJ) samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001004321 
   
  
    
    In situ promoter capture Hi-C on Hodgkin lymphoma cell line L-428 in experimental triplicates. Hi-C libraries were prepared as previously described (Orlando et al., 2018, https://currentprotocols.onlinelibrary.wiley.com/doi/pdf/10.1002/cphg.63). Promoter capture was based on 32,313 biotinylated 120-mer RNA baits (Agilent). Hi-C libraries were sequenced using Illumina HiSeq 2000 technology. The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004322 
   
  
    
    ChIP-seq data (H3K4Me3, H3K27Ac histone modifications) of Hodgkin lymphoma cell line L-428. Samples were processed as previously described (Sud et al., 2018). The files are in bam format, aligned to build 37 of the human genome. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004323 
   
  
    
    Primary plasma cell leukemia (pPCL) samples were sequenced using the Nimblegen MedExome Plus hybridization capture to detect translocations, copy number changes, and mutations in 20 pPCL samples and patient matched controls. Sequencing was performed on a NextSeq500 using 75 bp paired end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD00001004324 
   
  
    
    This dataset consist on 70 maternal plasma samples (bam files) used in the FetalQuantSD. The maternal plasma DNA samples were sequenced using the HiSeq 2000 platform (Illumina) with a 50-cycle paired-end mode. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001004325 
   
  
    
    RNA sequencing data from Vγ9Vδ2-T cells from chronic lymphocytic leukemia patients and age-mnatched healthy controls. Matched Vγ9Vδ2-T cell samples before and after expansion with autologous monocyte-derived dendritic cells for each donor are included. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001004326 
   
  
    
    This dataset includes transcriptome sequencing of 17 paired NAFLD-HCC samples and adjacent normal tissues. All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001004327 
   
  
    
    Paired-end, ribosome depleted, total RNA Sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001004328 
   
  
    
    DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems).  After PCR amplification the libraries were hybridized with probes against either the entire exome (MedExome, Nimblegen) or a targeted panel of 140 genes using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001004329 
   
  
    
    Somatic mutations of 256 whole-genome sequenced colorectal tumors. 234 MSS, 19 MSI and 3 POLE mutants.
See Katainen R. et al. CTCF/cohesin-binding sites are frequently mutated in cancer, Nature Genetics 2015. doi:10.1038/ng.3335 
    
   
  
    
   
  256 
 
  
    EGAD00001004330 
   
  
    
    Target sequencing W/ TruSight Cardio Sequencing Kit. 395 early onset lone AF cases and 375 controls. Sequencing was performed on Illumina NextSeq and HiSeq 2500 systems. 
    
   
  
    
      
      unspecified 
      
    
   
  1131 
 
  
    EGAD00001004331 
   
  
    
    RNA-seq (Ribodepleted Directional -75 PE- Hiseq 4000) data of purified and expanded iNKt and T cells from normal donor, and RNA-seq (poly-A 100-PE Hiseq 2500) data from C1R cell line. Data set consist of 3 pairs of  fastq files, one pair per sample 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001004332 
   
  
    
    Familial adenomatous polyposis (FAP) and MUTYH‐associated polyposis (MAP) are inherited disorders associated with multiple colorectal adenomas that lead to a very high risk of colorectal cancer. The somatic mutations that drive adenoma development in these conditions have not been investigated comprehensively. In this study we performed analysis of paired colorectal adenoma and normal tissue DNA from individuals with FAP or MAP, sequencing 14 adenoma whole exomes (eight MAP, six FAP), 55 adenoma targeted exomes (33 MAP, 22 FAP) and germline DNA from each patient. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
      Illumina HiSeq 2000 
      
    
   
  121 
 
  
    EGAD00001004333 
   
  
    
    Whole exome sequencing of 76 individuals with familial atrial fibrillation. 
BAM files have been aligned with BWA meme algorithm. Fastq files were filtered and trimmed using cutadapt. Samples have been sequnced on an Illumina 2500 machine. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1131 
 
  
    EGAD00001004334 
   
  
    
    50 samples from Mali generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001004335 
   
  
    
    Histone ChIP-seq of 13 human embryonic tissues from weeks 6-8 of gestation. H3K4me3, H3K27me3 and H3K27ac. Biological replicates (n=2) for 11 tissues. Tissues (n): Brain (2); Retinal Pigmented Epithelium (eye)(2); Palate(2); Tongue (1); Left ventricle (heart)(2); Lung(2);  Liver(2); Pancreas (2*); Stomach(1); Upper limb (2); Lower limb (2); Adrenal gland (2); Kidney (2) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  77 
 
  
    EGAD00001004336 
   
  
    
    The dataset for Evolution of neoantigen landscape during immune checkpoint blockade in non-small cell lung cancer includes 17 bam files from next-generation sequencing on the Illumina HiSeq2500. The biospecimens analyzed include matched tumor pre-treatment, post-progression and normal samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001004337 
   
  
    
    Whole Genome Sequencing files accompanying the paper titled "Structure and evolution of double minutes in diagnosis and relapse brain tumors". Please read the paper for more details. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004339 
   
  
    
    Dataset for "Genomic landscape of oral cancers" (CGI WGS) 
    
   
  
    
      
      Complete Genomics 
      
    
   
  59 
 
  
    EGAD00001004340 
   
  
    
    The dataset “NKI-AvL CRC-OVC DNA-seq" includes 4 normal and 4 tumor BAM files from paired-end whole exome sequencing on Illumina HiSeq2500 and Illumina NovaSeq6000 for 2 colorectal cancer and 2 ovarian cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001004341 
   
  
    
    The dataset “NKI-AvL CRC-OVC RNA-seq" includes 4 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 2 colorectal cancer and 2 ovarian cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004342 
   
  
    
    The dataset “NKI-AvL CRC-OVC scTCR RNA-seq" includes 368 BAM files from paired-end RNA sequencing on Illumina MiSeq for 2 colorectal cancer and 2 ovarian cancer patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  368 
 
  
    EGAD00001004344 
   
  
    
    Data consists of 4,640 RNA-sequencing sample libraries. These libraries were sequenced from four sites of the upper gastro-intestinal tract (Barrett’s oesophagus, proximal normal oesophagus, proximal normal stomach, and duodenum) in two experiments. 4,587  libraries were produced in the first experiment in which whole transcriptomes were isolated single cells dissociated from endoscopic biopsy tissue obtained from  the four previously mentioned tissues. The other 53 libraries were produced in the second experiment in which whole transcriptomes were isolated from whole tissue from endoscopic biopsies of the four previously mentioned tissues. The data found here are stored in the raw fastq file format from paired end sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4640 
 
  
    EGAD00001004345 
   
  
    
    This dataset contains variant call format files generated from whole exome sequencing of germline DNA from indiviudals with diagnosed with testicualr germ cell cancer. 
    
   
  
    
   
  960 
 
  
    EGAD00001004346 
   
  
    
    This is a bulk DNA and RNA sequencing study of human renal tumours . 
This dataset contains all the data available for this study on 2018-09-19. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  37 
 
  
    EGAD00001004347 
   
  
    
    We analyzed alternative splicing with Shh medulloblastoma. This dataset contains bam files of whole genome sequencing from 4 cases. Genomic DNA was isolated from both tumor and matched control specimens. We performed whole genome sequence on Illumina Hiseq. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001004348 
   
  
    
    This dataset includes microRNA sequencing data from 198 human serum samples, representing a subset of 66 women with no history of cancer who participated in the UKCTOCS study and with serum samples collected at three timepoints over a period of up to 5 years. Small RNA libraries prepared from the serum samples were sequenced with 50-bp single end reads on an Illumina HiSeq 2000 instrument. Data is provided as FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  198 
 
  
    EGAD00001004351 
   
  
    
    ERBB2/HER2 transmembrane and juxtamembrane domain mutations in cancer. Exome sequencing of tumor and matched blood and 2 blood samples from relatives. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004352 
   
  
    
    The Whole Exome Sequencing dataset contains 30 whole exome sequencing files (tumor, germ line DNA) and phenotype metadata for 15 patients on the phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data from baseline samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004353 
   
  
    
    The aim of this study was to compare the mutational landscape of breast cancer diagnosed during pregnancy (BCP) and breast cancer from age/stage non-pregnant patients (controls). We present whole genome sequencing data (Illumina HiSeq X ten platform) of tumor and matched normal tissues from 35 BCP patients and 20 controls. This work provides important novel biological insights and a unique resource to study the biology of breast cancer in young women and how pregnancy could modulate tumor biology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  106 
 
  
    EGAD00001004355 
   
  
    
    This dataset consists on 22 samples linked to 22 bam files from whole genome and whole exome sequencing of Esthioneuroblastomas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001004356 
   
  
    
    Dataset for "Genomic landscape of oral cancers" (Illumina WGS) 
    
   
  
    
   
  106 
 
  
    EGAD00001004357 
   
  
    
    Whole genome sequencing of sick children in neonatal and paediatric intensive care units. Datasets EGAD00001007780 (GRCh37) and EGAD00001007868 (GRCh38) are extentions of this dataset. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  219 
 
  
    EGAD00001004358 
   
  
    
    EGAS00001002317 - Whole exome sequencing of data of 18 RIMs with matched bloods. Median depth of 112x (range of 110-120). Performed on Illumina HiSeq Platform.
EGAS00001002318 - RNA sequencing data of 18 RIMs on the Illumina HiSeq Platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001004359 
   
  
    
    4 WGS bam files for 4 cases with fusion 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001004360 
   
  
    
    10 RNA-Seq bam files including 4 cases with fusion and 6 controls without fusion. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001004361 
   
  
    
    Summary statistics from GWAS meta-analysis of cervical cancer 
    
   
  
    
   
  2 
 
  
    EGAD00001004362 
   
  
    
    Exemplar asymptomatic controls (n=10, 6 males) and exemplar cases with chronic Achilles tendinopathy (n=10, 6 males), representing divergent extremes of the phenotype spectrum were selected for WES. Individual samples were sequenced at paired ends on the Illumina HiSeq 2000/2500 platform at 30X coverage using the Agilent V5+UTR (71Mbp) capture kit. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001004363 
   
  
    
    FastQ files with paired-end RNAseq data for human fetal brain homogenate from 120 samples (12-19 post-conception weeks). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  120 
 
  
    EGAD00001004364 
   
  
    
    Whole-exome sequencing (WES) was performed on a total of 34 PC specimens, with 15 cases having matched gDNAs extracted from blood. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  49 
 
  
    EGAD00001004365 
   
  
    
    Whole transcriptome sequencing (RNA-seq) was performed on 39 PC specimens. Among them, 21 specimens also had WES data. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  39 
 
  
    EGAD00001004366 
   
  
    
    Dataset for "Genomic landscape of oral cancers" (Illumina RNA) 
    
   
  
    
   
  110 
 
  
    EGAD00001004367 
   
  
    
    Single cell RNA-seq analysis of human skin. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001004368 
   
  
    
    Targeted gene sequencing of cancer driver genes to determine the driver mutations present in newly-derived cancer organoid models 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001004370 
   
  
    
    Illumina whole genome sequencing to high depth (x50) of four Tanzanian individuals. Genomic DNA derived from peripheral whole blood. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001004371 
   
  
    
    Microbiome analysis was performed on the patient samples collected pre-FMT and on days after FMT, and on samples collected from the FMT donor. Genomic bacterial DNA was extracted from fecal samples using the QIAamp DNA Stool kit (Qiagen, Hilden, Germany), with the addition of a bead-beating lysis step. Genomic 16S ribosomal-RNA V4 variable regions were amplified and sequenced on the Illumina MiSeq platform. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  11 
 
  
    EGAD00001004372 
   
  
    
    This data set consists of DQ2.5-glia-a1a- and DQ2.5-glia-w1- specific T-cell receptor  sequences from single cells isolated from blood or biopsies of celiac disease patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  53 
 
  
    EGAD00001004373 
   
  
    
    DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems).  After PCR amplification the libraries were hybridized with probes against either a targeted panel consisting of 140 genes and chromosomal regions (Nimblegen) using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  263 
 
  
    EGAD00001004374 
   
  
    
    The dataset includes 43 matched normal samples from 43 NF1-glioma patients profiled by Whole Exome Sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  43 
 
  
    EGAD00001004375 
   
  
    
    The dataset includes 59 tumor samples from 56 NF1-glioma patients profiled by Whole Exome Sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  59 
 
  
    EGAD00001004376 
   
  
    
    The dataset includes 29 tumor samples from NF1-glioma patients profiled by RNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  29 
 
  
    EGAD00001004378 
   
  
    
    Fastq files from exome sequencing of paired normal/tumor (pre and post-nCRT) samples from 7 patients with rectal tumors.  All samples were sequenced on a 5500xl SOLiD sequencing platform (Thermo Fisher Scientific). 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  22 
 
  
    EGAD00001004379 
   
  
    
    Shallow whole‐genome sequencing dataset on samples from three patients who underwent histological transformation to small‐cell lung cancer. Samples included in this dataset include normal buffy coat samples, plasma samples collected at diagnosis of NSCLC as well as prior to small‐cell transformation and after SCLC transformation and progression on cisplatin and irinotecan. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001004380 
   
  
    
    Glioblastoma patient derived Fast/Slow cycling cancer stem cell RNA sequencing.
Consists of 3 patient cell lines and 6 files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001004384 
   
  
    
    Whole Genome Sequencing of 44 patients with Chronic Lymphocytic Leukemia. This dataset comprises 44 .bam files aligned to the hg19 build of the human genome from sequencing reads generated on an Illumina HiSeq instrument. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  44 
 
  
    EGAD00001004385 
   
  
    
   
  
    
      
      454 GS FLX Titanium 
      
      AB 3730xL Genetic Analyzer 
      
      Illumina MiSeq 
      
    
   
  171 
 
  
    EGAD00001004386 
   
  
    
    Whole Exome Sequencing reads consisting of BAM paired end reads from Follicular Lymphoma samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001004387 
   
  
    
    WGS of ovarian cancer organoids, tumor samples and blood references.
Ovarian cancer (OC) is a heterogeneous disease usually diagnosed at a late stage. Experimental in vitro models that faithfully capture the hallmarks and tumor heterogeneity of OC are limited and hard to establish. We present a novel protocol that enables efficient derivation and long-term expansion of OC organoids. Utilizing this protocol, we have established 56 organoid lines from 32 patients, representing the spectrum of ovarian neoplasms, including non-malignant borderline tumors, as well as mucinous, clear-cell, endometrioid, low- and high-grade serous carcinomas. OC organoids recapitulate histological and genomic features of the pertinent lesion from which they were derived, illustrating intra- and inter-patient heterogeneity, and can be genetically modified.  We show that OC organoids can be used for drug screening assays and capture different tumor subtype responses to the gold standard platinum-based chemotherapy, including acquisition of chemoresistance in recurrent disease. Finally, OC organoids can be xenografted, enabling in vivo drug sensitivity assays. Taken together, this demonstrates their potential application for research and personalized medicine. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  111 
 
  
    EGAD00001004388 
   
  
    
    DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - phenotypic and family descriptions 
    
   
  
    
   
  - 
 
  
    EGAD00001004389 
   
  
    
    DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - exome sequence VCF files 
    
   
  
    
   
  - 
 
  
    EGAD00001004390 
   
  
    
    DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - exome sequence CRAM files 
    
   
  
    
   
  - 
 
  
    EGAD00001004391 
   
  
    
    This is a prospective, single arm phase IIa trial in which patients with early breast cancer will receive pre-operatively two doses of denosumab 120mg subcutaneously one week apart (maximum 12 days) followed by surgery. Tumor, normal breast tissue and blood samples will be collected at baseline and at surgery. Post-operative treatment will be at the discretion of the investigator.Primary objective: to determine if a short course of RANKL inhibition with denosumab can induce a decrease in tumor proliferation rates as determined by Ki67 immunohistochemistry (IHC) in newly diagnosed, early stage breast cancer in pre-menopausal women. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001004393 
   
  
    
    26 samples from Cameroon generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001004394 
   
  
    
    Dataset consists of fastq files of Ribo-seq, polyA-RNA and total RNA sequencing of 80 samples (65 DCM cases and 15 controls) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  80 
 
  
    EGAD00001004396 
   
  
    
    BAM files of individuals from the 1958BC aligned to hg17 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  648 
 
  
    EGAD00001004397 
   
  
    
    We profiled the transcriptomes (RNA-sequencing) of 40 clinically significant invisible and visible tumors, all with ISUP Grade 2 disease and treated by radical prostatectomy. Twenty tumors were mpMRI invisible (PI-RADSv2: 1-2), while 20 tumors were visible (PI-RADsv2: 5). 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  40 
 
  
    EGAD00001004398 
   
  
    
    Hotspot mutations in the spliceosome gene SF3B1 are reported in 20% of uveal melanomas. SF3B1 is involved in 3'-splice site (3'ss) recognition during RNA splicing; however, the molecular mechanisms of its mutation have remained unclear. Here we show, using RNA-Seq analyses of uveal melanoma, that the SF3B1 R625/K666 mutation results in deregulated splicing at a subset of junctions, mostly by the use of alternative 3'ss. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  76 
 
  
    EGAD00001004399 
   
  
    
    ctDNA and protein markers for earlier detection of pancreatic cancers 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  14 
 
  
    EGAD00001004400 
   
  
    
    SNV calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  538 
 
  
    EGAD00001004401 
   
  
    
    SNV calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  562 
 
  
    EGAD00001004402 
   
  
    
    SNV calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  580 
 
  
    EGAD00001004403 
   
  
    
    CNA calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  538 
 
  
    EGAD00001004404 
   
  
    
    CNA calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  562 
 
  
    EGAD00001004405 
   
  
    
    CNA calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study 
    
   
  
    
   
  580 
 
  
    EGAD00001004406 
   
  
    
    16S rRNA gene sequencing with Illumina MiSeq (V4 hypervariable region) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  188 
 
  
    EGAD00001004408 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  117 
 
  
    EGAD00001004409 
   
  
    
    Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 134 samples: matched tumour-normal pairs from 67 mucosal melanoma cases 
    
   
  
    
   
  - 
 
  
    EGAD00001004410 
   
  
    
    DNA extracted from sorted CD19+ tumor cells (16 patients) was used for exome capture with the SureSelect V5 All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments. The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001004411 
   
  
    
    DNA extracted from sorted CD3+ cells (16 patients) was used for exome capture with the SureSelect V5 Mb All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments.The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001004412 
   
  
    
    RNA-Seq was performed on 12 samples of sorted CD19+ tumor cells. RNA-Seq libraries were prepared using the SureSelect Automated Strand Specific RNA Library Preparation Kit as per manufacturer’s instructions (Agilent technologies) and subjected to paired-end (2 x 100 bp) sequencing on HiSeq2000 (Illumina). The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001004413 
   
  
    
    There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients.
The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. 
Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2018-10-23. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004414 
   
  
    
    There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients.
The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. 
Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-10-23. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001004415 
   
  
    
    There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients.
The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. 
Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-10-23. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001004416 
   
  
    
    There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients.
The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. 
Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2018-10-23. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001004417 
   
  
    
    Data supporting: "The landscape of selection in 551 Esophageal Adenocarcinomas defines genomic biomarkers for the clinic." Frankell et al.
WGS (BAM files)
379 matched tumour-normal pairs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004419 
   
  
    
    Summary statistics of a GWAS meta-analysis for severe acne. A total of 7,441,713 genotyped and imputed variants were used for 5,602 European severe acne cases and 21,120 matched population controls. 
    
   
  
    
   
  1 
 
  
    EGAD00001004420 
   
  
    
    Whole exome and targeted sequencing data from 11 glioblastoma multiforme patients. A total of 70 tumour specimens and 11 blood samples were used for whole exome sequencing (WES) using the Agilent SureSelectXT Human All Exon V5 Kit. Two custom targeted sequencing panels were designed using the using Agilent’s Haloplex (TES1) or Agilent SureSelect XT2 technology (TES2). Libraries were sequenced on an Illumina HiSeq2500 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  194 
 
  
    EGAD00001004421 
   
  
    
    FASTQ files of the RNA-Seq data for both the normal and tumor samples for the study "Genomic landscape of lung adenocarcinoma in East Asians". For raw read count data as well as other metadata, please download from https://src.gisapps.org/OncoSG_public/study/summary?id=GIS031 by clicking the download icon next to the dataset title. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  260 
 
  
    EGAD00001004422 
   
  
    
    FASTQ files of the Exome-Seq data for both the normal and tumor samples for the study "Genomic landscape of lung adenocarcinoma in East Asians". For mutations and copy number variants called by this study, please download from https://src.gisapps.org/OncoSG_public/study/summary?id=GIS031 by clicking the download icon next to the dataset title. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  418 
 
  
    EGAD00001004423 
   
  
    
    Data supporting: "The landscape of selection in 551 Esophageal Adenocarcinomas defines genomic biomarkers for the clinic." Frankell et al.
RNAseq (BAM files)
116 tumours 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004424 
   
  
    
    Prostate Cancer - RNA-Seq unmapped reads 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  148 
 
  
    EGAD00001004425 
   
  
    
    FASTQ files from sequencing to < 0.4x depth of coverage of thirteen glioma patients. Indexed sequencing libraries were prepared using a commercially available kit (ThruPLEX-Plasma Seq, Rubicon Genomics). Libraries were pooled in equimolar amounts and sequenced on a HiSeq 4000 (Illumina) generating 150-bp paired-end reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  13 
 
  
    EGAD00001004426 
   
  
    
    Spiradenocarcinoma is a rare cutaneous sweat gland adnexal cancer with potential for aggressive behaviour. They are classified histologically into low- and high-grade tumours, with morphologically low-grade tumours thought to behave more favourably. However, limited information is available, with only 18 published cases. 
We have collected morphologically low-grade spiroadenocarcinomas (one with a lung metastasis) and high-grade spiroadenocarcinomas, as well as some spiradenomas (benign lesions),  cylindromas (another type of malignant cutaneous sweat gland adnexal tumour) and hybrid spiradenoma-cylindromas. H&E-stained sections were reviewed, follow-up was obtained, and immunohistochemistry for Ki-67, p53 and, MYB has been performed. The tumours were solitary, measuring 0.8-7?cm (median: 2.7?cm), with a predilection for the head and neck of elderly patients (median age: 72 years; range 53-92) without gender bias. Histologically, the tumours were multinodular and located in deep dermis and subcutis. A pre-existing spiradenoma was present in all cases. The malignant component was characterized by expansile growth with loss of the dual cell population, up to moderate cytological atypia and increased mitotic activity (median: 10/10 HPF; range 1-28). Additional findings included squamoid differentiation (n=9), necrosis (n=7), and ulceration (n=5). P53 expression was variable and no significant differences were noted in the benign compared with the malignant parts of the tumours. In contrast, in the malignant components the Ki-67 proliferative index was slightly increased, and MYB expression was lost. Follow-up (median: 67 months; range: 13-132) available for 16 patients (84%) revealed a local recurrence rate of 19% but no metastases or disease-related mortality. Here we wish to exome sequence these cases to define the first genomic landscape for this malignancy. 
    
This dataset contains all the data available for this study on 2018-10-29. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  164 
 
  
    EGAD00001004427 
   
  
    
    55 single read fastq files of low-pass WGS sequencing used to determine copy number aberrations and 34 paired read fastq files of 48 cancer gene exon sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001004428 
   
  
    
    Peripheral T-cell lymphomas not otherwise specified (PTCL-NOS) represent a heterogeneous group of nodal and extra-nodal mature T-cell lymphomas, with a low prevalence in Western countries. PTCL-NOSs account for about 25% of all PTCLs and are currently diagnosed based on exclusion criteria, as this lymphomas lack unifying morphological, phenotypic and genomic features. Cytogenetic and FISH analysis of PTCL-NOS samples have not revealed recurrent pathogenetic abnormalities, while gene expression profiling has shown only partial ability to segregate cases representing homogeneous clinic-pathological entities. This underscores the need to look at PTCL-NOS with innovative and high-throughput approaches to identify recurrent genetic lesions that could further our understanding of the biology of this heterogeneous group of diseases, provide better diagnostic tools and perhaps new targets for innovative treatments.
Our aim is to study ~15 patients affected by PTCL-NOS. Out study will be funded by a private, non-profit Italian cancer research fund (Associazione Italiana per la Ricerca sul Cancro, www.airc.it) based on a grant owned by Anna Dodero and Cristiana Carniti, hematologists at INT.
Samples will be analysed by whole genome sequencing using Illumina X10 machines, on a 150bp-PE protocol. Data will be analysed using the pipeline available in Team 78, under the supervision of Peter Campbell, the WTSI faculty who will oversee the project, and by Francesco Maura, visiting scientist at the WTSI.
     
 . 
This dataset contains all the data available for this study on 2018-10-30. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  27 
 
  
    EGAD00001004429 
   
  
    
    ChIP-Seq files for PCGP ATRX study paper titled "MYCN Amplification and ATRX Mutations are Incompatible in Neuroblastoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  121 
 
  
    EGAD00001004430 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  56 
 
  
    EGAD00001004431 
   
  
    
    SNV calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004432 
   
  
    
    SNV calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004433 
   
  
    
    SNV calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004434 
   
  
    
    SNV calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004435 
   
  
    
    Childhood cerebellar tumours mirror conserved fetal transcriptional programs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  145 
 
  
    EGAD00001004436 
   
  
    
    Exome sequencing was performed on samples from patients 064, 105, and 8760, including the remission sample for 105. Exomes were captured using the Agilent SureSelect All Exon kit v5
kit and libraries sequenced on HiSeq 2000 or 2500 or 4000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001004437 
   
  
    
    RNA-seq data from TEX cells transduced with HIST1H3H WT, HIST1H3H K27M, HIST1H3F WT, HIST1H3F K27I and Luc2 control in triplicate. In addition, RNA-Seq was performed on untransduced TEX cells. Libraries [rRNA-depleted stranded (HMR)] were sequenced on an Illumina Hiseq 4000 platform to generate 100 bp paired-end reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  18 
 
  
    EGAD00001004439 
   
  
    
    mRNA sequencing of 50 undifferentiated sarcoma tumour samples and 5 adjacent muscle tissue samples. BAM files are provided, with metadata specifying which samples are tumour/normal. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  53 
 
  
    EGAD00001004440 
   
  
    
    CNA calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004441 
   
  
    
    CNA calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004442 
   
  
    
    CNA calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004443 
   
  
    
    CNA calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions 
    
   
  
    
   
  40 
 
  
    EGAD00001004446 
   
  
    
    WGS files for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001004447 
   
  
    
    WES files for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  128 
 
  
    EGAD00001004448 
   
  
    
    60 samples from Burkina Faso and Ghana generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001004449 
   
  
    
    16S sequencing data (dual-index) from 1054 Flemish Gut Flora Project (FGFP) samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  1054 
 
  
    EGAD00001004450 
   
  
    
    Mapped BAM files of 162 tumor/normal WES experiments. 
    
   
  
    
      
      unspecified 
      
    
   
  324 
 
  
    EGAD00001004451 
   
  
    
    Whole genome sequencing data of organoid cultures derived from human bone marrow-derived and cord blood-derived hematopoietic stem and multipotent progenitor cells to study the mutation accumulation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001004452 
   
  
    
    Tumor (CD138+ plasma cells) and non-tumor (peripheral blood white cells or stem cell harvest) DNA from patients with a plasma cell dyscrasias were sequenced.  Whole genome sequencing using high molecular weight DNA was performed using the 10X Genomics Chromium platform on either a HiSeq4000 or NovaSeq (Illumina) using 100 to 150 bp paired-end reads.  The dataset consists of 111 patients, and 223 samples in total (matched tumor and control per patient; one patient had 2 tumor samples sequenced).  Diseases consisted of 2 MGUS patients, 8 SMM patients, 91 newly diagnosed myeloma patients, 1 previously treated patient, 4 relapsed MM patients, and 5 PCL patients.  Paired RNA-seq data are available for 81 of the samples under study EGAS00001003411. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  231 
 
  
    EGAD00001004453 
   
  
    
    We performed targeted DNA sequencing of primary uveal melanomas and their matched metastases from 35 patients, analyzing a total of 124 tissues. Sequencing was performed on an Illumina HiSeq 2500 instrument using a panel of 538 genes commonly involved in cancer. 124 BAM files were generated. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  124 
 
  
    EGAD00001004454 
   
  
    
    FGFP (Flemish Gut Flora Project, N=100) and TR-MDD (Treatment-Resistant Major Depression Disorder, N=7) shotgun sequencing samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  157 
 
  
    EGAD00001004455 
   
  
    
    Whole exome and transcriptome sequencing of 12 melanoma patients (including technical replicates). linicalTrials.gov Identifier: NCT02035956. Paper: Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer - Nature volume 547, pages 222–226 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  72 
 
  
    EGAD00001004456 
   
  
    
    This dataset contains short-read whole-genome sequencing data for individuals with neurodevelopmental disorders and their relatives from the NIHR-BioResource Rare Disease Consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4 
 
  
    EGAD00001004457 
   
  
    
    Datasets Galaxy 929/938 describe the amplified single chromosome sequencing data. 
    
   
  
    
   
  2 
 
  
    EGAD00001004458 
   
  
    
    The study aims to find bacteria in neural tissue from patients with amyotrophic lateral sclerosis 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  34 
 
  
    EGAD00001004459 
   
  
    
    Each dataset cosist of WES data from 5 samples (1 patient): original leukemia initial diagnosis T-ALL, original leukemia relapse T-ALL, PDX derived of initial diagnosis T-ALL, PDX derived of relapse T-ALL, remission (normal control) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  164 
 
  
    EGAD00001004461 
   
  
    
    RNAseq files (dataset 1 of 2) for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1083 
 
  
    EGAD00001004462 
   
  
    
    20 whole genome seq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001004463 
   
  
    
    RNAseq files (dataset 2 of 2) for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  204 
 
  
    EGAD00001004464 
   
  
    
    whole exome sequencing data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  523 
 
  
    EGAD00001004465 
   
  
    
    Gene expression comparison between human colonic epithelial cells cultured with Klebsiella pneumoniae (KP) derived from PSC patients versus KP JCM1662. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004466 
   
  
    
    Low pass WGS: 48 samples (5 blood samples from 6 patient data): 22 Tumour cores and 26 normal/benign cores (Next Seq ) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  48 
 
  
    EGAD00001004467 
   
  
    
    WES: 48 samples (5 blood samples from 6 patient data): 22 Tumour cores and 26 normal/benign cores (HiSeq) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004468 
   
  
    
    Total RNA Seq: 15 Samples (2 patients (MF1 and MF3)) (HiSeq) and Poly A RNA Seq: 27 Samples (4 patients Normal and Tumour ) (HiSeq) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001004469 
   
  
    
    829 bam files from exome sequencing of human tetralogy of fallot patients 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  829 
 
  
    EGAD00001004470 
   
  
    
    Exome sequence data from microcephalic dwarfism patients with de novo DNMT3A variants 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001004471 
   
  
    
    RNA-seq data generated from cells from control individuals and individuals with de novo DNMT3A variants causing microcephalic dwarfism. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004472 
   
  
    
    RRBS sequence data from one control and one patient with de novo DNMT3A mutations resulting in microcephalic primordial dwarfism. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  2 
 
  
    EGAD00001004473 
   
  
    
    ChIP-seq data from controls and patients with de novo DNMT3A mutations resulting in microcephalic primordial dwarfism. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  28 
 
  
    EGAD00001004474 
   
  
    
    Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 28 samples: matched tumour-normal pairs from 14 melanocytic nevi cases 
    
   
  
    
   
  - 
 
  
    EGAD00001004475 
   
  
    
    PacBio sequencing data of HKCI-2, HKCI-C1, HKCI-C2, HKCI-C3, HKCI-4, HKCI-9, HKCI-11, HKCI-5A and MIHA. All the data (9 samples) were saved in pacbio hdf5 format. 
    
   
  
    
      
      PacBio RS 
      
    
   
  9 
 
  
    EGAD00001004476 
   
  
    
    RNA sequence data for HKCI-2, HKCI-C1, HKCI-C2, HKCI-C3 and MIHA. RNA-seq data of all the 5 samples were stored in compressed fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001004478 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001004479 
   
  
    
    There is a total of 4 sample data (2 WGS and 2 RNAseq) belong to 2 patient deposited in this study. 
    
   
  
    
   
  4 
 
  
    EGAD00001004480 
   
  
    
    In this work, we establish and characterized a low-passage cervix cancer cell line from a Brazilian patient with squamous cell carcinoma. The dataset contains three samples from the same patient (blood, tumor tissue and the primary cell line). The technology used was exome sequencing and the file type available is fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004481 
   
  
    
    Transcriptomic sequences of small intestinal Plasma cell (PCs)s from Celiac disease patients. RNAseq data produced using Illumina paired-end (75bp) reads. Includes only raw data of sequences (fastq format). Samples from seven Celiac disease patients and four healthy controls. Samples from Celiac disease patients contain sub-groups of PCs that are either specific or not specific to autoantigen of the disease (TG2). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  18 
 
  
    EGAD00001004482 
   
  
    
    Whole genome sequencing and whole exome sequencing of  13 pediatric osteosarcoma patients including 13 primary, 10 metastatic, and 3 relapsed tumors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  78 
 
  
    EGAD00001004483 
   
  
    
    Microarray analysis of mtDNA 
    
   
  
    
   
  5800 
 
  
    EGAD00001004484 
   
  
    
    Adeno-associated virus (AAV) is a defective mono-stranded DNA virus, endemic in human population (35-80%). Recurrent clonal AAV2 insertions are associated with the pathogenesis of rare human hepatocellular carcinoma (HCC) developed on normal liver. This study aimed to characterize the natural history of AAV infection in the liver and its consequence in tumor development.
In silico analyses using viral capture data explored viral variants and new clonal insertions. Clonal AAV insertions were positive selected during HCC development on non-cirrhotic liver challenging the notion of AAV as a non-pathogenic virus. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001004485 
   
  
    
    This project focused on identifying rare coding variation that substantially increases risk of VEOIBD by exome sequencing of VEOIBD patients and some of their family members. Here you can find BAM files from an affected proband (P2) and his unaffected parents. In this study ALPI mutations were identified as a likely cause of the disease. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004486 
   
  
    
    This dataset consists of aligned DNA sequencing data in BAM file format from cell-free DNA and white blood cells from 24 men with metastatic prostate cancer. One cell-free DNA sample and one white blood cell sample is available for each patient, resulting in 48 total BAM files in this dataset. The sequencing was performed using a hybrid capture-based targeted panel of 73 prostate cancer driver genes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004487 
   
  
    
    Whole transcriptome sequencing (WTS) of a longitudinal breast cancer (BC) cohort consisting of 146 cases (281 tumors, 109 pairs), including 52 (38%) that achieved pathologic complete responses (pCR) and 85 (62%) that harbored residual diseases at time of surgery. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  235 
 
  
    EGAD00001004488 
   
  
    
    Dataset of BAM files from patients with proven bacterial meningitis in Malawi. Samples consist of BAM files from PolyA RNA seq runs from patients classifed as admission whole blood or CSF on admission (pre-antibiotics) in the Emergency Department. Subsequent BAM files are identical runs from whole blood from the same patients, taken at either day 10 or day 40 post admission to hospital. All patients have disease, they are divided into survivors and non-survivors at the day 40 time point. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  45 
 
  
    EGAD00001004489 
   
  
    
    This dataset contains matched RNA-Seq and miRNA-Seq fastq files from 109 match samples of 34 human Papillomavirus-negative Head and Neck cancer patients, including 72 lymph nodes, 29 tumor and 8 normal samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  218 
 
  
    EGAD00001004490 
   
  
    
    The dataset consists of samples from papillary thyroid cancer patients. A total of 11 DNA samples from blood/normal and cancer tissue are subjected to whole
exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and
quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset. 
    
   
  
    
   
  11 
 
  
    EGAD00001004491 
   
  
    
    RNA-seq of seven small intestinal neuroendocrine tumors, sequenced with illumina Nextseq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001004492 
   
  
    
    BAM files for high-throughput whole genome sequence data of 17 modern Aboriginal Australians 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  17 
 
  
    EGAD00001004493 
   
  
    
    Transcriptome of Ewing sarcoma tumors (ICGC project). Fastq files of 57 RNA-seq are available (2x101bp). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001004494 
   
  
    
    Illumina HiSeqXTen platform sequencing data of whole genome libraries prepared from 156 matched tumour-normal samples from 78 donors 
    
   
  
    
   
  156 
 
  
    EGAD00001004495 
   
  
    
    Whole genome sequencing data of tumor tissues, adjacent normal tissues, and peripheral blood from CRC patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  107 
 
  
    EGAD00001004496 
   
  
    
    Tumour and control from a patient with uveal melanoma with a MBD4 germline mutation. Samples were whole exome sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004497 
   
  
    
    Single-cell RNA-seq profiling of immune cells sorted from human Melanoma tumors (and several matching PBMC samples). Contains de-multiplexed FASTQ files per plate (MARS-seq amplification batch, total 204 samples) and also de-multiplexed FASTQ files of single-cell TCRb-seq. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  204 
 
  
    EGAD00001004498 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001004499 
   
  
    
    In this study, we aimed to identify somatic structural variation of T-cell acute lymphoblastic leukemias (T-ALLs_ from patient-derived xenografts (PDX) at the single-cell level. For this purpose, we performed strand-specific single-cell sequencing of PDX-derived T-ALL relapse samples from two juvenile patients  (P1, P33). To validate structural variation detected via scTRIP, we profiled whole exome sequencing (WES) data from P33 (samples taken during initial disease, remission, relapse), and mate-pair sequencing data from P1 (relapse). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  124 
 
  
    EGAD00001004500 
   
  
    
    Mapped data for 10 Colon MSI cancer samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001004501 
   
  
    
    Genomic and transcriptomic data from a cohort of 35 RAS wild-type colorectal cancers. All 35 cases were DNA sequenced at baseline (BL) before treatment with single agent cetuximab. Progressive disease (PD)-biopsies were taken shortly after radiological progression and successfully exome sequenced from 24/35 cases. mRNA sequencing is available for 25 Baseline and 15 PD samples. ctDNA from 9 cases that progressed after prolonged cetuximab benefit were also deep sequenced. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  155 
 
  
    EGAD00001004503 
   
  
    
    Whole exome sequencing data of 19  snap-frozen peritoneal mesothelioma (tumor) samples and 16 matched normal samples. Sequencing library was prepared using Ion AmpliSeq Exome RDY Library Preparation. Samples were sequenced on the Ion Proton System using the Ion PI Hi-Q Sequencing 200 Kit and Ion PI v3 chip. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  35 
 
  
    EGAD00001004504 
   
  
    
    RNA-seq data of 15 snap-frozen tissue of peritoneal mesothelioma. The strand specific RNA library prepared using TruSeq (Illumina) and pair-end sequencing performed in Illumina HiSeq 4000.  The datasets contains paired fastq files for each of 15 tumor samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  15 
 
  
    EGAD00001004505 
   
  
    
    49 samples from Nigeria generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  49 
 
  
    EGAD00001004506 
   
  
    
    WES files for CHEN WTPDX paper titled "Forty-Five patient-derived xenografts capture the clinical and biological heterogeneity of Wilms tumor" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  107 
 
  
    EGAD00001004507 
   
  
    
    RNAseq files for CHEN WTPDX RNASEQ paper titled "Forty-Five patient-derived xenografts capture the clinical and biological heterogeneity of Wilms tumor" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  88 
 
  
    EGAD00001004509 
   
  
    
    Ovarian cancer (OC) is a heterogeneous disease usually diagnosed at a late stage. Experimental in vitro models that faithfully capture the hallmarks and tumor heterogeneity of OC are limited and hard to establish. We present a novel protocol that enables efficient derivation and long-term expansion of OC organoids. Utilizing this protocol, we have established 56 organoid lines from 32 patients, representing the spectrum of ovarian neoplasms, including non-malignant borderline tumors, as well as mucinous, clear-cell, endometrioid, low- and high-grade serous carcinomas. OC organoids recapitulate histological and genomic features of the pertinent lesion from which they were derived, illustrating intra- and inter-patient heterogeneity, and can be genetically modified.  We show that OC organoids can be used for drug screening assays and capture different tumor subtype responses to the gold standard platinum-based chemotherapy, including acquisition of chemoresistance in recurrent disease. Finally, OC organoids can be xenografted, enabling in vivo drug sensitivity assays. Taken together, this demonstrates their potential application for research and personalized medicine. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  50 
 
  
    EGAD00001004512 
   
  
    
    High throughput sequencing dataset of antibody repertoires from naive B-cells, taken from blood samples of 100 individuals from Norway. 48 healthy controls and 52 patients with celiac disease. Sequencing was performed using a 300*2 paired-end kit by Illumina MiSeq. The sequences were processed using pRESTO. Each fastq file in the dataset is the repertoire of a single individual. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  100 
 
  
    EGAD00001004513 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Cerebral Small Vessel Disease (CSVD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004514 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Hypertrophic Cardiomyopathy (HCM) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004515 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project.Participants from the Intrahepatic Cholestasis of Pregnancy (ICP) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004516 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Neuropathic Pain Disorders (NPD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004517 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Primary Membranoproliferative Glomerulonephritis (PMG) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004518 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Steroid Resistant Nephrotic Syndrome (SRNS) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004519 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Bleeding, Thrombotic and Platelet Disorders (BPD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004520 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Inherited Retinal Disorders (IRD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004521 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Multiple Primary Malignant Tumours (MPMT) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004522 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Neurological and Developmental Disorders (NDD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004523 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project.Participants from the Primary Immune Disorders (PID) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004524 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Stem cell and Myeloid Disorders (SMD) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004525 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Pulmonary Arterial Hypertension (PAH) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004526 
   
  
    
    We set out to determine ctDNA abundance at de novo mCSPC diagnosis and whether ctDNA provides complementary clinically relevant information to a prostate biopsy. We collected and sequenced 77 plasma cell-free DNA samples from 53 newly diagnosed patients with mCSPC. Targeted sequencing was also performed on DNA from 48 diagnostic prostate tissue samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  178 
 
  
    EGAD00001004528 
   
  
    
    This dataset contains 31 pancreatic organoid samples used in the 'organoid data of pancreatic cancers ' study 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  31 
 
  
    EGAD00001004529 
   
  
    
    This dataset contains 53 blood samples used as controls in study EGAXXXXX and EGAXXXXX 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  53 
 
  
    EGAD00001004530 
   
  
    
    This dataset includes somatic small variant calling files derived from fifteen metastatic samples from cutaneous squamous cell carcinoma matched to normal blood samples. These samples were whole-genome sequenced by HiSeq X Ten and the resulting reads were mapped against the human genome (hg37) using BWA-MEM 0.7.10-r789. Somatic variant calling was then performed using strelka 1 (version 2.0.17). 
    
   
  
    
   
  13 
 
  
    EGAD00001004532 
   
  
    
    In this dataset there are 55 whole genome sequencing samples of epithelial ovarian carcinoma (bam files). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  55 
 
  
    EGAD00001004533 
   
  
    
    48 samples from Botswana generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004534 
   
  
    
    Whole exome sequencing of 17 tumors from 12 different individuals with biallelic germline NTHL1 mutations from 9 different tissue types. Provided are 17 bam files which are mapped to human genome version GRCh37. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001004535 
   
  
    
    Dataset of 27 whole genome sequencing files (BAM), which cover 12 individuals out of which four suffer from COPD. From 9 individuals, there are sequencing data from blood and from lung brushings at one single site available, from another 3 there is additionally a lung brushing sequencing file from a second site available. Comparison of blood with lung brushings allows the calling of somatic mutations within a tissue. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  27 
 
  
    EGAD00001004537 
   
  
    
    Primary pediatric osteosarcoma samples were collected and profiled using WGS. When possible, germline and tumor samples were collected. For some patients, multiple tumor tissues were collected and sequenced. Some of the samples were used to derive PDTX models, which were also profiled with WGS. Paired end sequencing was performed on Illumina HiSeq instruments and FASTQ files reported. 
    
   
  
    
      
      unspecified 
      
    
   
  75 
 
  
    EGAD00001004538 
   
  
    
    Primary pediatric osteosarcoma samples were collected and profiled using RNAseq. When possible, germline and tumor samples were collected. For some patients, multiple tumor tissues were collected and sequenced. Some of the samples were used to derive PDTX models, which were also profiled with RNAseq. Paired end sequencing was performed on Illumina HiSeq instruments and FASTQ files reported. 
    
   
  
    
      
      unspecified 
      
    
   
  30 
 
  
    EGAD00001004539 
   
  
    
    Whole exome sequencing (WES) libraries were prepared from 200ng of genomic DNA using the Agilent SureSelect XT Target Enrichment System for Illumina Paired-End Multiplexed Sequencing Library coupled with the Agilent SureSelect XT Human all exon v6 capture reagent. Libraries were sequenced on a NextSeq 550 sequencer using the High output 300 cycles kit generating 150bp paired end single-indexed reads. Alignment against b37 using Novoalign (version 3.02.08). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  48 
 
  
    EGAD00001004541 
   
  
    
    RNA-sequencing of human hepatocellular carcinoma biopsies (n=14), 44 HCC xenografts derived from 11 HCC biopsies and 3 lymphoma xenografts derived from 3 HCC biopsies. RNA-sequencing was performed using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Gold (lllumina). SR126 sequencing was performed on an Illumina HiSeq 2500 using v4 SBS chemistry according to the manufacturer’s guidelines. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  61 
 
  
    EGAD00001004542 
   
  
    
    The dataset includes whole exome sequencing (WES) data on 57 matched esophageal tumor-normal pair. The Agilent Sure-Select Human All Exon V4 plus UTRs reagent was used to capture the target exons and UTRs and Illumina HiSeq 2000 instrument was used to sequence the target region with approximately 72-fold coverage. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  114 
 
  
    EGAD00001004543 
   
  
    
    RNA from CD138+ plasma cells from patients with a plasma cell dyscrasia were sequenced using the TruSeq Stranded Total RNA Ribo-zero Gold kit (Illumina). Paired-end 75bp reads were generated on a NextSeq500 or HiSeq4000 (Illumina). The dataset consists of 1 MGUS sample, 5 SMM samples, 69 newly diagnosed myeloma samples, 1 relapsed myeloma sample, 1 previously treated myeloma samples, and 4 PCL samples. Matching whole genome sequencing data are available for these samples under study  EGAS00001003164. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  83 
 
  
    EGAD00001004544 
   
  
    
    This Dataset is currently hosted by the European Nucleotide Archive. To access the data contained within the Dataset please follow the link below:
        https://www.ebi.ac.uk/ena/browser/view/PRJEB39323
Dataset consists of 20 snRNA-seq bam files from 10X v2. 5 samples from postmortem white matter tissue from non-neurological controls and15 samples from different MS lesions from the white matter tissue of 4 postmortem progressive MS patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001004545 
   
  
    
    65 paired tumor and normal whole-genome sequencing samples from urothelial bladder carcinomas (UBC, the most common type of bladder cancer)  are used to uncover the whole-genome mutational landscape of UBC. Recurrent mutations in noncoding regions affecting gene regulatory elements and structural variations leading to gene disruptions are prevalent in this type of cancer. 
    
   
  
    
   
  65 
 
  
    EGAD00001004547 
   
  
    
    All normal somatic cells are thought to acquire mutations. However, characterisation of the patterns and consequences of somatic mutation in normal tissues is limited. Uterine endometrium is a dynamic tissue undergoing cyclical shedding and reconstitution lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands showed that most are clonal cell populations derived from a recent common ancestor, with mutation burdens differing from other normal cell types and many fold lower than endometrial cancers. Mutational signatures found ubiquitously account for most mutations. Many, in some women all, endometrial glands are colonised by cell clones carrying driver mutations in cancer genes, often with multiple drivers. Total and driver mutation burdens increase with age, but are also influenced by other factors, including body mass index and parity, and clones with drivers often originate during early decades of life. The somatic mutational landscapes of normal cells differ between cell types and are revealing the procession of neoplastic change leading to cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001004548 
   
  
    
    Integration of Genomic and Transcriptional Features in Pancreatic Cancer Reveals Increased Cell Cycle Progression in Metastases - RNA-Seq mapped and unmapped reads 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  75 
 
  
    EGAD00001004550 
   
  
    
    This dataset contains Whole Exome Sequencing of 47 MSI colorectal cancers (CRCs) and paired adjacent normal mucosa 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  94 
 
  
    EGAD00001004551 
   
  
    
    Integration of Genomic and Transcriptional Features in Pancreatic Cancer Reveals Increased Cell Cycle Progression in Metastases - WGS mapped reads 
    
   
  
    
   
  - 
 
  
    EGAD00001004552 
   
  
    
    10X genomics chromium single-cell RNA-sequencing of (i) patient derived triple negative breast cancer xenograft (ii) primary tumour and ascites ovarian cancer cell lines at tumour recurrence. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  3 
 
  
    EGAD00001004553 
   
  
    
    Direct library preparation+ single-cell DNA-sequencing of (i) patient derived triple negative breast cancer xenograft (ii) primary tumour and ascites ovarian cancer cell lines at tumour recurrence. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  980 
 
  
    EGAD00001004554 
   
  
    
    Uveal melanoma (UM) is the most common primary intraocular malignancy in adults. Despite improvement of diagnosis and treatment of the primary tumor, there is no effective treatment of metastatic disease and approximately half of patients will die within one year or less following metastases detection. Tumor heterogeneity has been proposed as a key factor of drug resistance. However, it has been scarcely studied in UM. The present project aims searching for specific drivers of the metastatic progression, describing the genomic and transcriptomic landscape of metastatic UM, exploring tumor heterogeneity and investigating its role in drug resistance. Thus whole exome sequencing and transcriptomics have been performed on constitutional, primary tumor and metastatic samples from 28 UM patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  110 
 
  
    EGAD00001004555 
   
  
    
    The aim of our project is to decipher the genomic of advanced hepatocellular carcinoma using whole exome sequencing. To this purpose, we aim to compare genetic landscape of advanced hepatocellular carcinoma with early tumor in order to understand the mechanisms of tumor progression. This work will also help to identify new therapeutic targets potentially useful to treat patients at advanced stage. This dataset contain whole exome sequencing aligned reads for 41 tumor with matched normal samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  39 
 
  
    EGAD00001004556 
   
  
    
    In total, 186 FH+ ESCC cases were sequenced using whole-exome sequencing, then 1935 ESCC cases and 1186 geographically-matched healthy controls were sequenced using 7 Mb custom designed Roche SeqCap kit which targeted about 600 genes. The libraries were constructed and then sequenced in Illumina platform. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3289 
 
  
    EGAD00001004557 
   
  
    
    The dataset includes BAM, FASTQ and decompressed gVCF files for 50 samples from Benin generated for the H3Africa Chip Design Study. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001004558 
   
  
    
    Genotyping of 43 cases of invasive GAS infection by Illumina HumanCore-24 array and Illumina Global Screening Array. 
    
   
  
    
   
  43 
 
  
    EGAD00001004559 
   
  
    
    WGBS files for PCGP NBL_MYCN_ATRX paper titled "MYCN Amplification and ATRX Mutations are Incompatible in Neuroblastoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001004561 
   
  
    
    Plasma DNA libraries were constructed from 4 mL of plasma without library enrichment, namely without PCR amplification. Paired-end massively parallel sequencing was performed 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  169 
 
  
    EGAD00001004563 
   
  
    
    This dataset contains whole genome sequencing data from 21 primary and relapsed IDH-wt glioblastomas and matched blood controls. Tumors were sequenced at a target coverage of 150x, blood controls at 80x. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  63 
 
  
    EGAD00001004564 
   
  
    
    This dataset contains strand-specific RNA sequencing data from 16 primary/relapsed sample pairs of IDH-wt glioblastomas 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001004565 
   
  
    
    This dataset contains gene panel sequencing data from 43 sample pairs of primary and relapsed IDH-wt glioblastomas. The gene panel covers 50 glioma-associated genes. 14 of the sequenced sample pairs were sequenced with whole genome sequencing also and are accessible under EGAD00001004563. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  86 
 
  
    EGAD00001004566 
   
  
    
    WES files for Newman MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004567 
   
  
    
    RNASeq files for MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004568 
   
  
    
    We performed whole genome bisulfite sequencing of plasma DNA in 11 colorectal cancer patients to study the colonic DNA in plasma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001004569 
   
  
    
    Exome sequencing of 317 rainforest hunter-gatherers (RHG) and neighbouring farmers (AGR) from Central Africa was performed based on the Nextera Rapid Capture Expanded Exome Kit (62-Mb content) with the Illumina HiSeq 2500. The population sample includes the Baka of south-eastern Cameroon and northern Gabon (wRHG), the Bantu-speaking Nzebi and Bapunu sedentary agriculturalists of Gabon (wAGR), and the BaTwa (eRHG) and BaKiga (eAGR) from Uganda. After QC filters, exomes of 300 unrelated individuals were obtained at high coverage (mean depth 68x), including 406,270 variants. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  317 
 
  
    EGAD00001004570 
   
  
    
    Repeated clinical malaria episodes are associated with modification of the immune system in children.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/. . 
This dataset contains all the data available for this study on 2019-01-17. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  113 
 
  
    EGAD00001004571 
   
  
    
    The BLUEPRINT project is a large-scale project investigating epigenetic mechanisms involved in blood formation, in health and disease. The human variation workpackage (WP10) of the project seeks to characterize the effect of common sequence variation on the epigenome status of a cell. To do this, the project will use highly purified blood cells to minimise "experimental noise" and therefore enhance the power to discover modest effects.  Two peripheral blood cell types, the CD14+CD16- monocyte (an important central orchestrator of adaptive immunity and a bridge between innate and adaptive immunity) and the CD65+CD9- neutrophilic granulocyte (the frontline cell for innate immunity) have been selected for this purpose.  The two types of cells will be obtained at high purity from adult blood (AB) of 200 healthy males and females, respectively.  Cells will be purified by using already validated and fully operational protocols that are based on density gradient centrifugation of the buffy coat obtained from whole blood, followed by magnetic bead-based purification using monoclonal antibodies against Cluster of Differentiation (CD) lineage-specific cell surface markers. This data set contains functional genomics data for gene expression and chromatin state. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  172 
 
  
    EGAD00001004572 
   
  
    
   
  
    
   
  29 
 
  
    EGAD00001004573 
   
  
    
    Patients with germline mutations in CYLD can develop hundreds of benign skin tumours called cylindromas.  The development of multiple tumours within a single patient at sun-protected and sun-exposed sites, varying tumour histological patterns and grades of malignancy allow for the testing of several genetic hypothesis in relation to cutaneous carcinogenesis. By adopting the unprecedented approach of whole genome sequencing of multiple benign skin tumours within individuals in multigenerational families, we set out to study the impact of mutational diversity on models such as multistep carcinogenesis as well as non-sequential carcinogenesis.  Using non-negative matrix factorisation (NMF) to discover mutational signatures, we found distinct mutational signatures in identical benign tumours (N=2 patients; n=11 tumours) at sun exposed and sun protected skin tumours within a mother and her daughter. We found recurrent mutations in epigenetic modifying genes  in CCS tumours which are known to have an oncogenic dependency on Wnt signalling. We also demonstrate that cutaneous tumours that metastasize to the lung carry a a UV signature, supporting the origin from the skin. Distinct malignant tumours, such as BCC and malignant spiradenocarcinoma carried unique driver mutations. These findings add new dimensions to the existing paradigms of UV-induced skin cancer and highlight the utility of studying rare disease to gain novel insights into genetic mechanisms of tumour formation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  13 
 
  
    EGAD00001004574 
   
  
    
    The dataset consists of sequenced cell free DNA (cfDNA) samples from colorectal cancer patients. The samples were sequenced on an Illumina MiSeq machine using a custom amplicon sequencing approach. These amplicons were designed to cover the most common mutation hotspots in colorectal cancer. The data include 138 cfDNA samples from 34 different patients. For each patient several samples are available derived from blood drawn at different time points during treatment. In addition the data include samples from 22 histology slides and 30 samples derived from HT29/HCT116 cell lines that were used as controls. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  189 
 
  
    EGAD00001004575 
   
  
    
    Whole genome sequncing data of original/SHANK2 modified/SHANK2 knockout. Note that the SHANK2 knockout sample is a different sample from 1_0441_003. Please refer to other paper for the data. 
    
   
  
    
   
  3 
 
  
    EGAD00001004576 
   
  
    
    ATRT whole exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  32 
 
  
    EGAD00001004577 
   
  
    
    Metabolic reprogramming is linked to cancer cell growth and proliferation, metastasis, and therapeutic resistance in a multitude of cancers. Targeting dysregulated metabolic pathways to overcome resistance, an urgent clinical need in all relapsed/refractory cancers, remains difficult. Through genomics analysis of clinical specimens, we show that metabolic reprogramming towards oxidative phosphorylation (OXPHOS) and glutaminolysis is associated with therapeutic resistance to the Bruton’s tyrosine kinase inhibitor ibrutinib in mantle cell lymphoma (MCL), an incurable B-cell lymphoma with poor clinical outcomes. Inhibition of OXPHOS with a novel, clinically applicable small molecule, IACS-010759, which targets complex I of the mitochondrial electron transport chain, results in significant growth inhibition in vitro and in vivo in ibrutinib-resistant patient-derived cancer models. This work suggests that targeting metabolic pathways to subvert therapeutic resistance is a clinically viable approach to treat highly refractory malignancies. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001004578 
   
  
    
    Chronic liver injury predisposes to cirrhosis and hepatocellular carcinoma, but how somatic mutations accumulate in liver disease is unexplored. We sequenced whole genomes of 400 microdissections of 100-500 hepatocytes from 5 normal and 6 cirrhotic livers. Compared to normal liver, cirrhotic liver had higher mutation burden, especially structural variants, including chromothripsis. Cirrhotic nodules were oligoclonal; sometimes entirely derived from a single, recent common ancestor. Clonal expansions millimeters in diameter occurred in cirrhosis in the absence of known driver mutations. Endogenous mutational processes predominated, although signatures of polycyclic aromatic hydrocarbon and aristolochic acid exposure occurred in some samples. Up to 10-fold within-patient variation in activity of exogenous signatures existed between adjacent cirrhotic nodules, with both clone-specific and microenvironmental forces shaping this heterogeneity. Synchronous hepatocellular carcinomas drew from the same repertoire of mutational signatures as background cirrhotic liver, but with higher burden. Somatic mutations chronicle the exposures, toxicity, regeneration and clonal structure of liver tissue as it progresses from health to disease. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  577 
 
  
    EGAD00001004579 
   
  
    
    WGS files for Newman MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004580 
   
  
    
    Whole Exome Sequencing on PDAC PDX1 parental sample and 12 clones. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001004581 
   
  
    
    This includes variant calls (single nucleotide variants and small insertions/deletions) from 8086 (mostly British Pakistani/British Bangladeshi) individuals from the following studies:
1. 3781 British Pakistani/British Bangladeshi adults from East London Genes and Health
2. 2791 British South Asian mothers from Born in Bradford
3. 1428 British South Asian adults from Birmingham
4. 86 individuals (mixed ancestries) from families with rare diseases, from Queen Mary University London
All of the Birmingham and most of the Born in Bradford samples were previously sequenced as part of PMID: 26940866. 
Mapping was done with bwa-mem and variant calling was carried out with GATK HaplotypeCaller. We removed variant sites for which the following was true:
SNPs: "QD < 2.0 || FS > 30 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0"
Indels: "QD < 2.0 || FS > 30 || ReadPosRankSum < -20.0" 
    
   
  
    
   
  - 
 
  
    EGAD00001004582 
   
  
    
    This dataset contains DNA sequencing data from 95 colorectal cancer and matched-normal samples. The dataset contains targeted deep sequencing of selected regulatory elements in 95 cancer and matched-normal samples, and data for one sample that was additionally whole-genome sequenced (cancer and matched-normal). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  192 
 
  
    EGAD00001004583 
   
  
    
    Whole-genome-sequencing (WGS) of human tumours has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational-signatures in 324 WGS of human induced pluripotent stem cells (iPSCs) following exposure to known or suspected environmental carcinogens. 79 agents were tested with or without metabolic activation at concentrations that produced measurable cytotoxicity; 41 yielded characteristic substitution mutational signatures. Some exhibit similarity with signatures found in human tumours. Additionally, 6 agents produced double-substitution signatures and 8 produced indel signatures. Investigating mutation asymmetries across genome topography reveals fully functional mismatch and transcription-coupled repair pathways in iPSCs. Primary adducts induced by environmental carcinogens can be resolved by disparate repair/replicative pathways, resulting in an assortment of signature outcomes even for a single mutagen. This compendium of experimentally-induced mutational-signatures permits further exploration of roles of environmental agents in cancer aetiology, and underscores how human stem cell DNA is directly vulnerable to environmental agents. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  324 
 
  
    EGAD00001004584 
   
  
    
    RNA-seq of 24 M-CSF differentiated human peripheral monocyte-derived macrophages (MDMs) activated with short exposure (3hours) to LPS, or long exposure (24 hours) to LPS, LPS with IFNγ, IFNγ, IL-4, IL-10, and dexamethasone. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001004585 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD00001004586 
   
  
    
    Variants and WGS data for Gardner et al. 2018 (biorxiv 471375). One VCF each for Alu, L1, and SVA. Flat text file and WGS for processed pseudogenes. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001004588 
   
  
    
    Whole genome sequencing data of ccRCCs were utilized for somatic variations calling. 
    
   
  
    
   
  82 
 
  
    EGAD00001004589 
   
  
    
    54 WGS Ewing's sarcoma samples sequenced at The Hospital for Sick Children Toronto (Adam Shlien's lab) and published on Science 2018. Reference Anderson et al. "Rearrangement bursts generate canonical gene fusions in bone and soft tissue tumors" 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001004590 
   
  
    
    DNA-seq from plasma of 14 liver transplantation patients 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  14 
 
  
    EGAD00001004591 
   
  
    
    TRACERx 100: RNAseq data from the first 100 TRACERx tumours (164 tumor regions from 64 patients) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  164 
 
  
    EGAD00001004592 
   
  
    
    Raw data used in the analysis of chromosomally integrated HHV6 genomes in parent-infant pairs, generated by means of  full viral genome sequencing by SureSelect target enrichment, in the context of a larger study investigating the relationship between HHV6 and adverse pregnancy outcome. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  24 
 
  
    EGAD00001004593 
   
  
    
    Precision medicine trials in glioblastoma should be conducted at tumor recurrence. However, second surgery for recurrent GBMs is not routinely performed and therefore molecular data is predominantly derived from primary samples. This study aims to establish the frequency of driver changes at tumor recurrence. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  377 
 
  
    EGAD00001004594 
   
  
    
    The dataset includes multi-region exome sequencing (MSeq) of four resected treatment naïve mismatch repair deficient gastro-esophageal cancers. Paired-end sequencing was performed on the Illumina HiSeq 2500 or NovaSeq 6000 with a target depth of 200X. Seven primary tumor regions along with tumor-adjacent non malignant tissue were subjected to MSeq. An additional two lymph node metastases were also included from each of two cases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  35 
 
  
    EGAD00001004595 
   
  
    
    VALCAP files for Ma et al. (2019) Genome Biology (accepted) titled “Analysis of error profiles in deep next-generation sequencing data" 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  47 
 
  
    EGAD00001004596 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004597 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004598 
   
  
    
    Genome and transcriptome sequence data from a mucinous colloid carcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004599 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004600 
   
  
    
    Genome and transcriptome sequence data from a non-small cell adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004601 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004602 
   
  
    
    Genome and transcriptome sequence data from a metastatic synovial sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004603 
   
  
    
    Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the cheek patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004604 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004605 
   
  
    
    Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004606 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001004607 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004608 
   
  
    
    Genome and transcriptome sequence data from a clear cell carcinoma of the left ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004609 
   
  
    
    Genome and transcriptome sequence data from a low grade chondrosarcoma (bronchus) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004610 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004611 
   
  
    
    Genome and transcriptome sequence data from a follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004612 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004613 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004614 
   
  
    
    Genome and transcriptome sequence data from a metastatic follicular thyroid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004615 
   
  
    
    Genome and transcriptome sequence data from a ganglioglioma of the left temporal lobe patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004616 
   
  
    
    Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the cervix patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004617 
   
  
    
    Genome and transcriptome sequence data from a anorectal gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004618 
   
  
    
    Genome and transcriptome sequence data from a metastatic mmmt of the endometrium patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004619 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the stomach patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004620 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004621 
   
  
    
    Genome and transcriptome sequence data from a metastatic basal cell carcinoma of frontal scalp patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004622 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectosigmoid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004623 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004624 
   
  
    
    Genome and transcriptome sequence data from a double-hit lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004625 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004626 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004627 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectosigmoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004628 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004629 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004630 
   
  
    
    Genome and transcriptome sequence data from a metastatic basal cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004631 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer ER+ patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004632 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004633 
   
  
    
    Genome and transcriptome sequence data from a metastatic renal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004634 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004635 
   
  
    
    Genome and transcriptome sequence data from a extramedullary spinal ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004636 
   
  
    
    Genome and transcriptome sequence data from a metastatic endometrioid/mucinous ovarian carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004637 
   
  
    
    Genome and transcriptome sequence data from a metastatic clear cell carcinoma of gynecological origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004638 
   
  
    
    Genome and transcriptome sequence data from a high grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004639 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004640 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004641 
   
  
    
    Genome and transcriptome sequence data from a metastatic alveolar soft part sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004642 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004643 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004644 
   
  
    
    Genome and transcriptome sequence data from a metastatic lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004645 
   
  
    
    Genome and transcriptome sequence data from a metastatic fibrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004646 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004647 
   
  
    
    Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004648 
   
  
    
    Genome and transcriptome sequence data from a high grade sarcoma of the epithelioid/spindle cell patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004649 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004650 
   
  
    
    Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004651 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004652 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the stomach patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004653 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004654 
   
  
    
    Genome and transcriptome sequence data from a breast invasive ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004655 
   
  
    
    Genome and transcriptome sequence data from a locally advanced oropharyngeal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004656 
   
  
    
    Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004657 
   
  
    
    Genome and transcriptome sequence data from a metastatic ampullar carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004658 
   
  
    
    Genome and transcriptome sequence data from a metastatic osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004659 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004660 
   
  
    
    Genome and transcriptome sequence data from a metastatic primitive neuro-ectodermal tumor of the testicle patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004661 
   
  
    
    Genome and transcriptome sequence data from a metastatic clear cell carcinoma of the ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004662 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma of pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004663 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004664 
   
  
    
    Genome and transcriptome sequence data from a neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004665 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004666 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic neck adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004667 
   
  
    
    Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004668 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004669 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004670 
   
  
    
    Genome and transcriptome sequence data from a metastatic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004671 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004672 
   
  
    
    Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004673 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004674 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004675 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004676 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004677 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004678 
   
  
    
    Genome and transcriptome sequence data from a metastatic thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004679 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004680 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004681 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004682 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004683 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004684 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the GE junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004685 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004686 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004687 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004688 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004689 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004690 
   
  
    
    Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004691 
   
  
    
    Genome and transcriptome sequence data from a metastatic transverse colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004692 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004693 
   
  
    
    Genome and transcriptome sequence data from a metastatic carcinoma to paraspinal mass with primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004694 
   
  
    
    Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004695 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001004696 
   
  
    
    Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004697 
   
  
    
    Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004698 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004699 
   
  
    
    Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004700 
   
  
    
    Genome and transcriptome sequence data from a metastatic lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004701 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004702 
   
  
    
    Genome and transcriptome sequence data from a high grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004703 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004704 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004705 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004706 
   
  
    
    Genome and transcriptome sequence data from a adenocarcinoma of the liver patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004707 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004708 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004709 
   
  
    
    Genome and transcriptome sequence data from a metastatic high-grade adenocarcinoma of the fallopian tubes patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004710 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004711 
   
  
    
    Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004713 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004714 
   
  
    
    Genome and transcriptome sequence data from a metastatic non-small cell lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004715 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004716 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004717 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004718 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004719 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1340 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1340 
 
  
    EGAD00001004720 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2057 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2057 
 
  
    EGAD00001004721 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1970 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1970 
 
  
    EGAD00001004722 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2091 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2091 
 
  
    EGAD00001004723 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1267 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1267 
 
  
    EGAD00001004724 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 230 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  230 
 
  
    EGAD00001004725 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 232 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  232 
 
  
    EGAD00001004726 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 239 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  239 
 
  
    EGAD00001004727 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 692 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  692 
 
  
    EGAD00001004728 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 612 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  612 
 
  
    EGAD00001004729 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1700 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1700 
 
  
    EGAD00001004730 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 628 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  628 
 
  
    EGAD00001004731 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 596 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  596 
 
  
    EGAD00001004732 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1735 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1735 
 
  
    EGAD00001004733 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 585 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  585 
 
  
    EGAD00001004734 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 766 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  766 
 
  
    EGAD00001004735 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2055 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2055 
 
  
    EGAD00001004736 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 620 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  620 
 
  
    EGAD00001004737 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 624 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  624 
 
  
    EGAD00001004738 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 481 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  481 
 
  
    EGAD00001004739 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 378 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  378 
 
  
    EGAD00001004740 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 735 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  735 
 
  
    EGAD00001004741 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 718 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  718 
 
  
    EGAD00001004742 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 493 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  493 
 
  
    EGAD00001004743 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1222 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1222 
 
  
    EGAD00001004744 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 522 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  522 
 
  
    EGAD00001004745 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 488 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  488 
 
  
    EGAD00001004746 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 509 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  509 
 
  
    EGAD00001004747 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 604 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  604 
 
  
    EGAD00001004748 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 626 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  626 
 
  
    EGAD00001004749 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 635 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  635 
 
  
    EGAD00001004750 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1522 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  1522 
 
  
    EGAD00001004751 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 465 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  465 
 
  
    EGAD00001004752 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 606 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  606 
 
  
    EGAD00001004753 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 615 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  615 
 
  
    EGAD00001004754 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 636 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  636 
 
  
    EGAD00001004755 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 968 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  968 
 
  
    EGAD00001004756 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 480 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  480 
 
  
    EGAD00001004757 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 561 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  561 
 
  
    EGAD00001004758 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 844 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  844 
 
  
    EGAD00001004759 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 928 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  928 
 
  
    EGAD00001004760 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 635 samples 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  635 
 
  
    EGAD00001004761 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1072 samples 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  1072 
 
  
    EGAD00001004762 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1436 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  1436 
 
  
    EGAD00001004763 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 589 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  589 
 
  
    EGAD00001004764 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 656 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  656 
 
  
    EGAD00001004765 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 648 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  648 
 
  
    EGAD00001004766 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 375 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  375 
 
  
    EGAD00001004767 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 755 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  755 
 
  
    EGAD00001004768 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 492 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  492 
 
  
    EGAD00001004769 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 531 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  531 
 
  
    EGAD00001004770 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1222 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  1063 
 
  
    EGAD00001004771 
   
  
    
    Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 522 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2500 
      
    
   
  742 
 
  
    EGAD00001004772 
   
  
    
    This dataset contains all the .bam files used for the study. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1238 
 
  
    EGAD00001004773 
   
  
    
    10 bams of WGS data from HiSeqXTen platform; 10 bams of RNA-seq data from HiSeq2500 platform; 7 bams of TruSeq Methyl Capture EPIC sequencing data from HiSeq4000 platform 
    
   
  
    
   
  20 
 
  
    EGAD00001004774 
   
  
    
    We investigated the somatic genetic basis of Wilms’ tumour and found complex phylogenetic relations between tumours. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  203 
 
  
    EGAD00001004775 
   
  
    
    Data supporting: "Patient-specific detection of cancer genes reveals recurrently perturbed processes in esophageal adenocarcinoma." Mourikis et al.
WGS (BAM files)
521 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004776 
   
  
    
    Data supporting: "Patient-specific detection of cancer genes reveals recurrently perturbed processes in esophageal adenocarcinoma." Mourikis et al.
RNAseq (BAM files)
137 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001004777 
   
  
    
    RNA sequencing of peripheral immune cells from patients +/- an IBD risk variant. Peripheral immune cells +/- in vitro test compound treatment.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-02-15. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  71 
 
  
    EGAD00001004778 
   
  
    
    Raw reads from single-cell RNA-sequencing of peripheral blood of five TET2 mutation carriers as well as three non-carrier family members. Single-cells were captured into 10x barcoded gel beads and RNA-sequencing library preparation was done using Chromium Single Cell 3' v2 chemistry (10x Genomics, Pleasanton, CA, USA). Sequencing was performed as recommended with 98bp length of read 2 using HiSeq4000 sequencer. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001004779 
   
  
    
    Raw reads from whole-genome bisulfite sequencing. Whole-genome bisulfite sequencing library preparations and Illumina sequencing of DNA samples from TET2 mutation carriers (Ly9, Ly11, Ly14, Id1) and their age-matched controls (Ly8, Ly10, Ly13, Id2, Id3) was done as a service at BGI (BGI Tech Solutions Co., Ltd., China). Bisulfite treatment was done with EZ DNA Methylation-Gold Kit (Zymo Research, CA, USA) for 300-400bp size-range fragments with methylated adapters in 5' and 3' ends. Sequencing was done with the HiSeq X-Ten platform using paired-end 150 base-pair read length. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001004780 
   
  
    
    Bam files from deep exome sequencing of blood DNA samples from five TET2 mutation carriers (Ly1, Ly2, Ly9, Ly11, Ly14) and three wild-type family members (Ly8, Ly10, Ly13) extracted at multiple time points. Library preparations were performed with SeqCap EZ Exome v3 (Roche, Switzerland) using six different index primers per sample for which paired-end Illumina sequencing was done with 75bp read length and HiSeq4000 sequencer. After alignment (bwa version 0.7.12), base recalibration (GATK 3.5), realignment around indels (GATK 3.5) and duplicate removal (MarkDuplicates; Picard Tools version 1.79), data from libraries with six different indexes were merged. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  15 
 
  
    EGAD00001004781 
   
  
    
    Raw reads from ChIP- [Anti-Histone H3 (acetyl K27) (Abcam, ab4729)] and input sequencing of EBV transformed lymphoblastoid cells from three carriers of TET2 mutation (Ly9, Ly11 and Ly14) and two wild-type (Ly8 and Ly10) family members using Illumina HiSeq Rapid paired-end 60 bp sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001004782 
   
  
    
    Bam file from exome sequencing of FFPE sample from Ly3 using SeqCap EZ Human Exome Library (Roche Nimblegen, Inc., WI, USA) and Illumina HiSeq2000 sequencer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004783 
   
  
    
    Variant calls from whole-genome sequencing of Ly1-07 using Complete Genomics paired-end sequencing service. 
    
   
  
    
      
      Complete Genomics 
      
    
   
  1 
 
  
    EGAD00001004784 
   
  
    
    Raw reads from RNA-sequencing of monocyte-derived macrophages from three individuals with heterozygous TET2 loss (Ly9, Ly11, Ly14) and two wild-type controls (Ly8 and an unrelated control). Libraries were prepared using ScriptSeq RNA-Seq Library Preparation Kit and Illumina sequenced with paired-end 75bp reads. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001004785 
   
  
    
    Raw reads from targeted bisulfite sequencing. The SureSelect Methyl-Seq target enrichment system (Agilent Technologies, Inc., CA, USA) was used to prepare bisulfite sequencing libraries from blood DNA samples of lymphoma patients (Ly1, Ly2), healthy family members (Ly8, Ly9, Ly10, Ly11, Ly12, Ly13 and Ly14), baseline controls (Control1-5), DNMT3A mutation carriers (Id5, Id7, Id9, Id11) and their age-matched controls (Id6, Id8, Id10, Id12). In addition, blood DNA sample of a patient (HLRCC_N7) with germline fumarate hydratase (FH) mutation is included. Illumina paired-end sequencing for targeted libraries from Ly1, Ly2, Ly8, Ly9, Ly10 and Ly11 was done at Karolinska Institutet using 100 base-pair read length and the HiSeq2000 platform. Illumina paired-end sequencing for targeted libraries from Ly12, Ly13, Ly14, and DNMT3A mutation carriers and their age-matched controls was done as a service at BGI (BGI Tech Solutions Co., Ltd., China) using 126 base-pair read length and the HiSeq2500 platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  25 
 
  
    EGAD00001004786 
   
  
    
    Whole genome sequencing (WGS) detects all mutations in a cancer. “Mutational signatures” are patterns of mutations that report the DNA damage and subsequent DNA repair processes that have occurred in cancers. We present a patient with Xeroderma Pigmentosum that developed metastatic angiosarcoma, unresponsive to all lines of sarcoma therapy. Primary tumour WGS revealed a hypermutated tumour, including clonal ultraviolet light-induced mutational patterns (Signature 7) and subclonal signatures of activating mutations of DNA Polymerase-epsilon (POLE)(Signature 10). These signatures are associated with response to immune-checkpoint blockade. Immunohistochemistry confirmed high PD-L1 expression in metastatic deposits. The patient was commenced on anti-PD-L1 therapy and has responded. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001004787 
   
  
    
    The study includes NGS-based methylC-capture sequencing (MCC-Seq) on 199 visceral adipose tissue and 206 whole-blood DNA samples derived from obese individuals (BMI >40 kg m-2) in the IUCPQ cohort. We generated 100bp paired-end reads using the Illumina HiSeq2000 or 2500 systems. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  345 
 
  
    EGAD00001004788 
   
  
    
    This dataset contains whole-transcriptome sequencing data of 113 myeloproliferative neoplasms (MPN) patients and 15 controls. Patients were diagnosed with essential thrombocythemia, polycythemia vera, primary myelofibrosis, and secondary acute myeloid leukemia. The data were pooled from 5 different sequencing experiments as indicated using an Illumina HiSeq2000 machine. All samples were sequenced paired-end. Sequenced samples were processed with custom workflows for discovery of fusion genes, SNVs and Indels calling, and identification of splicing abnormalities. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  145 
 
  
    EGAD00001004790 
   
  
    
    This dataset contains the imputed genotypes for the gencord samples.
Genotyping was done using Illumina OMNI2.5M.
Imputation was done using SHAPEIT2/IMPUTE2 with 1000 genomes project phase 3 reference panel. 
    
   
  
    
   
  251 
 
  
    EGAD00001004791 
   
  
    
    The dataset contains RNA-seq data of 96 EOPC patients and 9 controls. For some patients multiple tissue samples were sequenced ("multi-area" samples). The RNA extraction and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004792 
   
  
    
    1. Utra-deep exome sequencing data, illumina pair-end reads, fastq, 190 samples
2. WES data, illumina pair-end reads, fastq, 120 samples
3. RNA-seq data, illumina, pair-end reads, fastq, 20 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  330 
 
  
    EGAD00001004793 
   
  
    
    This dataset includes the whole-genome sequencing data from a study entitled "Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X Ten machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  98 
 
  
    EGAD00001004794 
   
  
    
    This dataset includes the RNA-seq data from a study entitled "Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma". PolyA tails were captured by Oligo-dT beads, and sequencing run was made in HiSeq 2500 machines. Paired-end FASTQ files are provided in our dataset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001004795 
   
  
    
    Subcutaneous panniculitis-like T-cell lymphoma (SPTCL) is a rare subtype of peripheral T-cell lymphoma affecting younger cases and associated with hemophagocytic lymphohistiocytosis.  To clarify the molecular pathogenesis of SPTCL, we analyzed paired tumor and germline DNAs from 13 patients by whole exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001004796 
   
  
    
    The TARGET Study consists of 200 BAM files, for the 100 patients discussed in the publication. Each patient has a normal control BAM file and a ctDNA BAM file. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  200 
 
  
    EGAD00001004797 
   
  
    
    Twenty-seven Tibetan samples from China were whole-genome sequenced to investigate high-altitude adaptation, population genetics and demographic history. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  43 
 
  
    EGAD00001004798 
   
  
    
    TRACERx 100: RRBS data from a subset of the first 100 TRACERx tumours 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  98 
 
  
    EGAD00001004799 
   
  
    
    WES sequence data from 15 samples, RNA-seq sequence data from 19 samples, all sequence data are raw pair-end sequence data in fastq format, sequenced by Illumina platform. 
    
   
  
    
      
      unspecified 
      
    
   
  34 
 
  
    EGAD00001004800 
   
  
    
    Stage-1 meta-analysis with GC correction 
    
   
  
    
   
  5 
 
  
    EGAD00001004802 
   
  
    
    ETMR RIPSeq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001004803 
   
  
    
    RNA was prepared using the IlluminaTruSeq RNA sample preparation kit for poly-adenylated mRNA with an average of 97.64 million reads per sample respectively 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001004805 
   
  
    
    ATACSeq library amplification was performed on 5 ETMR tumour samples using the NEBnext High Fidelity 2xPCR Master Mix (New England Biolabs, Cat#M0541S) according to the manufacturer’s protocol. ATAC-seq libraries were sequenced using single-end 50 bp reads on the Illumina HiSeq 2000 platform. ATAC-seq peaks analysed was conducted as published previously(Torchia et al. 2016). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001004806 
   
  
    
    Exome sequencing data for two sibs with juvenile idiopathic arthitis and one unaffected sib 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  3 
 
  
    EGAD00001004808 
   
  
    
    Exome sequencing performed on DNA from 94 anterior ischemic stroke cases 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  94 
 
  
    EGAD00001004809 
   
  
    
    H3K27Ac ChIP-seq DNA libraries for 5 ETMR samples were prepared using NEBNext ChIP-seq Illumina Sequencing library preparation kit. ChIP-seqlibraries were sequenced using single-end 50 bp reads on the Illumina HiSeq 2000 platform. ChIP-seq peaks analysed was conducted as published previously(Torchia et al. 2016). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001004810 
   
  
    
    This dataset is related to publications Costa et al. Cancer Cell 2018 and Givel et al. Nat. Commun. 2018 which describe the identification of 4 Cancer Associated Fibroblasts (CAF) in breast and ovarian cancer. This dataset contains transcriptomic profiles obtained by RNA-Seq of 34 CAF-S3 samples from breast and ovarian Tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001004811 
   
  
    
    whole-genome sequencing of 168GCs identifies hot-spot tandem duplications 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001004812 
   
  
    
    40 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  40 
 
  
    EGAD00001004813 
   
  
    
    48 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004814 
   
  
    
    50 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001004815 
   
  
    
    45 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001004816 
   
  
    
    61 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  61 
 
  
    EGAD00001004817 
   
  
    
    68 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  68 
 
  
    EGAD00001004818 
   
  
    
    71 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  71 
 
  
    EGAD00001004819 
   
  
    
    48 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001004820 
   
  
    
    52 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001004821 
   
  
    
    60 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001004822 
   
  
    
    55 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  55 
 
  
    EGAD00001004823 
   
  
    
    This dataset contains 26 mapped bam files. The samples were generated with 3 different protocols for deriving pancreatic progenitors from hPSC. Three parallel differentiations were performed, all done in a hPSC NKX6.1-GFP reporter line. For each protocol there are three cellular populations: total (presort), GFP+ and GFP- .  In summary: 3 differentiations x 3 protocols x 3 cellular populations. We prepared Smart-Seq2 RNA-seq libraries for the 27 samples, 1 sample failed library preparation and is not therefore included in this dataset. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001004824 
   
  
    
    This dataset contains 27 mapped bam files. The samples were generated with 3 different protocols for deriving pancreatic progenitors from hPSC. Three parallel differentiations were performed, all done in a hPSC NKX6.1-GFP reporter line. For each protocol there are three cellular populations: total (presort), GFP+ and GFP- .  In summary: 3 differentiations x 3 protocols x 3 cellular populations. We prepared ATAC-seq libraries for the 27 samples, and sequenced them on Illumina HiSeq4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  27 
 
  
    EGAD00001004825 
   
  
    
    Case series of the rare tumor entity chordoma. 9 cases sequenced with Whole Exome Sequencing (WES) and 2 cases sequenced with Whole Genome Sequencing (WGS) were recruited from the personalized oncology program NCT-MASTER/DKTK-MASTER at the German Cancer Research Center. One of the WES patients was re-sequenced at a later time point when he relapsed, this resequencing was done by WGS. Therefore there are 11 patients, one of which with two samples, all of which were sequenced with matched normal controls, amounting to a total number of 24 NGS samples. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001004826 
   
  
    
    This dataset consists of Illumina HiSeq 2000 Whole Exome Sequencing of 84 colorectal samples: 42 Tumor tissue samples and 42 Normal tissue samples (adjacent to tumor sites). 2x75bp paired-end sequencing reads: 2 fastq files per sample 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001004827 
   
  
    
    This dataset consists of SOLiD small RNA-seq of 250 colorectal samples: 100 tumor tissue samples, 100 normal tissue samples (adjacent to tumor sites) and 50 matched control samples of healthy individuals. CSfasta and qual files converted to single fastq files prior to uploading. 
    
   
  
    
      
      AB SOLiD System 
      
    
   
  250 
 
  
    EGAD00001004828 
   
  
    
    This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole genome sequence data for ChM-seq (118 H3K4me3, 118 H3K27ac and 6 inputs). The final quality filtered set included 91 individuals with H3K27ac ChM-seq and 88 with H3K4me3 ChM-seq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  242 
 
  
    EGAD00001004829 
   
  
    
    This dataset includes whole genome sequence data for ATAC-seq (42 samples) of human stimulated and cultured CD4+ Treg cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  49 
 
  
    EGAD00001004830 
   
  
    
    This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole transcriptome data for141 samples. The final quality filtered set included 123 individuals with RNA-seq data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  141 
 
  
    EGAD00001004831 
   
  
    
    We isolated T cells and monocotyes from healthy platelet donors and cultured them in resting and stimulated conditions with addition of a range of cytokines. We performed ATAC sequencing to assess the chromatin accessability in different cytokines treated  cells. These cellular profiles were used to map risk variants to the cytokine-induced cell states relevant for autoimmune diseases.  . 
This dataset contains all the data available for this study on 2019-03-11. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  183 
 
  
    EGAD00001004832 
   
  
    
    The dataset includes whole genome sequencing (WGS) data on ten matched esophageal tumor-normal pairs. WGS was performed by the cancer sequencing service of CG with an average read coverage of approximately 50-fold. Illumina HiSeq2000 instrument was used to perform the sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001004833 
   
  
    
    Dataset consisting of three WGS BAM files, representing the two dual lung metastases with matched germline control, for a 37 year old female patient, with primary adrenocortical carcinoma. Work conducted at Garvan Institute of Medical Research, Sydney, Australia. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001004834 
   
  
    
    This dataset includes cram files from 3,001 samples. These cram files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these cram files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  3001 
 
  
    EGAD00001004836 
   
  
    
    ATRT ChIPSeq H3K27ac 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004837 
   
  
    
    ATRT RNASeq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004838 
   
  
    
    We have sequenced four samples to identify variants in KMT2A gene. Three samples were sequenced with WGS, one was sequenced by WES (Patient2). 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001004839 
   
  
    
    PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019) 
    
   
  
    
   
  1812 
 
  
    EGAD00001004840 
   
  
    
    PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019) 
    
   
  
    
   
  1813 
 
  
    EGAD00001004841 
   
  
    
    PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019) 
    
   
  
    
   
  610 
 
  
    EGAD00001004842 
   
  
    
    PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018) 
    
   
  
    
   
  81 
 
  
    EGAD00001004843 
   
  
    
    PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018) 
    
   
  
    
   
  81 
 
  
    EGAD00001004844 
   
  
    
    PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018) 
    
   
  
    
   
  27 
 
  
    EGAD00001004845 
   
  
    
    Massively-parallel DNA sequencing of 113 advanced thyroid cancers and Massively-parallel RNA sequencing of 25 advanced thyroid cancers 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  187 
 
  
    EGAD00001004846 
   
  
    
    Series of 56 paired presentation, relapse and control samples from newly diagnosed, uniformly treated myeloma patients. Deep of treatment response and maintenance allocation (active observation or lenaldiomide) was determined for all. All samples underwent whole exome sequencing with additional baits to cover the myc and immunoglobulin loci. There are 168 (56 presentation, 56 relapse and 56 control) samples in this study. 131 are available as part of this dataset. The remaining 37 are available with dataset accession id EGAD00001001358. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  131 
 
  
    EGAD00001004847 
   
  
    
    ATRT ATACSeq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001004848 
   
  
    
    Colorectal cancer panel sequencing of 100 adenoma and carcinoma samples . Cancer Hot Spot panel sequencing of 7 samples from one poly. 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  100 
 
  
    EGAD00001004849 
   
  
    
    Shotgun sequencing data from subject as baseline and after 4 weeks of daily doses of low, medium or high CFU Eubacterium hallii L2-7. The strain contained in the drink "Ela" was also sequenced. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  53 
 
  
    EGAD00001004850 
   
  
    
    The dataset contains RNAseq data from 5 subsets of NK cells isolated from human lung: 
1) CD69+CD49a+CD103+CD16-CD56bright NK cells
2) CD69+CD49a+CD103-CD16-CD56bright NK cells
3) CD69+CD49a-CD103-CD16-CD56bright NK cells
4) CD69-CD49a-CD103-CD16-CD56bright NK cells
5) CD56dimCD16+NKG2A+CD57- NK cells
The dataset contains paired data for the subsets from 2 donors with 2 biological replicates/donor and subset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001004851 
   
  
    
    Sequencing data from primary mucinous ovarian carcinomas, benign and borderline mucinous tumours and extra-ovarian mucinous metastases. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  130 
 
  
    EGAD00001004852 
   
  
    
    We isolated T cells and monocotyes from healthy platelet donors and cultured them in resting and stimulated conditions with addition of a range of cytokines. We performed K27Ac ChM sequencing to assess the chromatin activity in different cytokines treated  cells. These cellular profiles were used to map risk variants to the cytokine-induced cell states relevant for autoimmune diseases.  . 
This dataset contains all the data available for this study on 2019-03-19. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  192 
 
  
    EGAD00001004853 
   
  
    
    Genentech gallbladder cancer study - exome 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  392 
 
  
    EGAD00001004854 
   
  
    
    Genentech gallbladder cancer study - RNA-seq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  120 
 
  
    EGAD00001004855 
   
  
    
    Genentech gallbladder cancer study - whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  361 
 
  
    EGAD00001004856 
   
  
    
    Paired end Illumina whole exome sequencing of 9 GBM trios (blood, primary and recurrent tumour). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001004857 
   
  
    
    Paired-end whole exome sequencing of 50 TNBC breast cancer metastasis samples and matched normal samples obtained from 50 unique patients assayed at study baseline (directly after patient randomization). The included raw sequencing data (fastq-format) were generated using Illumina HiSeq2500 instruments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  65 
 
  
    EGAD00001004858 
   
  
    
    Transcriptome sequencing of 97 matched TNBC breast cancer metastasis samples obtained from 50 unique patients assayed at two timepoints: at baseline (directly after patient randomization), and post-induction treatment or control waiting period. The included raw transcriptome sequencing data (fastq-format) were generated using Illumina HiSeq2500 instruments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  97 
 
  
    EGAD00001004859 
   
  
    
    Set of 133 bam files from patients affected with Lupus. BAM alignments for exonic variants present in 76 Lupus-related genes. VCF file describing the variants. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  133 
 
  
    EGAD00001004860 
   
  
    
    Placental biopsies were collected within 30 minutes of birth and flash frozen in RNAlater(ThermoFisher). For each biopsy,total placental RNA was extracted from approximately 5 mg of tissue using the “mirVana miRNA Isolation Kit” (Ambion) followed by DNase treatment (“DNA-free DNA Removal Kit”, Ambion). RNA quality was assessed with the Agilent Bioanalyzer and all the samples with RIN values ≥ 7.0 were used in the downstream experiments. Total RNA-libraries were prepared from 300-500ng of total placental RNA with the “TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat”(Illumina). Small RNA-libraries were prepared from 150ng of total placental RNA with the “NEBNext Multiplex Small RNA Library Prep Kit for Illumina”(New England Biolabs)and concentrated using the “QIAquick PCR purification kit”(Qiagen). Paired libraries were combined and size selected using the Pippin Prep and 3% Agarose Gel Cassette with marker F (Sage Science),pooled and sequenced (single-end, 50bp) using a Single End V4 cluster kit and HiSeq4000 instrument. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  288 
 
  
    EGAD00001004861 
   
  
    
    Somatic mutation frequencies in patients with therapy-related myeloid neoplasms (129 patients, 181 samples including bone marrow, mesenchymal stromal cells and hair DNA) and primary myelodysplastic syndrome (108 patients, 215 samples including bone marrow, mesenchymal stromal cells and hair DNA) is assessed by deep sequencing of selected genes, using a Fluidigm Access Array, a Nimblegen capture panel and an Ion AmpliSeq panel. The dataset consists of paired fastq files obtained by either Hiseq (2x101bp) or Nextseq (2x150bp) Illumina sequencing. The mutational burden is found to be similar in both, however the distribution of variants is different. Correlation of the mutational spectrum with prognosis is also observed. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  396 
 
  
    EGAD00001004862 
   
  
    
    Glioblastoma multiforme (GBM) is clinically highly aggressive as a result of evolutionary dynamics induced by cross-talk between cancer cells and a heterogeneous group of immune cells in tumor microenvironment. The brain harbors limited numbers of immune cells with few lymphocytes and macrophages; thus, innate‐like lymphocytes, such as γδ T cells, have important roles in antitumor immunity. Here, we characterized GBM‐infiltrating γδ T cells, which may have roles in regulating the GBM tumor microenvironment and cancer cell gene expression. V(D)J repertoires of tumor‐infiltrating and blood‐circulating γδ T cells from four patients were analyzed by next-generation sequencing-based T-cell receptor (TCR) sequencing in addition to mutation and immune profiles in four GBM cases. In all tumor tissues, abundant innate and effector/memory lymphocytes were detected, accompanied by large numbers of tumor‐associated macrophages and closely located tumor‐infiltrating γδ T cells, which appear to have anti-tumor activity. The immune-related gene expression analysis using the TCGA database showed that the signature gene expression extent of γδ T cells were more associated with those of cytotoxic T and Th1 cells and M1 macrophages than those of Th2 cells and M2 macrophages. Although the most abundant γδ T cells were Vγ9Vδ2 T cells in both tumor tissues and blood, the repertoire of intratumoral Vγ9Vδ2 T cells was distinct from that of peripheral blood Vγ9Vδ2 T cells and was dominated by Vγ9Jγ2 sequences, not by canonical Vγ9JγP sequences that are mostly commonly found in blood γδ T cells. Collectively, unique GBM‐specific TCR clonotypes were identified by comparing TCR repertoires of peripheral blood and intra‐tumoral γδ T cells. These findings will be helpful for the elucidation of tumor-specific antigens and development of anticancer immunotherapies using tumor-infiltrating γδ T cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004863 
   
  
    
    This dataset contains Whole Genome Sequencing, RNA-sequencing and ATAC-sequencing data obtained from PBMCs derived from blood samples of one patient with complex genomic rearrangements and the biological parents. The patient has multiple congenital anomalies and delayed development. Data access is closed. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001004864 
   
  
    
    This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data obtained from PBMCs derived from blood samples of two patients with intellectual disability and/or multiple congenital anomalies and eight parents included in the University Medical Center Utrecht (The Netherlands). Data access is closed. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001004865 
   
  
    
    This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data obtained from PBMCs derived from blood samples of 34 patients with intellectual disability and/or multiple congenital anomalies and their biological parents (58) included in the University Medical Center Utrecht (The Netherlands). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  15 
 
  
    EGAD00001004866 
   
  
    
    This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data of 17 lymphoblastoid cell lines derived from patients with complex genomic rearrangements. Patients have phenotypes in category of intellectual disability and/or multiple congenital anomalies. Data access is closed. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  15 
 
  
    EGAD00001004867 
   
  
    
    This dataset contains all the data available for this study on 2019-03-26. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  60 
 
  
    EGAD00001004869 
   
  
    
    Illumina platform sequencing data for matched tumour-normal DNA samples from 77 melanoma patients participating in a study investigating response to immunotherapy. Selected cases also have RNA sequencing of the tumour. 
    
   
  
    
   
  - 
 
  
    EGAD00001004871 
   
  
    
    ChIP-seq data for Lymphoblastoid Cell Lines (LCL) and Fibroblasts (FIB) from the Gencord Cohort:
- 160 LCLs assayed for H3K27ac, H3K4me1 and H3K4me3,
- 78 FIB assayed for H3K4me3 and 79 FIB assayed for H3K27ac and H3K4me1
This dataset was generated as part of the following study:
Delaneau et al (2019). Chromatin 3D interactions mediate genetic effects on gene expression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  239 
 
  
    EGAD00001004872 
   
  
    
    RNA-seq data for 168 Lymphoblastoid Cell Lines (LCL) and 78 Fibroblasts (FIB) from the Gencord Cohort.
This dataset was generated as part of the following study:
Delaneau et al (2019). Chromatin 3D interactions mediate genetic effects on gene expression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  246 
 
  
    EGAD00001004873 
   
  
    
    Whole exome sequencing BAM files of 50 metastatic solid tumour and matched blood germline DNA prior to pembrolizumab treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  100 
 
  
    EGAD00001004874 
   
  
    
    This dataset consists of amplicon sequencing of fibrocystic breast tissues, subsequent cancer tissue and germline control of 17 patients. The target genes include all exons of 27 protein-coding genes and 2 non-coding genes, as well as mutation hotspots in three cancer genes, frequently mutated in breast cancer. Ion AmpliSeq libraries were generated and sequenced on the Ion S5 XL system. 
    
   
  
    
      
      Ion Torrent S5 XL 
      
    
   
  51 
 
  
    EGAD00001004875 
   
  
    
    Aligned RNA-seq sequences in this dataset are from the Proteogenomic Landscape of Curable Prostate Cancer study 
    
   
  
    
   
  56 
 
  
    EGAD00001004876 
   
  
    
    In this project we have sequenced the exome of skin moles (melanocytic naevi) and also normal skin from young and old people. We are interested in looking at the clonality of these lesions and the burden of UV mutations . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001004877 
   
  
    
    Targeted analysis of chondrosarcoma cancer genes . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  445 
 
  
    EGAD00001004878 
   
  
    
    R&D project to develop low input library construction methods. . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001004879 
   
  
    
    Evolution of the cancer epigenome in myeloproliferative neoplasms. . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  17 
 
  
    EGAD00001004880 
   
  
    
    We will sequence at 15X coverage the genomes of 1536 IBD patients. These samples are currently onsite at Sanger and made available for sequencing via our collaboration with the UK IBD Genetics consortium. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3124 
 
  
    EGAD00001004881 
   
  
    
    Genome-wide CRISPR/Cas9 library screen was performed in isogenic cell lines. Three biological replicates were used. Cells were harvested at initiation of screen and at specific time points following cell culture. Using this approach we aim to identify genes specifically important in cell survival in engineered cell lines. Please perform 72, instead of 36, PCR reactions as the screen performed at 200x coverage. . 
This dataset contains all the data available for this study on 2019-04-01. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004882 
   
  
    
    This dataset comprises over 850 individuals recruited in Uttar Pradesh, India, including cases of rheumatic heart disease based on echocardiographic diagnosis and controls recruited on the basis of normal echocardiograms. For this analysis all available samples were genotyped using the Illumina HumanCore-24 BeadChip platform. 
    
   
  
    
   
  940 
 
  
    EGAD00001004884 
   
  
    
    The data consists of 47 exome-sequenced synchronous colorectal cancers from 23 patients. The exomes of corresponding normal samples were used to remove germline variants. All patients are Finnish (white Caucasian). All except one patient (sync_11 who belongs to a LS family) were assumed sporadic. The sequence data was produced with Illumina HiSeq 4000. 
    
   
  
    
   
  47 
 
  
    EGAD00001004885 
   
  
    
    Whole exome sequencing of human and mouse sarcoma samples for creation of personalized therapy options. Tissues were sequenced directly; no interventions or alterations were made to the tissue samples 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001004886 
   
  
    
    Whole Genome Sequencing of Normal Singaporean Volunteers 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  175 
 
  
    EGAD00001004887 
   
  
    
    We performed multiregion whole exome sequencing of a total of 37 samples from five consecutive patients (normal tissue, n=5; primary tumors, n=16; tumor thrombi, n=16) to >35-fold target coverage. Matching primary tumor and venous tumor thrombus samples were analyzed. Four patients had a clear cell RCC, one patient had a poorly differentiated type II papillary RCC (RCC-VTT-04). The latter patient had a friable thrombus, the others were of solid consistency. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  37 
 
  
    EGAD00001004888 
   
  
    
    We will use targeted exome sequencing to examine normal appearing epithelium and whole exome and whole genome sequencing of microdissected clones identified by immunostaining
Some of the samples will be of low DNA concentration and therefore may require extra rounds of amplification during library prep.
 . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001004889 
   
  
    
    Mutational signatures have been shown to be attributable to specific genetic contexts, such as mutations in DNA repair genes. DNMT3A is a DNA methyltransferase that helps maintain the DNA methylation pattern in a site-specific manner and may participate in DNA repair or the stress response. We have identified an adult individual who is a germline mosaic for a DNMT3A mutation. We have obtained clonal lymphoblastoid cells (LCLs) from the subject representing both WT and mutant lines grown in the same individual for >50 years. These clones represent a unique opportunity to examine the mutational impact of the DNMT3A mutation in a well-controlled setting. Our goal is to perform WGS on whole blood, representing the pool, as well as several WT and several mutant clones, in order to investigate the contribution of DNMT3A to mutation rates and signatures.   . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001004890 
   
  
    
    The aim of this study is to investigate the somatic mutations in twins with BRCA1/2 negative breast cancer with no strong family history.   . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001004891 
   
  
    
    Drug resistant population of PC9(human non-small cell lung cancer) or A375 (human melanoma) cell lines were used for this study. By exome sequencing, we will analyse mutations of cells in drug tolerent state and after drug holiday.  . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001004892 
   
  
    
    High-coverage whole genome sequences using Hiseq X for 4 individuals to investigate their Y chrosmosmes' relationship to the known phylogeny. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001004893 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. 
Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (Illumina HiSeqX, 40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. 
Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.  . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  36 
 
  
    EGAD00001004894 
   
  
    
    In this study, samples fom a window study of the PARP inhibitor rucaparib in patients with primary triple negative or BRCA1/2 related breast cancer (RIO trial) will be investigated. Samples will undergo whole genome sequence and analysis, including use of HR Predict.  . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  60 
 
  
    EGAD00001004895 
   
  
    
    Recent advances in genomics have demonstrated that clonal haemopoiesis driven by leukaemia associated somatic mutations is a relatively common phenomenon that increases in frequency with advancing age. Whilst individuals with clonal haemopoiesis have an increased risk of developing haematological malignancies, they also have an increased mortality from other causes. Additionally, certain mutations are almost exclusively seen in individuals aged 70 years or older, whilst others are seen in individuals with non-haematological cancers including breast and ovarian. Recently, clonal haemopoiesis was found to be associated with a significantly increased risk of atherosclerotic cardiovascular disease. This association is thought to be causative with clonally-derived macrophages showing elevated expression of several chemokine and cytokine genes that contribute to atherosclerosis. 
Another vascular pathology, abdominal aortic aneurysm (AAA), increases with age and shares risk factors with atherosclerosis (including smoking, male sex, high cholesterol). However, the impact of these risk factors and the overlap between AAA and atherosclerosis is poorly understood. To investigate a possible link between clonal haemopoiesis and AAA, we will study DNA samples from 300 patients with AAA and up to 200 controls for evidence of clonal haemopoiesis. This will be done using target DNA enrichment with biotinylated RNA baits followed by high throughput sequencing. . 
This dataset contains all the data available for this study on 2019-04-03. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  472 
 
  
    EGAD00001004896 
   
  
    
    In order to reconstruct the evolutionary history of metastatic colorectal cancer, we performed whole-exome sequencing of 12 metastatic colorectal cancer patients for whom the primary tumor and matched distant metastases to the brain (n=10) and liver (n=2). For 8 of the 12 patients, multiple regions (n=3-7) of the primary tumor and distant metastases were sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  163 
 
  
    EGAD00001004897 
   
  
    
    Genome and transcriptome sequence data from a locally advanced breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004898 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004899 
   
  
    
    Genome and transcriptome sequence data from a invasive ductal carcinoma of right breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004900 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004901 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004902 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004903 
   
  
    
    Genome and transcriptome sequence data from a squamous cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004904 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001004905 
   
  
    
    Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004906 
   
  
    
    Genome and transcriptome sequence data from a diffuse large B-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004907 
   
  
    
    Genome and transcriptome sequence data from a T-cell prolymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004908 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004909 
   
  
    
    Genome and transcriptome sequence data from a metastatic serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004910 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004911 
   
  
    
    Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004912 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004913 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004914 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004915 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004916 
   
  
    
    Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004917 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004918 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004919 
   
  
    
    Genome and transcriptome sequence data from a abdominal sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004920 
   
  
    
    Genome and transcriptome sequence data from a meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004921 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma of the breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004922 
   
  
    
    Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004923 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001004924 
   
  
    
    Genome and transcriptome sequence data from a metastatic neuroendocrine tumor, lung primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004925 
   
  
    
    Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004926 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004927 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004928 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004929 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004930 
   
  
    
    Genome and transcriptome sequence data from a metastatic ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004931 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004932 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004933 
   
  
    
    Genome and transcriptome sequence data from a metastatic sarcomatoid carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001004934 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma of the alveolar ridge patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004935 
   
  
    
    Genome and transcriptome sequence data from a oligometastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004936 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001004937 
   
  
    
    This dataset includes 10 RNA-sequencing (RNA-seq) data for 9 primary tumors and 1 cell line from adult T-cell leukemia/lymphoma (ATL). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001004938 
   
  
    
    Hepatocellular carcinoma (HCC) is a heterogeneous aggressive malignancy with low efficacy of current therapies at advanced stages. We integrated molecular and pharmacological profiling of a large panel of liver cancer cell lines (LCCL) to assess their clinical relevance as HCC preclinical models and identify new effective therapies and biomarkers of response. Here, we performed multi-omic analysis including whole-exome, RNA and microRNA sequencing in a series 34 LCCL. Molecular profiles of LCCL and primary HCC were compared and we searched for molecular features associated with drug response. Our panel of LCCL faithfully recapitulated the most aggressive molecular “proliferation class” of HCC. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001004939 
   
  
    
    Sequencing data from patients with Ovarian cancer. Data utilised in the 'Enhanced detection of circulating tumor DNA by fragment size analysis' manuscript (Mouliere et al, 2018) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  118 
 
  
    EGAD00001004940 
   
  
    
    This dataset comprises 2570 whole genome sequenced samples from the Medical Genome Reference Bank.
https://sgc.garvan.org.au/initiatives/mgrb
The files are provided in cram format, aligned to hs37d5 with decoys, with no further processing applied.
The dataset also contains phenotype information for each sample. 
    
   
  
    
   
  2570 
 
  
    EGAD00001004941 
   
  
    
    Recent work in the Campbell group has revealed somatic mutations present in normal, non-cancerous human skin. A subset of the mutations conferred selective advantages to the host cells, leading to clonal expansions and raising the risk for future cancer development. Capturing such somatic mutations in normal tissue is important to advance our understanding about carcinogenesis and could provide prospective medical insights. 
In this project, our goal is to detect somatic mutations in normal (pre-cancerous) liver tissue. Using Laser Microdissection technology, we will dissect individual liver lobules from patient samples and submit these to sequencing. For each patient sample, we aim to sequence multiple lobules to characterise the mutagenic burden. Samples will be taken from patients with different liver disease aetiologies, including alcoholism and obesity, with a view on distinguishing the prevalent mutation types occurring in each disease context.
We will perform targeted sequencing, initially using the WTSI cancer panel. Later we aim to use a novel bait set that captures both cancer genes as well as genes relevant to the non-cancerous samples (ie. genes implicated in hereditary disorders, immune sequences).
 . 
This dataset contains all the data available for this study on 2019-04-08. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  63 
 
  
    EGAD00001004942 
   
  
    
    Gastroschisis (MIM 230750) is a herniation of the intestines through a defect of the abdominal wall lateral to the umbilicus (usually on the right side), and it is not covered by a membrane [Ledbetter, 2012].  Gastroschisis is a congenital anomaly with increasing incidence, easy prenatal diagnosis and extremely variable postnatal outcomes.  On the basis of clinical manifestations, epidemiologic charateristics, and the presence and type of additional malformations, gastroschisis could be considered a heterogeneous condition with no gene/s discovered yet.
This congenital anomaly affects approximately 1-3 infancts per 10,000 live births [Calzolari et al.1995;Parker et al.,2010]  Current knowledge about causative mutations/variants.  To date, no single gene has been linked to gastroschisis.  Some publications have tried to link this malformation to variants in genes (such as AEBP1 (adipocyte enhancer binding protein) gene [Feldkamp et al,. 2012] or the VEGF-NOS3 pathway [Lammer et al., 2008].
Previously, a Scribble mutant mouse model (circletail) was reported to exhibit gastroschisis, however recent studies demonstrated that the Scribble knockout fetus exhibits exomphalos phenotype of gastroschisis [Carnagham et al., 2013]. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-04-08. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004943 
   
  
    
    Organoids are self-organizing 3D structures grown from stem cells that recapitulate essential aspects of organ structure and function. Here we describe a method to establish long-term culture conditions of human airway epithelial organoids that contain all major cell populations and allow personalized human disease modelling. We collected macroscopically inconspicuous lung tissue from non-small-cell lung cancer (NSCLC) patients undergoing medically indicated surgery and isolated epithelial cells to engineer 3D organoids. We exploit the potential to derive sub-clones from AOs to demonstrate the feasibility of CRISPR gene editing. Finally, we show that AOs readily allow modelling of viral infections such as RSV and for the first time demonstrate the possibility to study neutrophil-epithelium interaction in an organoid model. Taken together, we anticipate that human AOs will find broad applications in the study of adult human airway epithelium in health and disease. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001004944 
   
  
    
    The dataset is composed of 62 samples (31 subjects before and after probiotic-like bacteria treatment). Sequencing was performed using Illumina HiSeq 2500. Fastq files are provided. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  61 
 
  
    EGAD00001004945 
   
  
    
    This dataset contains 70 human LV H3K27ac ChIP-seq paired-end FASTQ files.
The sequencing was performed using Illumina Hiseq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  70 
 
  
    EGAD00001004946 
   
  
    
    Whole exome sequencing data for 18 mucoepidermoid carcinoma samples. The samples were used for Illumina TruSeq library construction and captured using Agilent V4 exome panel. The PE fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001004948 
   
  
    
    The collection and use of tissue for this study had Melbourne Health institutional review board approval and patients provided written informed consent (Melbourne Health Local Project Number: 2016.087). Following the prostatectomy of 13 patients, ranging from 52 to 78 years of age and from CAPRA-S risk score of 0 (attributed to benign tissue samples, harvested from a site far from a low grade, low volume cancer) to 7 (Supplementary file 2), a four millimeter tissue core was collected from the prostate tumour site, conditional to histopathological verification66,67. If not otherwise specified, all procedures were carried out at 4 °C. Tissue blocks were washed in Phosphate-buffered saline (PBS) solution for 2 minutes and minced for 2 minutes with a scalpel. Homogenised tissue was added to a solution (total volume of 7 ml) composed by of 1 mg/ml collagenase IV (Worthington Biochemical Corp, USA), 0.02 mg/ml DNase 1 (New England Biolabs, USA), 0.2 mg/ml dispase (Merck, USA). The tissue homogenised was serially digested at 37 °C at 180 rpm, through three steps of 5, 10 and 10 minutes of duration, with the final 3 minutes dedicated to sedimentation at 0 rpm. After each digestion step, the supernatant was aspirated and filtered through a 70 μm strainer into a pre-chilled tube, diluting the solution with 15 ml of 2% bovine serum PBS to quench the enzymatic reaction. The resulting cumulative solution was then centrifuged at 1500 rpm for five minutes, with the supernatant collected and the cell pellet resuspended into 1 ml 2% PBS-serum prior to labelling (Fig. S1). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  52 
 
  
    EGAD00001004949 
   
  
    
    The dataset contains exome sequencing data of seven healthy family members, all in FASTQ format. The samples were taken from peripheral blood mononuclear cells. Furthermore, corresponding proteomics data are available as well. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001004950 
   
  
    
    March 2019 data update for cord blood CD34+CD38-, CMP, GMP, MEP, monocyte, erythroid precursor, B cell and primary AML total blasts reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency as part of the International Human Epigenome Consortium. This dataset contains data for samples: CEMT0158 CEMT0159 CEMT0160 CEMT0161 CEMT0162 CEMT0163 CEMT0164 CEMT0165 CEMT0166 CEMT0167 CEMT0168 CEMT0169 CEMT0170 CEMT0171 CEMT0172 CEMT0189 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001004951 
   
  
    
    Whole-genome sequencing of human individuals from Polynesian and Native American populations, as well as 10x Genomics Chromium data from Polynesian, Native American and Aboriginal Australian populations, allowing for experimental phasing of haplotypes.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-04-11. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  68 
 
  
    EGAD00001004952 
   
  
    
    Small-molecule inhibitors targeting the most commonly activated pathway in melanoma, MAPK pathway (either alone or in combination) are already given to melanoma patients for few years, and initially reduce tumour burden dramatically, eventually melanomas become resistant and tumours progress while on treatment. Resistance to this treatment occurs by acquisition of additional mutations or other alterations that affect the mitogen-activated protein kinase (MAPK) pathway by either direct or indirect signalling. Many resistance mechanisms somehow lead to reactivation of extracellular signal-regulated kinase (ERK), thereby restoring signalling of the oncogenic BRAF/MEK/ERK pathway. In addition, PI3K pathway activation contributes to resistance to BRAF inhibition. Less frequent but equally important to the phenomenon of targeted drug resistance is the observation that B15-20% of BRAF mutant melanoma patients fail to respond to BRAF inhibition already early on treatment, owing to intrinsic resistance. These patients have little therapeutic options, unless immunotherapy can be given. To better understand the resistance mechanisms in MAPK inhibitor-treated melanoma patients and melanoma biology, our lab generated a big panel of MAPK inhibitor resistant melanoma cell lines by continuous drug exposure. The understanding of the genetic landscape and gene expression as well as cross resistance to other treatment regimens, and other aspects of melanoma biology such as phenotype switch, will allow us to better exploit new therapeutic strategies for melanoma patients. . 
This dataset contains all the data available for this study on 2019-04-11. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001004953 
   
  
    
    Single cell + bulk genomics study for immune and hematopoietic organs during human fetal development . 
This dataset contains all the data available for this study on 2019-04-11. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001004954 
   
  
    
    We aim to describe the transcriptomic landscape of infant spindle cell tumours. . 
This dataset contains all the data available for this study on 2019-04-11. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  38 
 
  
    EGAD00001004955 
   
  
    
    The aim of this project is to test whether HPV oncogenes and /or interferon induce the APOBEC mutational signature in vitro, and to test the role of APOBEC3A in this process. . 
This dataset contains all the data available for this study on 2019-04-11. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001004956 
   
  
    
    16S rDNA amplicon sequencing of 196 human fecal samples of an Inulin cross-over trial in healthy, mildly constipated individuals. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  196 
 
  
    EGAD00001004958 
   
  
    
    Highly recurrent U1 snRNA mutations drive alternative splicing in HH medulloblastoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  234 
 
  
    EGAD00001004959 
   
  
    
    This dataset contains all samples used in study 'Whole genome characterisation of 5-FU treated organoids' 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001004960 
   
  
    
    multi-region exome sequencing of 116 pulmonary nodules including lung
preneoplasia atypical adenomatous hyperplasia (AAH, N=22), adenocarcinoma in situ (AIS,
N=27), minimally invasive adenocarcinoma (MIA, N=54) and invasive lung adenocarcinoma
(ADC, N=13). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  320 
 
  
    EGAD00001004961 
   
  
    
    We sequenced a total of 2 H3.3K27WT (pcGBM2, G477; 3 replicates total) and 2 H3.3K27M (DIPGVI, DIPGXIII; 6 replicates total) patient-derived cell lines as well as Crispr/Cas9 H3.3K27M-KO clones for one of the cell lines (3 replicates total; DIPGXIII-KO) using ATAC-Seq. P-XX designates passages of replicates. These samples can be found at GEO under accession number GSE128744. This repository contains 1 replicate of G477 to be released under controlled access. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001004962 
   
  
    
    Amplicon sequencing of 10 patients 
    
   
  
    
   
  95 
 
  
    EGAD00001004963 
   
  
    
    Whole exome paired-end sequencing data was performed on a trio (patient + parents) who has primary immunodeficiency to identify the genetic cause of the immunodeficiency. Analysis revealed a novel homozygous mutation in IL2RB. 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001004964 
   
  
    
    Whole exome sequencing data for matched normal and endometriosis samples. Samples were prepared using Agilent Sureselect capture kit, and sequenced on an Illumina HiSeq 2500. The submitted files are in BAM format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001004965 
   
  
    
    We performed RNA sequencing in whole-blood from the same 65 individuals from the PIVUS study at ages 70 and 80 (130 samples) to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging. Each individual has four fastq files, two for each age. Consecutive sample IDs refer to the sample individual, e.g. PIVUS003 and PIVUS004 are the age 70 and age 80 samples of the first individual. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  130 
 
  
    EGAD00001004966 
   
  
    
    Dataset of adenoma and colon cancer multi-region sequencing. Publication: Nat Ecol Evol. 2018 Oct;2(10):1661-1672. doi: 10.1038/s41559-018-0642-z. Epub 2018 Aug 31. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  139 
 
  
    EGAD00001004967 
   
  
    
    The dataset is referenced by EGA Study ID EGAS00001003605, which includes the short-reads data for 59 samples. All short-reads data files are in fastq format. 
    
   
  
    
      
      unspecified 
      
    
   
  59 
 
  
    EGAD00001004968 
   
  
    
    High-resolution (Sub-5 kbp resolution) Hi-C datasets generated using glioblastoma primary cultures from 3 different adult patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001004969 
   
  
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  172 
 
  
    EGAD00001004971 
   
  
    
    Androgen deprivation therapy treated patients (n=11) were recruited from an open label neoadjuvant phase II study in which patients with high-risk disease received a ‘supercastration’ regimen consisting of degarelix 240/80 mg subcutaneously every four weeks; abiraterone acetate 500 mg orally daily titrating upwards every two weeks by 250 mg to a final dose of 1000 mg daily; bicalutamide 50 mg orally daily; and prednisolone 5 mg orally twice daily for a total of 6 months (Australian New Zealand Clinical Trials Registry 12612000772842).  Untreated patients with similar pre-treatment characteristics were obtained from a prospective prostatectomy biorepository22,23. Prior to ligation of the dorsal venous complex and prostate pedicles, the anterior prostate was defatted and the specimen was removed immediately, placed in a sterile container and transferred on ice for long-term storage in the vapour phase of liquid nitrogen. A total of 50–100 µg of adipose tissue was separated from fresh frozen samples stored at −160°C. RNA was isolated using the Qiagen RNeasy Lipid Tissue Mini Kit and eluted in 35 µL nuclease-free water. 0.5–1 µg of total RNA was used as the input for cDNA library synthesis using TruSeq RNA Sample Prep Kit v2 (Illumina), and libraries were constructed according to manufacturer’s instructions. Samples were sequenced on a HiSeq 2500 (Illumina) using 101 base paired-end chemistry, aiming for 50 million mapped paired-end reads per sample. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001004977 
   
  
    
    Bone marrow mononuclear cells from patients diagnosed with B cell precursor acute lymphoblastic leukemia were obtained at three sequential time points: first diagnosis, remission after chemotherapy and relapse. Genomic DNA was isolated and targeted gene panel sequencing was performed using a customized biotinylated RNA oligo pool (SureSelect, Agilent, Santa Clara, California) to hybridize the target regions comprising 362 kbp on a HiSeq2000. Target regions were selected to validate mutations previously identified in these samples using whole exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  150 
 
  
    EGAD00001004978 
   
  
    
    The study includes NGS-based WGBS on one sperm DNA sample pooled from 30 participants, and methylC-capture sequencing (MCC-Seq) on the same pooled sperm sample as well as 45 sperm DNA samples derived from both fertile and infertile individuals in two cohorts (Toronto, a fertile cohort; Montreal, an idiopathic infertility cohort).  All the data were generated with 100bp paired-end reads using the Illumina HiSeq2000 or 4000 systems. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  47 
 
  
    EGAD00001004979 
   
  
    
    The dataset reports the 16S rRNA gene sequencing of the fecal microbiota of donors from the Milieu Intérieur Cohort. The Milieu Intérieur cohort includes a total of 1,000 healthy individuals of western European ancestry, recruited in France as part of the Milieu Intérieur project. To assess their fecal microbiota composition, 16S rRNA profiles were generated from stool samples of 863 of the 1,000 donors. Human stool samples were produced at home no more than 24 hours before the scheduled medical visit and collected in a double-lined sealable bag maintaining strict anaerobic conditions. Upon reception at the clinical site, the fresh stool samples were aliquoted and stored immediately at -80°C. DNA was extracted from stool and barcoding PCR was carried out using indexed primers targeting the V3-V5 region of the 16S rRNA gene. Equal volumes of normalized PCR reaction were pooled and thoroughly mixed. The amplicon libraries were sequenced on Illumina MiSeq. 
    
   
  
    
   
  1311 
 
  
    EGAD00001004981 
   
  
    
    BAM files (Illumina HiSeq 2000) with whole genome sequencing data of 49 individuals of European/Romanian descent, and 50 individuals of Roma (Romani/Rroma) ethnic background from Romania. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  99 
 
  
    EGAD00001004982 
   
  
    
    VCF files with whole genome sequencing data of 49 individuals of European/Romanian descent, and 50 individuals of Roma (Romani/Rroma) ethnic background from Romania. 
    
   
  
    
   
  99 
 
  
    EGAD00001004984 
   
  
    
    Fastq files resulting from whole exome sequencing of trios of samples from 6 breast cancer patients: normal breast, pre-NAC biopsy and post-NAC surgical resection. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  17 
 
  
    EGAD00001004985 
   
  
    
    In this study we use expression data from breast cancer tumors to define immune clusters in breast cancer. Immune clusters have gradual levels of immune infiltration. In the intermediate immune infiltration cluster, we found a worse prognosis which is independent of known clinicopathological features. We also found the immune clusters associated with treatment response. Further using gene expression data and deconvolution algorithms to dissect the immune contexture of the clusters. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  97 
 
  
    EGAD00001004987 
   
  
    
    This dataset pertains to mitochondrial DNA amplicon sequencing of paired DNA samples from gingivo-buccal oral cancer patients. DNA was isolated from the tumor and blood tissues of 89 patients (178 samples). The sequencing libraries were prepared from whole mitochondrial amplicons using Nextera XT DNA Library Preparation Kit (Illumina) and sequenced in Illumina HiSeq platform. The uploaded BAM files are generated by aligning paired-end reads to the mitochondrial reference sequence (rCRS) using BWA-MEM. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Ion Torrent PGM 
      
    
   
  178 
 
  
    EGAD00001004988 
   
  
    
    Collection of RNA-seq, Illumina, paired-end fastq files for 370 archival tissues from a subset of patients with high grade serous ovarian carcinoma enrolled in the phase 3 ICON7 trial. Clinical data and digital pathology information for CD8 is also available. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  370 
 
  
    EGAD00001004989 
   
  
    
    This dataset contains a total of 10 families and 51 samples presented in 'Non-invasive prenatal diagnosis by genome-wide haplotyping of cell-free plasma DNA' study, in which 9 of them are cfDNA samples and the rest of samples are gDNA samples. All the samples are targeted sequencing data with a custom 45Mb capture library . Raw pair-end fastq files for each sample is available. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  51 
 
  
    EGAD00001004990 
   
  
    
    Raw RNA-seq data for WT and 2 Gorlin NES cells, tumors derived from MYCN mis-expressed WT NES cells, tumors derived from Gorlin NES cells, and tumors derived from Gorlin NES cells transduced with mutant DDX3X (R351W and R534S) and CRISPR/Cas9 targeting GSE1 (each sample has 3 replicates/tumors except Gorlin NES cell tumors have 4). Raw whole exome sequencing data for  WT and Gorlin 1 NES cells, tumors derived from MYCN mis-expressed WT NES cells, and tumors derived from Gorlin NES cells (each sample has 3 replicates/tumors except Gorlin NES cell tumors have 4). Raw data for amplicon sequencing of GSE1 and KDM3B at regions targeted by CRISPR/Cas9 in Gorlin NES cells. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  44 
 
  
    EGAD00001004991 
   
  
    
    Paired-end RNA-seq FASTQ files from 21 newborn screening dried blood spot (DBS) samples.  These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). Each sample was sequenced in 4 lanes, leading to 8 FASTQ files per sample and a total of 21x8=168 FASTQ files. There is an additional number of 8 FASTQ files corresponding to sample BS13, which was downsampled to 1/4 of its original depth (see BS13_README file for details). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001004992 
   
  
    
    H3K27me3 ChIP-Seq of 6 samples and H3K27ac ChIP-Seq of 4 samples with respective input controls 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001004993 
   
  
    
    Ribodepletion RNA-Seq of 8 samples and polyA RNA-Seq of 22 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001004994 
   
  
    
    WXS of 3 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001004996 
   
  
    
    Total RNA was extracted using RNAble (Eurobio), cleaned-up with RNeasy columns (Qiagen) and sequenced. The libraries were prepared at the Genomics Platform of the Cochin Institute, following the TruSeq Stranded mRNA protocol (Illumina), starting from 1 µg of high quality total RNA. Paired end (2 × 75 bp) sequencing was performed on a Nextseq 500 platform (Illumina).
FASTQ sequences were aligned on hg19 (GRCh37) human reference genome with STAR (v.2.5.2a) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  134 
 
  
    EGAD00001004997 
   
  
    
    Whole‐exome sequencing was performed using NimbleGen MedExome capture (Roche NimbleGen, Madison, WI, USA) from 1 μg of high quality genomic DNA, followed by sequencing of libraries using paired-end mode (2x 75bp) on a Nextseq 500 platform (Illumina, San Diego, CA, USA), at the Genomics Platform of the Cochin Institute. 
Reads were aligned on hg19 (GRCh37) using BWA V0.7.17. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  86 
 
  
    EGAD00001004998 
   
  
    
    Small RNA (<100 bases in length) were purified from total RNA using miRNeasy kit (Qiagen), then sequenced. 
Libraries were prepared at the Genomics Platform of the Cochin Institute, following the TruSeq small RNA protocol (Illumina), starting from 1 µg of high quality total RNA. 
Single read (1 × 75 bp) sequencing was performed on a Nextseq 500 platform (Illumina). 
FASTQ sequences were aligned on miRBase v.2052, then counted with STAR (v.2.5.2a). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  111 
 
  
    EGAD00001004999 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  41 
 
  
    EGAD00001005000 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  42 
 
  
    EGAD00001005001 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  40 
 
  
    EGAD00001005002 
   
  
    
    This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole genome sequence data for ATAC-seq (114 samples) The final quality filtered set included 73 individuals with ATAC-seq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  114 
 
  
    EGAD00001005003 
   
  
    
    Isolation of bacteria in infected brains in patients with Parkinson's disease. Here we used next generation sequencing of 16S ribosomal RNA gene PCR amplicons (NGS 16S amplicon analysis). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  22 
 
  
    EGAD00001005006 
   
  
    
    PAGE Dataset Mar 2019 
    
   
  
    
   
  2595 
 
  
    EGAD00001005007 
   
  
    
    PAGE Dataset Mar 2019 
    
   
  
    
   
  2595 
 
  
    EGAD00001005008 
   
  
    
    PAGE Dataset Mar 2019 
    
   
  
    
   
  875 
 
  
    EGAD00001005009 
   
  
    
    Paired-end RNA-seq BAM files from 21 newborn screening dried blood spot (DBS) samples. These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). There is one BAM file per sample and there is an additional BAM file, corresponding to sample BS13, which was downsampled to 1/4 of its original depth (see BS13_README file for details). 
    
   
  
    
   
  21 
 
  
    EGAD00001005010 
   
  
    
    This dataset was conceived to characterize the epigenomic landscape of representative iBCP-ALL subtypes. To do so, we performed Whole Genome DNA bisulfite-sequencing on 2 MLL-AF4, 2 MLL-AF9 and 2 non-MLL rearranged leukemias, and also 2 pools of BCPs obtained from fetal liver. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  8 
 
  
    EGAD00001005011 
   
  
    
    Whole genome sequencing and genotyping of samples from BFMOS (Mossi from Burkina faso) 
    
   
  
    
   
  111 
 
  
    EGAD00001005012 
   
  
    
    Whole genome sequencing and genotyping of samples from CMBAN (Bantu from Cameroon) 
    
   
  
    
   
  76 
 
  
    EGAD00001005013 
   
  
    
    Whole genome sequencing and genotyping of samples from CMSBA (Semibantu from Cameroon) 
    
   
  
    
   
  63 
 
  
    EGAD00001005014 
   
  
    
    Whole genome sequencing and genotyping of samples from TZWAS (Wasaamba from Tanzania) 
    
   
  
    
   
  174 
 
  
    EGAD00001005015 
   
  
    
    Whole genome sequencing and genotyping of samples from TZCHA (Chagga from Tanzania) 
    
   
  
    
   
  156 
 
  
    EGAD00001005016 
   
  
    
    Whole genome sequencing and genotyping of samples from TZPAR (Pare from Tanzania) 
    
   
  
    
   
  148 
 
  
    EGAD00001005017 
   
  
    
    B-cell acute lymphoblastic leukemia (B-cell ALL) is the most common cancer in childhood. Studying identical twins with B-cell ALL provides a unique and tractable model for deciphering the developmental timing of pre- and post-natal mutations contributing to clonal evolution. To date, this has mainly focused on major cytogenetic subgroups of childhood B-cell ALL, including MLL fusions, ETV6-RUNX1, hyperdiploidy, and BCR-ABL1. However, formal demonstration of the prenatal origin and “backtracking” the natural history of the leukemia remains understudied in “B-other”/Normal Karyotype (NK) B-cell ALL. To characterize the genetic landscape of this particular leukemia subtype, we performed whole genome DNA- and B-cell receptor (BCR)- on a pair of 8-month-old monozygotic twins diagnosed with concordant “B-other”/NK B-cell ALL. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001005018 
   
  
    
    B-cell acute lymphoblastic leukemia (B-cell ALL) is the most common cancer in childhood. Studying identical twins with B-cell ALL provides a unique and tractable model for deciphering the developmental timing of pre- and post-natal mutations contributing to clonal evolution. To date, this has mainly focused on major cytogenetic subgroups of childhood B-cell ALL, including MLL fusions, ETV6-RUNX1, hyperdiploidy, and BCR-ABL1. However, formal demonstration of the prenatal origin and “backtracking” the natural history of the leukemia remains understudied in “B-other”/Normal Karyotype (NK) B-cell ALL. To characterize the epigenetic landscape of this particular leukemia subtype, we performed DNA bisulfite-sequencing on a pair of 8-month-old monozygotic twins diagnosed with concordant “B-other”/NK B-cell ALL. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001005019 
   
  
    
    Whole genome sequencing of 92 individuals from 44 African indigenous populations. Sequences made with Illumina HiSeq 2000 sequencing system; data uploaded in BAM format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  43 
 
  
    EGAD00001005020 
   
  
    
    The incidence of brain metastases in breast cancer (BCBM) patients is increasing. These patients have a very poor prognosis and therefore identification of blood-based biomarkers, such as circulating tumor cells (CTCs) and understanding the genomic heterogeneity could help to personalize treatment options. In this study, DNA from individual CTCs as well as corresponding primary tumors and brain metastases were analyzed by next generation sequencing (NGS) in order to evaluate copy number aberrations and single nucleotide variations (SNVs). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001005021 
   
  
    
    Bam files for 16 meningioma tumor samples; ChIPseq performed on Illumina HiSeq 2000 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001005022 
   
  
    
    ONT Minion reads to provide 30x coverage for a patient with ataxia-pancytopenia syndrome. 
    
   
  
    
      
      MinION 
      
    
   
  1 
 
  
    EGAD00001005023 
   
  
    
    The COMPARE study enrolled 29,066 British blood between donors between February 2016 and March 2017, the study aim is to find the optimum technology for haemoglobin screening (ISRCTN 90871183). All participants were at the time of recruitment active blood donors. The 4,796 participants in this dataset have consented to join the NIHR BioResource. Genotyping data was produced using the Thermo Fisher Scientific Axiom Genotyping platform. The UK Biobank version 2 array design was used, content on this array has been added to allow for accurate DNA based identification of human blood group antigens. 
    
   
  
    
   
  - 
 
  
    EGAD00001005025 
   
  
    
    Isolation of fungi in infected neural tissues in patients with Parkinson's disease. Here we used next generation sequencing of Internal Transcribed Spacer (ITS) regions, by PCR amplicons (NGS ITS amplicon analysis). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  22 
 
  
    EGAD00001005026 
   
  
    
    The Donor InSight III study, undertaken by Sanquin research, recruited 3,046 Dutch blood donors between 2015 and 2016. The purpose of the study was to gain more insight into characteristics of donors, their motivations and health. All participants were at the time of recruitment active blood donors. Genotyping data was produced using the Thermo Fisher Scientific Axiom Genotyping platform. The UK Biobank version 2 array design was used, content on this array has been added to allow for accurate DNA based identification of human blood group antigens. 
    
   
  
    
   
  - 
 
  
    EGAD00001005027 
   
  
    
    Amplicon sequencing from 45 samples - Amplicons of 16s v3-v4 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  45 
 
  
    EGAD00001005028 
   
  
    
    Analysis of mutational signatures is becoming routine in cancer genomics, with implications for pathogenesis, classification, prognosis, and even treatment decisions. However, the field lacks a consensus on analysis and result interpretation. Using whole-genome sequencing of multiple myeloma (MM), chronic lymphocytic leukemia (CLL) and acute myeloid leukemia, we compare the performance of public signature analysis tools. We describe caveats and pitfalls of de novo signature extraction and fitting approaches, reporting on common inaccuracies: erroneous signature assignment, identification of localized hyper-mutational processes, overcalling of signatures. We provide reproducible solutions to solve these issues and use orthogonal approaches to validate our results. We show how a comprehensive mutational signature analysis may provide relevant biological insights, reporting evidence of c-AID activity among unmutated CLL cases or the absence of BRCA1/BRCA2-mediated homologous recombination deficiency in a MM cohort. Finally, we propose a general analysis framework to ensure production of accurate and reproducible mutational signature data. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001005029 
   
  
    
    Paired immunoglobulin heavy and light chain sequences were obtained from 803 single IgA plasma cells isolated from duodenal biopsies of five celiac disease patients. The cells were specific to discrete antigenic regions of the enzyme TG2, which is the main autoantigen in celiac disease. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  10 
 
  
    EGAD00001005030 
   
  
    
    Brain metastasis (BM) of colorectal cancer (CRC) is rare but lethal and lacks effective therapies or a good understanding of its genomic landscapes. We conduct an analysis of whole-exome sequencing (WES, Illumina HiSeq 2500 sequencing platform) on 11 patient-matched BMs, primary CRC tumours, and adjacent normal tissues; and whole-genome sequencing (WGS, Illumina HiSeq X Ten platform) on 8 patient-matched BMs, primary CRC tumors, and adjacent normal tissues to uncover the whole-genome mutational landscape of colorectal cancer with brain metastasis. 
    
   
  
    
   
  38 
 
  
    EGAD00001005031 
   
  
    
    RNA sequencing of frozen tumor biopsies from patients with blastic plasmacytoid dendritic cell neoplasm. 4 samples. Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001005032 
   
  
    
    Whole-genome sequencing of frozen tumor biopsies from patients with blastic plasmacytoid dendritic cell neoplasm. 10 samples. Illumina HiSeq X-Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001005033 
   
  
    
    The reference human genome is still incomplete, and several non-reference sequences have derived from diverse populations. With the available of whole genome sequencing data from multiple individuals, we could construct the pan-genome sequence. Here we provide high quality genome sequencing (~30x coverage) from 185 Han Chinese individuals. All samples were sequenced using Illumina HiSeq X10 sequencer and paired-end 150-bp reads were produced. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  185 
 
  
    EGAD00001005034 
   
  
    
    Illumina BAMs for a patient with ataxia-pancytopenia syndrome and both of their parents. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001005035 
   
  
    
    Tumor mutational burden (TMB) has emerged as a predictive biomarker of response to immune checkpoint inhibitors. Standardization of TMB measurement is essential for implementing diagnostic tools to guide treatment. Here we evaluate bioinformatic TMB analysis by whole exome sequencing (WES) in formalin-fixed, paraffin-embedded samples. In CheckMate 026, TMB was retrospectively assessed in 312 patients with non-small cell lung cancer (58% of the intent-to-treat population) who received first-line nivolumab treatment or chemotherapy. We examined the sensitivity of TMB assessment to bioinformatic filtering methods and assessed concordance between TMB data derived by WES and the FoundationOne CDx™ assay. TMB scores comprising synonymous, indel, frameshift, and nonsense mutations (all mutations) were 3.1-fold higher than data including missense mutations only, but values were highly correlated (Spearman’s r = 0.99). Scores from CheckMate 026 samples including missense mutations only were similar to those generated from data in The Cancer Genome Atlas, but those including all mutations were generally higher. Using databases for germline subtraction (instead of matched controls) showed a trend for race-dependent increases in TMB scores. Parameter variation can therefore impact TMB calculations, highlighting the need for standardization. Encouragingly, WES and FoundationOne CDx outputs were highly correlated (Spearman’s r = 0.90) and differences could be accounted for by empirical calibration, suggesting that reliable TMB assessment across assays, platforms and centers is achievable. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  368 
 
  
    EGAD00001005037 
   
  
    
    This dataset contains 200 RNA-seq bam files (142 SLE, 58 healthy individuals). RNA libraries were prepared with the Illumina TruSeq sample preparation kit and were sequenced on Illumina HiSeq2000. 49 bp paired-end reads were mapped to the GRCh37 reference human genome using the GEM mapper.This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  200 
 
  
    EGAD00001005038 
   
  
    
    This dataset contains the imputed genotypes for 197 individuals. All individuals were genotyped with the Illumina HumanCoreExome-24 array. The individuals were phased with SHAPEIT and imputed to the 1000 Genomes Project Phase III using IMPUTE2. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity 
    
   
  
    
   
  197 
 
  
    EGAD00001005039 
   
  
    
    This dataset contains the RPKM and raw read counts of expression for all the individuals. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity. 
    
   
  
    
   
  200 
 
  
    EGAD00001005040 
   
  
    
    This dataset contains the clinical phenotypes/covariates information for all the individuals. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity. 
    
   
  
    
   
  200 
 
  
    EGAD00001005041 
   
  
    
    This dataset contains the eQTL summary statistics (nominal pass, significant eQTLs, best associated variant per gene). eQTL mapping was performed with fastQTL. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity. 
    
   
  
    
   
  142 
 
  
    EGAD00001005042 
   
  
    
    This dataset contains the sQTL summary statistics (nominal pass, significant sQTLs). sQTL mapping was performed with QTLtools. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity. 
    
   
  
    
   
  142 
 
  
    EGAD00001005044 
   
  
    
    The majority of embryos that are created through IVF do not implant. It seems plausible that rates of implantation would improve if we had a better understanding of molecular factors affecting embryo competence. Currently, the process of selecting an embryo for uterine transfer utilizes an ad-hoc combination of morphological criteria, the kinetics of development, and genetic testing for aneuploidy. However, no single criterion can ensure selection of a viable embryo. In contrast, RNA-sequencing of embryos could yield highly dimensional data, which may provide additional insight and illuminate the discrepancies among current selection criteria. Indeed, recent advances enabling the production of RNA-sequencing (RNA-seq) libraries from single cells have facilitated the application of this technique to the study of some transcriptional events in early human development. However, these studies have not assessed the quality of their constituent embryos relative to commonly used embryological criteria. Here, we perform proof-of-principle advancement to clinical selection procedures by generating high quality RNA-seq libraries from a trophectoderm biopsy as well as the remaining whole embryo. We combine state-of-the-art embryological methods with low-input RNA-seq to develop the first transcriptome-wide approach for use in future predictive embryology studies. Specifically, we demonstrate the capacity of RNA-seq as a promising tool in preimplantation screening by showing that biopsies of an embryo can capture valuable information content available in the whole embryo from which they are derived. Furthermore, we show that this technique can be used to generate a RNA-based digital karyotype, and to identify candidate competence-associated genes. Together, these data establish the foundation for a future RNA-based diagnostic in IVF. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001005046 
   
  
    
    The BAM files for WES and RNA seq used in the article "Molecular Profiling Reveals Unique Immune and Metabolic Features of Melanoma Brain Metastases." on cancer Discovery 2019. PMID: 30787016 PMCID: PMC6497554.
Authors : Grant M Fischer, ..., Michael A Davies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  199 
 
  
    EGAD00001005047 
   
  
    
    In order to characterize the T cell receptor (TCR) repertoire of DQ2.2-glut-L1-specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.2:DQ2.2-glut-L1 tetramer binding CD4+ T cells isolated from six T-cell lines (TCLs) of four Celiac disease patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001005048 
   
  
    
    In order to characterize the T cell receptor (TCR) repertoire of DQ2.5-hor-3-specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.5:DQ2.5-hor-3- tetramer binding CD4+ T cells isolated from biopsies of celiac disease patients. We also sequenced the TCR of the T-cell clones (TCCs) that were generated by cloning by limited dilution and antigen-free expansion of HLA-DQ2.5:DQ2.5-hor-3-tetramer binding CD4+ T cells from biopsies of celiac disease patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001005049 
   
  
    
    5000 cells of each subset of CD8 T cells (CD103-KLRG1+, CD103-KLRG1- and CD103+ from LP and CD103+ IELs) were sorted into tubes. A modified SMART protocol was used in first-strand cDNA synthesis, and TCRalpha / TCRbeta genes were amplified in two rounds of semi-nested PCR reaction, following the method described in detail in Risnes et al., 2018. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  40 
 
  
    EGAD00001005050 
   
  
    
    Single-cell TCRalpha-beta sequencing of LP CD103+ CD8 T cells from the grafted/native duodenum of two donors (Ptx#1 and Ptx#2) before and 1 year after transplantation.
Single cells were sorted into 96-well plates. Paired TCRalpha and TCRbeta sequences were obtained after three nested PCR with multiplexed primers covering all TCRalpha and TCRbeta V genes, as described before (Risnes et al., 2018), and original protocol in (Han et al., 2014). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001005052 
   
  
    
    This dataset contains RNA-sequencing of Bone marrow-derived CD34+ cells from Healthy Controls (n=2) and SLE patients (n=8). 
SLE patients are divided into two categories based on severity: patients with moderate/mild disease (n=3) and patients with severe disease (n=5). 
Libraries were generated using the Illumina TruSeq Sample Preparation kit v2. Single-end 75-bp mRNA sequencing was performed on Illumina NextSeq 500. The raw fastq files are uploaded. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  10 
 
  
    EGAD00001005053 
   
  
    
    This dataset contains 4 batches of Indonesian RNA-seq data from Mentawai, New Guinea and Sumba islands. One RNA-seq batch was prepared without Globin depletion, and three batches were Globin depleted using the Illumina Globin-Zero Gold Kit. There are 179 runs in total, including 119 unique samples. Dataset includes multiple batch control samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  179 
 
  
    EGAD00001005054 
   
  
    
    We performed single cell RNA sequencing (scRNA-seq) for 208,506 cells derived from 58 lung adenocarcinomas from 44 patients, which covers primary tumour, lymph node and brain metastases, and pleural effusion in addition to normal lung tissues and lymph nodes. The extensive single cell profiles depicted a complex cellular atlas and dynamics during lung adenocarcinoma progression which includes cancer, stromal, and immune cells in the surrounding tumor microenvironments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  80 
 
  
    EGAD00001005055 
   
  
    
    The goals of this study is to investigate the prevalence and heritability of age-related clonal haemopoeisis (ARCH) in healthy elderly individuals.We will use a bespoke bait set to pull down DNA regions of interest in whole blood samples combined with HiSeq at a deep level . By correlating findings from each individual to their respective twin we hope to elucidate whether heritable traits influence the development of ARCH.     
a.	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-05-31. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005056 
   
  
    
    To better understand the pattern of genetic changes over time, we performed whole exome sequencing of sequential bone marrow samples from 9 patients taken overtime including some paired SMM/newly diagnosed MM/Relapse MM samples. 
Samples from 9 patients (9 controls and 53 tumors) underwent whole exome sequencing with an additional capture for the IGH, IHK, IGL, and MYC loci. DNA was obtained from either CD138+ cells from the bone marrow of smoldering myeloma patients through time (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using NimbleGen's MedExome. After PCR amplification hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  62 
 
  
    EGAD00001005057 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001005058 
   
  
    
    Identify and track clonal evolution of clones in consecutive human chronic lymphocytic leukemia samples identified by whole exome sequencing. 
    
   
  
    
      
      Illumina Genome Analyzer II 
      
    
   
  79 
 
  
    EGAD00001005059 
   
  
    
    This dataset reports whole genome sequences for 82 individuals from different populations from Mentawai, New Guinea, Sumatra and Sumba islands. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  82 
 
  
    EGAD00001005060 
   
  
    
    May 2019 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001005061 
   
  
    
    Bam files for 124 samples (62 tumor vs blood pairs); Whole Genome Sequencing performed on Illumina HiSeq X Ten 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  124 
 
  
    EGAD00001005062 
   
  
    
    We performed shallow coverage whole genome sequencing on 147 glioma samples and analyzed their copy number profile. The coverage of each sample is about 2. Data are presented as VCF files which describe the copy number segments and their log2 ratio. 
    
   
  
    
   
  147 
 
  
    EGAD00001005063 
   
  
    
    Tumor and matching normal tissues were collected from 8 patients and organoids were derived from each tissue. Whole exome libraries were prepared for tumor tissue, normal tissue, tumor organoids and normal organoids, and paired-end sequencing was performed using Illumina Novaseq 6000 system. Only tumor and matching normal tissues were sequenced for the patients without available organoids. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001005064 
   
  
    
    Bronchoscopies were collected from healthy and asthma volunteers. Cohort inclusion criteria for all subjects were: age between 40 – 65 years and history of smoking < 10 pack years. For the asthmatics, inclusion criteria were: age of onset of asthmatic symptoms ≤12 years, documented history of asthma, use of inhaled corticosteroids with(out) β2-agonists due to respiratory symptoms and a positive provocation test (i.e. PC20 methacholine ≤8mg/ml with 2-minute protocol). For the non-asthmatic controls, the following criteria were essential for inclusion: absent history of asthma, no use of asthma-related medication, a negative provocation test (i.e. PC20 methacholine >8 mg/ml and adenosine 5'-monophosphate >320 mg/ml with 2-minute protocol), no pulmonary obstruction (i.e. FEV1/FVC ≥70%) and absence of lung function impairment (i.e. FEV1 ≥80% predicted). Asthmatics stopped inhaled corticosteroid use 6 weeks prior to all tests. All subjects were clinically characterised with pulmonary function and provocation tests, blood samples were drawn, and finally subjects underwent a bronchoscopy under sedation. If a subject developed upper respiratory symptoms, bronchoscopy was postponed for ≥6 weeks.
Fibreoptic bronchoscopy was performed using a standardised protocol during conscious sedation. Six macroscopically adequate endobronchial biopsies were collected for this study, located between the 3rd and 6th generation of the right lower and middle lobe. Extracted biopsies were processed directly thereafter, with a maximum of one hour delay.
The medical ethics committee of the Groningen University Medical Center Groningen approved the study, and all subjects gave their written informed consent. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  4087 
 
  
    EGAD00001005065 
   
  
    
    Human lung tissue was obtained from deceased organ donors from whom organs were being retrieved for transplantation. Informed consent for the use of tissue was obtained from the donors’ families (REC reference: 15/EE/0152 NRES Committee East of England - Cambridge South).
Fresh tissue from the peripheral parenchyma of the left lower lobe or lower right lobe of the lung was excised within 60 minutes of circulatory arrest and preserved in University of Wisconsin (UW) organ preservation solution (Belzer UW® Cold Storage Solution, Bridge to Life, USA) until processing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD00001005066 
   
  
    
    Four iPSC line data were sequenced by WGS. One of them has gene MYBPC3 modified. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001005069 
   
  
    
    Whole genome and transcriptome sequencing of a pancreatic tumor harboring a RASGRP1 gene fusion 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001005070 
   
  
    
    Non-deduplicated bam files comprising Illumina HiSeq2500 SE100 low coverage whole genome data for 30 pre-treatment (BL) cfDNA samples and 20 matched post-treatment (PD) cfDNA samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001005071 
   
  
    
    We showed that mice in which Dnase1l3 had been deleted showed aberrations in the fragmentation of plasma DNA. We also observed a change in the ranked frequencies of end motifs of plasma DNA caused by the Dnase1l3 deletion. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  41 
 
  
    EGAD00001005072 
   
  
    
    We have been applying whole genome and transcriptome sequencing across metastases collected during post mortem. Herein we show the findings for the first such patient. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001005073 
   
  
    
    Despite multiple large-scale sequencing studies offering substantial insight into the genomic landscape of cutaneous melanoma, the molecular events surrounding disease progression and the resulting molecular heterogeneity between metastases have not been fully elucidated. We have been applying whole genome and transcriptome sequencing across metastases collected during post mortem. Herein we show the findings for the first such patient. This is a targeted pulldown validation in support of the whole-genome sequencing analysis of the metastatic tumours and targeted pulldown of the primary tumour. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001005074 
   
  
    
    This dataset includes 48 bam files, including those of 24 tumors and 24 paired normal samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001005075 
   
  
    
    Deep sequencing of viral samples (average ~9,000x coverage) from patients chronically infected with Hepatitis B (HBV).  
Whole HBV genome sequencing of 1467 patients (1102 in discovery and 365 in validation cohort) chronically infected with HBV at baseline. The patient population contained HBV genotypes A (98), B (285), C (716), D (356), E (7), and F (5) with 977 HBeAg-positive and 490 HBeAg-negative patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1467 
 
  
    EGAD00001005076 
   
  
    
    This dataset is for TrypanoGEN Phase 1: Variant discovery, and includes 233 samples sequenced to approximately 10X coverage. Samples are from Guinea, Cote D’Ivoire, DRC and Uganda using Illumina HiSeq 2500. 
    
   
  
    
   
  233 
 
  
    EGAD00001005077 
   
  
    
    This dataset contains 3 GBM stem cell samples profiled by RNA-seq. Two of those samples have been profiled with WGS as well (with matched blood WGS data available) 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001005078 
   
  
    
    This dataset includes 139 bam files of mRNA sequencing. All subjects are tumor samples of pediatric acute myeloid leukemia. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  139 
 
  
    EGAD00001005079 
   
  
    
    We want to investigate mosaic mutations as a cause of childhood IBD . 
This dataset contains all the data available for this study on 2019-06-10. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  28 
 
  
    EGAD00001005080 
   
  
    
    This study involves mutagenizing a range of different cell lines with ENU to identify those mutations which engender resistance to targeted treatment. . 
This dataset contains all the data available for this study on 2019-06-10. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001005081 
   
  
    
    This study involves mutagenizing 11-18 with ENU to identify those mutations which engender resistance to targeted treatment. . 
This dataset contains all the data available for this study on 2019-06-10. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  120 
 
  
    EGAD00001005082 
   
  
    
    Exome Sequencing in a set of Asian Head and Neck cancer cell lines, to identify mutations that can be used to genomically classify the cell lines. . 
This dataset contains all the data available for this study on 2019-06-10. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  21 
 
  
    EGAD00001005083 
   
  
    
    300-Obese cohort, Nijmegen, the Netherlands. Dataset contains gut microbiome data generated by metagenomic sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  297 
 
  
    EGAD00001005084 
   
  
    
    Whole genome sequencing of participants from the INTERVAL study. . 
This dataset contains all the data available for this study on 2019-06-12. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5112 
 
  
    EGAD00001005085 
   
  
    
    15x Whole Genome Sequencing of 15,000 individuals from the INTERVAL study cohort, phase II.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-06-12. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5592 
 
  
    EGAD00001005086 
   
  
    
    15x Whole Genome Sequencing of 15,000 individuals from the INTERVAL study cohort, phase III.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-06-12. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001005087 
   
  
    
    Multi-omic data for lung neuroendocrine neoplasms, including the first multi-omic sequencing data for the understudied lung atypical carcinoids. The data includes Whole-exomes, whole-genomes, RNA-seq, and EPIC 850K methylation array data. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
    
   
  23 
 
  
    EGAD00001005088 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  236 
 
  
    EGAD00001005089 
   
  
    
    RNAseq of U251 and two ID1 gene knockouts 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001005092 
   
  
    
    Whole transcriptome (n=4) and targetted RNA sequencing (n=15) .bam files of infantile glioma samples reported on at the Hospital for Sick Children 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  19 
 
  
    EGAD00001005093 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  118 
 
  
    EGAD00001005095 
   
  
    
    This dataset comprises 1440 whole genome sequenced samples from the Medical Genome Reference Bank.
https://sgc.garvan.org.au/initiatives/mgrb
The files are provided in cram format, aligned to hs37d5 with decoys, with no further processing applied.
The dataset also contains phenotype information for each sample. 
    
   
  
    
   
  1440 
 
  
    EGAD00001005097 
   
  
    
    Cancer and germline exomes, and cancer RNA-seq consisiting of FASTQ paired-end reads from melanoma and lung cancer samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001005098 
   
  
    
    Dataset consists of forty vcf files, outcome of variant calling with caveman algorithm of matched bam file (TUMOR and NORMAL) of RRMM patients. Bam files were obtained from whole exome sequencing! 
    
   
  
    
   
  40 
 
  
    EGAD00001005099 
   
  
    
    The dataset consists of two patient-derived xenograft model of myxoid liposarcoma one sensitive and one resistant to trabectedin. Both models underwent the same treatment schedule with trabectedin (tree time points and one treatment with doxorubicin). We performed genomic profiling using Agilent OneSeq assay. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  41 
 
  
    EGAD00001005100 
   
  
    
    Biopsies from visceral adipose tissue from the omental depot (OAT) were obtained from five obese individuals and one lean donor with participant informed consent obtained after the nature and possible consequences of the studies were explained under protocols approved by the Institutional Review Boards of the Perelman School of Medicine at the University of Pennsylvania, the Children’s Hospital of Philadelphia, or the Tel Aviv Sourasky Medical Center. The obese donors underwent bariatric surgery, the lean donor underwent cholecystectomy. OAT samples were placed in 1 mL of DMEM, and finely minced under sterile conditions before digestion in 50 mL of DMEM with 3 mg/1 mL collagenase IV (Gibco). Samples were incubated at 37°C in a rotating oven for 20-60 min. Adipocyte and stromal vascular fractions (SVF) were separated by centrifugation, and red blood cells (RBCs) were removed from the SVF by histopaque gradient (Sigma). Single-cell RNA-sequencing libraries were prepared using the MARS-seq pipeline, and sequenced on the MiSeq 500 or HiSeq 2500 Sequencing System (Illumina). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  23 
 
  
    EGAD00001005101 
   
  
    
    Biopsies from visceral adipose tissue from the omental depot (OAT) were obtained from an obese individual with participant informed consent obtained after the nature and possible consequences of the studies were explained under protocols approved by the Institutional Review Boards of the Perelman School of Medicine at the University of Pennsylvania, the Children’s Hospital of Philadelphia, or the Tel Aviv Sourasky Medical Center. The obese donor underwent bariatric surgery, the lean donor underwent cholecystectomy. OAT samples were placed in 1 mL of DMEM, and finely minced under sterile conditions before digestion in 50 mL of DMEM with 3 mg/1 mL collagenase IV (Gibco). Samples were incubated at 37°C in a rotating oven for 20-60 min. Adipocyte and stromal vascular fractions (SVF) were separated by centrifugation, and red blood cells (RBCs) were removed from the SVF by histopaque gradient (Sigma). Single-cell RNA-sequencing libraries were prepared using the Chromium platform (10x genomics), and sequenced on the MiSeq 500 or HiSeq 2500 Sequencing System (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005103 
   
  
    
    RNA sequencing was performed on 54 bone marrow samples at diagnosis of paediatric patients with B lymphoblastic leukemia. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  54 
 
  
    EGAD00001005105 
   
  
    
    Sample set of 74 whole-exome sequencing samples from sporadic Burkitt lymphoma patients from the UK. 33 of these samples have matched constitutional data, giving a total number of 107 samples 
    
   
  
    
      
      unspecified 
      
    
   
  107 
 
  
    EGAD00001005107 
   
  
    
    To identify novel causes of hereditary thrombocytopenia, we performed a genetic
association analysis of whole-genome sequencing (WGS) data from 13,037 individuals
enrolled in the NIHR BioResource, including 233 cases with isolated thrombocytopenia.
We found an association between rare variants in the transcription factor (TF)-encoding
gene IKZF5 and thrombocytopenia. We report five causal missense variants in or near
IKZF5 zinc fingers (Znfs), of which two occurred de novo and three co-segregated in three
pedigrees. A canonical DNA-Znf binding model predicts that three of the variants alter
DNA recognition. Expression studies showed that chromatin binding was disrupted in
mutant compared to wild-type (WT) IKZF5 and electron microscopy (EM) revealed a
reduced quantity of alpha granules in normally sized platelets. Proplatelet formation (PPF)
was reduced in megakaryocytes (MKs) from seven cases relative to six controls.
Comparison of RNA-seq data from platelets, monocytes, neutrophils and CD4+ T-cells
from three cases and 14 healthy controls showed 1,194 differentially expressed genes
(DEGs) in platelets but only four DEGs in each of the other blood cell types. In conclusion,
IKZF5 is a novel transcriptional regulator of megakaryopoiesis and the eighth transcription
factor associated with dominant thrombocytopenia in humans. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  51 
 
  
    EGAD00001005109 
   
  
    
    This dataset contains primary raw data of whole-genome sequencing, whole-genome bisulfite sequencing, ATAC-seq, ChIP-seq (ChIP-mentation) of histone variants and modifications, as well as RNA-seq of giant cell tumor of bone tissue and primary cell line samples. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001005111 
   
  
    
    Most patients with late stage high-grade serous ovarian cancer (HGSOC) initially respond to chemotherapy but inevitably relapse and develop resistance, highlighting the need for novel therapies to improve patient outcomes. The MEK/ERK pathway is activated in a large subset of HGSOC, thus making it an attractive therapeutic target. Here, we systematically evaluated the extent of MEK/ERK pathway activation and efficacy of pathway inhibition in a large panel of well-annotated HGSOC patient-derived xenograft (PDX) models. The vast majority of models were nonresponsive to the MEK inhibitor cobimetinib (GDC-0973) despite effective pathway inhibition.  Proteomic analyses of adaptive responses to GDC-0973 revealed that GDC-0973 upregulated the pro-apoptotic protein BIM, thus priming the cells for apoptosis regulated by BCL2-family proteins. Indeed, combination of both MEK inhibitor and dual BCL-2/XL inhibitor (ABT-263) significantly reduced cell number, increased cell death and displayed synergy in vitro in most models. In vivo, the GDC-0973 and ABT-263 combination was well tolerated and resulted in greater tumor growth inhibition than single agents. Detailed proteomic and correlation analyses identified two subsets of responsive models – those with high BIM at baseline that was increased with MEK inhibition and those with low basal Bim and high pERK levels.  Models with low BIM and low pERK were non-responsive. Our findings demonstrate that combined MEK and BCL-2/XL inhibition has therapeutic activity in HGSOC models and provide a mechanistic rationale for clinical evaluation  of this drug combination as well as the assessment of the extent to which BIM and/or pERK levels predict drug combination effectiveness in chemoresistant HGSOC. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001005112 
   
  
    
    The study focus was differential expression in bronchial biopsies between persistent asthma, asthma in remission and healthy controls using RNAseq. There were 184 samples that passed QC. RNA samples were processed using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina, San Diego, CA), using an automated procedure in a Caliper Sciclone NGS Workstation (PerkinElmer, Waltham, MA). In this procedure, all cytoplasmic and mitochondria  rRNA was removed (RiboZero Gold kit). The obtained cDNA fragment libraries were loaded in pools of multiple samples unto an Illumina HiSeq2500 sequencer using default parameters for paired-end sequencing (2 × 100 bp). Data are available as 221 pairs of FASTQ-files. Note that several samples are associated with multiple sequence runs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  184 
 
  
    EGAD00001005113 
   
  
    
    50 samples of 16 individuals with Gastrointestinal Tumor. Patients were sequenced in various combinations of WGS, Exome and RNA sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  50 
 
  
    EGAD00001005114 
   
  
    
    A targeted gene panel that covers coding, noncoding, and short tandem repeat regions improves the diagnosis of patients with neurodegenerative diseases 
    
   
  
    
      
      NextSeq 500 
      
    
   
  136 
 
  
    EGAD00001005115 
   
  
    
    Additional unpublished RNA-seq data generated in conjunction with our mulit-tissue ChiP-seq project. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005116 
   
  
    
    Panel-based next-generation sequencing data of 150 human surgical liver samples from Caucasian donors.  The panel was designed for 340 ADME (absorption, distribution, metabolism and excretion) and ADME-related genes. NGS was carried out on the Illumina HiSeq2500 system (Illumina Inc., San Diego, CA, United States) at high depth with 2 × 100 bps paired-end reads.   Variants were called using samtools and varscan (2.3.5). Data on n=15,727 filtered variants for the 150 patients are comprised in one vcf file. 
    
   
  
    
   
  150 
 
  
    EGAD00001005118 
   
  
    
    The dataset contains Bam files from DigiPico runs as well as bulk sequencing data used in the "A highly accurate platform for clone-specific mutation discovery enables the study of active mutational processes" publication. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      NextSeq 550 
      
    
   
  22 
 
  
    EGAD00001005120 
   
  
    
    Whole genome sequencing of AML blood or bone marrow at presentation and remission for 5 patients. Relapse samples are included for 2 patients, totaling 12 WGS BAM files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001005121 
   
  
    
    MASQ targeted amplicon sequencing of AML blood or bone marrow at presentation, and relapse, when available, for 5 patients. Remission samples of both blood and bone marrow are included for 5 patients. Multiple batches (b1,b2) are used for 2 patients. There are 25 assays of AML data. MASQ data demonstrating sensitivity, input range, and batch size are also included, as 12 assays. All data is provided in paired FASTQ files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  37 
 
  
    EGAD00001005122 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Leber Hereditary Optic Neuropathy (LHON) Rare Disease domain 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005123 
   
  
    
    Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Ehler-Danlos (ED) and ED-like Syndromes (EDS) Rare Disease domain. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005124 
   
  
    
    ChIP-seq and RNA-seq of glioblastoma initiating cells and their differentiated counterparts with and without inhibition/knockdown of KDMs. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  90 
 
  
    EGAD00001005125 
   
  
    
    PARN sequences of Patients 1 and 2 carrying mutatoins as decribed in Benyelles et al., EMBO Molecular Medicine 2019) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005126 
   
  
    
    All sequencing was performed within the DNAlink (Korea) by using the Solexa sequencing technology (Illumina, San Diego, CA). 1 ug of genomic DNA was sheared to an average size of 150 bp by using the Covaris System. The libraries were prepared by using TruSeq DNA Sample Prep Kit (Illumina). The purified DNA library was hybridized with the SureSelect Human All Exon V3 probe set (Agilent Technologies) to capture 50 Mb of targeted exons following the manufacturer’s instructions. Exome capture was carried out using the Agilent SureSelect Human All Exon 50Mb Kit. The captured exome libraries were sequenced on the Illumina HiSeq2000 using the manufacturer’s recommended protocols. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  224 
 
  
    EGAD00001005127 
   
  
    
    Single cell transcriptome atlas of immune cells in human small intestine and in Celiac disease 
    
   
  
    
      
      NextSeq 500 
      
    
   
  45 
 
  
    EGAD00001005128 
   
  
    
    We profiled two human fetal brainstem specimens at 17 and 19 post-conception weeks by single-cell RNA-seq using 10X Chromium Single Cell 3'. The BAM files are provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001005129 
   
  
    
    We profiled 11 patient tumor samples by single-cell and single-nuclei RNA-seq using 10X Chromium 3'. These include samples from the following entities: WNT-subtype medulloblastoma (N=3), embryonal tumors with multilayered rosettes (N=3), and atypical teratoid-rhabdoid tumors (N=5). The BAM files are provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  11 
 
  
    EGAD00001005130 
   
  
    
    We profiled 43 normal human adult brain and 11 normal human fetal brain specimens by bulk RNA-seq. The raw fastqs are provided. 
    
   
  
    
      
      unspecified 
      
    
   
  54 
 
  
    EGAD00001005131 
   
  
    
    We profiled 186 patient tumor samples by bulk RNA-seq. These includes 38 embryonal brain tumors, 101 high-grade gliomas, 24 low-grade gliomas, 10 medulloblastoma and 13 matched normals. The raw fastqs are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  186 
 
  
    EGAD00001005133 
   
  
    
    RNA-sequencing was carried out on ascetic fluid-isolated mesothelial cells from low-grade serous ovarian cancer patients, high-grade serous ovarian cancer patients, chemotherapy-treated high-grade serous ovarian cancer patients and control mesothelial cells obtained from non-oncologic patients to identify differentially expressed genes associated to mesothelial-to-mesenchymal transition process. The dataset contains 18 samples: 
- Control mesothelial cells: 4 samples
- Group 1, high-grade serous ovarian cancer patients: 3 samples
- Group 2, chemotherapy-treated high-grade serous ovarian cancer patients: 5 samples
- Group 3, low-grade serous ovarian cancer patients: 6 samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  18 
 
  
    EGAD00001005134 
   
  
    
    We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001005135 
   
  
    
    We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  59 
 
  
    EGAD00001005136 
   
  
    
    We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  15 
 
  
    EGAD00001005137 
   
  
    
    whole genome sequencing data of parent blood samples. Single cell full-length RNA-seq and PBAT-Seq data of in vitro culture D6 to D14 human embryo. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  638 
 
  
    EGAD00001005138 
   
  
    
    Exome sequencing of 277 rainforest hunter-gatherers (RHG) and neighbouring farmers (AGR) from Central Africa was performed based on the Nextera Rapid Capture Expanded Exome Kit (62-Mb content) with the Illumina HiSeq 2500. After QC filters, exomes of 266 unrelated individuals were obtained at high coverage (Lopez et al., Curr Biol 2019). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  277 
 
  
    EGAD00001005139 
   
  
    
    Exome sequencing of 20 rainforest hunter-gatherers (RHG) and 20 neighbouring farmers (AGR) from western central Africa was performed using 101-bp paired-end reads on Illumina HiSeq 2000. All individuals presented very low rates of missing values ranging from 0.5% to 4%, and a mean depth of coverage of 6.5× (ranging from 4× to 13×)(Lopez et al., Curr Biol 2019). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001005140 
   
  
    
    In order to elucidate the biological pathways altered by sphingolipid modulation with N-(4-hydroxyphenyl) retinamide (4HPR) treatment in human HSPC that may contribute to the restraint in proliferation while promoting persistence of HSC self-renewal, as well as determine the mechanism of synergy in enhancement of HSC self-renewal with CB CD34+ agonists UM171 and StemRegenin 1 (SR1), we performed RNA-sequencing (RNA-Seq) of 3 pools of lin-CB cells following 2 or 4 days with DMSO, 4HPR, UM171+SR1 or 3-Factor (4HPR+UM171+SR1). We identified modulation of sphingolipid metabolism regulates self-renewal through activating coordinated stress pathways that coalesce on endoplasmic reticulum stress and autophagy programs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  25 
 
  
    EGAD00001005141 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005142 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005143 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005144 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005145 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005146 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005147 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005148 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005149 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005150 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 36 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005151 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 36 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005152 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005153 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005154 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005155 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005156 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005157 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005158 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005159 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005160 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005161 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005162 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005163 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005164 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005165 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005166 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005167 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005168 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005169 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005170 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005171 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005172 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005173 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005174 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005175 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005176 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005177 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005178 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005179 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005180 
   
  
    
    Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005181 
   
  
    
    Whole genome sequencing dataset of 54 tumor/normal samples of 25 cHCC-ICC cases. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  54 
 
  
    EGAD00001005182 
   
  
    
    The RNA-seq data of 97 tumor samples from 77 cHCC-ICC cases, data of two other HCC tumor samples were also included. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  99 
 
  
    EGAD00001005183 
   
  
    
    The whole exome sequencing data of 291 tumor/normal samples from 121 cHCC-ICC cases. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  291 
 
  
    EGAD00001005184 
   
  
    
    Long-read WGS of three cancer cell lines 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001005185 
   
  
    
    RNA sequencing on cetuximab treated, untreated and release samples of one metastatic colorectal xenograft.
3 cetuximab treated samples, 3 placebo and 3 release - paired fastq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001005186 
   
  
    
    WES on cetuximab treated, untreated and release samples of two metastatic colorectal xenografts.
First case: 3 cetuximab treated samples, 3 placebo and 3 release.
Second case: 2 cetuximab treated samples, 2 placebo and 3 release.  
Paired fastq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001005187 
   
  
    
    RNA was isolated from purified human CD8 cells that were incubated with anti-HER2/CD3 TDB in the presence of SK-BR-3 cells. Sequencing libraries were generated and submitted for transcriptome profiling by high-throughput sequencing.  Experiments were performed in triplicates for anti-HER2/CD3 TDB treatment and control.
This Dataset is associated with the following ArrayExpress Experiment:
E-MTAB-8211 - The effect of anti-HER2/CD3 TDB on transcription in human CD8 T cells (bulk) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001005188 
   
  
    
    Single-cell RNA-seq libraries were generated from human PBMCs that were incubated with anti-HER2/CD3 TDB in the presence of KPL-4 cells.
This dataset is linked with the following ArrayExpress Experiment:
E-MTAB-8212 - The effect of anti-HER2/CD3 TDB on transcription in human PBMCs (single-cell) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001005189 
   
  
    
    Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal individuals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA samples were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. The data is provided as 2 VCF files, one for the WA population and one for the NT population. 
    
   
  
    
   
  122 
 
  
    EGAD00001005190 
   
  
    
    RNASeq for Genomic Analysis of Mucinous Tumours (GAMuT) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  109 
 
  
    EGAD00001005191 
   
  
    
    To evaluate 3 different tissue dissociation protocols or fresh vs. frozen cell preparations, we performed single-cell RNA sequencing on cancer or distant normal tissue dissociates from 2 colorectal cancer patients. Total 18,409 cells from 10 sample preparations were analyzed (5 primary colorectal cancer and 5 matched normal mucosa). The results suggest highly consistent cellular proportions were recovered with different sample preparation methods. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001005192 
   
  
    
    BLUEPRINT EpiVar Whole Genome Sequencing Phase 2 genotypes 
    
   
  
    
   
  197 
 
  
    EGAD00001005193 
   
  
    
    That tobacco smoking causes lung cancer is well-established, but we lack quantitative understanding of its effects on genomes of normal bronchial epithelium. We sequenced whole genomes of 632 colonies derived from single bronchial epithelial cells across 16 subjects. Tobacco smoking is the major influence on mutation burden, adding 1000-10,000+ mutations/cell, massively increasing both between-subject and within-subject variance, and generating several distinct signatures of substitutions and indels. A population of cells in subjects with smoking history had mutation burdens equivalent to that expected for never-smokers: these cells lacked tobacco-specific mutational signatures, were four-fold more frequent in ex-smokers than current smokers, and had significantly longer telomeres than their more mutated counterparts. Driver mutations increased in frequency with age, affecting 4-14% of cells in middle-aged never-smokers. In current smokers, ≥25% of cells carried driver mutations and 0-6% cells had 2 or even 3 drivers. Thus, tobacco smoking increases mutation burden, cell-to-cell heterogeneity and driver mutations, but quitting promotes replenishment of bronchial epithelium from mitotically quiescent cells that have avoided tobacco mutagenesis. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  644 
 
  
    EGAD00001005194 
   
  
    
    15x whole genome sequencing in samples from the isolated population of Orkney. 
This dataset contains all the data available for this study on 2019-07-23. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1360 
 
  
    EGAD00001005195 
   
  
    
    Whole exome sequencing of CD19- relapses in CARPALL study 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  10 
 
  
    EGAD00001005196 
   
  
    
    This data set concerns DNA copy number alterations and mutation data from 30 IBD-associated dysplastic lesions and 13 IBD-associated cancers. DNA was isolated from formalin-fixed, paraffin-embedded material. Whole-genome shallow seq and Truseq amplicon cancer panel (Illumina) were used for detection of DNA copy number alterations and gene mutations, respectively. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  43 
 
  
    EGAD00001005197 
   
  
    
    Improving the understanding of cardiometabolic syndrome pathophysiology and its relationship with thrombosis are ongoing healthcare challenges. Using plasma biomarkers analysis coupled with the transcriptional and epigenetic characterisation of cell types involved in thrombosis, obtained from two extreme phenotype groups (obese and lipodystrophy) and comparing these to lean individuals and blood donors, the present study identifies the molecular mechanisms at play, highlighting patterns of abnormal activation in innate immune phagocytic cells and shows that extreme phenotype groups could be distinguished from lean individuals, and from each other, across all data layers. The characterisation of the same obese group, six months after bariatric surgery shows the loss of the patterns of abnormal activation of innate immune cells previously observed. However, rather than reverting to the gene expression landscape of lean individuals, this occurs via the establishment of novel gene expression landscapes. Netosis and its control mechanisms emerge amongst the pathways that show an improvement after surgical intervention. Taken together, by integrating across data layers, the observed molecular and metabolic differences form a disease signature that is able to discriminate, amongst the blood donors, those individuals with a higher likelihood of having cardiometabolic syndrome, even when not presenting with the classic features. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001005198 
   
  
    
    To understand intrinsic cancer cell signatures and the surrounding microenvironemt and their interactions, we performed single-cell RNA sequencing on 63,689 cells from 23 patients with 23 primary colorectal cancer and 10 matched normal mucosa samples. Analyzing of primary colorectal cancer and normal mucosa samples show a comprehensive cellular landscape of colon cancer, which is a valuable resource for the development of therapeutic strategies. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  33 
 
  
    EGAD00001005199 
   
  
    
    BLUEPRINT WP10 Quantitative Trait Loci (QTLs) Phase 2 full summary statistics data include five molecular traits (eQTL, hQTL(H3K27ac), hQTL(H3K4me1), mQTL, and psiQTL) for three primary blood cells (Monocytes, Neutrophils, and T-cells). Each full summary statistics file contains the associations for all tested variants for each phenotype ID. 
    
   
  
    
   
  197 
 
  
    EGAD00001005200 
   
  
    
    BLUEPRINT WP10 Quantitative Trait Loci (QTLs) Phase 2 summary statistics of most significant association data include five molecular traits (eQTL, hQTL(H3K27ac), hQTL(H3K4me1), mQTL, and psiQTL) for three primary blood cells (Monocytes, Neutrophils, and T-cells). Each summary statistics file contains the most significant association for each phenotype ID. 
    
   
  
    
   
  197 
 
  
    EGAD00001005201 
   
  
    
    13 ATAC-Seq datasets of human pancreatic islets from 13 donors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  13 
 
  
    EGAD00001005202 
   
  
    
    6 ChIP-Seq datasets of Mediator in human pancreatic islets from 6 donors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001005203 
   
  
    
    3 ChIP-Seq datasets of Cohesin in human pancreatic islets from 3 donors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001005204 
   
  
    
    14 ChIP-Seq datasets of H3K27ac in human pancreatic islets from 14 donors, where islets were treated in high (11mM) glucose conditions. Samples IDs HI-129, HI-130, HI-131, HI-132, HI-135, HI-137 and HI-152 were also cultured in low glucose conditions. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001005205 
   
  
    
    7 H3K27ac ChIP-Seq datasets from 7 donors where islets were treated in low (4mM) glucose conditions. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001005206 
   
  
    
    4 Promoter-Capture Hi-C datasets of human pancreatic islets from 4 human islet donors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001005207 
   
  
    
    7 RNA-Seq datasets in human pancreatic islets from 7 donors, where islets were treated in high (11mM) glucose conditions 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001005208 
   
  
    
    7 RNA-Seq datasets in human pancreatic islets from 7 donors, where islets were using low (4mM) glucose conditions. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001005209 
   
  
    
    Input dataset of human islets. To be used in conjuction with Cohesin and Mediator ChIP-Seq datasets. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005210 
   
  
    
    2 Circular chromosome conformation capture (4C-Seq) datasets of the human beta cell line EndoC-bh1. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005211 
   
  
    
    This data includes whole exome sequencing of matched normal-tumor samples of patients who have received immunotherapy. '-1' refers to matched normal sample and '-2' refers to matched tumor sample. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  294 
 
  
    EGAD00001005212 
   
  
    
    WGS with linked reads of pediatric glioblastoma. For each patient, blood and tumor tissue were sequenced. For two patients, we also provide sequencing data for the blood of their parents. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  26 
 
  
    EGAD00001005214 
   
  
    
    All normal somatic cells are thought to acquire mutations but understanding of the rates, patterns, causes and consequences of somatic mutation in normal cells is limited. Uterine endometrium adopts multiple physiological states over a lifetime and is lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands from women aged 19 to 81 years showed them to be clonal cell populations derived from recent common ancestors, with total mutation burdens that increase with age at ~29 base substitutions/year and which are many-fold lower than endometrial cancers. Normal endometrial glands frequently carry driver mutations in cancer genes. Driver mutation burdens increase with age and correlate negatively with parity. Phylogenetic trees of normal endometrial glands constructed using whole genome sequences indicated that clones with drivers often originate during the first decades and spread to colonise the endometrial epithelial lining. The results show that driver mutation landscapes differ between normal cell types, perhaps shaped by differences in normal tissue physiology, and suggest that the procession of neoplastic changes leading to endometrial cancer is initiated early in life. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001005215 
   
  
    
    The aim of this work is to apply an integrated systems approach to understand the biological underpinnings of large joint (hip and knee) osteoarthritis which culminates in the need for total joint replacement (TJR). We will obtain diseased and non-diseased cartilage as well as other disease-relevant tissue following TJR, coupled with a blood sample. We will generate genotype data and will characterise the pairs of diseased and non-diseased tissue samples in terms of methylation, transcription (RNASeq) and expression (quantitative proteomics).  We will apply integrative approaches to combine information across the omics levels to characterise genes, pathways, and networks that underlie osteoarthritis progression.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-08-01. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  210 
 
  
    EGAD00001005216 
   
  
    
    Gene expression from human iPSC derived motor neurons. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001005217 
   
  
    
    RNA sequencing of 31 pancreatic cancer organiods 
    
   
  
    
      
      NextSeq 500 
      
    
   
  31 
 
  
    EGAD00001005221 
   
  
    
    Dataset contains Exome sequencing data (aligned and base quality score recalibrated BAM files) for 236 tumor samples + matched normal from blood, collected from 21 patients with adult diffuse glioma. The majority of these samples were spatially-mapped during sample collection, enabling the genomic information derived from them to be mapped in 3D space. 
    
   
  
    
   
  236 
 
  
    EGAD00001005222 
   
  
    
    Dataset includes 160 double-stranded RNAseq libraries collected from 16 patients with adult diffuse glioma. The majority of these samples were spatially-mapped during sample collection, enabling the genomic information derived from them to be mapped in 3D space. 
    
   
  
    
   
  160 
 
  
    EGAD00001005223 
   
  
    
    Tricholastoma (TB): Exome
Merkel cell carcinoma (MCC): Genome and Exome
Healthy tissue (HT): Exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001005224 
   
  
    
    Whole transcriptome, strand-directional RNAseq data for paired primary dn recurrent samples from patients with glioblastoma 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001005225 
   
  
    
    This dataset contains RNA sequencing data for 20 intra/extra hepatic bileduct organiods. Data is in BAM format and was processed by STAR. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  20 
 
  
    EGAD00001005226 
   
  
    
    The cohort of 15 patients included ten patients with available tissue from the primary tumor and ≥1 metastatic site, four patients with pairs of metastases only, and one patient with an anastomotic recurrence five years after initial resection of the primary tumor 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001005227 
   
  
    
    This data incldues matched exome data of patients who received immunotherapy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  120 
 
  
    EGAD00001005228 
   
  
    
    This dataset comprises a 2572 sample joint called vcf from the Medical Genome Reference Bank. 
    
   
  
    
   
  2572 
 
  
    EGAD00001005229 
   
  
    
    WGS and WTS data of a single patient diagnosed with HSTCL.
Whole-genome sequencing (WGS) was performed for the tumor-normal sample. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 4000 2x151 bp read length.
Whole-transcriptome sequencing (WTS): Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on Illumina HiSeq 2500 with 2x101 bp read length. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001005230 
   
  
    
    Whole-transcriptome sequencing (WTS) of 36 samples from patients diagnosed with NKTL. Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on HiSeq 2500 and HiSeq 3000 (Illumina) with 2x101 bp and 2x151 bp read length, respectively. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001005231 
   
  
    
    Whole-genome sequencing (WGS) was performed for 50 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000 or HiSeq X Ten as 2x101 bp or 2x151 bp, respectively. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  120 
 
  
    EGAD00001005232 
   
  
    
    Whole genome sequencing of  immune cells from patients diagnosed with psoriatic arthritis      . 
This dataset contains all the data available for this study on 2019-08-07. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  8 
 
  
    EGAD00001005233 
   
  
    
    This study is a benchmarking exercise to explore potential source of variation between different CRISPR drop out libraries. . 
This dataset contains all the data available for this study on 2019-08-07. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  37 
 
  
    EGAD00001005234 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. 
Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. 
Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.  . 
This dataset contains all the data available for this study on 2019-08-07. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001005235 
   
  
    
    DNA and RNA isolated from FFPE blocks of patients who received immune checkpoint inhibition (ICI) are sequenced with the aim to identify therapeutically tractable vulnerabilities in tumors to prevent ICI resistance. . 
This dataset contains all the data available for this study on 2019-08-07. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  141 
 
  
    EGAD00001005236 
   
  
    
    ost adults with intellectual disabilities (ID) do not undergo genetic diagnostic investigation as part of their clinical care and have 'missed the boat' with regard to the WES and WGS genetic testing that is now being provided for children with ID. There is a dramatically increased risk of psychistric disorders in adults with ID, e.g. the risk of psychoses is 10X higher than in the general population. It remains an open question as to how much of adult ID is genetic in origin and how similar the genetic forms of adult ID are to those being diagnosed in children, in part due to survivor bias. There is also the opportunity to identify adults with treatable forms of ID, of which over 80 have been described, thus improving their clinical management. Furthermore, analysis of medical records of adults with genetic forms of ID can help to characterise the 'natural history' of individual disorders, resulting in more accurate prognoses for diagnosed children and identifying opportunities for improved management and possibly therapeutic intervention (e.g. optimal anti-epileptic therapy).                                                                           Here we propose to exome sequence (to ~50X coverage) 200 adults with ID and co-morbid psychiatric disorders. This cohort has previously been assayed with chromosomal microarrays (Wolfe et al 2017 EJHG, 25, 66-72) identifying a diagnostic yield of ~11% which is comparable to the CNV diagnostic yield in various child ID cohorts (10-15%). The authors observed no substantive biases in diagnostic yield between different psychiatric diagnostic classes. The WES data will be analysed using the diagnostic workflows developed in the DDD study to ensure comparability between child and adult ID datasets. This study is intended as a pilot study to demonstrate the value of WES in adults with ID.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-08-07. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  200 
 
  
    EGAD00001005237 
   
  
    
    Whole exome sequencing and RNAseq data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  827 
 
  
    EGAD00001005238 
   
  
    
    RNA-seq of primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent (ER) or poor response(PR). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  74 
 
  
    EGAD00001005239 
   
  
    
    T200 caner panel sequencing on primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent (ER) or poor response (PR). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  75 
 
  
    EGAD00001005240 
   
  
    
    The diversity and heterogeneity within high-grade serous ovarian cancer (HGSC) is not well understood. We performed whole genome sequencing on primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent or poor response. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  103 
 
  
    EGAD00001005246 
   
  
    
    Whole genome sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001005247 
   
  
    
    Whole exome sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001005248 
   
  
    
    Targeted sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa. This targeted panel covers the exons of 435 genes commonly mutated in high grade gliomas in children. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  16 
 
  
    EGAD00001005249 
   
  
    
    Exome and RNA sequencing data for EGAS00001003776 - one female patient with neurofibroma/schwannoma hybrid nerve sheath tumor (N/S HNST) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005250 
   
  
    
    This dataset includes bam files of WES of three fibroblast samples derived from patients with aplastic anemia. 
    
   
  
    
   
  3 
 
  
    EGAD00001005251 
   
  
    
    Retinoblastoma (RB), the commonest eye cancer in children was the first cancer for which a genetic cause was identified: the Rb1 gene is a tumour suppressor gene that is mutated in RB. 
The Rb1 gene defect alone does not predict the clinical outcome. We propose to study other possible mechanisms:
1. Stepwise further mutations occur in RB, increasing its carcinogenesis. We will sequence the whole genome in RB tissue, and relate the different genes expressed to the treatments used.
2. Extracellular matric proteins contribute to a tumour permissive environment for RB to continue to grow. This includes Samll Leucine Rich Proteoglycans (SLRP), a family of 15 secreted extracellular matrix proteins involved in eye development.
3. Cancer stem cells (CSC), a subpopulation of treatment resistant cells, drive RB tumours, and whether these stem cells can be manipulated for new therapies.
The aim of this study is to assist finding targeted diagnostic techniques and treatments for RB. . 
This dataset contains all the data available for this study on 2019-08-14. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  47 
 
  
    EGAD00001005252 
   
  
    
    Immortalised HaCaT keratinocytes were transduced with Cas9 and the CRISPR-KO v1.1 genome-wide gRNA library. The gRNA library was prepared from genomic DNA isolated 14 days post library transduction. gRNA representation will be compared to the original CRISPR-KO v1.1 library to reveal genes essential for HaCaT survival and growth. . 
This dataset contains all the data available for this study on 2019-08-14. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001005253 
   
  
    
    This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-08-14. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1574 
 
  
    EGAD00001005254 
   
  
    
    Whole genome sequences at 15X depth of patients with Inflammatory Bowel Disease.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-08-14. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2546 
 
  
    EGAD00001005255 
   
  
    
    In this study a collection of core biopsies from breast cancer patients receiving neoadjuvant chemotherapy as part of the ChemoNEAR trial will be investigated. The study is intended to detect early acquired resistance in women with diagnosed breast carcinoma. Samples will undergo whole genome sequencing and analysis, including use of HR Predict. . 
This dataset contains all the data available for this study on 2019-08-14. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001005261 
   
  
    
    CB: Aligned sequences used in the error modeling study 
    
   
  
    
   
  10 
 
  
    EGAD00001005262 
   
  
    
    This dataset contains whole genome sequencing data on 25 individuals with myasthenia gravis. The data was generated using Illumina sequencing technology and is presented as BAM files for each sample. 
    
   
  
    
   
  25 
 
  
    EGAD00001005263 
   
  
    
    Primary T cell immunodeficiency disorders have a heterogeneous genetic basis.  This study will focus on one case characterised by severe T cell lymphopenia in the index case.  We aim to sequence the complete exomes of this individual, her three unaffected siblings and parents in an effort to identify the causative genetic mutation responsible for this disorder.  We will perform exome capture using Agilent SureSelect system, followed by sequencing on the HiSeq platform.  Our study has the potential to uncover genes important for T cell development and novel therapeutic strategies to treat T cell immunodeficiencies. . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005264 
   
  
    
    The objective of this study is to identify the causative genes in two unrelated congenital neutropenia families.  We aim to whole exome sequence the affected individuals, unaffected siblings and parents in both cases in an effort to idenitfy the causative genetic mutation.  Exome capture will be performed using Agilent SureSelect system.  Subsequently, exome libraries will sequenced using the Illumina HiSeq platform.  Sequence variant calling will be done in house and common variants excluded using public databases and data from unaffected family members. . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001005265 
   
  
    
    We plan to sequence the exomes of 4 AML cases (tumour and germline) in an effort to discover new mutations in this disease that could improve our understanding of leukaemogenesis and guide the development of new targeted therapies. The Sanger Institute will sequence the exomes of 4 Acute Myeloid Leukaemia cases including tumour and germline DNA so that somatically-acquired, AML-specific mutations can be accurately designated.
 . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001005266 
   
  
    
    Deep whole genome sequencing of sampels from the Cilento isolates. The samples are sequenced using the Illumina HiSeq X Ten system.   . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001005267 
   
  
    
    Whole genome sequencing of sampels from the NSPHS cohort. The samples are sequenced using the Illumina HiSeq X Ten system.   . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001005268 
   
  
    
    Whole genome sequencing of sampels from an isolated population from the Val Borbera valley in Italy. The samples are sequenced using the Illumina HiSeq X Ten system.   . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001005269 
   
  
    
    Deep whole genome sequencing of sampels from the Orkney Complex Disease Study (ORCADES), each with data on up to 300 quantitative traits and other risk factors associated with cardiovascular, metabolic and other complex diseases. The samples are sequenced using the Illumina HiSeq X Ten system.   . 
This dataset contains all the data available for this study on 2019-08-19. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001005270 
   
  
    
    AML-MRD: Aligned sequences used in the error modeling study 
    
   
  
    
   
  96 
 
  
    EGAD00001005271 
   
  
    
    This data contains DNA methylation data obtained from the PBMCs obtained from type 2 diabetes adolescents and controls. There are 21 diabetic samples and 10 controls. This dataset also contains metabolic data obtained from the serum of 155 samples. There are 113 diabetic and 42 control samples. 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  21 
 
  
    EGAD00001005272 
   
  
    
    This dataset consists of 6 BAM files. These are whole exome sequencing data of pediatric patients with myelodysplastic syndrome. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6 
 
  
    EGAD00001005273 
   
  
    
    50 lymphoblastoid cell lines of adult female Twins which are part of the MuTHER study, processed with the FAIRE assay in order to generate maps of open chromatin.  . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  78 
 
  
    EGAD00001005274 
   
  
    
    50 lymphoblastoid cell lines of adult female Twins which are part of the MuTHER study, subjected to  Chromatin Immunoprecipitation  using an antibody for H3K4me1. H3K4me1 peaks mark distal regulatory  elements. . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  174 
 
  
    EGAD00001005275 
   
  
    
    The objective is to identify new disease genes involved in calcific aortic valve stenosis (CAVS) by screening newly identified candidate genes. The recruitment of patients with CAVS has been achieved by l’institut du thorax (Nantes, France). DNA from the selected patients has been analysed by targeted capture (Agilent SureSelect) and massively parallel sequencing (Illumina).  . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  485 
 
  
    EGAD00001005276 
   
  
    
    These samples include exome sequences of samples from patients who suffered Sudden Unexplained Death in Epilepsy. They all are of European descent. . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001005277 
   
  
    
    This project is a pilot study, in collaboration with Maria Grazia Spillantini and Mariangela Iovino (Cambridge Centre for Brain Repair), to investigate the utility of IPS-derived neurons for the study of neurodegenerative disorders. Our aim is to characterise the transcriptional consequences of tauopathies using neurons derived from differentiated IPSCs as a model system. We will use IPS cells derived from six individuals, four with known mutations in the tau protein, 2 without. RNA will be extracted at Day 0 and Day 65 of differentiation by which time the neuronal tauopathy is apparent. RNA will be extracted and the transcriptome of each line characterised using RNAseq. We will then search for genes that are differentially expressed between the transcriptomes of individuals with tau mutations versus those in controls. My lab will analyse the RNAseq data, comparing both affected and controls and both time-points, to establish candidate genes. Darren Logan’s lab, along with our collaborators, will experimentally verify and further investigate these genes in additional lines and animal models.
 From this analysis we will generate a list of candidate genes that are differentially expressed between cases and controls. This study will not only help us understand the molecular basis of tauopathies, but also identify gene candidates for biomarkers of neurodegenerative disease. It will serve as a proof of principle for future planned studies into generating and transcriptomically analysing an allelic series of Tau mutations in IPSCs with a controlled genetic background.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001005278 
   
  
    
    We have collected material from a patient who had BrafV600E mutant melanoma that was
treated with PLX4032. We have germline DNA from the patient and DNA and RNA from
distinct lesions before and after treatment with PLX4032. We would like to exome sequence
these samples to gain a snap shot of the mechanisms of resistance that are operative.
 . 
This dataset contains all the data available for this study on 2019-08-21. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001005279 
   
  
    
    High-grade serous ovarian cancer (HGSOC) likely originates from the fallopian tube (FT) epithelium, but advanced stages are mostly found outside the FT. We used ex-vivo cultures of HGSOC  and knock-out of tumor suppressors in FT organoids to study changes in epithelial cells and niche requirements for normal and transformed FT cells. We found that transformed cells require BMP signaling and are growth arrested in Wnt rich  medium.
A SureSelectXT Automation Custom Capture Library (Agilent) target enrichment panel was designed. The enrichment panel comprised all coding exons of 121 genes associated with ovarian cancer. Capture was performed according to the manufacturer’s instructions using an NGS Workstation Option B (Agilent) for automated library preparation starting with 3 μg DNA per sample. Sequencing was performed on a Illumina Hiseq 2500 system gnerating 2x100bp paired end reads and a target coverage of >200 per sample. Sequence reads were mapped to the haploid human reference genome (hg19) using BWA. Variants where called with FreeBayes v1.1. 
    
   
  
    
   
  20 
 
  
    EGAD00001005280 
   
  
    
    Epigenomic and transcriptomic analysis of Langerhans cell histiocytosis (LCH) biopsies.
Single-cell RNA-seq (10x Genomics) of seven LCH biopsies. ATAC-seq data from four sorted cell populations (in duplicate) from one LCH biopsy. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  13 
 
  
    EGAD00001005281 
   
  
    
    RNA-seq sequencing data of human germinal centre B-cells (42 samples) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  42 
 
  
    EGAD00001005282 
   
  
    
    This dataset includes bam files of tumor and paired normal samples derived from 15 patients with myelofibrosis. Tumor samples includes those before and after treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001005283 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001005284 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  22 
 
  
    EGAD00001005285 
   
  
    
    Identifying high risk smoldering myeloma patients and progression mechanism is a prerequisite to implement effective inception strategies and curve myeloma related morbidity and mortality. We hypothesize that genomics may help identify determinants of progression that may help predict outcome and offer effective chemo preventive targets.
Eighty-two patients underwent a custom targeted panel sequencing with an additional capture for the translocation loci. These results were compared to 223 newly diagnosed patients (EGAS00001003223 and EGAD00001004117) and 17 MGUS and 10 early myeloma patients. DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems). After PCR amplification the libraries were hybridized with probes against either a targeted panel consisting of 140 genes and chromosomal regions (Nimblegen) using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads.
Overall, these data highlight the importance of dysregulation of the MAPK pathway in the progression to MM. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  241 
 
  
    EGAD00001005286 
   
  
    
    We developed a new bioinformatics method for detecting the eccDNA in plasma. We revealed that the biological properties between eccDNA and linear DNA are different. eccDNA could be potentially provided as a new class of circulating biomarker. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001005287 
   
  
    
    Deep WGS (germline and 2-5 tumour regions) was performed on 20 patients: 13 lung adenocarcinoma, 5 squamous cell carcinoma, 2 small-cell lung cancers. A total of 68 BAM files are provided, where tumours were sequenced to 60x or 150x depth. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  68 
 
  
    EGAD00001005288 
   
  
    
    Exome sequencing of 87 Fibromyalgia patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  87 
 
  
    EGAD00001005289 
   
  
    
    exome sequencing data captured with agilent v4 (71k) and sequenced on illumina technology. 
data from a total of 40 samples, of which 14 are vitiligo cases with familial history of vitiligo or immune disease, and the remaining are alopecia areata cases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  19 
 
  
    EGAD00001005290 
   
  
    
    Cytokines affect T cell responses by polarising them to different phenotypes. We isolated T cells from healthy platelet donors and cultured them in resting and stimulated condition, as well as in the presence of Th2, iTreg and Th17 polarizing cocktail. To characterize the efficacy of cytokine induced porization and subpopulation specific response, we profiled single cell transcriptome five days following polarization using 10x platform (3' v2). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001005291 
   
  
    
    Cytokines affect T cell responses by polarising them to different phenotypes. We isolated T cells from healthy platelet donors and cultured them in resting and stimulated conditions, as well as in the presence of Th1, Th2, Th17 and iTreg cocktail. In addition, T cells were stimulated in the presence of IL-10, IL-21, IL-27, IFNb and TNFa. We performed bulk RNA sequencing to assess impact of different disease-relevant cytokines upon T cell response. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  141 
 
  
    EGAD00001005296 
   
  
    
    Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung . 
This dataset contains all the data available for this study on 2019-08-28. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  237 
 
  
    EGAD00001005297 
   
  
    
    A targeted gene screen of 365 known cancer genes in luminal breast cancer samples pre-chemotherapy and at resection post-chemotherapy to evalaute clonal expansion of chemotherapy cancer cells. . 
This dataset contains all the data available for this study on 2019-08-28. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  133 
 
  
    EGAD00001005298 
   
  
    
    This project aims to evaluate the transcriptional response to disease measured in whole blood of participants who developed enteric fever after challenge and, importantly, those who were challenged but stayed well throughout the challenge period. This data will provide unique coverage of the transcriptome and will yield invaluable insight after integration with a wealth of clinical data collected during this trial. 
     
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2019-08-28. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  195 
 
  
    EGAD00001005299 
   
  
    
    This study involves exome sequencing of blood/bone marrow DNA from patients with myeloid malignancies. Blood DNA samples have been taken from patients at different timepoints of disease phenotype. We hope to elucidate mechanisms of clonal evolution in these patients.   . 
This dataset contains all the data available for this study on 2019-08-28. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  46 
 
  
    EGAD00001005300 
   
  
    
    Study to stimulate WT and IL-10RB mutant macrophages with LPS in presence or absence of recombinant IL-10 and compare their gene expression profiles by RNASeq
These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2019-08-28. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001005302 
   
  
    
    Metadata summarizes participants (n=198), samples (n=396), basic clinical information, and analysis.
analysis1: raw sequencing reference alignment files (bam/bai)
analysis2: error-corrected sequencing reference alignment files (bam/bai)
analysis3: variant calling using error-corrected sequencing reference alignment (vcf) 
    
   
  
    
   
  396 
 
  
    EGAD00001005303 
   
  
    
    211 NKTL FFPE specimens were screened for somatic mutations using deep targeted capture sequencing. FFPE rolls or slides were extracted using QIAamp DNA FFPE Tissue kit (QIAGEN). The FFPE genomic DNA was treated with NEBNext FFPE DNA Repair Mix and assessed by Quant-it PicoGreen dsDNA Assay Kit (Invitrogen). The library was generated from 10-200 ng DNA with SureSelectXT Low Input Target Enrichment System for Illumina Paired-End Sequencing Library (Agilent Technologies) according to manufacturer’s instructions. RNA based probe was designed with SureDesign (Agilent Technologies) to target-capture 140 genes. Next, the captured libraries were pooled in equimolar concentration and sequenced on Illumina Novaseq 6000 platform with SP or S1 chip. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  214 
 
  
    EGAD00001005305 
   
  
    
    This dataset contains genomic and transcriptomic profiling of skin samples (74) from patients with CYLD cutaneous syndrome 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  69 
 
  
    EGAD00001005306 
   
  
    
    Primary plasma cell leukemia (pPCL) samples were sequenced using the Nimblegen MedExome Plus hybridization capture to detect translocations, copy number changes, and mutations in 3 pPCL samples and patient matched controls. Sequencing was performed on a NextSeq500 using 75 bp paired end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001005307 
   
  
    
    This data set includes 72 mate pair sequenced osteosarcomas (36 as part of a discovery cohort and 36 as part of a validation cohort). It also includes RNA-sequencing data on 67 osteosarcomas (mostly overlapping with the above mate pair sequenced cases) and 13 osteoblastomas used as controls for gene expression levels. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  101 
 
  
    EGAD00001005308 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  112 
 
  
    EGAD00001005310 
   
  
    
    Basic phenotypic data (country, ethnicity and sex)  for 348 samples of the H3Africa Chip Design Study. Divided into 8 datasets of 41 samples from Zambia, 24 samples from Cameroon, 50 samples from Mali, 26 samples from Cameroon, 49 samples from Nigeria, 48 samples from Botswana, 50 samples from Benin, 60 samples from Burkina Faso and Ghana. 
    
   
  
    
   
  348 
 
  
    EGAD00001005311 
   
  
    
    This study entails whole genome sequencing of an interleukin (IL)-12 b-1 receptor-deficient individual who presented with a chronic systemic Salmonella Enteritidis infection that did not resolve with standard IFNg and antibiotic treatment. Whole genome sequencing of the patient's parents are also included.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
 . 
This dataset contains all the data available for this study on 2019-09-05. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001005312 
   
  
    
    This study investigates the genomic and transcriptomic characteristics of Wilm's tumour organoids . 
This dataset contains all the data available for this study on 2019-09-05. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001005313 
   
  
    
    Swift kit whole genome bisulphite of MPN colonies . 
This dataset contains all the data available for this study on 2019-09-05. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  16 
 
  
    EGAD00001005314 
   
  
    
    Single Nuclei ATAC seq data from GBM tumor samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001005315 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001005316 
   
  
    
    Next generation RNA-Sequencing (RNA-seq) is a flexible approach that can be applied to e.g. global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  3 
 
  
    EGAD00001005317 
   
  
    
    Patient-derived lung cancer organoids cram files : targeted seq 13 samples, whole exome seq 12 samples
mutation profiles of PDO and matched tissue : aggregated vcf 1 file
details : https://www.nature.com/articles/s41467-019-11867-6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  44 
 
  
    EGAD00001005318 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  51 
 
  
    EGAD00001005319 
   
  
    
   
  
    
      
      BGISEQ-500 
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  59 
 
  
    EGAD00001005320 
   
  
    
    This dataset includes "clinical exome" profiling (approximately 4000 genes related to diseases) on individuals (n=7) from a family with a familial history of Alzheimer's disease. Two affected cases ad five cases without dementia are included. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  7 
 
  
    EGAD00001005321 
   
  
    
    The dataset includes Fastq files from WES experiments performed on a proband presenting with syndromic optic atrophy and his healthy parents. Exons were captured by hybridization and sequenced on an Illumina platform 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001005322 
   
  
    
    DNA extracted from sorted CD19+ tumor cells (18 samples - 16 patients) was used for exome capture with the SureSelect All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments. The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001005323 
   
  
    
    DNA extracted from sorted CD3+ cells (16 patients) was used for exome capture with the SureSelect All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments.The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001005324 
   
  
    
    RNA was extracted from flow-sorted CD19+. RNA-Seq was performed on 32 samples of 30 patients (2 replicates per samples). RNA-Seq libraries were subjected to non-stranded paired-end (2 x 75 bp) sequencing on HiSeq 2500 (Illumina). The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  64 
 
  
    EGAD00001005331 
   
  
    
    Total of 180 gynecologic tumor specimens were subjected for targeted-exome and/or whole-transcriptome sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  180 
 
  
    EGAD00001005335 
   
  
    
    August 2019 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001005337 
   
  
    
    Genomic data obtained from the joint processing and variant calling of 4,810 individuals from Singapore. VCF files are by Chromosome (chr. 1-22 plus X) for all 4,810 individuals. 
Self-reported ethnicity is found in the "Region" column of metadata file. 
    
   
  
    
   
  4810 
 
  
    EGAD00001005338 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96146A 1195 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005339 
   
  
    
    The dataset for Genome-wide cell-free DNA fragmentation in patients with cancer includes 538 bam files from whole genome next-generation sequencing on the Illumina HiSeq2500.  The samples analyzed include plasma samples from healthy individuals and patients with cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  537 
 
  
    EGAD00001005340 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96172B 1694 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005341 
   
  
    
    WGS from 4 patients, WTS from only 3 patients (insufficient tissue from 4th patient for WTS).
Whole-genome sequencing (WGS) was performed for 60 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000.
Whole-transcriptome sequencing (WTS): Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on Illumina HiSeq 2500 with 2x101 bp read length.
Description of prefix used in filenames:
T: Tumor samples
N: Normal samples (Blood)
P: PDX samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001005343 
   
  
    
    random whole-genome shotgun sequencing of cfDNA in control samples (NPH*) and late-stage cancer samples. First letter denotes primary cancer tissue (C: Colon, B: Breast, P: Prostate) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  41 
 
  
    EGAD00001005344 
   
  
    
    The dataset includes the BAM files from WES experiments performed on a proband presenting with syndromic optic atrophy and his healthy parents - Family 2 in our study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001005345 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96226B 1274 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001005346 
   
  
    
    A family trio from Uganda (Baganda ethno-linguistic group) has been sequenced to high depth (ca. 30x) on the Illumina HiSeq 2500 platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001005347 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96193B 2410 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005348 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96199A 843 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005351 
   
  
    
    Analysis of mutational signatures caused by exposure to known mutagens in human induced pluripotent stem (iPS) cells. A reference human iPS cell-line will be exposed to 100 chemicals known or proposed to be mutagenic. Following exposure to mutagen, cells will undergo a period of recovery before sub clones are generated and sequenced. The progenitor "parental" IPS cell-line will be used to generate reference sequence data, in order to determine the mutational signatures acquired as a result of exposure to different mutagens. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001005353 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96199B 1170 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005354 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96211C 1397 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  3 
 
  
    EGAD00001005355 
   
  
    
    Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96225C 1034 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  2 
 
  
    EGAD00001005356 
   
  
    
    We used 200 ccRCC samples from 51 tumors to simultaneously isolate DNA, RNA, and protein according to established protocol. RNA quality was assessed using an Agilent Bioanalyzer, and total RNA with RIN>7 was used for further RNA sequencing. 184 ccRCC samples from 49 tumors passing initial quality control underwent RNA sequencing at Admera Health Inc. (Genohub Inc., Austin, TX). RNA sequencing libraries were prepared using the Illumina TruSeq Stranded mRNA high throughput (HT) sample preparation kit following the manufacturers’ protocol. Pair-end RNA Seq data was deposited in this cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  173 
 
  
    EGAD00001005357 
   
  
    
    Whole exome sequencing data from a series of 5 patient derived organoids (PDOs) established from metastatic colorectal cancers (CRCs). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001005358 
   
  
    
    All sequencing was performed within the DNAlink (Korea) by using the Solexa sequencing technology (Illumina, San Diego, CA). mRNA was isolated from total RNA using poly-T oligo-attached magnetic beads and was fragmented with fragmentation buffer to an average size of 300 bp. The libraries were prepared by using TruSeq RNA Library Prep Kit v2 (Illumina) and were sequenced on the Illumina HiSeq2000 using the manufacturer’s recommended protocols 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  220 
 
  
    EGAD00001005359 
   
  
    
    This is 2nd part of data for original Control iPSC lines with clinically annotated genetic variants for versatile multi-lineage differentiation, 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  1 
 
  
    EGAD00001005361 
   
  
    
    231 HCC exome sequencing with Sureselect 50Mb 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  452 
 
  
    EGAD00001005362 
   
  
    
    Paired single-cell sequencing dataset of T-cell receptors from IELs, from both treated and untreated celiac patients and from controls. (Amplicon sequencing, paired-end fastq files). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  20 
 
  
    EGAD00001005363 
   
  
    
    Tumor biopsies from LAM disease were retrospectively analyzed by multiple techniques to characterize the  alterations  in patients ,to elucidate the landscape of genetic/genomic alterations. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  61 
 
  
    EGAD00001005364 
   
  
    
    Exome sequencing data of two siblings of with a neurodegenerative phenotype due to SMVT deficiency. Exonic sequences were enriched using the SeqCap EZ Human Exome Library v3.0 kit (Roche NimbleGen) and libraries sequenced as 100bp paired-end reads on the HiSeq 2000 platform (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001005365 
   
  
    
    Single-cell sequencing of human pancreatic cells on 10X 5' platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001005366 
   
  
    
    Whole-genome sequencing (WGS) was performed for 13 pairs of tumor-normal and 5 tumor-only samples from patients diagnosed with angiosarcoma. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq X Ten as 2x151 bp. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001005367 
   
  
    
    Whole-transcriptome sequencing (WTS) of 6 tumor-normal and 6 tumor-only samples from patients diagnosed with angiosarcoma. Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on HiSeq 2500 and HiSeq 3000 (Illumina) with 2x101 bp and 2x151 bp read length, respectively. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001005368 
   
  
    
    Single Cell RNA sequencing for 5 low grade glioma samples. NovaSeq6000 was used for scRNA Seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001005369 
   
  
    
    Single Cell RNA sequencing for 4 high grade glioma samples. NovaSeq6000 was used for snRNA Seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001005370 
   
  
    
    WES from two human osteosarcoma with two samples each from the corresponding cell line, BAM files 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001005371 
   
  
    
    The biology of cell-free DNA fragmentation and the roles of DNASE1, DNASE1L3 and DFFB 
    
   
  
    
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD00001005372 
   
  
    
    12 tissues from the warm autopsy are selected for this project. Using 10X Chromium technology we will generate ~1000 single cell/nulei genomic libraries per tissue.  Each tissue will be whole genome sequenced (~2 lanes per 1000 cells) on hiseq X10. per single cell we will generate CNV profile and we investigate the level of genomic heterogenity with in tissue and across different tissues.      . 
This dataset contains all the data available for this study on 2019-10-02. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001005373 
   
  
    
    In this study, we performed systematic comparative analysis of seven widely-used SNV-calling methods, including SAMtools, the GATK Best Practices pipeline, CTAT, FreeBayes, MuTect2, Strelka2 and VarScan2, on both simulated and real single-cell RNA-seq datasets.
We generated SMART-seq2 data for 70 CD45- single cells, which were derived from two colorectal cancer patients (P0411 and P0413). The average sequencing depths of these cells were 1.4 million reads per cell. We also generated tumor and adjacent normal bulk WES data, as well as tumor bulk RNA-seq data for these patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  75 
 
  
    EGAD00001005374 
   
  
    
    ChIP-seq for AR, FOXA1 and HOXB13 on 8 prostectomy samples, both regions with/-out tumor cells, Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  50 
 
  
    EGAD00001005375 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  73 
 
  
    EGAD00001005376 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001005377 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001005378 
   
  
    
    30x whole genome sequencing of samples from the VIKING Health Study - Shetland. 500 DNA samples were sequenced using the Illumina HiSeq X system. FASTQ files are deposited 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  500 
 
  
    EGAD00001005379 
   
  
    
    This study is the first to interrogate the whole mtDNA in BP patients and controls and to implicate multiple novel mtDNA variants in disease susceptibility. Whole mtDNA of German BP patients (n=180) and age- and sex-matched healthy controls (n=188) were sequenced using next generation sequencing (NGS) technology, followed by the replication study using Sanger sequencing of an additional independent BP (n=89) and control cohort (n=104). While the BP and control groups showed comparable mitochondrial haplogroup distributions, the haplogroup T exhibited a tendency of higher frequency in BP patients suffering from neurodegenerative diseases (ND) compared to BP patients without ND (p= 0.1448, Fisher’s exact test) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  368 
 
  
    EGAD00001005380 
   
  
    
    Contains RNAseq data for 14 transduced/non-transduced organoids 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001005381 
   
  
    
    Woodcock et al TenMenDeep EGA Dataset A. These are Illumina based deep sequencing data based on bait capture sequencing. See Woodcock et al methods for more detail. Note: the Amplicon sequencing data type is selected because the EGA Website currently has no option to select Bait Capture Sequencing or similar. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  117 
 
  
    EGAD00001005382 
   
  
    
    Woodcock et al TenMenDeep EGA Dataset B. These are Illumina based deep sequencing data based on bait capture sequencing. See Woodcock et al methods for more detail. Note: the Amplicon sequencing data type is selected because the EGA Website currently has no option to select Bait Capture Sequencing or similar. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  33 
 
  
    EGAD00001005383 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  20 
 
  
    EGAD00001005384 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001005385 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  64 
 
  
    EGAD00001005386 
   
  
    
    113 DNA samples were derived from the tumors of the low grade glioma patients and sequenced using Illumina WES (exome seq) paired-end technology.
Dataset contains 113 BAM files aligned to hg19 using BWA v.0.5.9. After mapping duplicated reads were removed, reads were re-aligned around InDels and read base quality score was re-calibrated. 
    
   
  
    
   
  - 
 
  
    EGAD00001005387 
   
  
    
    44 DNA samples were derived from the tumors of the low grade glioma patients and sequenced using Illumina RNAseq paired-end technology.
Dataset contains 44 BAM files aligned to hg19 using Tophat 2. 
    
   
  
    
   
  - 
 
  
    EGAD00001005388 
   
  
    
    Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al.
RNAseq (BAM files)
241 tumour samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005389 
   
  
    
    Whole genome sequencing of 35 osteosarcoma patients (primary, relapsed, and metastatic) with matched normals. Tumors were sequenced at target 60X and matched normals at target 30X. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  72 
 
  
    EGAD00001005390 
   
  
    
    Single Cell-RNA Seq IDHR132H Wild-type Primary GBM Female, 76. 
Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005391 
   
  
    
    SF11977 single cell RNA-seq IDHR132H Wild-type GBM Female, 61
Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005392 
   
  
    
    SF11956 IDHR132H WT GBM. Male, 63.
Single Cell RNA seq from high grade glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005393 
   
  
    
    SF11644 Primary GBM Gender Male age 57. Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005394 
   
  
    
    Single cell RNA-Seq Primary diffuse astrocytoma G2. IDH mutant, ATRX mutant. Gender Male Age 34. Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005395 
   
  
    
    Single cell Primary astrocytoma G2. IDH mutant, ATRX negative. Male, 44.
Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005396 
   
  
    
    Single cell RNA-Seq Low Grade Astrocytoma IDHR132H mutant. Male, 64.
Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005397 
   
  
    
    SF11949 Primary oligodendroglioma G3 IDH1 Mutant. Male, 40
Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005398 
   
  
    
    IDH1 mutant oligodendroglioma male, 40. 
Single Nuclei ATAC seq data from low grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005399 
   
  
    
    Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Female Age 51. 
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005401 
   
  
    
    Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Male Age 73. 
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005402 
   
  
    
    Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Female age 40. 
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005403 
   
  
    
    Single Nuclei RNA Seq of primary GBM. Gender Female Age 44.
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005405 
   
  
    
    IDH1 mutant GBM 55, Male.
Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005406 
   
  
    
    IDHR132H Wildtype GBM. Male, 63
Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005407 
   
  
    
    High grade glioma sample, Gender Male Age 46. 
Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005408 
   
  
    
    SF11331 Primary GBM Male,55
Single Nuclei ATAC seq data from high grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005409 
   
  
    
    SF10022 single nuclei RNA-Seq Primary High-grade glioma. gender Male Age 65.
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005410 
   
  
    
    Single Nuclei RNA-Seq Primary GBM. Gender Female Age 51. 
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005411 
   
  
    
    Single Cell RNA seq from Recurrent oligodendroglioma sample. Gender Male Age 67. 
NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005412 
   
  
    
    Single Nuclei RNA-Seq Primary IDHR132H Wild-type GBM. Male, 61.
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005413 
   
  
    
    SF11612 Recurrent oligodendroglioma. Gender Male Age 67. 
Single Nuclei ATAC seq data from low grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005414 
   
  
    
    Single Nuclei ATAC Seq IDHR132H mutant Astrocytoma . Male, 64.
Single Nuclei ATAC seq data from low grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005415 
   
  
    
    Single Nuclei RNA-Seq Primary High-grade Glioma. Gender male age 39. 
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005416 
   
  
    
    Organoid cultures were exposed to two different E.Coli strains and a dye control with three biological duplicates. Their original culture was harvested as a control. In total 10 organoid cultures were whole-genome sequenced using the Novaseq6000 platforms. The data is deposited as .bam format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001005417 
   
  
    
    Disease: Severe congenital deafness, early onset cataracts and various neurological features
Family:  3 affected individuals originated from the same small village (Amarat) in the Kayseri region of Turkey and belonging to the same large extended consanguineous family.
Dataset: 5 BAM files. Whole-genome sequencing (WGS) was applied to the three affected individuals (II.2, II.4 and II.7) and two healthy individuals (II.1 and II.3). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001005418 
   
  
    
    SF11979 snATAC, IDHR132H WT GBM Female, 76
Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005419 
   
  
    
    Diffuse large B-cell lymphoma (DLBCL) is the most common histologic subtype of non-Hodgkin lymphoma and is notorious for its clinical heterogeneity. Patient outcomes can be predicted by cell-of-origin (COO) classification, demonstrating that the underlying transcriptional signature of malignant B-cells informs biological behavior in the context of standard combination chemotherapy regimens. In the current study, we used mass cytometry (CyTOF) to examine tumor phenotypes at the protein level with single cell resolution in a collection of 27 diagnostic DLBCL biopsy specimens from treatment naïve patients. We found that malignant B-cells from each patient occupied unique regions in 37-dimensional phenotypic space with no apparent clustering of samples into discrete subtypes. Interestingly, variable MHC class II expression was found to be the greatest contributor to phenotypic diversity. Within individual tumors, a subset of cases showed multiple phenotypic subpopulations, and in one case we were able to demonstrate direct correspondence between protein-level phenotypic subsets and DNA mutation-defined subclones. In summary, CyTOF analysis can resolve both inter- and intra-tumoral heterogeneity among primary samples, and reveals that each case of DLBCL is unique and may be comprised of multiple, genetically distinct subclones. 
    
   
  
    
   
  17 
 
  
    EGAD00001005420 
   
  
    
    Here we performed single-cell RNA sequencing to address repertoire stability and subset plasticity during IL-15 driven homeostatic proliferation. Sorted NK cell subsets representing discrete stages of NK cell differentiation are compared with the corresponding subsets after proliferation and further sorted into two subsets depending on the rate of proliferation. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001005421 
   
  
    
    These are 21 metastatic melanoma exomes matched with 7 germlines from 7 multisite metastatic melanoma cases. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005422 
   
  
    
    Dataset contains 854 single cell sequenced colorectal cancer organoids. 
    
   
  
    
   
  854 
 
  
    EGAD00001005423 
   
  
    
    Whole Genome sequencing. 1 ug of genomic DNA from each lymph node sample was used for the construction of a TruSeq DNA PCR Free (350) library before sequencing in a Illumina HiSeq X Ten (2 × 151 bp).  Mean coverage 30x. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001005424 
   
  
    
    Exome Sequencing. 3 ug of genomic DNA from each lymph node sample were sheared and used for the construction of a paired-end sequencing library as described in the paired-end sequencing sample preparation protocol provided by Illumina. Enrichment of exonic sequences was then performed for each library using either the Sure Select Human All Exon 50 Mb or All Exon+UTRs v4 kits following the manufacturer’s instructions (Agilent Technologies). Exon-enriched DNA was pulled down by magnetic beads coated with streptavidin (Invitrogen), followed by washing, elution and 18 additional cycles of amplification of the captured library. Enriched libraries were sequenced in one lane of an Illumina GAIIx sequencer or in two lanes of a HiSeq2000 when using pools of eight samples. 
    
   
  
    
      
      unspecified 
      
    
   
  13 
 
  
    EGAD00001005425 
   
  
    
    Whole Exome Sequencing for a cohort of 20 B-ALL samples : 5 Down syndrome (DS), 7 Hyperdiploid (HeH), 3 iAMP21 and 5 others. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001005426 
   
  
    
    RNA-sequencing for a cohort of  B-ALL samples : 5 Down Syndrome (DS), 16 Hyperdiploid (HeH), 6 iAMP21, 9 other. 
RNA-sequencing for B-cell progenitors from 3 healthy. 
It also contains RNA-sequencing datasets of Patient-Derived Xenografts (X) developed from the B-ALL samples : 4 Down Syndrome (DS), 4 Hyperdiploid (HeH), 1 iAMP21, 3 other. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001005427 
   
  
    
    Three SpCas9-ABE (R785X/R785X) and three xCas9-ABE-repaired organoid clones (F508del/R553X) and their respective unrepaired control organoids were paired-end whole genome sequenced using Illumina Novaseq 6000 system.  The reads were mapped to hg19 genome assembly and data is provided as BAM files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001005428 
   
  
    
    Single Nuceli Primary GBM 73 Male.
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005429 
   
  
    
    Single Cell Prmary high grade glioma IDHR132H Wild-type Female 76
Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001005430 
   
  
    
    Single Nuclei Primary GBM IDHR132H Wildtype. Female 76.
Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001005431 
   
  
    
    Our understanding of the BCR repertoire in the context of immune-mediated diseases is incomplete, and defining this could provide new insights into pathogenesis and therapy. Here, we compared the BCR repertoire in systemic lupus erythematosus, anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitis, Crohn’s disease, Behçet’s disease, eosinophilic granulomatosis with polyangiitis, and immunoglobulin A (IgA) vasculitis by analysing BCR clonality, use of immunoglobulin heavy-chain variable region (IGHV) genes and—in particular—isotype use. An increase in clonality in systemic lupus erythematosus and Crohn’s disease that was dominated by the IgA isotype, together with skewed use of the IGHV genes in these and other diseases, suggested a microbial contribution to pathogenesis. Different immunosuppressive treatments had specific and distinct effects on the repertoire; B cells that persisted after treatment with rituximab were predominately isotype-switched and clonally expanded, whereas the inverse was true for B cells that persisted after treatment with mycophenolate mofetil. Our comparative analysis of the BCR repertoire in immune- mediated disease reveals a complex B cell architecture, providing a platform for understanding pathological mechanisms and designing treatment strategies. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  167 
 
  
    EGAD00001005432 
   
  
    
    To be added... 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  27 
 
  
    EGAD00001005433 
   
  
    
    The dataset contains plasma DNA methylation data derived from metastatic prostate cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  115 
 
  
    EGAD00001005434 
   
  
    
    Data supporting: "Genomic evidence supports a clonal diaspora model for metastases of esophageal adenocarcinoma." Noorani et al.
WGS (BAM files)
134 samples for 18 cases
Includes primary, lymph-node, distant metastatic, Barrett's and normal samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005435 
   
  
    
    This dataset contains 9 RNA-seq BAM files. RNA was derived from TERT promoter mutant GBM cell lines and sequenced on an Illumina HiSeq4000 sequencer with paired-end reads and an average read length of 50 base pairs. Reads were aligned with TopHat (v2.0.14) using a GENCODE V19 transcriptome-guided alignment. 
    
   
  
    
   
  9 
 
  
    EGAD00001005438 
   
  
    
    Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al.
scRNAseq (BAM files)
38 Barrett's and normal samples 
    
   
  
    
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001005439 
   
  
    
    This dataset contains small RNA sequencing data and mRNA capture sequencing data from 20 different human biofluids (amniotic fluid, aqueous humor, ascites, bile, bronchial lavage fluid, breast milk, cerebrospinal fluid, colostrum, gastric fluid, pancreatic cyst fluid, plasma, saliva, seminal fluid, serum, sputum, stool, synovial fluid, sweat, tear fluid and urine). In total, 180 samples were sequenced. Files are provided in fastQ format. Samples were sequenced on a NextSeq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  180 
 
  
    EGAD00001005442 
   
  
    
    Exome sequencing of ID trios and sibpairs 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  123 
 
  
    EGAD00001005443 
   
  
    
    Metagenomes of stool samples from 46 Lifelines control subjects (no antimicrobial use in the past three months before sampling, no occupational lifestock contact). Samples were sequenced and analysed as part of the EFFORT project and derived from the LifeLines cohort from the Northern parts of the Netherlands. http:///www.lifelines.nl. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  46 
 
  
    EGAD00001005444 
   
  
    
    Metagenomes of stool samples from 54 pig farmers, 24 broiler farmers and 70 slaughter line workers. Note: Access to the data will only be granted for antibiotic resistance studies in accordance with the EFFORT consents issued by the participants. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  148 
 
  
    EGAD00001005445 
   
  
    
    This dataset includes 44 bam files derived of 21 patients with IVL. Tumor samples are derived from cfDNA (n = 18), PDX (n = 4) and bone marrow (n = 2). Normal samples are derived from peripheral blood. 
    
   
  
    
   
  44 
 
  
    EGAD00001005446 
   
  
    
    Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of 14 healthy individuals, 8 patients with mild/severe rheumatoid arthritis, 1 patient with systemic lupus erythematosus/rheumatoid arthritis, 2 patients with ulcerative colitis and 2 patients with Chrohn's disease. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end 75bp sequencing was performed on NextSeq500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001005448 
   
  
    
    Single cell RNA sequencing (scRNA-seq) is widely used for profiling transcriptomes of individual cells. The droplet-based 10X Genomics Chromium (10X) approach and the plate-based Smart-seq2 full-length method are two frequently-used scRNA-seq platforms, yet there are only a few thorough and systematic comparisons of their advantages and limitations. Here, by directly comparing the scRNA-seq data by the two platforms from the same samples of CD45- cells, we systematically evaluated their features using a wide spectrum of analysis. Smart-seq2 detected more genes in a cell, especially low abundance transcripts as well as alternatively spliced transcripts, but captured higher proportion of mitochondrial genes. The composite of Smart-seq2 data also resembled bulk RNA-seq data better. For 10X-based data, we observed higher noise for mRNA in the low expression level. Despite the poly(A) enrichment, approximately 10-30% of all detected transcripts by both platforms were from non-coding genes, with lncRNA accounting for a higher proportion in 10X. 10X-based data displayed more severe dropout problem, especially for genes with lower expression levels. However, 10X-data can better detect rare cell types given its ability to cover a large number of cells. In addition, each platform detected different sets of differentially expressed genes between cell clusters, indicating the complementary nature of these technologies. Our comprehensive benchmark analysis offers the basis for selecting the optimal scRNA-seq strategy based on the objectives of each study. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  78 
 
  
    EGAD00001005449 
   
  
    
    This dataset includes ChIP-seq data from two cell lines (HKCI-11 (GOFp53) and MIHA(WT p53)).  All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001005450 
   
  
    
    This dataset contains target capture sequence data from 255 samples, including 154 tumors and 101 normal samples.  All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  255 
 
  
    EGAD00001005451 
   
  
    
    This dataset contains whole genome sequence data from 24 samples, including 16 tumors and 8 normal samples.  All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD00001005452 
   
  
    
    This dataset contains whole genome sequence data from 12 samples from 1 patient, including 8 tumor sectors and 4 normal samples.  All the experiments were performed on Illumina HiSeq platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001005453 
   
  
    
    This dataset contains whole exome sequence data from 86 samples from 6 patient.  All the experiments were performed on Illumina HiSeq platform with raw reads stored in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001005454 
   
  
    
    Illumina platform sequencing of whole genome libraries prepared from paired tumour/normal samples from 103 cases of melanoma Uveal subtype 
    
   
  
    
   
  - 
 
  
    EGAD00001005455 
   
  
    
    Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al.
Single cell metadata and analysis
38 Barrett's and normals 
    
   
  
    
   
  - 
 
  
    EGAD00001005456 
   
  
    
    There are two samples, 42 (control) and 49
To test the role of activated CRLF2/IL7RA in leukemia initiation we expressed CRLF2 together with IL7RA in human CB hematopoietic progenitors. Human CRLF2 and wild type and/or activated mutant form of human IL7RA (IL7RAwt/ins) were cloned into a lentiviral vector with a bi-cistronic cassette under the expression control of an Eμ-B29 promoter/enhancer to augment expression in B-cell precursors. Backbone vector expressing GFP (BB).
Whole genome sequencing
Leukemic (49) and BB transduced (42) corresponding CB cells were collected from transplanted mice. Sequencing libraries were prepared from these samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005457 
   
  
    
    Whole-exome sequencing (WES) and whole-genome sequencing (WGS) were performed on matched adjacent normal tissues, multiregionally sampled adenomas at different stages and carcinomas from 5 patients with FAP and 1 patient with MUTYH-associated polyposis (MAP) (n=56 exomes; n=56 genomes; n=8,757 single cells). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  165 
 
  
    EGAD00001005458 
   
  
    
    Whole exome sequencing of 15 DNA samples, and whole genome sequencing of 2 matched DNA samples. Whole exome sequencing is of RMS samples (both alveolar and embryonal) and from cell lines as well as patient samples. Patient samples are of pediatric RMS patients. 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001005459 
   
  
    
    Whole genome sequencing of HSPC and SI clones of 2 disomy- and 1 trisomy 21 fetuses samples (HiSeq X Ten samples). 5 disomy clones and 5 trisomy clones were included in this experiment. Three bulk samples were also included. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  13 
 
  
    EGAD00001005460 
   
  
    
    BAM files corresponding to sequencing of 18 circulating tumor DNA and matched tumor samples from SCLC patients. Each ctDNA sample was sequenced twice. 
    
   
  
    
      
      Ion Torrent PGM 
      
      Ion Torrent Proton 
      
    
   
  18 
 
  
    EGAD00001005461 
   
  
    
    This dataset contains two experiments. 1) Single cell RNA-seq of diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not). For some of the patients, multiple indipendent plates were produced (each plate is a sample). 2) in vitro prednisolone exposure experiment. diagnostic bone marrow samples from patient 4662R were cultured for three days with and without prednisolone.
Single cell experiments were conducted according to Muraro et al (cell systems, 2016, doi:10.1016/j.cels.2016.09.002). Cell barcodes and UMI sequences are embedded in the header of each fastq entry. Cell barcodes irrelevant to this experiment were removed before submission. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001005462 
   
  
    
    BAM files corresponding to sequencing of 28 circulating tumor DNA and matched tumor samples from SCC patients. Each ctDNA sample was sequenced twice. 
    
   
  
    
      
      Ion Torrent PGM 
      
      Ion Torrent Proton 
      
    
   
  28 
 
  
    EGAD00001005463 
   
  
    
    BAM files corresponding to the sequencing of 125 circulating cell-free DNA from 125 healthy patients. Each sample was sequenced twice. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  125 
 
  
    EGAD00001005464 
   
  
    
    Single-cell RNA-seq of tumor-infiltrating lymphocytes from 14 cancer patients before treatment, taken from tumor, normal adjacent tissue, and peripheral blood.  Dataset consists of paired-end FASTQ files, including replicate libraries and runs. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  45 
 
  
    EGAD00001005465 
   
  
    
    Single-cell TCR-seq of tumor-infiltrating lymphocytes from 14 cancer patients before treatment, taken from tumor, normal adjacent tissue, and peripheral blood.  Dataset consists of paired-end FASTQ files, including replicate libraries and runs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  88 
 
  
    EGAD00001005466 
   
  
    
    Whole genome sequencing of 100 unrelated Uzbeks in order to impute genotypes into PE cases and controls from Uzbekistan and to provide genetic data and infrastructure for future genetic studies in Uzbekistan and Central Asia more generally and to fill a gap in worldwide information as Central Asia is not adequately represented in available genomic data. This dataset is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001005467 
   
  
    
    Whole genome sequencing of 100 unrelated Kazakhs in order to impute genotypes into PE cases and controls from Kazakhstan and to provide genetic data and infrastructure for future genetic studies in Kazakhstan and Central Asia more generally and to fill a gap in worldwide information as Central Asia is not adequately represented in available genomic data. This dataset is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001005468 
   
  
    
    Dataset includes 2 scRNA-seq samples from a 6.5-7 post-conception weeks human embryonic heart and 19 samples from 4.5-9 post-conception weeks human embryonic hearts analyzed with the Spatial Transcriptomics method. H&E stains can be sent if requested. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001005469 
   
  
    
    This includes variant calls (single nucleotide variants and small insertions/deletions) from 8086 (mostly British Pakistani/British Bangladeshi) individuals from the following studies:
1. 5236 British Pakistani/British Bangladeshi adults from East London Genes and Health (ELGH)
2. 2624 British South Asian mothers from Born in Bradford (mostly Pakistani) (BiB)
3. 1061 British South Asian adults from Birmingham (mostly Pakistani) (Birm)
All of the Birmingham and most of the Born in Bradford samples were previously sequenced as part of PMID: 26940866.
In the sample list file, the columns of interest to most people will be:
    vcf.id - sample ID from the vcf
    cohort - which cohort they're in
    sex.assigned - sex inferred from coverage on the X and Y chromosomes. Individuals for whom this did not match their reported sex have been discarded
    total, chrX and chrY - coverage within bait regions across all chromosomes, chrX and chrY respectively 
Mapping was done with bwa-mem and variant calling was carried out with GATK HaplotypeCaller. We removed variant sites for which the following was true: SNPs: "QD < 2.0 || FS > 30 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0" Indels: "QD < 2.0 || FS > 30 || ReadPosRankSum < -20.0" 
    
   
  
    
   
  - 
 
  
    EGAD00001005470 
   
  
    
    Whole-exome sequencing data from Illumina NextSeq 500. It consists of 88 paired-end FASTQ files from 44 primary, residual, relapsed tumors and normal samples from the blood. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  44 
 
  
    EGAD00001005472 
   
  
    
    Low coverage nanopore sequencing of ovarian cancer tumors 
    
   
  
    
      
      GridION 
      
      MinION 
      
    
   
  4 
 
  
    EGAD00001005473 
   
  
    
    Low coverage nanopore sequencing of prostate cancer tumors 
    
   
  
    
      
      GridION 
      
    
   
  5 
 
  
    EGAD00001005474 
   
  
    
    This dataset contains all available targeted and exome sequencing paired fastq files from our study, "Identification of hypermutation and defective mismatch repair in ctDNA from metastatic prostate cancer". Patient identifiers are denoted by the first three characters of the sample aliases (e.g. "P01"), and additional information is appended to reflect the panel used (targeted 73 gene panel: "PC", or whole-exome panel: "WXS"), and whether the sample represents cell-free DNA ("cfdna") or paired white-blood cell control ("wbc"). Several patients have multiple serial collections available, and these are denoted by the characters "C1, C2, C3," etc. All samples were sequenced using Illumina technology. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  154 
 
  
    EGAD00001005475 
   
  
    
    The dataset includes exome sequencing results for a patient with SSBP1 mutations that cause a complex optic atrophy spectrum disorder 
    
   
  
    
   
  1 
 
  
    EGAD00001005476 
   
  
    
    WGS Nanopore nanopore sequencing of organoid line HGS-3.1 and matching blood reference HGS-3 
    
   
  
    
      
      MinION 
      
    
   
  2 
 
  
    EGAD00001005477 
   
  
    
    LBC1921 and LBC1936 GVCFs called with GATK's HaplotypeCaller were combined and subject to variant quality score recalibration. This VCF contains the subset of samples (n = 296) from the LBC1921 cohort. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  296 
 
  
    EGAD00001005478 
   
  
    
    LBC1921 and LBC1936 GVCFs called with GATK's HaplotypeCaller were combined and subject to variant quality score recalibration. This VCF contains the subset of samples (n = 1068) from the LBC1936 cohort. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  1068 
 
  
    EGAD00001005479 
   
  
    
    Whole-exome sequencing coupled with RNA-seq of preinvasive (n=98) and invasive (n=99) lung adenocarcinoma samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  394 
 
  
    EGAD00001005480 
   
  
    
    Whole-genome sequencing (WGS) data for 546 Singaporean volunteers used to estimate WGS-LTL in the study. Samples were sequenced using Illumina Hiseq X to a mean coverage of 30X. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  546 
 
  
    EGAD00001005481 
   
  
    
    This dataset contains single cell RNA sequencing data of PBMC samples from 10 bladder cancer patients. cDNAs and single cell RNA libraries were prepared following manufacturer’s user guide (10x Genomics). Each library was sequenced in HiSeq4000 (Illumina) to achieve ~300 million reads following manufacturer’s sequencing specification. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001005482 
   
  
    
    Bacterial 16S V4 rDNA was amplified using two differently barcoded V4 fusion primers. Pooled PCR samples were purified and paired-end sequenced on MiSeq instrument for 250 cycles. The steps from DNA quantification to sequencing were conducted at Second Genome Inc. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  109 
 
  
    EGAD00001005483 
   
  
    
    These are caveman, pindel, battenberg and brass calls for index patients' metastatic melanoma genomes within this study. 
    
   
  
    
   
  - 
 
  
    EGAD00001005484 
   
  
    
    WXS files for Zhang PanNBL paper titled "Pan-neuroblastoma analysis reveals age- and signature-associated driver alterations" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  634 
 
  
    EGAD00001005486 
   
  
    
    This dataset contain WGBS sequencing result of HEMa_LP. The cells were cultured in Medium 254 supplemented with PMA-Free Human Melanocyte Growth Supplement-2 (HMGS-2) under 37°C,5% CO2. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001005487 
   
  
    
    These are caveman, pindel and sequenza calls for the metastatic melanoma exomes within this study. 
    
   
  
    
   
  - 
 
  
    EGAD00001005488 
   
  
    
    Paired tumor/normal WGS and RNA-seq of primary neuroblastoma. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  117 
 
  
    EGAD00001005489 
   
  
    
    Nimblegen SeqCap (sequence capture) deep targetted DNA sequencing pNET 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  98 
 
  
    EGAD00001005491 
   
  
    
    The dataset contains WGS and RNA-seq from Myeloma XI trial 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  246 
 
  
    EGAD00001005492 
   
  
    
    Content: 60 GB patient tumours and 4 normal brain samples combined in pairs by region (x2=8 total input samples). 
RNAseq: 1 lane per sample, total strand-specific rRNA-depleted (normal samples were combined = 2 lanes/samples per brain region).
WGBS: 2 lanes per sample (normal samples were combined = 2 lanes/samples per brain region).
ChIPseq (histone mark): a subset of 20 GB samples were profiled. For the same modification were multiplexed and sequenced on 4 lanes each (H3K27ac, H3K4me1) or a single lane (all others).
WGS: used as matching input control for the 20 ChIPseq samples.
Data type and technology: 
RNA-seq: PE 100bp sequenced on HiSeq2000.
WGBS: PE 100bp sequenced on HiSeq2000/4000.
ChIPseq: SE 50bp sequenced on HiSeq2000/4000.
WGS: PE 150bp sequenced on HiSeq X. 
    
   
  
    
   
  172 
 
  
    EGAD00001005493 
   
  
    
    Content: 2 GB RTK I cell lines (LN229, ZH487) in two conditions (NT control and shSOX10).
RNAseq: single replicates per condition, polyA+ RNA sequencing, SE.
ATACseq: biological replicates per condition, SE.
ChIPseq (histone H3 modifications, LN229 only): all marks for each condition were pooled and sequenced on two lanes for each pool.
ChIPseq (BRD4 and SOX10): SOX10 libraries were sequenced on single lanes. BRD4 samples were multiplexed and sequenced in two lanes.
ChIPseq input samples are also included.
Data type and technology: 
RNAseq: SE 50bp sequenced on HiSeq2000/4000.
ATACseq: SE 50bp sequenced on HiSeq2000/4000.
ChIPseq: SE 50bp sequenced on HiSeq2000/4000. 
    
   
  
    
   
  6 
 
  
    EGAD00001005494 
   
  
    
    Nimblegen SeqCap Custom Panel Sequencing for pNet 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  96 
 
  
    EGAD00001005495 
   
  
    
    The genomic hallmark of clear cell renal cell carcinoma is the loss of the short arm of chromosome three. This appears to be the earliest genomic event in the formation of these cancers. Often chromosome 3 is lost at the same time as part of chromosome 5 is duplicated via an unbalanced translocation, often with features consistent with focal chromothripsis. In this study, we sought to reconstruct the chromothriptic event that underlies the initiation of kidney cancer. We used long read sequencing (promethION, Oxford Nanopore Technologies) of patient tumour-derived DNA to elucidate how a single cell division error can generate cancer genome complexity. 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001005497 
   
  
    
    TTN gene targeted sequencing for AMC cohort (n=24) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  24 
 
  
    EGAD00001005498 
   
  
    
    Whole Exome sequencing of a set of Spanish patients suffering rare genetic diseases. The set consists of 3 patients, two were diagnosed with Aniridia (ANI-0006 and ANI-0023) and another one was diagnosed with Retinitis Pigmentosa (RP-0247). 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001005499 
   
  
    
    Targeted next-generation sequencing of 13 pediatric bithalamic diffuse gliomas.  BAM files of targeted next-generation DNA sequencing data of 13 pediatric gliomas, with multi-region sequencing data from 2 of these cases (17 total tumor samples). Genomic DNA was extracted from formalin-fixed, paraffin-embedded blocks of tumor tissue from using the QIAamp DNA FFPE Tissue Kit (Qiagen). Capture-based next-generation DNA sequencing was performed at the University of California, San Francisco Clinical Cancer Genomics Laboratory, using an assay that targets all coding exons of 480 cancer-related genes, select introns of 47 genes, and TERT promoter with a total sequencing footprint of 2.8 Mb (UCSF500 Cancer Panel). Sequencing libraries were prepared from genomic DNA, and target enrichment was performed by hybrid capture using a custom oligonucleotide library (Nimblegen SeqCap EZ Choice). Captured libraries were sequenced as paired-end 100 bp reads on an Illumina HiSeq 2500 instrument. Duplicate sequencing reads were removed computationally to allow for accurate allele frequency determination and copy number calling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  17 
 
  
    EGAD00001005500 
   
  
    
    Illumina platform sequencing of whole genome libraries prepared from paired tumour/normal samples from 87 cases of melanoma Acral subtype. 63 cases also have RNASeq sequencing from the tumour sample. 
    
   
  
    
   
  - 
 
  
    EGAD00001005501 
   
  
    
    Illumina RNASeq sequencing of tumour samples from 41 cases of melanoma 
    
   
  
    
   
  - 
 
  
    EGAD00001005502 
   
  
    
    TST170 DNA FASTQ files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001005503 
   
  
    
    DNA BAM files 
    
   
  
    
   
  16 
 
  
    EGAD00001005504 
   
  
    
    18 WGBS lanes for 9 samples of pilocytic astrocytoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001005506 
   
  
    
    WGS files for Mullighan_GL_reALL paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  99 
 
  
    EGAD00001005507 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD00001005508 
   
  
    
    3' mRNA-Seq obtained from distinct isolated cell types (epithelia cells,immune cells, fibroblasts) of endoscopically obtained esophageal adenocarcinoma tissue as well as normal esophageal mucosa. Libraries for RNA-sequencing were prepared using the QuantSeq 3' mRNA-Seq Library Prep Kit FWD for Illumina according to the low input protocol. Libraries were sequenced on a HiSeq 4000 (Illumina) by 1x 50 bases. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001005509 
   
  
    
    WXS files for Mullighan_GL_reALL paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  276 
 
  
    EGAD00001005510 
   
  
    
    RNAseq files for Mullighan_GL_reALL RNASEQ2 paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  34 
 
  
    EGAD00001005511 
   
  
    
    RNASeq files for Mullighan_GL_reALL RNASEQ1 paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  81 
 
  
    EGAD00001005512 
   
  
    
    RNAsequencing data from human pancreatic islets from 191 donors, Lund University. Processed for the Inspire consortium. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  191 
 
  
    EGAD00001005519 
   
  
    
    Files from DNA and RNA sequencing from primary tumors and metastases from pancreatic cancer patients along with matched normal tissues. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  252 
 
  
    EGAD00001005520 
   
  
    
    This dataset consists of three bam files (two cell-free DNA and one germline DNA) from a metastatic bladder cancer patient with BAP1 variants. Bam files were generated from targeted Illumina sequencing data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  3 
 
  
    EGAD00001005521 
   
  
    
    Genotyping data (Imputed) from human pancreatic islets from 191 donors from Lund that were analysed as part of the Inspire consortium. 
    
   
  
    
   
  191 
 
  
    EGAD00001005523 
   
  
    
    Phenotype data from human pancreatic islets from 191 donors, Lund University. Processed for the Inspire consortium. 
    
   
  
    
   
  191 
 
  
    EGAD00001005524 
   
  
    
    Colorectal cancer (CRC) is characterized by functional intratumor heterogeneity that  shares many similarites with the hierarchical organization of the normal intestinal epithelium. In order to relate transcriptional subtypes to functional tumor cell heterogeneity we applied scRNA-seq to 12 patient-derived CRC spheroid cultures. We identified shared expression programs that relate to intestinal lineages and revealed metabolic signatures that are linked to cancer cell differentiation. In addition, we validated and complemented sequencing results by quantitative microscopy using live-dyes and multiplexed RNA fluorescence in situ hybridization, thereby revealing metabolic compartmentalization and potential cell-cell interactions. Finally, we demonstrate  functional differences between metabolically distinct lineage subtypes that might have strong implications for future treatment strategies of CRC. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8714 
 
  
    EGAD00001005525 
   
  
    
    Validation data containing sequencing data of 13 samples. An hybrid capture approach was used to validate findings of both Manta and GRIDSS for the samples in the validation set. The dataset also contains the reference sequences used. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001005526 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  180 
 
  
    EGAD00001005707 
   
  
    
    WGS of more samples in the ovarian cancer organoid biobank dataset. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  23 
 
  
    EGAD00001005709 
   
  
    
    To identify what factors cause a different reactivity to MLN4924, 15 cells were categorized into high, intermediate, and low MLN4924 resistance groups based on the half-maximal inhibitory concentration (IC50) of MLN4924. 
PDC1, PCD2, PDC3, PDC4, and PDC5 showed high MLN4924 sensitivity, whereas PDC12, PDC13, PDC14, and PDC15 showed low MLN4924 sensitivity. 
Whole-transcriptome sequencing of these 9 patient-derived glioblastoma stem cells was performed. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001005710 
   
  
    
    TST170 Pilot RNA VCF 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001005711 
   
  
    
    TST170 Pilot RNA FASTQ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001005712 
   
  
    
    46 BAM files from 23 urothelial bladder cancer patients on an immunotherapy clinical trial. PBMC normal samples and solid tumor samples are paired. Alignment was done by BWA with reference genome hg19. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001005713 
   
  
    
    Whole exome sequencing and RNA sequencing data from 30 patients with prostate cancer. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  87 
 
  
    EGAD00001005714 
   
  
    
    Single cell atlas of human airways from 10 healthy volunteers by 10X Genomics 3’ RNA-seq profiling. 77,969 cells were collected by bronchoscopy at 35 distinct locations, from the nose to the 12th division of the airway tree, either by forceps (46,791 cells), or brush biopsies (31,178 cells). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  35 
 
  
    EGAD00001005715 
   
  
    
    RNA-Seq data of 36 HPV-negative HNSCC specimens from patients treated at The Netherlands Cancer Institute, Amsterdam. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  36 
 
  
    EGAD00001005716 
   
  
    
    RNA-Seq data of 55 HPV-negative HNSCC specimens from patients treated at the VU medical Centre, Amsterdam, The Netherlands. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  46 
 
  
    EGAD00001005717 
   
  
    
    RNA-Seq data of 17 HPV-negative HNSCC specimens from patients treated at the MAASTRO clinic, Maastricht, The Netherlands. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  17 
 
  
    EGAD00001005718 
   
  
    
    Low-coverage whole genome sequencing data of 37 HPV-negative HNSCC specimens from patients treated at The Netherlands Cancer Institute. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for WGS to a depth of approx. 0.5X. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001005719 
   
  
    
    Low-coverage whole genome sequencing data of 37 HPV-negative HNSCC specimens from patients treated at the VU Medical Centre, Amsterdam. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for WGS to a depth of approx. 0.5X. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001005720 
   
  
    
    RNA-Seq data of 8 HNSCC specimens from patients diagnosed with metastatic disease at The Netherlands Cancer Institute, Amsterdam. Primary HNSCC biopsy samples obtained prior to treatment were used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001005721 
   
  
    
    RNA-Seq data of 25 HNSCC specimens from patients treated at The Netherlands Cancer Institute, Amsterdam and enrolled in the ARTFORCE trial. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001005722 
   
  
    
    RNA-Seq data of 28 HPV-negative HNSCC specimens from patients treated at the Netherlands Cancer Institute (Amsterdam), VU Medical Centre (Amsterdam) or MAASTRO Clinic (Maastricht) in The Netherlands. HNSCC biopsy samples were obtained prior to treatment (chemo-radiotherapy) for the prospective study within the DESIGN project and used for polyA mRNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001005723 
   
  
    
    This dataset contains sequencing data from a large-scale study of mtDNA variations measured, using a sensitive mtDNA-targeted sequencing method called STAMP, in lymphoblast and blood samples of Huntington’s Disease patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2602 
 
  
    EGAD00001005724 
   
  
    
    This dataset contains whole genome sequencing data from Illumina short-reads sequencing (2X150bp) and 10X Genomics linked-reads sequencing. Both the sequencing technologies were used to sequence MCF7 cell line and a primary breast triple-negative cancer sample. The fastq of paired-end reads for both the samples sequenced with both the technologies is available. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001005728 
   
  
    
    aCGH CNV detection by CNsolidate for 6,827 DDD probands 
    
   
  
    
   
  - 
 
  
    EGAD00001005729 
   
  
    
    WGS files for Mullighan BiTE WGS paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001005730 
   
  
    
    WXS files for Mullighan BiTE WXS paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001005731 
   
  
    
    RNAseq files for Mullighan BiTE RNASEQ1 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  41 
 
  
    EGAD00001005732 
   
  
    
    lowinput RNASEQ files for Mullighan BiTE RNASEQ2 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001005733 
   
  
    
    single cell RNASEQ files for Mullighan BiTE RNASEQ3 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001005734 
   
  
    
    Exome Sequencing and RNA Sequencing Data for PDX Samples 
    
   
  
    
      
      Illumina Genome Analyzer 
      
    
   
  30 
 
  
    EGAD00001005735 
   
  
    
    This data set contains the raw .fastq files from two RNA-sequencing experiments and two small RNA-sequencing experiments. Both control brain tissue and tissue from sufferers of mesial temporal lobe epilepsy were sequenced. Two different brain regions were sequenced; the cortex and the hippocampus. For more details please see: Mills, James D., et al. "Coding and non-coding transcriptome of mesial temporal lobe epilepsy: Critical role of small non-coding RNAs." Neurobiology of disease 134 (2020): 104612. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  33 
 
  
    EGAD00001005736 
   
  
    
    In the brain the cells that control inflammation are called a type of white blood cell called microglia. Microglia are located throughout the brain and spinal cord and account for 10–15% of all cells found within the brain. As the resident white blood cells, they are the main active immune defence in the central nervous system (CNS). Microglia are part of an important class of cells known as macrophages that have two main states: M1 and M2. M1 cells are pro- inflammatory, leading to more inflammation, while M2 are anti-inflammatory, and drive wound healing. In this study, we will collect primary microglia from surgical biospies of 100 individuals. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
   
  - 
 
  
    EGAD00001005737 
   
  
    
    WES using IDT xGen Research Exome on Illumina NovaSeq 2x150bp: Normal sample (buffy coat), cecum tumor biopsy at diagnosis, ileocecal valve region tumor sample at week 19, pericolonic metastasis at week 19, lymph node metastasis at week 19, peritoneal metastasis at week 19. Week 19 samples from hemicolectomy. Deep coverage cfDNA NGS using PanCeq pan-cancer panel on NextSeq 2x150bp: week 2 and week 10. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001005738 
   
  
    
    79 RNAseq samples from 56 patients with melanoma who have undergone immune checkpoint blockade immunotherapy. 
    
   
  
    
   
  - 
 
  
    EGAD00001005739 
   
  
    
    Single-cell gene expression was profiled for 22 Hodgkin lymphoma tumors and 5 reactive lymph nodes (2 replicates were performed for RLN-1). Library preparation was performed with the 10x Chromium platform (3' version 2 assay). Sequencing was performed on an Illumina NextSeq. The BAM files were generated from the raw sequencing data using Cell Ranger (v2.1.0) mkfastq and count commands. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001005740 
   
  
    
    To define the cellular characteristics of malignant ascites of advanced gastric cancer patients and search for therapeutic strategies, we obtained 5 malignant ascites and 1 cerebrospinal fluid from five patients with gastric cancer. We analyzed single-cell RNA-seq data of 180 cells from 4 malignant ascites and 1 cerebrospinal fluid metastasis using Fluidigm® C1™ System. Whole exome sequencing data was also generated from blood or tumor tissue. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001005741 
   
  
    
    When comparing the differentiation capacities of pluripotent stem cell lines that have different
genetic backgrounds, batch to batch experimental variablility poses a significant challenge,
especially when trying to identify smaller effects. One way to address this issue is to
differentiate several different lines in the same culture dish, thereby elimating experimental
variation. In addition, it allows researchers to analyze many more lines with less experiments.
Parallel single cell RNA-Seq exploits that individual cells are tagged and hence each cell can
be reliably assigned to the donor of origin based on the genetic variants it contains. In
addition, analyzing the genetic signature of single cells within a differentiating population can
reveal differentation stages that are not easily detected in bulk RNAseq data.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  13433 
 
  
    EGAD00001005743 
   
  
    
    Fastq files for 80 Multiple Myeloma Patients and 12 cell-lines 
    
   
  
    
      
      454 GS Junior 
      
    
   
  12 
 
  
    EGAD00001005744 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001005745 
   
  
    
    This dataset contains whole genome sequencing data aligned to the b37 reference genome for 4 spatially and temporally distinct tumors from one patient with a matched normal blood sample. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001005746 
   
  
    
    Whole Exome sequencing of a set of Spanish patients suffering rare genetic diseases. The set consists of 4 patients, one was diagnosed with Retinitis Pigmentosa (RP-1629), another one was diagnosed with Macular Dystrophy (MD-0235) and two were diagnosed with Leber's Congenital Amaurosis (LCA-0081 and LCA-0103). 
    
   
  
    
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001005747 
   
  
    
    RNAseq sample used in study titled "Immune-awakening revealed by peripheral T cell dynamics after one cycle of immunotherapy". 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005748 
   
  
    
    Exome sequencing data from patient with Chronic Lymphocytic Leukemia. DNA was extracted from sorted B-CLL and T cells or granulocytes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001005749 
   
  
    
    This study reports the whole-genome sequencing data for 20 inflammatory breast cancer patients, each of whom has one normal blood sample and one breast tumor sample. Overall, there are 40 files included in this study, in the format of BAM. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  40 
 
  
    EGAD00001005750 
   
  
    
    The dataset for white blood cell and cell-free DNA analyses for detection of residual disease in gastric cancer includes 169 bam files from targeted deep sequencing on the Illumina HiSeq2500.  The samples analyzed include genomic DNA from white blood cells and cell-free DNA from longitudinal blood collections of patients with gastric cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  167 
 
  
    EGAD00001005751 
   
  
    
    In this study we aim to characterise the landscape of mutation and clonal selection in the human pancreas. The study combines targeted sequencing and whole-genome sequencing of microbiopsies from the pancreas. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with pancreatic ductal adenocarcinoma.
This dataset contains all the data available for this study on 2019-12-17. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  136 
 
  
    EGAD00001005753 
   
  
    
    Four micrograms of total RNA was used for cDNA library construction using the KAPA Stranded mRNA-Seq Kit (KR0960-v3.15), following manufacturer's protocol. The adaptor-ligated libraries were enriched by 6 cycles of polymerase chain reaction (PCR). Libraries were sequenced using the Novaseq 6000 with paired end 151bp reads. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina NovaSeq 6000 
      
    
   
  55 
 
  
    EGAD00001005754 
   
  
    
    Five hundred fifty nanograms of genomic DNA were input for library preparation after fragmentation by Covaris S2, following the KAPA Hyper Prep Kit (KR0961-V1.14) protocols, with selection for a library size range of 250-450 bp. Three hundred nanograms per library DNA each from 12 samples were normalized and combined into a single pool for exome capture using the xGen Lockdown Probes and Reagents based on their standard protocols 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 1500 
      
      Illumina NovaSeq 6000 
      
    
   
  117 
 
  
    EGAD00001005756 
   
  
    
    Paired-end DNA-seq FASTQ files from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). Each sample was multiplexed across flowcells and lanes, leading to a total number of 86 FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001005757 
   
  
    
    Paired-end DNA-seq BAM files from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). FASTQ files were processed at the CNAG (Barcelona) using the GEM short-read aligner on the human genome version hs37d5, producing a total of 16 BAM files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001005758 
   
  
    
    VCF file from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). BAM files were processed at the CNAG (Barcelona) with their pipeline, including GATK v3.6 for genotyping and other tools such as snpEff for annotating variants, to produce this VCF file with a total of 9,643,070 variants, out of which 7,891,370 are SNVs. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001005759 
   
  
    
    Five hundred nanograms of genomic DNA was fragmented by Covaris S2, the fragmented DNAs were performed end-repair, A-tailing at the 3 prime end, adaptors ligation with an IDT dual-indexed UMI adaptor system at the terminal ends.  The adapter ligated library with size range 300-750bp were selected by dual-SPRI method.  Twenty percent of the size selected PCR-free libraries were enriched by 5 PCR cycles prior to library size assessment by Bioanalyzer Fragment Analyzer. The PCR-free libraries were quantified by qPCR.The PCR-free libraries were denatured and diluted to optimal concentration. Illumina NovaSeq 6000 was used for Pair-End 151bp sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001005760 
   
  
    
    Transcriptomics for samples obtained from six patients (MBR01, MBR03, MBR05, MBR07, MBR10, MBR11) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001005761 
   
  
    
    Bevacizumab is an approved anti-angiogenic drug for patients with metastasized colorectal cancer (mCRC) targeting VEGF. The survival benefit of anti-VEGF therapy in mCRC patients is limited to a few months and acquired resistance mechanisms are greatly unknown. Using plasma DNA, we studied the evolution of tumor genomes in a cohort of patients with mCRC (n=150) and observed a recurrent focal amplification (8.7% of cases) on chromosome 13q12.2. Analysis of TCGA data (n=619) suggested an association with later stages, which we confirmed by longitudinal plasma analyses. We defined the minimally amplified region and studied the mechanistic consequences of copy number gain of the involved genes. The amplification of one gene, POLR1D, impacted cell proliferation, resulting in upregulation of VEGFA, an important regulator of angiogenesis which has been implicated in the resistance to bevacizumab. In several patients, we observed the emergence of this 13q12.2 amplicon under bevacizumab treatment, which was invariably associated with evolution of therapy resistance. Hence, we describe a novel resistance mechanism against a widely applied treatment in mCRC patients which will impact clinical management . 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  38 
 
  
    EGAD00001005763 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001005764 
   
  
    
    Whole exome sequencing study for 8 pairs of primary NSCLCs and distant metastases. This dataset includes a total of 30 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001005765 
   
  
    
    RNA sequencing study for 8 pairs of primary NSCLCs and distant metastases. This dataset includes a total of 29 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001005766 
   
  
    
    Newly generated 52 gastric tumor specimens were subjected for targeted-exome and/or whole-transcriptome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001005767 
   
  
    
    The appearance of type 1 diabetes (T1D)-associated autoantibodies is the first and only measurable parameter to predict progression toward T1D in genetically susceptible individuals. However, autoantibodies indicate an active autoimmune reaction, wherein the immune tolerance is already broken. Therefore, there is a clear and urgent need for new biomarkers that predict the onset of the autoimmune reaction preceding auto-antibody positivity or reflect progressive b-cell destruction. Here we report the mRNA sequencing-based analysis of 306 samples including fractionated samples of CD4+ and CD8+ T cells as well as CD4, CD8 cell fractions and unfractionated PBMC samples longitudinally collected from seven children who developed beta-cell autoimmunity (case subjects) at a young age and matched control subjects. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  306 
 
  
    EGAD00001005768 
   
  
    
    It was a single-cell RNA sequencing study on the PBMC samples from four Finnish children at risk of developing Type 1 diabetes and their gender age and HLA matched control children. All four Case children were positive for multiple islet specific autoantibodies and two of them also progressed to clinical disease during the follow up whereas the control children remain negative for all autoantibodies. Single-cell analysis confirmed some of the signatures obtained from the bulk data. It identified that high IL32 in case samples in the bulk RNA-seq was contributed mainly by activated T cells and NK cells. Trajectory analysis of the scRNA-seq data suggested that IL32 expression increased as the T cells moved towards activated state. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  8 
 
  
    EGAD00001005769 
   
  
    
    Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the commonest structural genomic variant in myelodysplastic syndromes (MDS). Lenalidomide (LEN) is the treatment of choice for patients with del(5q) MDS, but half of the responding patients become resistant within two years. TP53 mutations are detected in ~20% of patients who become resistant to LEN. Our data show that patients who become resistant to LEN harbor either TP53 or RUNX1 mutations or loss of RUNX1 expression. Here we show that LEN-induced degradation of IKZF1 permits a RUNX1/GATA2 complex to drive megakaryocytic differentiation and consequent del(5q) MDS progenitor cell death via CRBN-mediated CSNK1A1 degradation. Overexpression of GATA2 is able to restore LEN sensitivity in the context of RUNX1 or TP53 mutations by enhancing LEN-induced megakaryocytic differentiation. Screening for TP53 and RUNX1 mutations or downregulation should identify patients resistant to LEN, and strategies to activate GATA2 may resensitize del(5q) MDS cells to LEN. 
    
   
  
    
   
  16 
 
  
    EGAD00001005770 
   
  
    
    The aim of this study is to reconstruct the phylogenetic development of childhood tumours 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  8 
 
  
    EGAD00001005772 
   
  
    
    Paired blood and saliva samples from five unrelated individuals were directly compared for quality of whole genome sequencing.  Two (Sample Pairs 1 and 2) were female probands diagnosed with tetralogy of Fallot, a type of congenital heart disease, and three (Sample Pairs 3, 4 and 5) were male probands diagnosed with hypertrophic cardiomyopathy.  WGS was performed using Illumina HiSeq X to a target average coverage depth of 30x and a read length of 150 bp.  The resulting reads were not filtered for minimum quality in order to avoid losing possible contaminant reads.  Sequencing read alignment was done using Isaac Aligner to human genome build hg19.  Short variant i.e. single-nucleotide variant (SNV) and small insertion-deletion (indel) calling was performed using Isaac Variant Caller with default parameters. 
    
   
  
    
   
  10 
 
  
    EGAD00001005773 
   
  
    
    We have sequenced whole genomes of 10 melanoma samples (1 cell line; A375 and 9 patient derived short term cultures). Libraries were prepared with 10X linked reads technology in order to obtain phase information and subsequently sequenced on Illumina NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001005774 
   
  
    
    Fastq files from amplicon sequencing in 106 Multiple Sclerosis patients and 105 healthy volunteers in CD4 T cells, CD8 T cells and genomic DNA using PE300 Illumina MiSeq. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  633 
 
  
    EGAD00001005775 
   
  
    
    This dataset contains 7 paired end fastq files obtained with Illumina Hiseq and Nextseq sequencing of whole exomes relevant to a study of pseudodiastrophic dysplasia (PDD). It includes 3 patients from 2 unrelated families diagnosed with PDD, together with the four parents. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001005776 
   
  
    
    Deep WGS sequencing (160x) of 2 different sites of disease of a patient with a RET fusion positive cancer.
Amplicon sequencing of 19 other sites of the same patients for the RET fusion. 
    
   
  
    
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001005777 
   
  
    
    Whole genome sequencing data from four affected and one unaffected individuals from two families with familial adult myoclonic epilepsy, one of Sri Lankan origin and one of Indian origin. BAM files aligned to hg19 reference genome. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001005778 
   
  
    
    Aligned BAM files from NextSeq500 tageted panel sequencing of 84 samples from matched tumour-normal pairs of 42 melanoma patients. The dataset consists of 30 non-responders and 12 responders to ICB. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  84 
 
  
    EGAD00001005779 
   
  
    
    Files from DNA sequencing from primary tumors and metastases from pancreatic cancer patients along with matched normal tissues. Sequencing files include those derived from whole exome sequencing as well as MSK-IMPACT sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  81 
 
  
    EGAD00001005780 
   
  
    
    Whole-exome sequencing for 95 PMBCL cases (including 21 with matching normal DNA) was performed using a targeted capture approach with the SureSelect Human All Exon V6+UTR bait (Agilent Technologies) followed by massively parallel sequencing of enriched fragments on the HiSeq2500 platform (Illumina). Five libraries were pooled per lane and a 125bp paired-end mode was used. Tumor and normal DNA samples were sequenced to an average of 115X (SD 24X). All reads were aligned to the human reference genome (hg19) using bwa-mem version 0.7.5a29 with optical and PCR duplicates removed using the Picard tool. 
    
   
  
    
   
  116 
 
  
    EGAD00001005781 
   
  
    
    Whole genome sequencing data for 101 BL patients and transcriptome sequencing for 82 (out of 101) BL patients. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  183 
 
  
    EGAD00001005782 
   
  
    
    This dataset contains 9 bam files of exome sequencing for an experiment of evolved resistance. Here a barcoded cell line (HCC827 - POT) has been treated under high concentrations of gefitinib (GEF) and trametinib (TRM) until resistance has evolved, as well as under control conditions (DMSO). The dataset contains exome sequencing of confluent cells for three replicates for each anti-cancer drug as well as two replicates of growth under DMSO conditions. The original barcoded cell line (POT) was also exome sequenced and is included in the cohort. Sequencing was performed on the Illumina NovaSeq platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001005784 
   
  
    
    CRISPR/Cas9 lethality screens in a set of Asian head and neck cancer cell lines to identify novel targets. . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  100 
 
  
    EGAD00001005785 
   
  
    
    The aim of this study is to describe the transcriptome of single arthritic cells. . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  510 
 
  
    EGAD00001005786 
   
  
    
    Cancer is a genetic disease caused by an accumulations of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from the lung tissue of individuals with a variety of lung diseases (COPD, UIP, IPF, Emphysema, pulmonary hypertension). This will allow us to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. Smoking is a large risk factor for developing many of these lung diseases so we are particularly keen to determining whether there is evidence of a smoking signature in these patients.  . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  190 
 
  
    EGAD00001005787 
   
  
    
    Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development.      . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  689 
 
  
    EGAD00001005788 
   
  
    
    We will be testing the hypothesis that MBD4 PTV germline carriers also show an increased number of C toT germline mutations in their offspring. 
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  39 
 
  
    EGAD00001005789 
   
  
    
    Samples prepared by LCM - 5 cases for pilot study. Bulk DNA not available. . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  16 
 
  
    EGAD00001005790 
   
  
    
    Falciparum malaria is clinically heterogeneous and yet in most cases the risk of life-threatening disease dramatically declines after the first few infections of life because children rapidly acquire disease tolerance (resistance to severe malaria without improved control of parasite burden). Identifying the factors that determine clinical outcome in a malaria-naive host is therefore paramount to reduce malaria mortality. However, the relative contribution of disease-causing variants of the Plasmodium var gene family versus pathogenic inflammatory cytokine cascades remains fiercely debated - we sought to reconcile these conflicting arguments by studying their interaction in vivo. To this end, two human challenge models were used to reveal the parasite-host interactions that underpin variation in falciparum malaria. To capture the diversity of human immune responses, each individual was analysed independently by tracking dynamic changes in their whole blood transcriptome through time. And to uncover evidence of preferential expansion of disease-causing variants, var gene expression was tracked in vivo from the start to end of infection. In this way, we could show that group A var genes are always expressed upon liver egress but in a minority population that does not increase over 10-days of blood cycling; there is no selection of disease-causing variants in the naive host. In fact, parasites do not respond in any way to differences or changes in host environment. On the other hand, host-intrinsic variation determines the intensity of inflammation and progression to clinical malaria. And furthermore, regulation of the interferon signaling network controls host fate. These data emphasise the role of human immune decision-making in shaping course & outcome of infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2020-01-15. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001005791 
   
  
    
    Whole exome sequence data in fastq format was aligned to the GRCH38 reference genome. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found in he bam header. Tumour samples were classified as Anaplastic Thyroid, Poorly-differentiated or well-differentiated cancers. 
    
   
  
    
   
  - 
 
  
    EGAD00001005795 
   
  
    
    The study includes methylC-capture sequencing (MCC-Seq) on 94 sperm DNA samples derived from both fertile and infertile individuals who were recruited from the Men’s Health Clinic at the Royal Victoria Hospital, Montreal, Quebec. All the data were generated with 100bp paired-end reads using the Illumina NovaSeq 6000 systems. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  94 
 
  
    EGAD00001005796 
   
  
    
    The dataset for Multimodal Genomic Features Predict Outcome of Immune Checkpoint Blockade in Non-small Cell Lung Cancer includes 106 bam files from whole exome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include matched tumor/normal samples from non-small cell lung cancer patients treated with immunotherapy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  106 
 
  
    EGAD00001005797 
   
  
    
    Fastq files of chromatin run-on (14 fibrolamellar carcinoma, 3 non-malignant liver; single-end) and transcriptome (23 fibrolamellar carcinoma, 2 non-malignant liver; paired-end) sequencing of fibrolamellar carcinoma 
    
   
  
    
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001005798 
   
  
    
    The sequencing results provided in this study is enriched through liquid phase hybridization capture. The data set shows 35 clinical cfDNA samples showing a dominant peak at 166bp and 35 clinical cfDNA samples showing a dominant peak at 134/144bp. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  70 
 
  
    EGAD00001005799 
   
  
    
    Bam and fastq files from RNA-seq of PDAC samples described in Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  34 
 
  
    EGAD00001005800 
   
  
    
    RNA-seq of SMARCA2/4 knock-down prostate cancer cell lines (LNCaP and 22Rv1, 15 samples altogether). Dataset contains BAM files from RNA-seq performed using Illumina HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001005801 
   
  
    
    JAK and STAT alterations in CD30 positive LPD, panel sequencing, 12 cutaneous lymphoma patients, 40 samples 
    
   
  
    
      
      Illumina MiSeq 
      
      Ion Torrent PGM 
      
    
   
  40 
 
  
    EGAD00001005802 
   
  
    
    This dataset contains targeted sequencing of breast tumors with germline BRCA1/2 mutations (n = 30) and those without. Breast cancer related genes (n = 115) have been captured and sequenced. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001005803 
   
  
    
    RNA sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  39 
 
  
    EGAD00001005804 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of renal cell carcinoma (RCC) patients (MonRec Cohort) 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  117 
 
  
    EGAD00001005805 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data of cell-free DNA from plasma from self-reporting healthy individuals (MonRec Cohort) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  22 
 
  
    EGAD00001005806 
   
  
    
    Mutation analysis of 10 frequently mutated genes in renal cell carcinoma (BAP1, KDM5C, MET, MTOR, PBRM1, PIK3CA, PTEN, SETD2, TP53, VHL) in plasma DNA of RCC patients using a custom QIASeq panel (MonRec Cohort) 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  276 
 
  
    EGAD00001005807 
   
  
    
    Bulk RNA-sequencing was performed on CD4+ T cells isolated from the blood of visceral leishmaniasis patients (n = 12) and endemic controls (EC; n = 12). CD4+ T cells were obtained by magnetic-activated cell sorting (MACS). Alterations in the transcripts of T helper (Th) cells during infection were identified. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001005808 
   
  
    
    Raw whole exome sequencing data (fastq) for the GATCI project 
    
   
  
    
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001005809 
   
  
    
    Whole exome sequencing data for 381 TGA probands 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  381 
 
  
    EGAD00001005810 
   
  
    
    Raw RNA sequence data (fastq) for the GATCI project 
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD00001005812 
   
  
    
    Whole exome sequencing (WES) of tumor tissues from RCC patients (DIAMOND cohort) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  74 
 
  
    EGAD00001005813 
   
  
    
    A 2.077Mb (57306 probes) personalised capture panel [Tailored Panel Sequencing (TAPAS)] was designed based upon the somatic SNVs identified by WES of RCC patient FF and FFPE tissue samples and applied to cfDNA in plasma and urine. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  62 
 
  
    EGAD00001005814 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data of cell-free DNA from plasma and urine from RCC patients (DAIMOND cohort) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  106 
 
  
    EGAD00001005815 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data of tumor tissue from RCC patients (DAIMOND cohort) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  45 
 
  
    EGAD00001005816 
   
  
    
    This dataset includes whole genome sequence data for ChIPmentation assays (18 H3K4me3, 20 H3K27ac and 3 input samples) of human stimulated and cultured CD4+ Treg cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001005817 
   
  
    
    Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with GATK HaplotypeCaller to generate germline variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers 
    
   
  
    
   
  - 
 
  
    EGAD00001005818 
   
  
    
    Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with SomaticSniper to generate somatic variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers 
    
   
  
    
   
  - 
 
  
    EGAD00001005819 
   
  
    
    We performed whole exome sequencing of 8 samples derived from a patient with metastatic melanoma. These represent six different regions of a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor, one pre-treatment biopsy that was treatment naive and one post-PD-1 inhibitor treated lesion. Exome sequencing data was generated using methods as previously described, including library preparation using the Agilent SureSelect XT Target Enrichment protocol (#5190-8646) prior to sequencing on an Illumina HiSeq 2000/2500 v3 system using 76bp paired-end reads. Raw sequencing data was then processed using Saturn V, the next generation sequencing data processing and analysis pipeline developed by the Department of Genomic Medicine at the UT MD Anderson Cancer Center. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001005820 
   
  
    
    We performed RNA sequencing of 48 different regions sub-sampled from a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor. RNAseq was performed on samples with a minimum RNA integrity number (RIN) of 5.5 except for two cases (6A10 and 8A3) with RINs greater than 3. A minimum of 700ng of RNA were required for all samples undergoing RNAseq. Paired-end transcriptome reads were aligned using TopHat2, to the UCSC hg19 reference genome. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001005821 
   
  
    
    We performed deep targeted DNA sequencing for a panel of 265 cancer-related genes. This included subsampling 35 different regions of a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor. Samples with cancer cell purity greater than 80% based on pathologic assessment were used for cancer gene panel DNA sequencing. Mean sequencing coverage was 861x and paired-end reads in FASTQ format were generated by the Illumina pipeline and aligned to the reference human genome hg19 build using the Burrows-Wheeler Alignment Tool (BWA, v0.7.5) with default settings. Aligned reads were further processed using GATK with best practices for removing duplicates, indel removal and recalibration. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001005822 
   
  
    
    Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with MuTect to generate somatic variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers. 
    
   
  
    
   
  - 
 
  
    EGAD00001005823 
   
  
    
    Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005824 
   
  
    
    Genome and transcriptome sequence data from an uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005825 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005826 
   
  
    
    Genome and transcriptome sequence data from a colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005827 
   
  
    
    Genome and transcriptome sequence data from a primary unknown- upper GI or pulmonary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005828 
   
  
    
    Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005829 
   
  
    
    Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005830 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005831 
   
  
    
    Genome and transcriptome sequence data from a poorly differentiated adenocarcinoma more consistent with metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005832 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005833 
   
  
    
    Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005834 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005835 
   
  
    
    Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005836 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005837 
   
  
    
    Genome and transcriptome sequence data from a metastatic adenocarcinoma of unknown primary (upper GI?) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005838 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005839 
   
  
    
    Genome and transcriptome sequence data from a metastatic choroid melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005840 
   
  
    
    Genome and transcriptome sequence data from a carcinoma primary unknown origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005841 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005842 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005843 
   
  
    
    Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005844 
   
  
    
    Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005845 
   
  
    
    Genome and transcriptome sequence data from a metastatic unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005846 
   
  
    
    Genome and transcriptome sequence data from a cavernous sinus meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005847 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005848 
   
  
    
    Genome and transcriptome sequence data from a chondrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005849 
   
  
    
    Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005850 
   
  
    
    Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  3 
 
  
    EGAD00001005851 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005852 
   
  
    
    Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005853 
   
  
    
    Genome and transcriptome sequence data from a lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005854 
   
  
    
    Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005855 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005856 
   
  
    
    Genome and transcriptome sequence data from a cervical adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005857 
   
  
    
    Genome and transcriptome sequence data from a sex-cord stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005858 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005859 
   
  
    
    Genome and transcriptome sequence data from an endometrioid ovary carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005860 
   
  
    
    Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005861 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005862 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005863 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005864 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005865 
   
  
    
    Genome and transcriptome sequence data from a tongue squamous cell carcinoma (head and neck) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005866 
   
  
    
    Genome and transcriptome sequence data from a neuroendocrine tumor (GIC) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005867 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005868 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005869 
   
  
    
    Genome and transcriptome sequence data from a transformed diffuse large B cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005870 
   
  
    
    Genome and transcriptome sequence data from an osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005871 
   
  
    
    Genome and transcriptome sequence data from a chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005875 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005876 
   
  
    
    Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005877 
   
  
    
    Genome and transcriptome sequence data from a malignant epithelial mesothelioma (THR) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005878 
   
  
    
    Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005879 
   
  
    
    Genome and transcriptome sequence data from an ovarian cystadenocarcinoma low grade patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005880 
   
  
    
    Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005881 
   
  
    
    Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005882 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005883 
   
  
    
    Genome and transcriptome sequence data from a malignant choroid melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005884 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005885 
   
  
    
    Genome and transcriptome sequence data from a large intestine-colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005886 
   
  
    
    Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005887 
   
  
    
    Genome and transcriptome sequence data from a parotid gland adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005888 
   
  
    
    Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005889 
   
  
    
    Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005890 
   
  
    
    Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005891 
   
  
    
    Genome and transcriptome sequence data from a thyroid hurthle cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005892 
   
  
    
    Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005893 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005894 
   
  
    
    Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005895 
   
  
    
    Genome and transcriptome sequence data from a cecum adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005896 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005897 
   
  
    
    Genome and transcriptome sequence data from a metastatic adrenocortical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005898 
   
  
    
    Genome and transcriptome sequence data from a malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005899 
   
  
    
    Genome and transcriptome sequence data from an endometrial stromal sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005900 
   
  
    
    Genome and transcriptome sequence data from a hemangioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005901 
   
  
    
    Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005902 
   
  
    
    Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005903 
   
  
    
    Genome and transcriptome sequence data from a metastatic gallbladder adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005904 
   
  
    
    Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005905 
   
  
    
    Genome and transcriptome sequence data from an adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005906 
   
  
    
    Genome and transcriptome sequence data from a metastatic clear cell carcinoma of the ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005907 
   
  
    
    Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005908 
   
  
    
    Genome and transcriptome sequence data from a squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005909 
   
  
    
    Genome and transcriptome sequence data from an alveolar soft part sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005910 
   
  
    
    Genome and transcriptome sequence data from a gastroesophageal junction adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005911 
   
  
    
    Genome and transcriptome sequence data from a colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005912 
   
  
    
    Genome and transcriptome sequence data from an unknown tissue unknown histology patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005913 
   
  
    
    Genome and transcriptome sequence data from an osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001005914 
   
  
    
    Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found in the bam header 
    
   
  
    
   
  - 
 
  
    EGAD00001005915 
   
  
    
    Data supporting: "Repurposing of KLF5 activates a cell cycle signature during the progression from Barrett’s Oesophagus to Oesophageal Adenocarcinoma." Rogerson et al.
RNA-seq data
2 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001005916 
   
  
    
    Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK Haplotype Caller to generate germline variant calls. Variant calls are in VCF format. Details for the call can be found in the VCF header 
    
   
  
    
   
  - 
 
  
    EGAD00001005917 
   
  
    
    Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/MuTect, to generate somatic variant calls. Somatic variant calls are in VCF format. Details for the mutect call can be found in the vcf header. 
    
   
  
    
   
  - 
 
  
    EGAD00001005918 
   
  
    
    Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/SomaticSniper, to generate somatic variant calls. Somatic variant calls are in VCF format. Details for the mutect call can be found in the vcf header. 
    
   
  
    
   
  - 
 
  
    EGAD00001005919 
   
  
    
    We will be using G&T method to sequence single cell genome and transcriptome derived from FS13B iPSCs cell line. The cell cycle state of each of the single cells is known. Hence, we will be analysing the genome and transcriptome of single cells from each of the cell cycle state to generate a copy number profile and transcriptome profile per given cell cycle stage: G1, S, G2, S. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  192 
 
  
    EGAD00001005920 
   
  
    
    Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  49 
 
  
    EGAD00001005921 
   
  
    
    Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001005922 
   
  
    
    Sequencing of LCM-derived microbiopsies from 40 women who underwent mastectomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  46 
 
  
    EGAD00001005923 
   
  
    
    Sequencing of LCM-derived microbiopsies from 30 women who mastectomies due to Breast Cancer. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  29 
 
  
    EGAD00001005924 
   
  
    
    Sequencing of LCM-derived microbiopsies from 30 women who underwent mastectomies due to a breast cancer diagnosis. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001005925 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2020-01-29. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  27 
 
  
    EGAD00001005926 
   
  
    
    Raw whole genome sequencing data (fastq) for the GATCI project 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001005928 
   
  
    
    This dataset contains cell-free reduced representation bisulfite sequencing data from 60 pediatric cancer samples. Files are provided in fastq format. Samples were sequenced on a NextSeq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  60 
 
  
    EGAD00001005929 
   
  
    
    This is the dataset used in the benchmarking of ProSolo, a new probabilistic single nucleotide variant caller for single cell DNA sequencing data that provides control over the false discovery rate of different single cell events at genomic sites (e.g. alternative allele presence or allele dropout). It provides the whole exome sequencing data used in assessing ProSolo's performance that is not available elsewhere, namely bulk whole exome sequencing data of a patient with a constitutional mismatch repair defect (MSH6-) and their parents and siblings, a bulk whole exome sample of granulocytes from that patient and 5 single granulocytes whole exome sequenced after whole genome amplification. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001005931 
   
  
    
    RNASeq data from paired malignant/benign prostate tissues 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001005932 
   
  
    
    This data set contains small RNA-sequencing and RNA-sequencing data from subependymal giant cell astrocytomas (SEGA) resected from tuberous sclerosis complex patients. Small RNA-sequencing and RNA-sequencing were performed on the same set of SEGAs (n=19) and periventricular controls (n=8). For full details on library preparation and patients please refer to the paper "The coding and non-coding transcriptional landscape of subependymal giant cell astrocytomas." (PMID: 31834371 DOI: 10.1093/brain/awz370). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001005933 
   
  
    
    Fastq files of Reduced Representation Bisulfite Sequencing data (HaeIII, covering about 7 million CpGs per sample) of induced pluripotent stem cells (iPSC), definitive endoderm (DE) and hepatocyte-like cells (HLC).
The dataset comprises data generated by the in vitro differentiation protocol Cellartis (Takara Bio, "CEL", n = 4). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001005934 
   
  
    
    Fastq files of ATAC-seq data of induced pluripotent stem cells (iPSC), definitive endoderm (DE), hepatocyte-like cells (HLC) and primary human hepatocytes (PHH).
The dataset comprises data from two different in vitro differentiation protocols: Cellartis (Takara Bio, "CEL", n = 4) and as described by Wang et al. (PMID: 28287600, "HAY", n = 1), as well as from 3 PHH donors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001005935 
   
  
    
    Fastq files of mRNA-seq data of induced pluripotent stem cells (iPSC), definitive endoderm (DE) and hepatocyte-like cells (HLC). The dataset comprises data from the in vitro differentiation protocol Cellartis (Takara Bio, "CEL", n = 4) and several interventions (11x3 replicates). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001005936 
   
  
    
    WXS files for Mullighan Leventaki ALCL paper titled "Integrative molecular analysis of pediatric Anaplastic large cell lymphoma reveals subtypes with distinct immune suppression signatures." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001005937 
   
  
    
    Raw Whole Exome Sequencing data from Blood samples drawn from related Female participants presenting severe congenital neutropenia. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001005938 
   
  
    
    This dataset contains 3 pairs of exomes, germline (from whole blood) and patient-derived xenograft (PDX), from human pancreatic durctal adenocarcinoma patients. The data is referred to in the publication: "Pro-immunogenic impact of MEK inhibition synergizes with agonist anti-CD40 immunostimulatory antibodies in tumor therapy" (Nature Communications, 2020) 
Abstract: Cancer types with lower mutational load and a non-permissive tumor microenvironment are intrinsically resistant to immune checkpoint blockade. While the combination of cytostatic drugs and immunostimulatory antibodies constitutes an attractive concept for overcoming this refractoriness, suppression of immune cell function by cytostatic drugs may limit therapeutic efficacy. Here we show that targeted inhibition of mitogen-activated protein kinase (MAPK) kinase (MEK) does not impair dendritic cell-mediated T-cell priming and activation. Accordingly, combining MEK inhibitors (MEKi) with agonist antibodies (Abs) targeting the immunostimulatory CD40 receptor resulted in potent synergistic anti-tumor efficacy. Detailed analysis of the mechanism of action of MEKi GDC-0623 by means of flow cytometric analysis of the tumor immune infiltrate and whole tumor transcriptomics showed that, in addition to its cytostatic impact on tumor cells, this drug exerts multiple pro-immunogenic effects, including the suppression of M2-type macrophages, myeloid derived suppressor cells and CD4+ T-regulatory cells. In addition, MEKi was found to induce tumor-cell intrinsic interferon signaling, which contributed to antigen presentation by tumor cells. Finally, the tumoridical impact of MEKi involves the activation of multiple pro-inflammatory pathways involved in immune cell effector function in the tumor microenvironment. Our data therefore indicate that the combination of MEK inhibition with agonist anti-CD40 Ab is a promising therapeutic concept, especially for the treatment of mutant Kras-driven tumors such as pancreatic ductal adenocarcinoma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001005941 
   
  
    
    Paired melanoma tumor and normal (PBMC) WES data from a cohort of 26 patients subsequently treated with combined immune checkpoint blockade. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001005945 
   
  
    
    50 paired benign/cancer samples from prostate tissue generated in 2 different runs - on 3 plates on the IonTorrent Proton.
Total of 200 fastq.gz single end runs. 
Read length ~300 bp. 
%GC  44 
Sequences per file approx 1 Mio. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  100 
 
  
    EGAD00001005946 
   
  
    
    Fastq files of deeply sequenced single cell RNA-seq data (Smartseq2, approx. 2 million reads / sample) of hepatocyte-like cells (HLC) and primary human hepatocytes (PHH).
The dataset comprises data from two different in vitro differentiation protocols: Cellartis (Takara Bio, "CEL", n = 3) and as described by Wang et al. (PMID: 28287600, "HAY", n = 1), as well as PHH from 3 donors. Each replicate comprises 96 single cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001005947 
   
  
    
    The dataset contains exome sequencing fastq from 5 ovarian cancer patients, paired with tumor normal blood samples. Three tumor samples were sequenced from each patient: a biopsy sample ("-1" suffix in the file name), a local sample (multiple regions around the biopsy pooled together, with the "-2" suffix in the file name), and a global sample (multiple regions from the tumor pooled together, with a "-3" suffix in the file name). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  20 
 
  
    EGAD00001005948 
   
  
    
    Whole genome transcriptome poly-A selected strand specific 100bp paired-end RNA sequencing of post-mortem brain tissue from prefrontal cortex and orbitofrontal cortex were performed. Brain tissue samples were collected from four different biobanks in England and USA. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  223 
 
  
    EGAD00001005949 
   
  
    
    This study assessed molecular determinants of response in a cohort of patients with AML that were treated with venetoclax in combination with either DNA methyltransferase inhibitors or low dose cytarabine. RNA sequencing was performed on 31 patients from three different response classes [10 Group A - Durable remission (n=10), Group B - Relapsed (n=10) and Group C - Refractory (n=11)]. Library preparation and sequencing was performed at the Australian Genome Research Facility, using the Truseq Stranded mRNA library kit. Technical and batch replicate samples are included. Gene count data are provided with the original publication. The use of the sequencing data is subject to a data transfer agreement and is restricted to ethically approved research into blood cell malignancies and cannot be used to assess germline variants. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001005950 
   
  
    
    Gray Platelet Syndrome (GPS) is a rare recessive bleeding disorder resulting from biallelic variants in NBEAL2. As part of a comprehensive evaluation of the phenotype and genotype in 47 patients with GPS, four different blood cell-types (platelets, neutrophils, monocytes, and CD4-lymphocytes) were evaluated using bulk RNA-seq in five patients and five controls. These data are deposited in this archive in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  40 
 
  
    EGAD00001005951 
   
  
    
    RNASeq files for Mullighan Leventaki ALCL Project paper titled "Integrative molecular analysis of pediatric Anaplastic large cell lymphoma reveals subtypes with distinct immune suppression signatures." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001005952 
   
  
    
    Familial Multiple Sclerosis study dataset, including variant calling files from 138 samples with three different phenotypes: Multiple Sclerosis (MS), other Autoimmune Diseases (AID) and unaffected individuals. 
    
   
  
    
   
  138 
 
  
    EGAD00001005953 
   
  
    
    part of the DEEP project results resulted in the publication of 'Integrative analysis of single-cell expression data reveals distinct regulatory states in bidirectional promoters', Epigenetics & Chromatin (2018), Fatemeh et al., DOI: 10.1186/s13072-018-0236-7, PMID: 30414612, PMCID: PMC6230222. This dataset contains the subset of DEEP data related to that study. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005954 
   
  
    
    Additional histone modification data, not yet released as part of IHEC, for cell line 01_HepG2_LiHG_Ct1, H3K122ac. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001005955 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD00001005956 
   
  
    
    Whole exome sequencing of tumors and paired adjacent uninvolved tissues from 222 early stage NSCLC patients, in order to identify genomic drivers present in early-stage non-small cell lung cancer and determine the overall tumor mutational burden in early-stage non-small cell lung cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  540 
 
  
    EGAD00001005957 
   
  
    
    The dataset referenced by EGA Study ID EGAS00001004208 includes 20 human exome sequencing data and 11 human RNA sequencing data from tumor or normal tissues. Each sequencing data includes two pair-end short read files in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001005958 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD00001005959 
   
  
    
    In this experiment we investigated the effect of HDAC3 inhibition on the transcriptome of IFNg-primed macrophages under different tolerization conditions. Peripheral blood mononuclear cells (PBMCs) were isolated from 3 healthy donors. PBMCs were isolated from whole blood of healthy donors using Ficoll gradient (Invitrogen). Monocytes (CD14+ cells) were positively selected from PBMCs using CD14 Microbeads according to the manufacturer’s instructions (Miltenyi Biotec). Monocytes were subsequently treated with or without 500 nM HDAC3i (ITF3100) for 30 minutes prior to overnight IFNg priming (50 ng/mL). Cells were then kept without LPS (non-LPS; N), treated with 10 ng/mL LPS once (non-tolerized; NT), or treated with LPS twice (tolerized; T; second LPS concentration: 100 ng/mL). In total, there were 18 samples included. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  18 
 
  
    EGAD00001005960 
   
  
    
    The immune microenvironment of hepatocellular carcinoma (HCC) is poorly characterized. Combining two single-cell RNA sequencing technologies, we produced transcriptomes of CD45+ immune cells for HCC patients from five immune-relevant sites: tumor, adjacent liver, hepatic lymph node (LN), blood, and ascites.
This dataset is part of Smartseq2 data 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001005961 
   
  
    
    The immune microenvironment of hepatocellular carcinoma (HCC) is poorly characterized. Combining two single-cell RNA sequencing technologies, we produced transcriptomes of CD45+ immune cells for HCC patients from five immune-relevant sites: tumor, adjacent liver, hepatic lymph node (LN), blood, and ascites.
This is the droplet data of this study 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  19 
 
  
    EGAD00001005963 
   
  
    
    RNA seq analysis of 6 CUP metastases (each in triplicate), analysed by paired sequencing with NextSeq 500.
Whole exome sequencing of 15 CUP metastases, analysed by paired sequencing with NextSeq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001005964 
   
  
    
    This dataset includes whole transcriptome data of human stimulated and cultured CD4+ Treg cells (39 samples). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001005965 
   
  
    
    Single-cell ATAC-seq data for 5 CLL samples (2 controls, 3 tumor) of the CancerEpiSys-PRECiSe project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  5 
 
  
    EGAD00001005966 
   
  
    
    Tagged-WGBS for 3 Naive B Cell samples of the CancerEpiSys-PRECiSe project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001005967 
   
  
    
    ATAC-seq data for 26 CLL samples (7 controls, 19 tumor) of the CancerEpiSys-PRECiSe project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001005968 
   
  
    
    long RNA data for 27 CLL samples (8 controls, 19 tumor) of the CancerEpiSys-PRECiSe project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  27 
 
  
    EGAD00001005969 
   
  
    
    ChIPseq data for 31 CLL samples (12 controls, 19 tumor) of the CancerEpiSys-PRECiSe project; containing histone H3, histone modifications and transcription factor binding sites (CTCF, EBF1). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 550 
      
    
   
  31 
 
  
    EGAD00001005970 
   
  
    
    WGBS data for 75 paired fastq, spread over 31 samples (4 healthy T-cell, 7 healthy B-cell, 20 B-cell CLL tumors) of the CancerEpiSys-PRECiSe project. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  31 
 
  
    EGAD00001005971 
   
  
    
    This dataset contains 14 paired-end FASTQ sequences from mRNA-Seq on single human M-II stage oocytes that were collected from ovarian tissue from unstimulated patients undergoing fertility preservation treatments due to cancer diagnoses, which did not influence ovarian function. Cumulus-oocyte-complexes were matured in vitro according Gruhn et al (Science 365: 1466-1469) and short term flash frozen prior to lysis, RNA extraction, full length cDNA preparation and amplification using the Ultra-low-input SMART-Seq2 v4 kit from Takara Clonetech. Further, these cDNA were used to prepare libraries for sequencing according the Nextera XT DNA library preparation kit from Illumina 
    
   
  
    
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD00001005972 
   
  
    
    Dataset contains CYP2D6 sequencing data of 566 individuals who used tamoxifen as adjuvant breast cancer therapy. Phenotype data consists of the ratio between the metabolites endoxifen and desmethyltamoxifen (Metabolic ratio (MR)) as a proxy for CYP2D6 enzyme activity. Each sample is linked to one bam-file containing the CYP2D6 sequence. 
    
   
  
    
      
      PacBio RS II 
      
    
   
  2 
 
  
    EGAD00001005973 
   
  
    
    We are interested in inter-individual variation in transcriptional response to immune checkpoint blockade. We have analysed poly A purified RNA expression from CD8 T cells (297 transcriptomes in total) isolated from metastatic melanoma patients (n=106) prior to and during treatment with either single agent (Pembrolizumab) or combination (Ipilimumab/ Nivolumab) immune checkpoint blockade. We compare expression at different stages of treatment and additionally contrast this with that from healthy controls (n=68). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  297 
 
  
    EGAD00001005974 
   
  
    
    Oxford Nanopore long-read sequencing of A17-LAxillaryLN2Met-23312 PELICAN sample, identified as D051965 un Pan-Cancer Analysis of Whole Genomes study, and identified as PD13412a by prior Gundem et al whole genome sequencing study (PMID 25830880). Data used to support Figure 6 in Pubmed ID 32025007 "Pan-Cancer Analysis of Whole Genomes Consortium." Nature 2020 578:8293. 
    
   
  
    
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001005975 
   
  
    
    Genome Asia VCF files 
    
   
  
    
   
  1163 
 
  
    EGAD00001005976 
   
  
    
    This dataset contains NanoString gene expression of PBMC from patients from IMvigor210, IMvigor211 and IMmotion150 cohorts 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001005977 
   
  
    
    Single Cell RNAseq of blood and tumor from renal cancer patients 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001005978 
   
  
    
    To identify the therapeutic targets in a treatment-refractory cancer patient, we performed single-cell RNA sequencing for 3,115 cells from primary bladder cancer (BC159-T#3) and patient-derived xenografts (BC159-T#3-PDX-vehicle and BC159-T#3-PDX-tipifarnib). Matched time-series bulk tumor tissues were also sequenced using whole exome target probe (WES) and whole transcriptome target probe (WTS). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001005981 
   
  
    
    These are the raw sequencing files for the 50 brain tumour, 3 extracranial tumour and 34 matched normals for the patients in the discovery cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  87 
 
  
    EGAD00001005982 
   
  
    
    These are the variant calls for the 50 brain tumour samples in the discovery cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  50 
 
  
    EGAD00001005983 
   
  
    
    These are the Sequenza copy number calls for the 30 brain tumour samples within the discovery cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001005984 
   
  
    
    Raw sequencing files for the 18 (brain tumour-only) samples within the external validation cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  18 
 
  
    EGAD00001005985 
   
  
    
    These are the raw sequencing files for the orthogonal validation brain tumour samples in the discovery cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001005987 
   
  
    
    Whole transcriptome and targeted dna sequencing (Ampliseq) of pediatric low-grade glioma samples at the Hospital for Sick Children 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  101 
 
  
    EGAD00001005990 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001005991 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD00001005992 
   
  
    
    Using whole genome sequencing of lymphocytes excised from human tissue using laser capture microscopy (LCM), we identify the mutations arising in these microenvironments. This work will contribute towards developing a catalogue of mutations present in tissue resident lymphocytes across a range of tissues, and will characterize the mutational signatures that result from each microenvironment.  . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001005993 
   
  
    
    The aim of this study is to define the mutational landscape of human liver tumours. . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001005994 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. 
Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. 
Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.  . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001005995 
   
  
    
    The study will use WGS to aid in benchmarking different culture conditions  in a set of genetically annotated human organoid lines. The data will be used to assess whether there is any clonal differences introduced when culturing these lines in different conditions. . 
This dataset contains all the data available for this study on 2020-02-20. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001005996 
   
  
    
    Bacterial isolation in infected brains in patients with Huntington's disease. Here we used next generation sequencing of 16S ribosomal RNA gene PCR amplicons (NGS 16S amplicon analysis) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001005998 
   
  
    
    106 bulk RNA-seq samples of primary human keratinocytes 
8 single cell RNA-seq samples of human epidermis 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  112 
 
  
    EGAD00001005999 
   
  
    
    Contains 46 aGCT tumor sample WGS BAMs from 33 patients and corresponding 33 germline reference WGS BAMs from those 33 patients 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  79 
 
  
    EGAD00001006000 
   
  
    
    Single cell RNA-seq profiling of ~62k purified CD8+ T cell  transcriptomes, from six healthy older adult donors using 10X genomics. Cells from each donor were separated based on their IL-7R protein expression (i.e. CD8+ IL7R+ and CD8+ IL7R- T cells). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001006001 
   
  
    
    This dataset includes 52 samples from 19 individuals with pancreatic neuroendocrine tumours. These samples were analyzed using RNA-sequencing, whole-exome sequencing, shallow (~0.3x) whole-genome sequencing, and a 21-gene panel targeted capture sequencing. Everything was sequenced using the Illumina HiSeq and is provided in BAM or FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001006003 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  144 
 
  
    EGAD00001006004 
   
  
    
    Whole-exome sequencing dataset for the Australian Ovarian Cancer Study (AOCS) and The Jikei University School of Medicine (JIKEI) ovarian clear cell carcinoma (OCCC) clinical outliers project, representing 10 donors. Consists of 20 fastq files: one normal and one tumour from each patient. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  20 
 
  
    EGAD00001006005 
   
  
    
    liver cancer paired with normal controls, viral and non-viral origin 
    
   
  
    
   
  108 
 
  
    EGAD00001006006 
   
  
    
    Illumina Nextseq 500 whole transcriptome RNAseq from PBMCs - run1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001006007 
   
  
    
    In this study, we evaluated the effect of preservation agens on the effect of the methylation pattern of cell-free DNA. The methylation pattern was assessed with cell-free reduced representation sequencing (cf-RRBS). All 45 samples were sequenced on a NovaSeq6000, and samples are provided as raw fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001006008 
   
  
    
    RNAseq was performed on CDX, CDX-derived cell line and LNCaP cell line, with triplicates. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  26 
 
  
    EGAD00001006009 
   
  
    
    WES was performed on 2 TURP, 6 biopsies, 1 CDX, 1 cell line, 6 CTCs, 1 DNA germline and 1WBC from prostate cancer 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  18 
 
  
    EGAD00001006010 
   
  
    
    This dataset consists of 20 fastq files in total from exome and myeloid gene panel sequencing of 15 carriers of germline RUNX1 mutations from 10 different families. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Ion Torrent PGM 
      
      Ion Torrent Proton 
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001006011 
   
  
    
    8 matched pair melanocytic nevi 
Agilent SureSelect Human All Exon V5 plus UTR 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001006012 
   
  
    
    The dataset is comprised of 3 blood plasma samples (Patient 1-3_plasma_cfDNA) paired with genomic DNA (Patient 1-3_tumor gDNA) from the corresponding primary neuroblastoma from three patients. For all these samples whole-exome sequencing (WES) data have been generated. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001006013 
   
  
    
    Three technical replicates of FACS-sorted T cells (CD45+CD3+) and one replicate of FACS-sorted tumor cells (MCSP+) were loaded to a targeted 10,000 cells per lane on the 10X Genomics Chromium Controler with the single cell 5’ Immune Repertoire and Gene Expression profiling kit. In total, we loaded ~30,000 individual tumor infiltrating lymphocytes (TILs) and ~10,000 melanoma cells on the 10X platform (10X Genomics, CA, USA). Reverse transcription, TCR enrichment, and library preparations were performed according to the 10X Genomics 5’ V(D)J protocol revision C. Transcriptome libraries were pooled and sequenced on the Illumina NovaSeq 6000 S2 flow cell with 26 R1, 8 i7, and 91 R2 cycles respectively. The TCR libraries were pooled and sequenced on the Illumina MiSeq V2 150 cycles paired-end. Single cell transcriptomic and TCR data was processed with the 10X Genomics Cell Ranger Pipeline version 2.2.0 with the software-provided GRCh38 reference transcriptomes. After quality control, there was RNAseq profile data available from 6267 immune and 4303 melanoma cells. Downstream processing and visualization was encompassed through Seurat and tSNE plots. 
    
   
  
    
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001006014 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001006016 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001006017 
   
  
    
    Single Cell RNA-Seq of Primary GBM. Gender Female, Age, 57. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006018 
   
  
    
    Mixed Sample of scRNA-Seq primary low grade glioma. Genders: Male, Age: 34, 44. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006019 
   
  
    
    Single Cell-RNA Seq of Wildtype Primary GBM for Female, Age 50. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006020 
   
  
    
    Primary diffuse astrocytoma G3 Male, 74 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006022 
   
  
    
    The motor cortex is the earliest affected brain region in ALS. This dataset contains total RNA sequencing (stranded, 2x101bp) data derived from the motor cortex of 11 sporadic ALS patients and 8 healthy controls. 
    
   
  
    
   
  19 
 
  
    EGAD00001006023 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001006024 
   
  
    
    Pair end fastq file of 40 Roma whole genome sequence data. This dataset contains the fastq files obtained using illumina hiseq X generated reads, ~30X coverage.
10 Makedonian Roma - Balkan Roma
10 Spanish Roma -  North/Western Roma
10 Hungarian Roma - Vlax and Romungro Roma
 5 Lithuanian Roma - North/Western Roma
 5 Ukranian  Roma - Romungro Roma. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  40 
 
  
    EGAD00001006025 
   
  
    
    We have performed a comprehensive and integrative genomic
study of mantle cell lymphoma (MCL) to elucidate the features that may determine the
different clinical and biological behavior of the two molecular subtypes of this
lymphoma, conventional (cMCL) and leukemic non-nodal MCL (nnMCL). This data integrated with epigenomics and transcriptomics has allowed to uncover novel molecular mechanisms in the origin and development of these tumors and provide relevant information to stratify patients in different risk groups. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  114 
 
  
    EGAD00001006026 
   
  
    
    Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  53 
 
  
    EGAD00001006027 
   
  
    
    Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001006028 
   
  
    
    Genomic characterization (through whole-exome sequencing) of circulating tumor cells, bone marrow clonal plasma cells and extramedullary plasmacytomas from multiple myeloma patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  214 
 
  
    EGAD00001006029 
   
  
    
    Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  278 
 
  
    EGAD00001006030 
   
  
    
    Germline exome sequencing data (paired Fastq files) from 516 BRCA1/2-negative women affected with familial high-grade serous (or similar) ovarian carcinoma, as analysed and described in Subramanian et al (Nature Communications, 2020). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  516 
 
  
    EGAD00001006031 
   
  
    
    Whole genome, exome and RNA sequencing of uveal melanoma metastases, primary tumors and matched normal DNA. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  240 
 
  
    EGAD00001006032 
   
  
    
    Previously unpublished WGS reads mapping within the IG loci used in the benchmark of IgCaller. 
    
   
  
    
      
      unspecified 
      
    
   
  176 
 
  
    EGAD00001006033 
   
  
    
    Data supporting: "Genomic copy number predicts esophageal cancer years before transformation." Killcoyne, Gregson et al.
sWGS data
1000 samples
BAM files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  1000 
 
  
    EGAD00001006034 
   
  
    
    This is the PacBio long read data used for performing de novo assembly of the EGYPT individual (mapped against GRCh38). 
    
   
  
    
      
      Sequel 
      
    
   
  1 
 
  
    EGAD00001006035 
   
  
    
    10x Genomics linked read data used in variant phasing and de novo assembly scaffolding for the EGYPT individual (mapped against GRCh38). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001006036 
   
  
    
    This is the blood RNA-Seq read data used for expression analysis such as haplotypic expression (mapped against GRCh38). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006037 
   
  
    
    High-coverage WGS 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001006038 
   
  
    
    This data set contains for 10 Egyptian individuals the WGS reads mapping to chrM. These were subsequently used for haplogroup assignment. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001006039 
   
  
    
    This data set comprises WGS small variants and structural variants called in a cohort of 110 Egyptian individuals (10 individuals have been sequenced as part of this study and 100 are from EGAD00001001372/EGAD00001001380). 
    
   
  
    
   
  10 
 
  
    EGAD00001006040 
   
  
    
    This data set contains for 217 Egyptian individuals the amplicon sequencing reads mapping to chrM. These were subsequently used for haplogroup assignment. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  217 
 
  
    EGAD00001006041 
   
  
    
    The dataset includes 174 FASTQ files from paired-end WXS sequencing on Illumina HiSeq2500 for 39 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  87 
 
  
    EGAD00001006042 
   
  
    
    The dataset includes 77 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 39 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  77 
 
  
    EGAD00001006043 
   
  
    
    TST170 Pilot DNA VCF files 
    
   
  
    
   
  16 
 
  
    EGAD00001006044 
   
  
    
    TST170 Pilot RNA BAM files 
    
   
  
    
   
  16 
 
  
    EGAD00001006045 
   
  
    
    WGS bam file for the 18 samples used in Michealraj et al. Cell 2020.
The dataset includes PFA ependymoma tissue, derived line and blood samples of 6 patients
WGS data were aligned with BWA to the hg38 human reference genome (igenome) and further processed according to the GATK best practice pipeline. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  18 
 
  
    EGAD00001006046 
   
  
    
    RNA-seq fastq files for the 16 samples used in Michealraj et al. Cell 2020.
The samples include PFA and ST ependymoma tissues, normal pediatric brain as control and PFA ependymoma lines. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001006047 
   
  
    
    DNA was isolated from aberrant plasma cells (aPCs) and peripheral blood of 12 ALA, 10 ALA+MM and 29 MM individuals. DNA from aPCs was amplified using REPLI-g Mini Kit (Qiagen). Totally, we analyzed 51 patients, 102 samples. One batch of exome libraries (paired tumor-normal samples from 12 ALA, 10 ALA+MM and 6 MM) was prepared using SureSelect Human All Exon V5 Kit (Agilent Technologies) and sequenced on Illumina HiSeq 4000 platform, 100 cycles. Second batch (23 MM samples; IDs ARK01-ARK26) was prepared using SureSelect Human All Exon V5 + IGH, IGK, IGL, MYC (Agilent Technologies) library preparation kit and sequenced on Illumina HiSeq 2000 platform in paired-end settings, 75 cycles. The reads were mapped using BWA-MEM on human genome GRCh38 without alternate loci. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  99 
 
  
    EGAD00001006048 
   
  
    
    ChIP-seq was performed for the following histone modifications in both megakaryocytes and granulocytes: H3K27ac, H3K4me2, and H3K36me3. ChIP-seq was performed for H3K27me3 and CTCF in megakaryocytes only. All ChIP experiments consist of n=3 replicates for each of QPD and Control, with the exception of control granulocyte H3K4me2 for which only n=2 replicates were performed.
4C-seq datasets consist of 2 viewpoints (PLAU promoter and an intergenic enhancer), profiled in megakaryocytes from n=2 controls and n=4 QPD. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  77 
 
  
    EGAD00001006049 
   
  
    
    379 tissue samples from various parts of the developing human embryo brain were dissociated and single cells were collected and processed without bias for mRNA-seq using 10X chromium 3' protocol. Libraries were sequenced on Illumina NovaSeq and reads aligned against the human GRCh38 genome. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  379 
 
  
    EGAD00001006050 
   
  
    
    5 trios were whole genome sequenced with PacBio Sequel to a depth of 15X (Trios 1-4) or 40X (Trio 5). For each trio the child was affected with severe ID, and the parents were unaffected. Dataset consists of Trio 2 samples: T2P, T2F and T2M 
    
   
  
    
      
      Sequel 
      
    
   
  15 
 
  
    EGAD00001006051 
   
  
    
    We here focused on whole blood from paxgene tubes from healthy Tanzanians and the impact of urbanization and diet on innate immune responses. We performed RNA-sequencing of whole blood from paxgene tubes from healthy Tanzanians and investigate that transcriptional changes depend of diet and location. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  316 
 
  
    EGAD00001006053 
   
  
    
    RNA-seq was performed on cultured megakaryocytes and peripheral-blood derived granulocytes from individuals with QPD and a unaffected controls. Each group consisted of n=3 biological replicates. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001006054 
   
  
    
    Data were generated by next-generation sequencing (Illumina) in a fastq format. This dataset involved sequencing data from pregnant women and patients with hepatocellular carcinoma (HCC). For HCC samples, the paired buffy coat and tumor DNA tissue samples were also sequenced. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  29 
 
  
    EGAD00001006055 
   
  
    
    Mutational signatures are imprints of cell-intrinsic and extrinsic pathophysiological processes that have occurred through tumorigenesis. Experimental efforts to explore signature etiologies have produced a compendium of signatures of exogenous mutagens previously. Here, we unearth major sources of endogenous DNA damage and the genes that are critical to mitigating this innate stream of DNA damage under normal, physiological circumstances. We performed whole genome sequencing of 173 subclones of CRISPR-Cas9 knockouts of 43 genes in a human induced pluripotent stem cell system, in the absence of any added DNA damage. We reveal substitution and indel signatures that arise from those genes which are essential guardians of the genome. By detailed dissection and comparisons to cancer-derived signatures, we demonstrate interminable sources of constitutive DNA damage, and some mechanistic knowledge into how guardian genes preserve the genome. Based on these experimental insights, we develop and benchmark a tool for clinical classification of tumor samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  173 
 
  
    EGAD00001006056 
   
  
    
    The aim of this project is to differentiate human embryonic stem cells to an extra-embryonic fate, specifically the hypoblast. This is of uttermost importance given the current lack of human hypoblast stem cells.
We hypothesized that the pluripotent characteristics of the starting human embryonic stem cell population may dictate the competency for extra-embryonic cell fate specification. Based on this hypothesis and using human embryonic stem cells maintained in different naïve-like culture regimes, we have now developed conditions that allow the differentiation of human embryonic stem cells to a stable GATA6+ SOX2- population. This suggests that these cells may be putative human hypoblast stem cells. To validate this finding here we propose to perform RNA sequencing experiments of the differentiated human embryonic stem cells. By comparing their RNA expression profile to the single cell sequencing data of the human embryo that we are currently generating, we will be able to determine the identity of our GATA6+ SOX2- cells, and establish whether they represent the in vivo human hypoblast. 
This dataset contains all the data available for this study on 2020-04-20. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001006057 
   
  
    
    paired WGS sequencing of nodal B-cell lymphoma, one tumor and one control, one patient (H021). Sequencing on Hiseq XTen with TruSeq Nano library preparation kit. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001006058 
   
  
    
    paired WGS data of one tumor of one patient with nodal B-cell lymphoma. Tumor cells were sorted according to CD48 expression in a high and low fraction. Library preparation with TruSeq Nano and sequencing on Hiseq XTen. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001006059 
   
  
    
    Tumors and control of nodal B-cell lymphoma of one patient. WES sequencing on Illumina HiSeq 4000 with Agilent SureSelect V5+UTRs. Bam files were aligned with bwa mem to hg19. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001006060 
   
  
    
    paired EXOME sequencing on Illumina HiSeq 4000  using Agilent SureSelect V6 of one tumor sample of one patient with B-cell lymphoma. The bam-file was mapped to the hg19 genome. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001006061 
   
  
    
    Whole genome and whole exome sequencing data supporting the manuscript 'Somatic evolution in the non-neoplastic IBD affected colon' by Sigurgeir Olafsson et al. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  693 
 
  
    EGAD00001006062 
   
  
    
    This dataset include bam files of 16 paired tumor/normal of extranodal NK/T cell lymphomas. 
    
   
  
    
   
  32 
 
  
    EGAD00001006063 
   
  
    
    Illumina platform RNA-seq data from 47 Pancreatic neuroendocrine tumour samples 
    
   
  
    
   
  41 
 
  
    EGAD00001006064 
   
  
    
    A 19-sample data set containing data from FFPE high grade serous ovarian cancer biopsies. The library was made with a custom hybridization kit (EZ-Cap, Roche) spanning 7 genes. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  19 
 
  
    EGAD00001006065 
   
  
    
    Most patients with rare diseases do not receive a molecular diagnosis and the aetiological
variants and mediating genes for more than half such disorders remain to be discovered. We
implemented whole-genome sequencing (WGS) in a national healthcare system to streamline
diagnosis and to discover unknown aetiological variants, in the coding and non-coding regions
of the genome. In a pilot study for the 100,000 Genomes Project, we generated WGS data for
13,037 participants, of whom 9,802 had a rare disease, and provided a genetic diagnosis to
1,138 of the 7,065 patients with detailed phenotypic data. We identified 95 Mendelian
associations between genes and rare diseases, of which 11 have been discovered since 2015
and at least 79 are confirmed aetiological. Using WGS of UK Biobank1, we showed that rare
alleles can explain the presence of some individuals in the tails of a quantitative red blood cell
(RBC) trait. Finally, we reported 4 novel non-coding variants which cause disease through the
disruption of transcription of ARPC1B, GATA1, LRBA and MPL. Our study demonstrates a
synergy by using WGS for diagnosis and aetiological discovery in routine healthcare. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001006066 
   
  
    
    Maps of H3K27ac from normal 2nd- and 3rd-trimester cytotrophoblasts, preterm severe preeclampsia cytotrophoblasts, and 2nd-trimester amnion. Maps of H3K27me3, H3K27me3, H3K36me3, and H3K4me1 from 2nd- and 3rd-trimester cytotrophoblast. Maps of H3K9me3 from 2nd- and 3rd-trimester cytotrophoblast, smooth chorion, and basal plate. RNA-seq from 2nd- and 3rd-trimester cytotrophoblasts. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001006067 
   
  
    
    We generated global skeletal muscle transcriptomic data from long-term endurance (9 men, 9 women) and strength (7 men) trained individuals. These data were compared with healthy age-matched untrained controls (7 men, 8 women). All 40 samples were then multiplexed in 1 lane and sequenced (2x50bp paired end) on the Illumina NovaSeq 6000. 
    
   
  
    
      
      unspecified 
      
    
   
  40 
 
  
    EGAD00001006069 
   
  
    
    The dataset consists of FASTQ files from Seq-Well and some Chromium (10x Genomics) libraries from 10 control donors and 10 COPD GOLD2 patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  37 
 
  
    EGAD00001006070 
   
  
    
    Targeted sequencing analyses was made on samples of PDX engrafted with breast cancer bone metastases, 2 PDX acquired resistance to palbociclib. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001006071 
   
  
    
    Exome sequencing analyses obtained from 11 samples of PDX engrafted with bone metastases, match primary tumors and/or metastases and normal tissus. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001006072 
   
  
    
    Raw sequencing data (PE, fastq.gz) from NovaSeq 6000 sequencing runs of:
1. Blood-derived cell-free DNA from six healthy controls, enzymatic cytosine conversion (between 1 and 3 replicates each)
2. Urine-derived cell-free DNA from three healthy controls, enzymatic cytosine conversion (between 1 and 3 replicates each)
3. Blood-derived cell-free DNA from six patients with acute or chronic kidney disease with/without other relevant organ dysfunction, enzymatic cytosine conversion (between 1 and 3 replicates each)
4. Urine-derived cell-free DNA from three patients with acute kidney disease, enzymatic cytosine conversion (between 1 and 3 replicates each)
5. Blood-derived cell-free DNA from three healthy controls, bisulfite cytosine conversion (single datasets)
6. Conventional whole-genome sequencing on genomic DNA of two healthy kidney donors for two of the studied patients (single datasets). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001006073 
   
  
    
    35 paired samples ressected HCC 
    
   
  
    
   
  70 
 
  
    EGAD00001006074 
   
  
    
    Leukemic bone marrow in primary ETV6-RUNX1 positive acute lymphoblastic leukemia samples (Six diagnostic and two 15 days after treatment) using Chromium 3' single cell RNA-seq. Samples sequenced on 1-3 lanes and raw fastq files provided for each sample. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  8 
 
  
    EGAD00001006075 
   
  
    
    This dataset contains Whole-exon-sequencing (WES) of human acute erythroid leukemia patient samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  22 
 
  
    EGAD00001006076 
   
  
    
    This dataset contains RNAseq performed on 35 human acute erythroid leukemia patient samples and Xenografts. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  35 
 
  
    EGAD00001006077 
   
  
    
    WES (N=18) and WGS (N=2) of OSCC tumors and normals with the aim of identifying novel mutational signatures in Asian tumors 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 3000 
      
    
   
  40 
 
  
    EGAD00001006079 
   
  
    
    The dataset comprises of muscle samples from three patients with mitochondrial disease: Patient 1, age9; Patient 2, age16; and Patient 3, age 58. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001006080 
   
  
    
    4 samples of 2000000 HSPCs in 2 ml media each were then prepared and treated for 24 hours:
carboplatin-high: 150 µg/ml carboplatin
carboplatin-low: 18.75 µg/ml carboplatin
gemcitabine: 25 ng/ml gemcitabine
control: no drug
Single-cells were subsequently extracted using the ddSEQ™ Single-Cell Isolator (Bio-Rad) and later sequenced using the SureCell™ Whole Transcriptome Analysis 3' Library Prep Kit (Illumina) on the NextSeq 500 System (Illumina) using the NextSeq 500/550 High Output Kit v2.5 150 Cycles (Illumina) all following the manufacturer's instructions.
The raw FASTQ-files are available as read 1 and read 2 files from 4 lanes for each sample giving a total of 16 FASTQ-files. 
The FASTQ-files were also processed following the ddSeeker (Romagnoli, D., et al., BMC Genomics 2018, doi:10.1186/s12864-018-5249-x) and Drop-seq (Macosko, E.Z., et al., Cell 2015, doi:10.1016/j.cell.2015.05.002) protocols for processing scRNA-seq data to yield the final digital gene expression (dge) for each cell of each sample. This has resulted in one dge text file and one dge summary text file per sample. These 8 text-files with dge data are found in the zip-compressed analysis data-file. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001006081 
   
  
    
    PURPOSE: To determine the impact of basal-like and classical subtypes in advanced PDAC and to explore GATA6 expression as a surrogate biomarker. EXPERIMENTAL DESIGN: Within the COMPASS trial patients proceeding to chemotherapy for advanced PDAC undergo tumour biopsy for RNA sequencing. Overall response rate (ORR) and overall survival (OS) were stratified by subtypes and according to chemotherapy received. Correlation of GATA6 with the subtypes using gene expression profiling, in situ hybridization (ISH) were explored. RESULTS: Between December 2015-May 2019, 195 patients (95%) had enough tissue for RNA sequencing; 39 (20%) were classified as basal-like and 156 (80%) as classical. RECIST response data were available for 157 patients; 29 basal-like and 128 classical where the ORR was 10% vs. 33% respectively (p=0.02). In patients with basal-like tumours treated with modified FOLFIRINOX (mFFX) (n=22) the progression rate was 60% compared to 15% in classical PDAC (p= 0.0002). Median OS in the intention to treat population (n=195) was 9.3 months for classical vs. 5.9 months for basal-like PDAC (HR 0.47 95% CI 0.32-0.69, p=0.0001). GATA6 expression by RNAseq highly correlated with the classifier (p<0.001) and ISH predicted the subtypes with sensitivity of 89% and specificity of 83%. In a multivariable analysis, GATA6 expression was prognostic (p=0.02). In exploratory analyses, basal-like tumours, could be identified by keratin 5, were more hypoxic and enriched for a T cell inflamed gene expression signature. CONCLUSIONS: The basal-like subtype is chemoresistant and can be distinguished from classical PDAC by GATA6 expression. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  101 
 
  
    EGAD00001006083 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. 
Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. 
Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. 
  
This dataset contains all the data available for this study on 2021-09-27. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  520 
 
  
    EGAD00001006084 
   
  
    
    The dataset comprises of an aggregate level VCF (version 4.1), containing the somatic point and indel mutations found across the glioblastoma cohort (SweGBM-1, n=38 samples). The VCF file is in accordance with the HTS format specifications (https://samtools.github.io/hts-specs/). 
    
   
  
    
   
  38 
 
  
    EGAD00001006085 
   
  
    
    RNA sequencing of frozen tumor biopsies from patients with primary cutaneous CD8+ aggressive epidermotropic cytotoxic T-cell lymphoma. 6 samples. Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001006086 
   
  
    
    Whole-genome sequencing of frozen tumor biopsies from patients with primary cutaneous CD8+ aggressive epidermotropic cytotoxic T-cell lymphoma. 12 samples. Illumina HiSeq X-Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001006087 
   
  
    
    18 DLBCL genomes 
    
   
  
    
   
  18 
 
  
    EGAD00001006088 
   
  
    
    Somatic mutations accumulate in healthy tissues as we age, giving rise to cancer and potentially contributing to ageing. To study somatic mutations in non-neoplastic tissues, we developed a series of protocols to sequence the genomes of small populations of cells isolated from histological sections. Here, we describe a complete workflow that combines laser-capture microdissection (LCM) with low-input genome sequencing, whilst circumventing the use of whole-genome amplification (WGA). The protocol is subdivided broadly into 4 steps: tissue processing, LCM, low-input library generation and mutation calling and filtering. The tissue processing and LCM steps are provided as general guidelines which may require tailoring based on the specific requirements of the study at hand. Our protocol for low-input library generation utilises enzymatic rather than acoustic fragmentation to generate WGA-free whole-genome libraries. Finally, the mutation calling and filtering strategy has been adapted from previously published protocols to account for artefacts introduced via library creation. To date, we have used this workflow to perform targeted and whole-genome sequencing of small populations of cells (typically 100-1,000 cells) in thousands of microbiopsies from a wide range of human tissues. The low-input DNA protocol is designed to be compatible with liquid handling platforms and make use of equipment and expertise standard to any core sequencing facility. However, obtaining low-input DNA material via LCM requires specialized equipment and expertise. The entire protocol from tissue reception through whole-genome library generation can be accomplished in as little as a week, though 2-3 weeks would be a more typical turnaround time. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001006090 
   
  
    
    Serial samples from one AT-AML patient as described in publication Goldgraben et al Pediatric Blood & Cancer 2020. 
Whole exome sequencing of a AT-'germline' blood sample, one bone marrow sample (at AML diagnosis) and 3 AML blood samples. Library preped using the Illumina Nextera Rapid Capture Exome Enrichment Kit, and sequenced as PE150 on HiSeq4000.
Provided: 5 BAM files (GRCh37); 2 VCF analyses (germline and somatic) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001006092 
   
  
    
    Then individuals from three families segregating esophagus atresia. 
    
   
  
    
   
  10 
 
  
    EGAD00001006093 
   
  
    
    Targeted panel sequencin on Illumina HiSeq X Ten of brainstem glioma primary tumor and blood samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  42 
 
  
    EGAD00001006094 
   
  
    
    RNAseq on Illumina HiSeq X Ten of brainstem glioma primary tumor sample 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  75 
 
  
    EGAD00001006095 
   
  
    
    RNA-seq data for three Glioblastoma stem cell (GSC) lines exposed to PRMT5 inhibitor and control samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001006096 
   
  
    
    The plasma samples and white blood cell samples were collected from 30 non-small-cell lung cancer patients and 3 healthy individuals. The solid tumor biopsy samples from 14 patients (a subset of the 30 patients) were collected. The cfDNA was extracted from their plasma samples using the QIAamp circulating nucleic acid kit from QIAGEN (Germantown, MD). The cfDNA WES library was constructed with the SureSelect XT HS kit from Agilent Technologies (Santa Clara, CA) according to the manufacturer’s protocol. In brief, 10ng of cfDNA was used as input material.  After end repair/dA-tailing of cfDNA, the adaptor was ligated. The ligation product was purified with Ampure XP beads (Beckman-Coulter, Atlanta, GA) and the adaptor-ligated library was amplified with index primer in 10-cycle PCR. The amplified library was purified again with Ampure XP beads, and the amount of amplified DNA was measured using the Qubit 1xdsDNA HS assay kit (ThermoFisher, Waltham, MA). 700-1000 ng of DNA sample was hybridized to the Agilent SureSelect Human All Exon V6 (Agilent) capture library and pulled down by streptavidin-coated beads. After washing the beads, the DNA library captured on the beads was re-amplified with 10-cycle PCR. The final libraries were purified by Ampure XP beads. The library concentration was measured by Qubit, and the quality was further examined with Agilent Bioanalyzer before the final step of 2x150bp paired-end sequencing on the Illumina HiSeq X10 platform (Illumina) at an average coverage of 200. Whole-exome capture libraries of genomic DNA of the 30 non-small cell lung cancer patients were constructed via Roche SeqCap EZ Exome V3 (Roche); whole-exome capture libraries of genomic DNA of the 3 healthy individuals were constructed via Agilent SureSelect Human All Exon V6 (Agilent). Enriched exome libraries were sequenced on the Illumina HiSeq 3000 platform (Illumina) to generate 2x100bp paired-end reads at an average coverage of 200. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 3000 
      
    
   
  83 
 
  
    EGAD00001006097 
   
  
    
    We profiled 16 high-grade gliomas patient tumour samples by single-cell and single-nuclei RNA-seq and 3 normal-matched single-cell RNA-seq using 10X Chromium 3'. The fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001006098 
   
  
    
    We profiled 18 high-grade gliomas patient tumor samples by bulk RNA-seq. The raw fastqs are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  20 
 
  
    EGAD00001006099 
   
  
    
    We profiled 23 high-grade gliomas patient tumor samples and 4 normal-matched patient samples by whole exome sequencing. The raw fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  27 
 
  
    EGAD00001006100 
   
  
    
    We profiled 16 patient tumour samples by ChIP-seq. H3K27ac and Input are provided for 16 samples and H3K27me3 is provided for 14 samples. 
Among the 16 samples, 9 are G34WT and 7 are G34R/V. The raw fastq or bam files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  78 
 
  
    EGAD00001006101 
   
  
    
    Paired end (47/98) and single end (51/98) shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of colorectal cancer (CRC) patients. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  52 
 
  
    EGAD00001006102 
   
  
    
    The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006103 
   
  
    
    Mutation analysis of 17 genes (ALK, APC, BRAF, BRCA1, BRCA2, DPYD, EGFR, ERBB2, KIT, KRAS, MET, NRAS, PDGFRA, RET, ROS1, TP53, UGT1A1) in plasma DNA of CRC patients using the AVENIO ctDNA Targeted Kit. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  26 
 
  
    EGAD00001006104 
   
  
    
    Mutation analysis in plasma samples with low ctDNA levels using a molecular barcoding technology, i.e. the single target approach SiMSen-seq (Simple, multiplexed, PCR-based barcoding of DNA for sensitive mutation detection using sequencing). 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  53 
 
  
    EGAD00001006105 
   
  
    
    Targeted deep sequencing for the KRAS p.Gly12Asp, p.Gly12Val and p.Ala146Thr mutations in plasma samples of CRC patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  27 
 
  
    EGAD00001006106 
   
  
    
    RNA was isolated using phenol-chloroform extraction followed by DNase digestion or using the Qiagen Allprep DNA/RNA kit and protocol (Qiagen, #80204). cDNA synthesis was done using the SuperScript II Reverse Transcriptase kit (Invitrogen). Quantitative real-time PCR was performed by using primers as described previously13,21 on the 7500 Fast Real-time PCR System (Applied Biosystems). Relative levels of gene expression were calculated using the ΔΔCt method 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  26 
 
  
    EGAD00001006107 
   
  
    
    Whole Exome Sequencing was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 6). Library prep was done by exon capture using the Illumina Truseq Exome kit and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using BWA. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  10 
 
  
    EGAD00001006108 
   
  
    
    RNA-Seq was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 27). Library prep was done by removal of rRNA and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using STAR-Fusion. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001006109 
   
  
    
    smMIP-Seq was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 18). Library prep was done by capture of targeted sequences with single molecule (unique molecular index) tagged molecular inversion probes (MIP) and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using BWA. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  36 
 
  
    EGAD00001006110 
   
  
    
    Each run contains single cell RNA-seq data from unbiased sampling of single cells from the indicated human tissue. Single cell suspensions were prepared using enzymatic dissociation followed by tituration. The samples were processed using the 10XChromium 3' v3 sequencing pipeline, sequenced on an Illumina NovaSeq 6000,  and analyzed using the cellranger software and aligned to the human GRCh38 genome version 93. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001006111 
   
  
    
    Targeted long-read nanopore sequencing.
Abstract: Fusion genes are hallmarks of various cancer types and important determinants for diagnosis, prognosis and treatment. Fusion gene partner choice and breakpoint-position promiscuity restricts diagnostic detection, even for known and recurrent configurations. To accurately and impartially identify fusions, we developed FUDGE: FUsion Detection from Gene Enrichment. FUDGE couples target-selected and strand-specific CRISPR/Cas9 activity for fusion gene driver enrichment - without prior knowledge of fusion partner or breakpoint-location – to long-read Nanopore sequencing with the bioinformatics pipeline NanoFG. FUDGE has flexible target-loci choices and enables multiplexed enrichment for simultaneous analysis of several genes in multiple samples in one sequencing run. We observe on-average 665 fold breakpoint-site enrichment and identify nucleotide resolution fusion breakpoints - within two days. The assay identifies cancer cell line and tumor sample fusions irrespective of partner gene or breakpoint-position. FUDGE is a rapid and versatile fusion detection assay, providing unparalleled opportunity for diagnostic pan-cancer fusion detection. 
    
   
  
    
      
      GridION 
      
    
   
  17 
 
  
    EGAD00001006112 
   
  
    
    RNA-Sequencing of 27 functionally validated LSC and blast fractions from 9 AML patients. Three healthy hematopoietic stem and progenitor cells from age-matched controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001006113 
   
  
    
    In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  84 
 
  
    EGAD00001006114 
   
  
    
    In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1916 
 
  
    EGAD00001006115 
   
  
    
    In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-exome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  103 
 
  
    EGAD00001006116 
   
  
    
    In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001006117 
   
  
    
    n this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  575 
 
  
    EGAD00001006118 
   
  
    
    In this study we will perform targeted sequencing on the bulk samples of in vitro colonies. 
This dataset contains all the data available for this study on 2020-05-05. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  595 
 
  
    EGAD00001006119 
   
  
    
    25 Whole genome sequencing data cases 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001006120 
   
  
    
    Homologous recombination DNA repair deficiency and PARP inhibition activity in primary triple negative breast cancer. RNA-Seq data for paired baseline and end of treatment samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001006121 
   
  
    
    This dataset contains the fastq files used for this study 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  154 
 
  
    EGAD00001006122 
   
  
    
    This dataset contains all available targeted exon sequencing bam files from our study, "Activating AKT1 and PIK3CA mutations in metastatic castration-resistant prostate cancer". Patient identifiers are denoted by the first segment of the sample aliases (e.g. "P1"), and additional information is appended to reflect which serial sample is referenced (1st, 2nd, 3rd, etc.), and whether the sample represents cell-free DNA ("cfdna") or paired white-blood cell control ("WBC"). All samples were sequenced using Illumina technology. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  114 
 
  
    EGAD00001006123 
   
  
    
    3q-capture DNA sequencing was performed as we described previously 13. In summary, genomic DNA was fragmented using the Covaris shearing device (Covaris), and sample libraries were assembled following the TruSeq DNA Sample Preparation Guide (Illumina). After ligation of adapters and an amplification step, target sequences of chromosomal regions 3q21.1-q26.2 were captured using custom in-solution oligonucleotide baits (Nimblegen SeqCap EZ Choice XL). The design of target sequences was based on the human genome assembly hg19: chr3q21.1:126036241-130672290 - chr3q26.2:157712147-175694147. Amplified captured sample libraries were paired-end sequenced (2x100 bp) on the HiSeq 2500 platform (Illumina) and aligned against the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)25 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  33 
 
  
    EGAD00001006124 
   
  
    
    The samples are from patients of multiple cancer types. The library preparation protocol was developed by the laboratory. The DNA libraries were then sequenced with 150bp paired-end reads. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  388 
 
  
    EGAD00001006125 
   
  
    
    Whole exome sequencing of an alveolar rhabdomyosarcoma patient with RET germline mutation and subsequent analysis of potential therapeutic mechanisms associated with the patient's rare germline mutation. Patient sampels were sequenced from an initial biopsy and from a relapse biopsy, in addition to normal blood as the matched normal DNA. RNA sequencing was performed on the relapse sample as initial sample was unable to produce usable RNA for analysis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001006126 
   
  
    
    Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001006127 
   
  
    
    Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001006128 
   
  
    
    Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001006129 
   
  
    
    Exome sequencing of 22 Pheochomocytoma/Paraganglioma (PPGL) primary tumors, both malignant and non-malignant. Tumor material was from snap-frozen (SF) or formalin-fixed-paraffin-embedded (FFPE) . 
    
   
  
    
      
      Illumina HiScanSQ 
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD00001006130 
   
  
    
    The COVID-19 pandemic urgently needs therapeutic and prophylactic interventions. Here we report the rapid identification of SARS-CoV-2 neutralizing antibodies by high-throughput single-cell RNA and VDJ sequencing of antigen-enriched B cells from 60 convalescent patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006131 
   
  
    
    Here, we performed a characterisation of 12 tumours and matched normal samples from  3 syCRC patients by  whole genome sequencing: Patient A (tumours A1 and A2), Patient B (tumours B1-B5), and Patient C (tumours C1-C5). Somatic SNVs, indels and stuructural variants were called. 
    
   
  
    
   
  3 
 
  
    EGAD00001006132 
   
  
    
    fastq files for shallow whole genome sequencing data as described in Mouliere et al, 2018. Files are those not included in STM2 i.e. these are largely non-ovarian cancer samples 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  292 
 
  
    EGAD00001006133 
   
  
    
    Bulk RNA-seq for cALL patient-derived PDX samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  74 
 
  
    EGAD00001006134 
   
  
    
    Single cell RNA-seq for PT1-derived PDX samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  757 
 
  
    EGAD00001006135 
   
  
    
    Single cell RNA-seq for primary samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  285 
 
  
    EGAD00001006136 
   
  
    
    Single cell WGS (low pass) for chord blood samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  24 
 
  
    EGAD00001006137 
   
  
    
    Single cell WGS (low pass) for PT1-derived PDX samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  539 
 
  
    EGAD00001006138 
   
  
    
    Single cell WGS (low pass) for primary samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  444 
 
  
    EGAD00001006139 
   
  
    
    Biopsies from the terminal ileum and rectum of healthy individuals are digested on ice to single cells and processed for single-cell RNA-sequencing (10X Genomics and Illumina)
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2020-05-12. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  49 
 
  
    EGAD00001006141 
   
  
    
    16S sequencing data from 2259 Flemish Gut Flora Project (FGFP) samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2259 
 
  
    EGAD00001006142 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001006143 
   
  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001006144 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD00001006145 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  70 
 
  
    EGAD00001006149 
   
  
    
    This is the data from the eQTLs InsPIRE study. This dataset includes RNAseq and genotypes from pancreatic islets and FAC sorted beta-cells, as well as RPKM values, covariates and cell count estimates. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  255 
 
  
    EGAD00001006150 
   
  
    
    SMARTer Stranded Total RNA-Seq method of human platelet-rich plasma, platelet-free plasma, urine, conditioned medium, and extracellular vesicles (EVs) from these biofluids. Including a titration experiment with spikes. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001006151 
   
  
    
    Caracterization of somatic variants in patients with OpSCC 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  51 
 
  
    EGAD00001006152 
   
  
    
    Bam files from WGS of PDAC samples described in: Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution 
    
   
  
    
   
  - 
 
  
    EGAD00001006156 
   
  
    
    Dataset consisting of sequence data from 36 glioma patients. Data includes;
-Whole exome sequencing of tumour tissue and matched germline
-Shallow whole genome sequencing of urine cell-free DNA
-Targeted capture sequencing of plasma and CSF cell-free DNA
Additional sequencing data are provided from non-glioma and healthy controls 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  213 
 
  
    EGAD00001006157 
   
  
    
    The impact of genetic variants on molecular pathways that give rise to neurodegenerative diseases such as Alzheimer's and Parkinson's is best elucidated in the appropriate cell types and molecular contexts. Existing studies have focused on bulk profiling of mixed cell types, but have ignored assaying genetic effects across development and cell differentiation. At the core of this proposal is the idea to use single-cell assays to study genetic effects during differentiation of dopaminergic and cortical neurons to identify the sequence of molecular events from variants to healthy and diseased cell states in a cell-specific manner.
1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/
 . 
This dataset contains all the data available for this study on 2020-05-18. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  42 
 
  
    EGAD00001006158 
   
  
    
    The MYOSEQ project focuses on the application of next generation sequencing, in particular whole exome sequencing (WES), in a large cohort of patients with unexplained limb‐girdle weakness (LGW). Focusing on undiagnosed patients with a clearly defined clinical phenotype enables increased diagnostic rates for known genes (in particular Pompe disease, GNE related pathologies and other known LGMD subtypes) in this cohort, while the use of WES provides scope both for new gene discovery and for additional research into disease modifiers and genotype‐phenotype correlation with substantial cost effectiveness. The LGW patient cohort was collated by Newcastle University in collaboration with clinical centers across Europe. The sequencing was performed at the Broad Institute and jointly analyzed with Newcastle University. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina Genome Analyzer IIx 
      
      Illumina HiSeq 2000 
      
    
   
  888 
 
  
    EGAD00001006159 
   
  
    
    Tumor and normal exomes for 51 MCL patients and tumor and normal genomes for 34 MCL patients. 
    
   
  
    
   
  170 
 
  
    EGAD00001006160 
   
  
    
    Inherited cardiac conditions (ICC) panel sequencing data of Egyptian healthy volunteers. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  391 
 
  
    EGAD00001006161 
   
  
    
    This dataset comprises of 76 cancer and normal whole genomes obtained from 11 SI-NET patients, in the form of two fastq files (forward and reverse reads) for each genome containing sequences generated by Illumina NovaSeq 6000 system. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  76 
 
  
    EGAD00001006162 
   
  
    
    In this study we will perform whole genome sequencing on in vitro colonies. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  616 
 
  
    EGAD00001006163 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001006164 
   
  
    
    BAM files of RNA sequencing (RNA-seq) experiment on multi-regional colorectal cancer (CRC) samples. 58 samples corresponding to 16 patients were sequenced and there is one BAM file for each sample. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  88 
 
  
    EGAD00001006165 
   
  
    
    BAM files of Whole Exome Sequencing (WES) experiment on multi-regional colorectal cancer (CRC) samples. 32 tumour and 16 normal samples corresponding to 16 patients were sequenced, which makes 32 tumour BAMs and 16 normal BAMs (48 BAMs in total). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  48 
 
  
    EGAD00001006166 
   
  
    
    Whole exome sequencing (WES) was performed on the matched tumor and organoid pairs from 7 cervical cancer patients. The DNA was sequenced on NovaSeq6000 platform with 8Gb sequencing coverage. WES data was mapped against human reference genome GRCh38 by using BWA (v0.7.5) mapping tool. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001006167 
   
  
    
    This dataset contains the raw fastq-files and the VCF files of single cell targeted DNA sequening with the MissionBio Tapestri platform. This was performed on 8 male pediatric T-ALL cases: X09-XB37-XB47-XD83-XF91-XF97-XF100-XF121. For some patients we have timepoints during treatment: XF100 and XG121. XD83 is a patient that relapsed twice. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001006170 
   
  
    
    Data supporting: "The mutREAD method detects mutational signatures from low quantities of cancer DNA." Perner et al.
WGS, sWGS, WES, and reduced representation sequencing data
tumour and normal samples
BAM files 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  48 
 
  
    EGAD00001006171 
   
  
    
    Germline blood DNA sequencing data generated in routine diagnostics of hereditary cancer using the I2HCP gene panel (~135 genes). There are 130 samples sequenced in a MiSeq machine and 108 sequenced in a HiSeq machine. There is a partial overlap between those two sets, meaning that some samples were sequenced in both machines. There is a strong enrichment in samples with copy-number variants (CNV), both single- and multi-exon, since this dataset was compiled for a benchmarking effort of CNV calling tools for genetic diagnostics. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  188 
 
  
    EGAD00001006172 
   
  
    
    Single-cell RNA sequencing was performed on a total of 20 PC specimens. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001006173 
   
  
    
    Single-cell RNA sequencing was performed on bone marrow mononuclear cells from 8 acute myeloid leukemia patients at diagnosis. The profiling was performed using 10x Genomics 3' (5 samples) and 5' (3 samples) platforms. The raw data are available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  128 
 
  
    EGAD00001006175 
   
  
    
    Single Cell RNAseq of PBMC from renal cancer patients 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001006176 
   
  
    
    Whole-genome sequencing (10X Genomics) of frozen tumor biopsies from patients with primary cutaneous anaplastic large cell lymphoma. 12 samples. Illumina HiSeq X-Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001006177 
   
  
    
    Whole-exome sequencing of matched frozen tumor biopsies/granulocytes from patients with primary cutaneous anaplastic large cell lymphoma. 7 paired tumor/germline samples. BGISEQ-500. 
    
   
  
    
      
      unspecified 
      
    
   
  14 
 
  
    EGAD00001006178 
   
  
    
    RNA sequencing of frozen tumor biopsies from patients with primary cutaneous anaplastic large cell lymphoma. 12 samples. Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001006179 
   
  
    
    These are some selected exomes from the first year of the Childrens Rare Disease Cohorts initiative at Boston Children's Hospital. Patients were drawn from the following cohorts: immunodeficiency, epilepsy, IBD, hearing loss, and orphan diseases. Raw sequencing was performed on Illumina NovaSeq 6000 machines, and aligned to hs37d5. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001006180 
   
  
    
    This dataset contains three bam files. One normal blood sample and two matched FF and FFPE samples from the same metastatic prostate tumor. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001006181 
   
  
    
    The dataset includes raw RNA-seq data (fastq files) for miRNAs of monocytes before and after 6-hour exposure to four different immune stimuli, measured in 200 African- and European-descent healthy donors from Belgium. The stimuli include ligands for TLR4 (LPS), TLR1/2 (Pam3CSK4) and TLR7/8 (R848) and to a human seasonal influenza A virus (IAV). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  977 
 
  
    EGAD00001006182 
   
  
    
    This dataset contains single cell RNA sequencing data of PBMC samples from 20 bladder cancer patients. cDNAs and single cell RNA libraries were prepared following manufacturer’s user guide (10x Genomics). Each library was sequenced in HiSeq4000 (Illumina) to achieve ~300 million reads following manufacturer’s sequencing specification. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  20 
 
  
    EGAD00001006183 
   
  
    
    25 high coverage complete genome sequences of southern African Khoe-San individuals. Bam file format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001006184 
   
  
    
    linking MASTER H021-Cohort to EGAS0001004157 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001006185 
   
  
    
    ALI culture bronchial cells and alveolar lung surgical resection scRNA-Seq 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006186 
   
  
    
    Transcriptomic (N = 18) and epigenomic (N = 6) characterization of macrophages using RNA-sequencing and ChIP-sequencing (bait = SP140) in the presence of absence of SP140 inhibition. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001006187 
   
  
    
    This dataset contains BAM files for RNA-sequencing of stage I lung adenocarcinomas from Asian patients. In total, there are 107 patients and 107 tumor samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  107 
 
  
    EGAD00001006188 
   
  
    
    This dataset contains BAM files for whole exome-sequencing of stage I lung adenocarcinomas from Asian patients. In total, there are 113 patients and 262 samples, including 113 tumor samples, 113 adjacent normal samples and 36 buffy coat samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  262 
 
  
    EGAD00001006189 
   
  
    
    Paired WGS and RNA-Seq data of patients with multiple myeloma (MM) refractory to immunomodulatory agents (IMiDs) and proteasome inhibitors (PIs). We performed whole genome and transcriptome sequencing of 39 heavily pretreated RRMM patients with at least double refractoriness revealing complex structural changes and a high mutational load. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  116 
 
  
    EGAD00001006190 
   
  
    
    Single-cell transcriptomes for 10 hepatocellular carcinoma (HCC) patients from 21 sample of four relevant sites: primary tumor (T), portal vein tumor thrombus (P), metastatic lymph node (L) and non-tumor liver (N). Single cells were sequenced using Chromium Single Cell 3’ Library (10x Genomics). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001006193 
   
  
    
    Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of 14 healthy individuals, 8 patients with systemic lupus erythematosus, 9 patients with rheumatoid arthritis, and 11 patients with multiple sclerosis. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end sequencing was performed on NextSeq500. 
    
   
  
    
   
  42 
 
  
    EGAD00001006194 
   
  
    
    The risk of getting non-melanoma skin cancer varies over 40-fold across the body. Here we map mutations in normal skin in high and low risk sites in normal donors and those with an increased risk of skin cancer. The density of mutations varied widely, with evidence of positive and negative genetic selection.  Regional differences in mutational signatures in high and low cancer risk sites and preferential selection of mutants of TP53 in high risk skin and FAT1 in lower risk skin were observed. 10% of clones had copy number changes in cancer associated genes and the largest had multiple driver mutations with loss of heterozygosity. In hair follicles, a proposed site of origin of skin cancers, mutations in the upper follicle resembled adjacent skin, but the lower follicle was sparsely mutated. We conclude cancer risk reflects the efficiency of transformation of oncogenic mutants rather than the density of mutant clones. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  805 
 
  
    EGAD00001006195 
   
  
    
    RNA-seq data for 54 Glioblastoma stem cell (GSC) lines. Fastq files of the strand-specific paired-end RNA-seq data are available. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001006196 
   
  
    
    Additional Neuroblastoma whole genome sequencing data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001006197 
   
  
    
    This dataset contains RNA-seq (Illumina Hiseq 4000) from macroscopically preserved and lesioned OA subchondral bone of patients that underwent joint replacement surgery due to OA (N=48). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  48 
 
  
    EGAD00001006198 
   
  
    
    Whole genome sequencing data of 84 Nama individuals with KhoeSan ancestry from southern Africa in phased VCF format. Variants were called/phased as part of the African Genome Resource. The data has been aligned to GRCh37. 
    
   
  
    
   
  84 
 
  
    EGAD00001006199 
   
  
    
    ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the TPM and raw counts tables. 
    
   
  
    
   
  - 
 
  
    EGAD00001006200 
   
  
    
    ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes clinical phenotype data. 
    
   
  
    
   
  - 
 
  
    EGAD00001006201 
   
  
    
    ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the processed data from FMOne. 
    
   
  
    
   
  - 
 
  
    EGAD00001006202 
   
  
    
    This case represented the genomic findings of a pediatric glioblastoma patient who underwent multiple surgical resections and treated with standard chemoradiation, as well as a novel recombinant poliovirus vaccine therapy. The results present the preservation of a STAG2 mutated clone, besides elimination and emergence of other clones with oncogenic mutations through disease progression under different treatment modalities. Although STAG2 deficiency comprises only a small subset of gliomas, this case adds clinical evidence to existing preclinical data supporting a role for STAG2 mutations in gliomagenesis and resistance to standard therapies. 
    
   
  
    
   
  3 
 
  
    EGAD00001006203 
   
  
    
    This data set contains the raw .fastq files from one RNA-sequencing experiment. Endothelial cells of the basilar artery and endothelial cells of the carotid artery were post-mortem derived with laser microdissection and sequenced. For this analysis both the basilar- and the carotid artery endothelial cells of eleven individuals were sequenced. For more details please see: DMA Hermkens et al. "Profiling the Unique Protective Properties of Intracranial Arterial Endothelial Cells" Acta Neuropathol Commun. 2019 Oct 14; PMID:  31610812. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD00001006204 
   
  
    
   
  
    
      
      Illumina HiSeq 1500 
      
      MinION 
      
      PromethION 
      
    
   
  5 
 
  
    EGAD00001006205 
   
  
    
    ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the raw RNA-seq data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  148 
 
  
    EGAD00001006206 
   
  
    
    10xGenomics single-cell RNA sequencing of glioblastoma patient tumor 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  25 
 
  
    EGAD00001006207 
   
  
    
    This dataset contains Mate Pair Sequencing data from 15 samples from 13 patients.
Mate pair DNA library preparation was carried out using the Illumina
MP v.2 reagents and protocol. In brief,
fragmentation of genomic DNA was performed using a Hydroshear
device to an insert size of 4.5 kb followed by sequencing with Illumina
HiSeq 2000 instruments resulting in 30 Fastq files (paired end). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD00001006208 
   
  
    
    This dataset contains panel sequencing data from 33 samples.
Targeted sequencing was performed by creating libraries using
the Agilent SureSelect XT technology. Libraries were sequenced
using molecular barcode-indexed ligation-based sequencing using
a NextSeq500 (Illumina) instrument. Between three and six lanes
per sample have been sequenced resulting in 262 Fastq files (paired end). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  33 
 
  
    EGAD00001006209 
   
  
    
    The dataset contains two samples from one patient.
As a representative FFPE tissue sample, ET174 was histologically iden-
tified, targeted and microdissected with a puncher for nucleic acid
extraction. RNA was extracted using the automated Maxwell
system with the Maxwell 16 LEV RNA FFPE Kit (Promega), according to
the manufacturer’s instructions. To evaluate FFPE RNA quality, we used
the percentage of RNA fragments >200 nt fragment determination
value (DV200). Only RNA samples with DV200 > 70% were included for
sequencing on a NextSeq 500 (Illumina). Eight lanes have been sequenced
resulting in 16 Fastq files (paired end). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001006210 
   
  
    
    This dataset contains exome sequencing data from 21 samples.
Sequencing of samples using whole-exome sequencing was per-
formed by creating libraries using the IlluminaTruSeq exome enrich-
ment kit following the manufacturer’s instructions after size selection.
Size selection was performed by fractionation using a Covaris ultra-
sonicator and subsequent selection was performed using a 1.5% gel
Pippin Prep cassette (Sage Science). One lane per sample has been sequenced
resulting in 42 Fastq files (paired end). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001006211 
   
  
    
    This dataset contains whole genome sequencing data from 59 samples.
WGS libraries were prepared using the Illumina TruSeq Nano DNA
LT Library Prep or TruSeq Nano DNA HT Library Prep Kit following
the manufacturer’s instructions. In brief, 100 ng of genomic DNA was
fragmented to approximately 350 bp using a Covaris ultrasonicator
(Covaris). The fragmented DNA was then end-repaired, size-selected
using magnetic beads, extended with an ‘A’ base on the 3′ end and ligated
with TruSeq paired-end indexing adapters. Up to four lanes per sample have
been sequenced resulting in 222 Fastq files (paired end). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  59 
 
  
    EGAD00001006212 
   
  
    
    Mutation accumulation over time in normal somatic cells contributes to cancer development and is proposed as a cause of ageing. DNA polymerases POLE and POLD1 replicate DNA with high fidelity during normal cell divisions. However, in some cancers defective proofreading due to acquired mutations in the exonuclease domains of POLE or POLD1 causes markedly elevated somatic mutation burdens with distinctive mutational signatures. POLE and POLD1 exonuclease domain mutations also cause familial cancer predisposition when inherited through the germline. Here, we sequenced normal tissue DNA from individuals with germline POLE or POLD1 exonuclease domain mutations. Increased mutation burdens with characteristic mutational signatures were found to varying extents in all normal adult somatic cell types examined, during early embryogenesis and in sperm. Mutation burdens were further markedly elevated in neoplasms from these individuals. Thus human physiology is able to tolerate ubiquitously elevated mutation burdens. Indeed, with the exception of early onset cancer, individuals with germline POLE and POLD1 exonuclease domain mutations are not reported to show abnormal phenotypic features, including those of premature ageing. The results, therefore, do not support a simple model in which all features of ageing are attributable to widespread cell malfunction directly resulting from somatic mutation burdens accrued during life. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  211 
 
  
    EGAD00001006213 
   
  
    
    WES data of paired primary and metastatic tumors 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  179 
 
  
    EGAD00001006215 
   
  
    
    Human dnase1l3 deficiency-Mouse AAV samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001006216 
   
  
    
    Plasma DNA profile in DNASE1L3 deficiency 
    
   
  
    
      
      NextSeq 500 
      
      Sequel 
      
    
   
  37 
 
  
    EGAD00001006217 
   
  
    
    The dataset contains Whole Exome Sequencing data (BAM files) of 22 samples from HER2+ metastatic breast patients.
For 9 of the 13 tumours samples there are paired controls available from normal tissue.
There are 8 tumours samples that are from treatment-responder patients and 5 tumours samples from non responder patients. 
    
   
  
    
      
      unspecified 
      
    
   
  15 
 
  
    EGAD00001006218 
   
  
    
    This dataset contains miRNA-seq data from 10 patients.
Small RNAs were isolated as described previously57,58 from fresh-frozen
tumour material. In brief, total RNA was extracted using guanidinium
isothiocyanate/phenol extraction followed by 3′-adaptor ligation of
barcoded adenylated adaptors. Samples were pooled in two sets of five
samples. Subsequently, gel electrophoresis was used to isolate small
RNAs (19–35 nt) and purified using ethanol precipitation. Fragments
were then amplified using standard PCR, isolated using gel electropho-
resis and purified using ethanol precipitation. Samples were sequenced
on a HiSeq 2000 v.4 machine resulting in 10 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001006219 
   
  
    
    This dataset contains DRIP-seq data from 2 patients.
DNA–RNA hybrids were extracted from tissue derived from ETMR
patient-derived xenograft (PDX) models (BT183) that were treated
using topotecan or saline as described previously27. Tumours were
subsequently frozen and pelleted using ultracentrifugation. DNA–RNA
hybrids were extracted as described previously using the same protocol
that is applied for cultured cells21. DNA was extracted using proteinase
K followed by phenol–chloroform extraction and ethanol precipitation.
Subsequently the DNA was fragmented using the restriction enzymes
HindIII, EcoRI, BsrGI, XbaI and SspI (New England Biolabs). Digested
DNA was subsequently incubated with the anti-DNA–RNA hybrid anti-
body S9.6 (Merck, MABE1095) and immunoprecipitated using agarose
beads. Bound DNA–RNA hybrids were eluted and incubated with pro-
teinase K and cleaned with an additional phenol–chloroform–ethanol
extraction. The DNA was subsequently sonicated and sequenced using
a Hiseq 2000 machine with a 50-bp single-read protocol. Each treat-
ment condition was performed in duplicate and both RNase H and the
input was included as negative controls resulting in 10 Fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001006220 
   
  
    
    February 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001006221 
   
  
    
    This dataset contains merged data from the 22 Hodgkin lymphoma and 5 reactive lymph node samples, including count data and cell cluster assignments. 
    
   
  
    
   
  28 
 
  
    EGAD00001006222 
   
  
    
    SPECTA Lung cancer RNA FASTQ files (Illumina TST170 targeted analysis) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  120 
 
  
    EGAD00001006223 
   
  
    
    SPECTA Lung cancer DNA FASTQ files (Illumina TST170 targeted analysis) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  154 
 
  
    EGAD00001006224 
   
  
    
    Whole Exome Sequencing data from a retrospective paediatric HIV-disease progression cohort defined on the World Health Organization's criteria for paediatric HIV progression.  Data comprises of BAM and VCF files for 314 participants from 2 countries: Botswana and Uganda. DNA and RNA samples are linked in bio-repository. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  314 
 
  
    EGAD00001006226 
   
  
    
    Formalin-fixed, paraffin-embedded samples from 19 PSC-IBD-CRCs, 15 adjacent (non-tumour) mucosa samples and 18 non-mucosal DNA samples were collected via the nationwide network and registry of histo- and cytopathology in the Netherlands (PALGA). DNA was extracted for molecular analysis. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001006227 
   
  
    
    VCF for 87 Argentinean samples. Only SNPs (no indels) that passed the Affymetrix QC. Data from Luisi et al. 2020. Plos One. Fine-Scale Genomic Analyses Of Admixed Individuals Reveal Unrecognized Genetic Ancestry Components In Argentina. Reference Allele column does NOT contain reference allele from genome assembly. 
    
   
  
    
   
  87 
 
  
    EGAD00001006228 
   
  
    
    The following samples were generated from the patient samples used in the study:
- Bulk DNA sequencing specifically targeting mutated sites of interest derived from clonal cell populations (288 monoclonal colonies - 2 replicates).
- Targeted Muta-seq method (Patient P342: 2208 cells, Patient HRK: 1066 cells, Patient LAK: 618 cells, Patient P101: 1080 cells)
- Smart-seq2 method (Patient P342 - 768 individual cells). 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001006229 
   
  
    
    Data from a study of 148 samples from IPMNs, MCNs, and small associated invasive carcinomas from 18 patients using whole exome or targeted sequencing. Sequencing data from 77 samples out of the 148 samples in the complete study are available in this dataset, based on the permissions given by the participating patients and the informed consents. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  77 
 
  
    EGAD00001006230 
   
  
    
    Blood-based assays have shown increasing ability to detect circulating tumour DNA (ctDNA) in patients with early-stage cancer. However, detection of ctDNA in patients with non-small cell lung cancer (NSCLC) has continued to prove challenging. We performed retrospective analysis to quantify ctDNA levels in a cohort of 100 patients with early-stage NSCLC prior to treatment with curative intent. Where tumour tissue was available for whole exome sequencing, mutations identified were used to define patient-specific sequencing assays. For those 90 patients, plasma cell-free DNA was sequenced to high depth across capture panels targeting a median of 328 mutations specific to each patient. Data was analysed using Integration of Variant Reads (INVAR), detecting ctDNA in 66.7% of patients, including 52.7% (29 of 55) patients with stage I disease and >88% detection for patients with stage II and III disease (16/18 and 15/17). ctDNA was detected in plasma at fractional concentrations as low as 9.1x10-6, and in patients with tumour volumes as low as 0.23 cm3. A 36-gene sequencing panel (InVisionFirst-LungTM) was used to analyse plasma DNA in 27 samples including the 10 cases without tumour exome data, and detected ctDNA in 59% of samples tested (16 of 27). Across the entire cohort, detection rates were higher in squamous cell carcinoma patients compared to adenocarcinoma patients (81% vs. 59%). Detection of ctDNA prior to treatment was associated with significantly shorter time free from relapse, across all patients and in patient subgroups, with Hazard Ratios ranging from 2.25 to >11. Our analysis indicates that for patients with stage I NSCLC, the median ctDNA fraction in plasma is approx. 12 parts per million (0.0012%). This indicates the limits of detection that would be required for ctDNA-based liquid biopsies to detect ctDNA in the majority of patients with early-stage NSCLC. 
    
   
  
    
   
  29 
 
  
    EGAD00001006231 
   
  
    
    Identification of patients with life-threatening diseases including leukemias or infections such as tuberculosis or COVID-19 is an important goal of modern precision medicine. However, there is an increasing divide between what is technically possible and what is allowed because of privacy legislation. We have recently illustrated that classical machine learning can identify leukemia patients based on their blood transcriptomes. To facilitate integration of any omics data from any data owner world-wide without violating privacy laws, we here introduce Swarm Learning (SL), a decentralized machine learning approach uniting edge computing, artificial intelligence (AI), blockchain and privacy protection without the need for a central coordinator thereby going beyond federated learning. To illustrate its feasibility, using more than 12,000 transcriptomes from peripheral blood mononuclear cells and more than 2,000 peripheral blood transcriptomes we demonstrate that SL of omics data distributed across different individual sites leads to disease classifiers that outperform those developed at individual sites. Yet, SL completely protects local privacy regulations by design. We propose this approach to noticeably accelerate the introduction of precision medicine. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  650 
 
  
    EGAD00001006232 
   
  
    
    RNA, T cell receptor and B cell receptor single cell sequencing data generated on T and B cells derived from patients with CNS autoimmune disease 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001006233 
   
  
    
    Whole genome sequencing for individualized cancer interpretation 
    
   
  
    
   
  94 
 
  
    EGAD00001006234 
   
  
    
    PolyA selection transcriptome profiling by high-throughput for individualized cancer interpretation 
    
   
  
    
   
  22 
 
  
    EGAD00001006235 
   
  
    
    Total RNA transcriptome profiling by high-throughput for individualized cancer interpretation 
    
   
  
    
   
  19 
 
  
    EGAD00001006236 
   
  
    
    Exome sequencing for individualized cancer interpretation 
    
   
  
    
   
  90 
 
  
    EGAD00001006237 
   
  
    
    Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  167 
 
  
    EGAD00001006238 
   
  
    
    In this study, a total of 300 patients with MIBC receiving chemotherapy were included; 62 received NAC before cystectomy and 245 received first-line chemotherapy upon detection of locally-advanced (T4b) or metastatic disease. Treatment response, defined as pathological downstaging (< pTa,CIS,N0) after NAC or complete or partial response after first-line treatment (RECIST criteria). RNA-seq was performed using the QuantSeq kit FWD HT kit (Lexogen) using 500 ng input RNA from 121 tumor samples. Data provided here consist of 780 fastq files for RNA-seq. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  121 
 
  
    EGAD00001006239 
   
  
    
    In this study, a total of 300 patients with MIBC receiving chemotherapy were included; 62 received NAC before cystectomy and 245 received first-line chemotherapy upon detection of locally-advanced (T4b) or metastatic disease. Treatment response, defined as pathological downstaging (< pTa,CIS,N0) after NAC or complete or partial response after first-line treatment (RECIST criteria). WES was performed using DNA from 165 tumors (76x median coverage) and associated germline DNA (46x median coverage). Data provided here consist of 5,828 fastq files for WES. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  330 
 
  
    EGAD00001006241 
   
  
    
    This dataset includes 14 bulk RNA sequencing data (28 fastq files) in the study entitled "Three-dimensional human alveolar stem cell culture models reveal infection response to SARS-CoV-2". RNA sequencing library was generated with Truseq stranded total RNA Gold kit. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001006242 
   
  
    
    This dataset includes a total of 5 single cell RNA sequencing (scRNAseq) data of SARS-CoV-2 infected human alveolar stem cell culture models. Two scRNAseq data were obtained from SARS-CoV-2 infection with MOI of 1.0 (1 for control and 1 for infected case) and the other three were obtained from SARS-CoV-2 infection with MOI of 0.1 in the study entitled "Three-dimensional human alveolar stem cell culture models reveal infection response to SARS-CoV-2". The 10x Chromium Single Cell 3' Reagent kits were used to generate libraries. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001006243 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001006244 
   
  
    
    Phenotype data for shotgun metagenomic sequencing data of nasopharyngeal fluid for studying nasopharyngeal colonization dynamics with Streptococcus pneumoniae and associated antimicrobial-resistance in a South African birth cohort.
https://www.ebi.ac.uk/ena/data/view/PRJEB37312 
    
   
  
    
   
  196 
 
  
    EGAD00001006245 
   
  
    
    RNA-seq data of the HCI011 and HCI011R models, GDC032 treated and control (total of 21 samples), from the paper: FOXM1 is a biomarker of resistance to PI3Kα inhibition in ER+ breast cancer that is detectable using metabolic imaging (Ros et al, 2020) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  21 
 
  
    EGAD00001006246 
   
  
    
    RNA-seq of skin from human subjects with and without lymphedema 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001006247 
   
  
    
    Raw untargeted metabolomics profiled by Metabolon Inc. for 540 samples from healthy individuals. Files include sample names and run details which can be matched to their metagenomic sequencing samples from PRJEB11532 and PRJEB17643. Information regarding metabolite metadata is also available, including 
    
   
  
    
   
  3 
 
  
    EGAD00001006248 
   
  
    
    Longitudinal and germline exome sequencing analysis of a mother and son pair who both developed adult-onset diploid AML identified a novel germline missense mutation DNMT3A p.P709S. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001006249 
   
  
    
    This study aims to use RNAseq to identify differentially expressed transcripts in human melanoma cells that over-express the cell surface protein, LRRN4CL, relative to empty-vector control cells, to provide mechanistic insight into how LRRN4CL over-expression confers enhanced pulmonary metastatic colonisation abilities. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001006250 
   
  
    
    NABUCCO cohort 1 sequencing data. The dataset includes:
* Whole exome DNAseq pre-treatment on tumor samples (n=24) matched with blood samples (n=24)
* RNAseq pre-treatment on tumor samples (n=18)
* RNAseq post-treatment on tumor samples (n=18). Not all pre-treatment samples are linked with pre-treatment samples 
* High coverage Whole exome DNAseq on pre-treatment tumor samples (n=3) matched with post-treatment metastasized lymph nodes isolated with laser microdissection (n=3)
* All samples are labelled with the response phenotype (Complete Responder or Non-Complete Responder) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  69 
 
  
    EGAD00001006251 
   
  
    
    Multiregional whole-exome sequencing was done using 48 tumor samples (range: 4-10 tumor samples/patient) from 9 patients with adenocarcinomas of the stomach and gastroesophageal junction (GC) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  56 
 
  
    EGAD00001006253 
   
  
    
    WES from 51 cases initially diagnosed as Malignant Nerve Sheath Tumours (MPNST) and RNA sequencing data from 10 MPNST cases. Find more information in article: Lyskjær et al, 2020, J Pathol, "H3K27me3 expression and methylation status in histological variants of malignant peripheral nerve sheath tumours". 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  98 
 
  
    EGAD00001006255 
   
  
    
    Chronic liver disease is associated with metabolic dysregulation, liver failure and hepatocellular carcinoma. We analysed somatic mutations from 1202 genomes across 32 liver samples, including normal controls, alcohol-related and non-alcoholic fatty liver disease. Five of 27 patients with liver disease carried hotspot driver mutations in FOXO1, the major transcription factor downstream of insulin signalling. FOXO1 mutations were independently acquired by up to 5 distinct clones within the same patient’s sample, and impaired insulin-mediated nuclear export of FOXO1. GPAM, which produces storage triacylglycerol from dietary calories, also had significant excess of mutations, similarly exhibiting convergent evolution within biopsies. Telomeres were shorter in diseased than normal liver, with attrition more pronounced in larger clones. Multiple independent acquisitions of drivers within one small liver sample imply that such mutations could affect hundreds of grams of tissue across the whole organ, potentially contributing to systemic metabolic dysfunction. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  1111 
 
  
    EGAD00001006257 
   
  
    
    Whole transcriptome RNA-Sequencing was performed on 148 bone marrow or peripheral blood samples of B-ALL patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  148 
 
  
    EGAD00001006258 
   
  
    
    RNA paired end sequencing of 59 adrenocortical tumors and 4 controls. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  63 
 
  
    EGAD00001006259 
   
  
    
    Chronic hepatitis C virus (HCV) infection is associated with CD8+ T-cell exhaustion characterized by limited effector functions and thus compromised anti-viral activity. Exhausted HCV-specific CD8+ T cells are comprised of memory-like and terminally exhausted CD8+ T-cell subsets. So far, little is not known about the molecular profile and fate of these cells after elimination of chronic antigen stimulation by direct acting antiviral therapy (DAA). Here, we report an antigen-driven molecular core signature underlying exhausted CD8+ T-cell subset heterogeneity in chronic viral infection with a progenitor/progeny relationship of memory-like and terminally exhausted HCV-specific CD8+ T cells via an intermediate stage. Furthermore, transcriptional profiling reveals that the memory-like cells remain after DAA-mediated cure while  terminally exhausted HCV-specific CD8+ T-cell subsets are lost. Thus, the memory polarization of the overall HCV-specific CD8+ T-cell response after cure does not result from re-differentiation of exhausted T cells. Consequently, antigen elimination has little impact on the exhausted core signature of memory-like CD8+ T cells that remains clearly different from bona fide T-cell memory. These results identify a molecular signature of T-cell exhaustion that is imprinted like a chronic scar in HCV-specific CD8+ T cells even after HCV cure, highlighting the requirement of re-programming to elicit full effector potential of exhausted T cells. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  19 
 
  
    EGAD00001006260 
   
  
    
    SPECTA Lung cancer VCF files 
    
   
  
    
   
  154 
 
  
    EGAD00001006261 
   
  
    
    Bam and fastq files from RNA-seq of PDAC samples used in the PCSI mismatch repair study 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001006262 
   
  
    
    Bam files from WGS of PDAC samples used in the PCSI mismatch repair study 
    
   
  
    
   
  - 
 
  
    EGAD00001006263 
   
  
    
    linking 3 samples out of EGAD00001002528 to EGAS0001004517 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001006264 
   
  
    
    18 samples of RNA-Seq of serially passaged TIC-enriched spheres of colorectal cancer (CRC), sequenced on HiSeq2000 and HiSeq2500 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001006265 
   
  
    
    WGS data of serially passaged TIC-enriched spheres of colorectal cancer 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001006266 
   
  
    
    WES data of serially passaged TIC-enriched spheres of colorectal cancer (CRC) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001006268 
   
  
    
    RNAseq BAM files for Coding and non-coding mantle cell lymphoma driver mutations 
    
   
  
    
   
  102 
 
  
    EGAD00001006269 
   
  
    
    This dataset includes microRNA profiling of 61 early-passage metastatic melanoma cell lines. The data are provided as single-end small RNA seq fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  61 
 
  
    EGAD00001006270 
   
  
    
    This dataset includes transcriptome profiling of 68 early-passage metastatic melanoma cell lines. The data are provided as paired-end RNA seq fastq files. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  68 
 
  
    EGAD00001006271 
   
  
    
    This dataset includes whole exome profiling of 65 early-passage metastatic melanoma cell lines. The data are provided as BAM files for tumor and normal samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  126 
 
  
    EGAD00001006272 
   
  
    
    We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response. 
    
   
  
    
      
      unspecified 
      
    
   
  32 
 
  
    EGAD00001006273 
   
  
    
    We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response. 
    
   
  
    
      
      unspecified 
      
    
   
  150 
 
  
    EGAD00001006274 
   
  
    
    We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response. 
    
   
  
    
   
  32 
 
  
    EGAD00001006275 
   
  
    
    We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response. 
    
   
  
    
   
  150 
 
  
    EGAD00001006276 
   
  
    
    Whole genome sequencing data on D19-0702 (AUS1), presented in Martin et al. 2020 (AUS1). WGS (Illumina HiSeq) was performed at Kinghorn Centre for Clinical Genetics, Garvan Institute of Medical Research. Data was analyzed using the Seave bioinformatic analysis pipeline (https://www.seave.bio). 
    
   
  
    
   
  1 
 
  
    EGAD00001006278 
   
  
    
    HiChIP experiments with two sequencing libraries each. Illumina HiSeq 4000/2500. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001006279 
   
  
    
    ChIP-seq experiments: fastq files; both ChIP and Input for each sample. Illumina HiSeq 2500. ChIP-seq alignment files for trimmed,  mapping q20 and nonredundant reads; both ChIP and Input for each sample. Software: Trim Galore v0.3.7; Bowtie 2 v2.1.0; samtools v1.7 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  76 
 
  
    EGAD00001006280 
   
  
    
    Endometrial carcinoma, the most common gynecologic cancer, develops from endometrial epithelium which is composed of secretory and ciliated cells. Pathologic classification is unreliable and there is a need for prognostic tools. We used single cell sequencing to study organoid model systems derived from normal endometrial endometrium to discover novel markers specific for endometrial ciliated or secretory cells. We performed single cell sequencing on endometrial and ovarian tumours, and on organoids both treated with DBZ and normal and found both secretory-like and ciliated-like tumour cells. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  18 
 
  
    EGAD00001006281 
   
  
    
    Temozolomide (TMZ) is an oral alkylating agent used for the treatment of glioblastoma and is now becoming a chemotherapeutic option in patients diagnosed with high-risk low-grade gliomas. The O-6-methylguanine-DNA methyltransferase (MGMT) is responsible for the direct repair of the main TMZ-induced toxic DNA adduct, the O6-Methylguanine lesion. MGMT promoter hypermethylation is currently the only known biomarker for TMZ response in glioblastoma patients. Here we show that a subset of recurrent gliomas carry MGMT genomic rearrangements that lead to MGMT overexpression, independently from changes in its promoter methylation. By leveraging the CRISPR/Cas9 technology we generated some of these MGMT rearrangements in glioma cells and demonstrated that they lead to TMZ resistance both in vitro and in vivo. Lastly we showed that such fusions can be detected in tumor-derived exosomes and could potentially represent an early detection marker of tumor recurrence in a subset of patients treated with TMZ. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  136 
 
  
    EGAD00001006282 
   
  
    
    We analyzed baseline and on-therapy tumor biopsies from 101 patients with advanced melanoma treated with nivolumab (anti-PD-1) alone or combined with ipilimumab (anti-CTLA-4). Analysis of whole transcriptome data showed that T cell infiltration and interferon-gamma signaling signatures corresponded most highly with clinical response to therapy, with a reciprocal decrease in cell cycle and WNT signaling pathways in responding biopsies. Clinical outcome differences were likely not due to differential melanoma cell responses to interferon-gamma, as 57 human melanoma cell lines exposed in vitro to this cytokine showed a conserved interferon-gamma transcriptome response unless they had mutations that precluded signaling from the interferon-gamma receptor. Therefore, the magnitude of the antitumor T cell response and the corresponding downstream interferon-gamma signaling are the main drivers of clinical response or resistance to immune checkpoint blockade therapy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  54 
 
  
    EGAD00001006283 
   
  
    
    Whole exome sequencing of eight affected skin biopsies (“lesional”) from five giant CMN patients (age range, 4-58) with matching unaffected skin (not available in one patient) along with germline DNA. Agilent SureSelect V5+UTRs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001006284 
   
  
    
    We analyzed baseline and on-therapy tumor biopsies from 101 patients with advanced melanoma treated with nivolumab (anti-PD-1) alone or combined with ipilimumab (anti-CTLA-4). Analysis of whole transcriptome data showed that T cell infiltration and interferon-gamma signaling signatures corresponded most highly with clinical response to therapy, with a reciprocal decrease in cell cycle and WNT signaling pathways in responding biopsies. Clinical outcome differences were likely not due to differential melanoma cell responses to interferon-gamma, as 57 human melanoma cell lines exposed in vitro to this cytokine showed a conserved interferon-gamma transcriptome response unless they had mutations that precluded signaling from the interferon-gamma receptor. Therefore, the magnitude of the antitumor T cell response and the corresponding downstream interferon-gamma signaling are the main drivers of clinical response or resistance to immune checkpoint blockade therapy. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001006285 
   
  
    
    In the absence of recurrent gene mutations, evidence accumulates that epigenetic deregulation plays a prominent role in neuroblastoma biology. Here we provide genome wide H3K27ac profiles in 60 primary neuroblastoma samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  60 
 
  
    EGAD00001006286 
   
  
    
    In the absence of recurrent gene mutations, evidence accumulates that epigenetic deregulation plays a prominent role in neuroblastoma biology. Here we provide RNAseq profiles in 71 primary and relapse neuroblastoma samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  71 
 
  
    EGAD00001006287 
   
  
    
    NGS data of 12 patients enrolled in the Chinese Patient Assistance Program from multiple centers who received pemetrexed alone or combined with platinum as initial chemotherapy and continued pemetrexed maintenance therapy for advanced lung adenocarcinoma from November 2014 to June 2017. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001006288 
   
  
    
    This is the raw data obtained from shallow whole-genome sequencing of plasma DNA (plasma-Seq) for calling of somatic copy number alterations as well as focal amplifications and deletions from patients with breast, colorectal and non-small cell lung cancer. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  48 
 
  
    EGAD00001006289 
   
  
    
    Targeted sequencing (t-NGS) of frozen advanced cancers tissue was established with three panels covering 395 to 560 candidate cancer genes. Sequencing was done using the 2x150-bp paired-end technology on the Illumina MiSeq and NextSeq500 platforms. The DNA libraries of all coding exons were done with the HaloPlex Target Enrichment System (Agilent, Santa Clara, CA, USA). 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  735 
 
  
    EGAD00001006291 
   
  
    
    RNA sequencing data from breast cancers (n=18) and their matched HN tissues (n=36), healthy breast from cosmetic reduction mammoplasty (RM; n=5), and risk reducing mastectomies (RR, n=5), with peritumoral samples excised proximal to (TP, less than 2 cm) and distal from (TD, 5-10 cm) the primary tumor. 
    
   
  
    
      
      unspecified 
      
    
   
  66 
 
  
    EGAD00001006292 
   
  
    
    Single cell sequencing of fathers who have had children with autism and fathers who have had multiple children, but no children with autism.  The data was process on a 10X Chromium and sequenced on a NextSeq 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001006293 
   
  
    
    Circulating tumor-derived DNA (ctDNA) can be used to monitor cancer dynamics noninvasively. Detection of ctDNA can be challenging in patients with low-volume or residual disease, where plasma contains very few tumor-derived DNA fragments. We show that sensitivity for ctDNA detection in plasma can be improved by analyzing hundreds to thousands of mutations that are first identified by tumor genotyping. We describe the INtegration of VAriant Reads (INVAR) pipeline, which combines custom error-suppression methods and signal-enrichment approaches based on biological features of ctDNA. With this approach, the detection limit in each sample can be estimated independently based on the number of informative reads sequenced across multiple patient-specific loci. We applied INVAR to custom hybrid-capture sequencing data from 176 plasma samples from 105 patients with melanoma, lung, renal, glioma, and breast cancer across both early and advanced disease. By integrating signal across a median of >105 informative reads, ctDNA was routinely quantified to 1 mutant molecule per 100,000, and in some cases with high tumor mutation burden and/or plasma input material, to individual parts per million. This resulted in median Area Under the Curve (AUC) values of 0.98 in advanced cancers, and 0.80 in early stage and challenging settings for ctDNA detection. We generalized this method to whole-exome and whole-genome sequencing, showing that the INVAR may be applied without requiring personalized sequencing panels, so long as a tumor mutation list is available. As tumor sequencing becomes increasingly performed, such methods for personalized cancer monitoring may enhance the sensitivity of cancer liquid biopsies. 
    
   
  
    
   
  65 
 
  
    EGAD00001006294 
   
  
    
    Whole genome sequencing data of isogenic ATRX/TP53 knockout clones of the neuroblastoma cell line SK-N-SH 
    
   
  
    
   
  5 
 
  
    EGAD00001006295 
   
  
    
    SNPs and INDELs of novel hereditary neurological disease genes in Mali using Beckman 8800, ABI 3730/3730xl. 
    
   
  
    
   
  10 
 
  
    EGAD00001006296 
   
  
    
    Like many childhood cancers, malignant rhabdoid tumours (MRT) are thought to arise from aberrant foetal development. Although MRT predominantly exhibit a mesenchymal phenotype, it has been suggested that the foetal root of MRT lies in neural crest development. Here, we combine phylogenetic analyses of MRT, single cell mRNA assays, and functional experiments in patient-derived MRT organoids, to define the embryological origin of MRT and explore therapeutic avenues that may drive MRT differentiation. Phylogenetic analyses from the distribution of somatic mutations revealed that MRT were related to neural crest-derived, but not to mesodermal tissues, providing direct evidence of the neural crest origin of MRT in humans. In MRT organoids, reversal of the principal driver event underpinning MRT, SMARCB1 loss, induced differentiation along mesenchymal pathways. Together, these findings placed MRT cells on a developmental trajectory of neural crest to mesenchyme conversion, and defined the transcriptional changes underpinning MRT differentiation. Searching perturbation databases for agents that mimic these mRNA changes, we identified HDAC and mTOR inhibition as potential differentiation agents. Treatment of MRT organoids with this drug combination induced proliferation arrest with transcriptional changes akin to SMARCB1 re-expression. Our study defines the embryological root of MRT and proposes a differentiation treatment for this often fatal childhood cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001006297 
   
  
    
    270 samples with ALK-positiv non-small cell lung cancer, targeted sequencing (198 kb panel size) 
    
   
  
    
   
  270 
 
  
    EGAD00001006298 
   
  
    
    268 samples with ALK-positiv non-small cell lung cancer, ultra-low coverage whole genome sequencing 
    
   
  
    
   
  268 
 
  
    EGAD00001006299 
   
  
    
    Dataset consists of 207 glioma samples of WHO grades II, III and IV. Dataset consists 182 tumor derived genomic data of ~700 cancer-related and epigenetic-related genes with matched blood samples for 48 specimens. Dataset also consists transcriptomic data for 105 specimens. In total 335 bam files were deposited. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  335 
 
  
    EGAD00001006301 
   
  
    
    The AVENIO ctDNA Expanded Kit is a next-generation sequencing (NGS) liquid biopsy assay with a 77 gene panel (192 kb) containing genes in U.S. National Comprehensive Cancer Network (NCCN) Guidelines and emerging cancer biomarkers. This pan-cancer assay was applied to 48 plasma samples from patients with breast, colorectal and non-small cell lung cancer. After sequencing 150bp paired-end, reads were aligned to the hg38 genome with the AVENIO Oncology Analysis Software (version 2.0). These files are the deduplicated alignments generated by the analysis software used for subsequent variant, indel and CNV calling. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  48 
 
  
    EGAD00001006302 
   
  
    
    WES data from clonal pahtologic bone marrow plasma cells of multiple myeloma patients. From each patient there are three samples, pathologyc bone marrow plasma cells at diagnosis, pathologyc bone marrow plasma cells after VRD treatment and T lymphocytes as germline control. There are 14 patients in total 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006303 
   
  
    
    The valve methylation dataset consists of 12 bam files of human non-diseased valve tissue samples that are free from calcification (6 aortic and 6 mitral valves - matched; 10 males: 2 females; age range 42 – 64 years, mean age 52.2 years, SD 9.9682). Donor hearts are free from cardiovascular and valvular complications. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001006304 
   
  
    
    FASTQ files of the polyA+ (oligo-dT) RNA-Seq dataset from the POPS SGA (Small for Gestational Age) samples and their matched controls. The POP study placental biopsies were collected within 30 minutes of birth and flash frozen in RNAlater (ThermoFisher). For each biopsy, total placental RNA was extracted from approximately 5 mg of tissue using the “mirVana miRNA Isolation Kit” (Ambion) followed by DNase treatment (“DNA-free DNA Removal Kit”, Ambion). RNA quality was assessed with the Agilent Bioanalyzer and all the samples with RIN values ≥ 7.0 were used in the downstream experiments. RNA-libraries were prepared from 1g of total placental RNA with the TruSeq Stranded mRNA Library Prep Kit (Illumina) which captures polyA-tailed transcripts by oligo-dT beads, then pooled and sequenced (single-end, 50bp) using a Single End V4 cluster kit and Illumina HiSeq2500 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  59 
 
  
    EGAD00001006305 
   
  
    
    To further understand the biology of Sonic hedgehog medulloblastoma and its molecular subtypes, we studied 250 human Shh-MB using strand-specific RNA sequencing. We identified novel alterations within the cAMP dependent pathway and found that 18% of tumors have genetic events that directly target the abundance and/or stability of MYCN. We also discovered an extensive network of fusions in focally amplified regions, and several loss-of-function fusions in tumor suppressor genes PTCH, SUFU and NCOR1. Molecular convergence on a core of specific genes by nucleotide variants, copy number aberrations, and gene fusions highlights key roles of specific pathways in the pathogenesis of Sonic hedgehog medulloblastoma. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  82 
 
  
    EGAD00001006306 
   
  
    
    Purpose
Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma.
Patients and Methods
Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS.
Results
In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature.
Conclusions
Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001006307 
   
  
    
    RNA-SEQ for the Caldas Lab breast cancer PDTX collection. 
This includes both single and paired end runs 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  117 
 
  
    EGAD00001006309 
   
  
    
    Purpose
Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma.
Patients and Methods
Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS.
Results
In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature.
Conclusions
Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001006311 
   
  
    
    SF11940 snATAC Sequencing.Anaplastic Astrocytoma, IDH-mutant. 
Tumor location: Left Frontal. Age: 29. Sex: Male . 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006312 
   
  
    
    SF10679 snATAC Seq. Oligodendroglioma, Anaplastic (WHO gr. 3). Tumor Location: Frontal Age: 43. Sex: Male. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006313 
   
  
    
    SF11310 Oligodendroglioma, IDH-mutant.Tumor Location: Frontal. Age:22. Sex: Female 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006314 
   
  
    
    SF10320 Unknown. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006315 
   
  
    
    SF12374 snATAC Oligodendroglioma. Tumor Location: Right frontal. Age: 33. Sex: Male. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006316 
   
  
    
    SF10619 snATAC Oligodendroglioma (WHO gr. 2).Tumor Location: Parietal. Age: 52.	Sex: Male 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006317 
   
  
    
    SF4007 snATAC Seq. Oligodendroglioma (WHO gr. 2) Tumor Location: Left frontotemporal. Age:33. Sex: Female. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006318 
   
  
    
    Astrocytoma (WHO gr. 2) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001006319 
   
  
    
    SF10207 snATAC Seq. Oligodendroglioma, Anaplastic (WHO gr. 3). Tumor Location:Frontal. Age:43. Sex: Male 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006321 
   
  
    
    RNA sequencing during time series 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  58 
 
  
    EGAD00001006322 
   
  
    
    In the current study we report for the first time the unique collection of 6 leukemias and two sarcomas from XP-C. Comprehensive WGS-based mutational analysis provides genetic explanation for the increased incidence of leukemia in XP-C and describes an unique mutational process in internal tumors associated with NER deficiency. Raw data are provided in FASTQ format and variant analysis as VCF files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  15 
 
  
    EGAD00001006324 
   
  
    
    Exome sequences of primary tumor/metastatic/germline DNA trios.
Tumoral and germline samples were sequenced to an expected depth of respectively 150M and 50M reads in order to obtain differential depth (>100X versus >30X).
Submitted data are paired-end fastq files. 
    
   
  
    
   
  81 
 
  
    EGAD00001006325 
   
  
    
    Single cell full transcriptome sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001006326 
   
  
    
    The SARS-CoV-2 pandemic has led to increasing numbers of COVID-19 patients all over the world. Aetiopathologies range from no symptoms, mild flu-like to severe cases succumbing to respiratory failure. Reports on a dysregulated immune system in the severe cases, showing similarities to cytokine release syndrome, calls for better characterization and understanding of the changes in the immune system as well as their variance across COVID-19 patients in order to be able to design according to host-directed therapies. Here, we profiled blood transcriptomes of 39 COVID-19 patients and 10 control donors. Enriched granulocyte signatures in whole blood samples were verified in granulocyte samples from 49 COVID-19 patients in a second cohort. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  79 
 
  
    EGAD00001006327 
   
  
    
    Single cell hybrid-capture targeted sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma. 
    
   
  
    
   
  24 
 
  
    EGAD00001006328 
   
  
    
    Tumor and matching normal exomes for 28 GZL cases (n=56) and targeted capture sequencing for 42 GZL cases with 3 matching normals and 2 pooled normals (n=47) 
    
   
  
    
   
  103 
 
  
    EGAD00001006329 
   
  
    
    RNA-Seq samples from the BELOB clinical trial study to find transcriptome associations with response to Bevacizumab and CCNU in glioblastoma patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  96 
 
  
    EGAD00001006330 
   
  
    
    Whole RNA-sequencing of CD34+ cells and neutrophils derived from MPN patients before hydroxycarbamide treatment and after 9-months of treatment. CD34+ cells= 5 patients, 10 samples; neutrophils= 7 patients, 14 samples. Fastq files provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006331 
   
  
    
    Biopsies from the terminal ileum and rectum of healthy individuals are digested on ice to single cells and processed for single-cell RNA-sequencing (10X Genomics and Illumina) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/. 
    
   
  
    
   
  9 
 
  
    EGAD00001006332 
   
  
    
    This dataset contains single cell RNA sequencing data from Organoids grown in high nutrient (H) and low nutrient (L) medium. Organoids were grown for 110 days. Organoids were grown from a patient IPS cell line with a heterozygous mutation in TSC2 (Patient 1 TSC2+/- iPSCs) and an isogenic control cell line (TSC2+/+). 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001006333 
   
  
    
    This dataset contains whole genome sequencing data of a patient IPS cell line with a heterozygous mutation in TSC2 (Patient 1 TSC2+/- iPSCs) . This dataset further contains whole genome sequencing data of two tumors showing CNLOH. Tumors were isolated from organoids grown using the patient IPS cell line (Patient 1 TSC2+/- iPSCs ) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001006335 
   
  
    
    RNA-seq data for ALL patients as described in "The application of RNA sequencing for the diagnosis and genomic classification of pediatric acute lymphoblastic leukemia" 
    
   
  
    
   
  133 
 
  
    EGAD00001006336 
   
  
    
    Paired whole exome sequencing data of the HIPO head and neck cancer (HNC) (n=83), using Agilent SureSelect V4+UTRs and V6+UTRs with the sequencing platforms HiSeq2000 and HiSeq2500. The reads were aligned to hg19. This is part of project H019. 
    
   
  
    
   
  166 
 
  
    EGAD00001006337 
   
  
    
    The human placenta harbours chromosomal aberrations that are absent from the fetus in one to two percent of pregnancies. This confined mosaicism suggests that embryonic genetic bottlenecks exist, which phylogenetically segregate placental tissue. Here, we studied the somatic genetic landscape of human placentas by whole genome sequencing of 86 placental biopsies and of 106 microdissections. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  278 
 
  
    EGAD00001006338 
   
  
    
    Whole genome sequencing data (Illumina HiSeq and NovaSeq) of clonal cultures derived from pediatric human bone marrow-derived hematopoietic stem and multipotent progenitor cells (in total 44 samples from 10 donors) and bulk pediatric acute myeloid leukemia blasts (in total 6 samples from 6 patients) to study the mutation accumulation. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  118 
 
  
    EGAD00001006339 
   
  
    
    To investigate the immune response and mechanisms associated with severe COVID-19, we performed single-cell RNA-seq on nasopharyngeal and bronchial samples from 19 clinically well-characterized patients with moderate or critical disease and from 5 healthy controls. We identified airway epithelial cell types and states vulnerable to SARS-CoV-2 infection. In COVID-19 patients, epithelial cells showed an average threefold increase in expression of the SARS-CoV-2 entry receptor ACE2, which correlated with interferon signals by immune cells. Compared with moderate cases, critical cases exhibited stronger interactions between epithelial and immune cells, as indicated by ligand–receptor expression profiles, and activated immune cells , including inflammatory macrophages expressing CCL2, CCL3, CCL20, CXCL1, CXCL3, CXCL10, IL8, IL1B and TNF . The transcriptional differences in critical cases compared with moderate cases likely contribute to clinical observations of heightened inflammatory tissue damage, lung injury and respiratory failure. Our data suggest that pharmacologic inhibition of the CCR1 and/or CCR5 pathways may suppress immune hyperactivation in critical COVID-19. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001006340 
   
  
    
    Dataset contains paired-end Whole Exome sequencing data from 2 glioma patients (1 oligodendroglioma and 1 astrocytoma) , derived cultured cells, and derived murine xenografts. 
    
   
  
    
   
  28 
 
  
    EGAD00001006341 
   
  
    
    This dataset contains high-throughput RNA-sequencing of 14 samples, each sample comprising oligodendrocytes derived from human induced pluripotent stem cells, from individuals with and without a balanced t(1;11) translocation which substantially increases risk of major mental illness. 5 samples derive from 2 control individuals, and 9 samples from 3 individuals carrying the translocation. Libraries were prepared from each total-RNA sample using the TruSeq Stranded Total RNA with Ribo-Zero kit. Libraries were then sequenced using the NextSeq 500/550 High-Output v2 (150 cycle) Kit on the NextSeq 550 platform. Raw paired-end sequencing data is stored in two FASTQ files per sample. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  14 
 
  
    EGAD00001006342 
   
  
    
    This dataset was used to characterise T cell gene
      expression and clonality at sites of active inflammation within
      the joints of psoriatic arthritis (PsA) patients, and to compare
      these results with T cells from the peripheral blood of those
      same patients. Freshly sorted CD45RA negative CD3+CD4+ and
      CD3+CD8+ single cells from four patients were individually flow
      sorted into 96-well full-skirted plates (Eppendorf) containing
      10µL of a 2% Dithiothreitol (DTT, 2M Sigma-Aldrich), RTL lysis
      buffer (Qiagen) solution. Cell lysates were sealed, mixed and
      spun down before storing at -80 ºC. Paired-end multiplexed
      sequencing libraries were prepared following the Smart-seq 2
      protocol using the Nextera XT DNA library prep kit (Illumina). A
      pool of barcoded libraries from four different plates were
      sequenced across two lanes on the Illumina HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4703 
 
  
    EGAD00001006343 
   
  
    
    Whole genome sequencing of HSPC and SI clones of 3 disomy- and 3 trisomy 21 fetuses samples and 2 TMD samples (Novaseq 6000 samples). 22 disomy clones, 20 trisomy clones were included in this experiment. 11 bulk samples were also included. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  53 
 
  
    EGAD00001006344 
   
  
    
    Additional Neuroblastoma whole genome sequencing data 
    
   
  
    
   
  30 
 
  
    EGAD00001006345 
   
  
    
    Raw FastQ Files of 69 samples of endometrial tissue from uterus (rudiments) of patients diagnosed with MRKH Type 1/2 or healthy controls. Each sample consists of 2 lanes paired-end RNA sequencing data. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  69 
 
  
    EGAD00001006346 
   
  
    
    The dataset consists of a multisample VCF (version 4.1) and the corresponding annotated MAF file, containing the somatic point mutations found from exome sequencing across the high-grade T1 bladder cancer cohort (HGT1, n=61 samples). The VCF file is in accordance with the HTS format specifications (https://samtools.github.io/hts-specs/). The dataset comprises also a CSV file with clinical data. 
    
   
  
    
   
  61 
 
  
    EGAD00001006349 
   
  
    
    Data supporting: “Multi-omic cross-sectional cohort study of pre-malignant Barrett’s esophagus reveals structural variation and retrotransposon activity occur early in cancer evolution.” Katz-Summercorn, Jammula et al.
WGS (BAM files) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001006350 
   
  
    
    Human organoids recapitulating the cell-type diversity and function of their target organ are valuable for basic and translational research. We developed light-sensitive human retinal organoids with multiple nuclear and synaptic layers, and functional synapses. We sequenced the RNA of 285,441 single cells from these organoids at seven developmental time points and from the periphery, fovea, pigment epithelium and choroid of light-responsive adult human retinas. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  124 
 
  
    EGAD00001006351 
   
  
    
    Here, we profiled the gut microbiota in a discovery (n = 1,011) and validation (n = 484) cohort comprising Swedish subjects naive for diabetes treatment and grouped by glycemic status. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1495 
 
  
    EGAD00001006352 
   
  
    
    We performed whole genome sequencing to detect possible off-target mutations induced by prime editing. Liver organoids, derived from a healthy control, were transfected with either control (GFP) plasmids or prime editing plasmids (GFP+PE2+pegRNA+nickRNA) to induce a 6-bp deletion in CTNNB1. One control and two prime-edited organoid lines were clonally expanded from single cells. High-throughput sequencing was performed on the complete genomic DNA isolated from these clonal lines, as well as the starting culture (bulk). After correction for germline mutations in the starting culture, new mutations in the control and prime-edited lines were compared. The same approach was followed in small intestinal organoids, derived from a patient with disease-causing 3-bp deletion in DGAT1. In these small intestinal organoids, prime editing was used to insert the 3 missing nucleotides. Two corrected clones were compared to one control clone. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001006353 
   
  
    
    Data supporting: “Multi-omic cross-sectional cohort study of pre-malignant Barrett’s esophagus reveals structural variation and retrotransposon activity occur early in cancer evolution.” Katz-Summercorn, Jammula et al.
RNAseq (BAM files) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001006354 
   
  
    
    Phenotypic data for 475 human samples, including:
Demographics
Anthropometrics
Diet data
Clinical data
Time of day
Season in which serum sample was taken 
    
   
  
    
   
  1 
 
  
    EGAD00001006355 
   
  
    
    Whole Transcriptiome Rnaseq of 25 UPS samples - raw FastQ sequences, 125x2 nc Paired End Reads, min 30M PE, HiSEq technology 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  25 
 
  
    EGAD00001006356 
   
  
    
    RNA-seq of Bone Metastasis from breast and prostate cancer (4 breast and 5 prostate samples). Dataset contains BAM files from RNA-seq performed using Illumina HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      Ion Torrent S5 XL 
      
    
   
  288 
 
  
    EGAD00001006357 
   
  
    
    Dataset with 81 whole exome sequences from Iberian Roma samples. 
    
   
  
    
      
      unspecified 
      
    
   
  81 
 
  
    EGAD00001006358 
   
  
    
    VCF file with genome-wide data for 62 Iberian Roma samples. 
    
   
  
    
   
  62 
 
  
    EGAD00001006359 
   
  
    
    A set of 56 EpCAM-positive cells derived from bone marrow aspirates of breast cancer patients or patients without a cancererous disease (30 cells from 21 M0-stage and 11 cells from five M1-stage breast cancer patients, 15 cells from seven non-cancer patients serving as controls). EpCAM-positive cells from breast cancer patients were considered disseminated tumor cells as they harbored copy number alterations and showed high expression of the epithelial marker EpCAM and the mammary luminal progenitor marker KIT in comparison to EpCAM-positive bone marrow cells from non-cancer patients. Paired-end RNA-Sequencing of the samples was performed on Illumina NovaSeq6000, raw data are provided in the Fastq format. 
    
   
  
    
   
  56 
 
  
    EGAD00001006360 
   
  
    
    Single-cell RNA sequencing was performed on bone marrow mononuclear cells of 2 acute myeloid leukemia patients at refractory stage. The profiling was performed using 10x Genomics Chromium Single Cell 3ʹ Gene Expression platform. The raw data are available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001006362 
   
  
    
    Sequencing data from patients with bladder cancer. BAM files from targeted DNA sequencing of bladder cancer driver genes in 344 circulating tumor DNA and tumor tissue samples. BAM files from whole exome sequencing of 49 circulating tumor DNA and tumor tissue samples. Paired FASTQ files from RNA sequencing of 86 tumor tissue samples. 
    
   
  
    
   
  344 
 
  
    EGAD00001006363 
   
  
    
    The hematological malignancy multiple myeloma (MM),
      also called Kahler's disease or plasma cell (PC) myeloma, is
      characterized by a clonal expansion of PCs originating in the
      bone marrow (BM). The expansion of these cells leads to an
      overproduction of antibodies and results in typical symptoms
      such as anemia, renal failure and bone lesions. All cases of MM
      are preceded by the asymptomatic, non-malignant pre-stage
      monoclonal gammopathy of undetermined significance (MGUS). Of
      all MGUS patients, only 1% per year will progress to MM. Despite
      efforts to elucidate the molecular mechanisms underlying the
      MGUS-to-MM progression, its pathogenesis still remains largely
      unknown. Additionally, the genetic profiles of MGUS patients
      have only been limitedly investigated due to the only incidental
      finding of MGUS, the difficulties in BM sampling and isolating a
      sufficient number of aberrant PCs from the BM aspirates of MGUS
      patients. Consequently, reliable biomarkers to individually
      predict which MGUS patients will progress to MM and which will
      not, are lacking. Therefore, it is highly required to study the
      molecular pathogenesis of MGUS and the role of genetic events in
      relation to the malignant transformation to MM. 
    
   
  
    
   
  42 
 
  
    EGAD00001006364 
   
  
    
    DNA extraction from human stool samples was performed at the Center for Microbiome Innovation (CMI) at University of California, San Diego. DNA sequencing libraries were prepared using Nextera XT (Illumina). Shotgun DNA sequencing was performed on the Illumina HiSeq4000 platform.  Raw fastq reads were quality-checked. Skewer (version 0.2.2) was utilized with the paired-end mode. Human reads were identified and removed by Bowtie2 mapping against the human genome reference (hg19), followed by bam2fastq with --unaligned --no-aligned --force options. 
    
   
  
    
   
  162 
 
  
    EGAD00001006365 
   
  
    
    Case report of an ER+ Her2- breast cancer patient. Whole exome and transcriptome sequencing at time of diagnosis and relapse, targeted DNA sequencing of a liver met 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001006366 
   
  
    
    PCa-LINES: rRNA-minus RNA-seq of PCa cell-lines (VCaP & PC346c) and 4 additional patient samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001006367 
   
  
    
    To study the evolution of DNA methylation at genome level and methylation intra-tumor heterogeneity (ITH) during early lung carcinogenesis, we performed multiregional reduced representation bisulfite sequencing (RRBS) of 127 resected lung samples from 39 patients using single end library Hiseq3000. 
    
   
  
    
   
  127 
 
  
    EGAD00001006368 
   
  
    
    This dataset consists of 106 bam files. Each sample from 10-20 consecutive patient extractions were combined into one DNA pool, generating a total of 106 DNA pools. We sequenced 11 genes implicated in hereditary breast cancer using the SureSelect Custom kit. 
    
   
  
    
   
  106 
 
  
    EGAD00001006369 
   
  
    
    Whole genome sequencing of tumour-normal pairs in eight patients with clinically localised disease undergoing prostatectomy. A bespoke DNA capture and amplification panel against the highest prevalence, highest confidence aberrations for each individual was designed and used to interrogate ctDNA isolated from plasma prospectively obtained pre- and post- (24 hours and 6 weeks) surgery. Tagged-amplicon deep sequencing (TAm-Seq) across the TP53 gene in ctDNA in a cohort of 189 individuals. 
    
   
  
    
   
  224 
 
  
    EGAD00001006370 
   
  
    
    exome sequencing files from 25 alopecia areata samples from spain. 
    
   
  
    
   
  26 
 
  
    EGAD00001006371 
   
  
    
    Exome sequencing data for 14 Vitiligo samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001006373 
   
  
    
    Data supporting: “Longitudinal tracking of 97 esophageal adenocarcinomas (EAC) using liquid biopsy sampling.” Ococks, Frankell, Masque Soler et al.
ctDNA (BAM files)
333 samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  48 
 
  
    EGAD00001006374 
   
  
    
    A study looking at Germline and Somatic biomarkers using WES data only 
    
   
  
    
   
  88 
 
  
    EGAD00001006375 
   
  
    
    A radiomics study integrating PET/CT, WES and RNAseq data 
    
   
  
    
   
  99 
 
  
    EGAD00001006376 
   
  
    
    This dataset contains DNA from B-lymphocytes from 2 Coriell families and 4 individuals hybridized to HumanKaryomap BeadChip Array. Single cells from subjects GM12878 and GM7228 were amplified using multiple displacement amplification (SureMDA) according to Infium Karyomapping Assay Guide.  Bulk DNA was processed and hybridized to an array for subject GM12878, GM07224 and GM07225. 
    
   
  
    
   
  4 
 
  
    EGAD00001006379 
   
  
    
    Paired-end RNA-seq of follicular T cell lymphoma for the discovery of fusion transcripts 
    
   
  
    
   
  3 
 
  
    EGAD00001006380 
   
  
    
    RNASeq files for paper titled "Molecular classification improves risk assessment in adult B-lineage ALL: Patients on the international UKALLXII-ECOG2993 trial." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  57 
 
  
    EGAD00001006381 
   
  
    
    Illumina Nextseq total RNA sequencing profiles of skeletal muscle biopsies of 5 affected patients (F3/2M, F4/1F, F5/1M, F2/2F, F2/1M) compared to 6 control (Control040500, Control3509, Control3934, Control3949, Control4994, Control5106) and comparing 3 patients with additional EARS2 mutations (F2/2F, F2/1M, F5/1M) to 2 patients without (F3/2M, F4/1F).
Illumina MiSeq total RNA sequencing profiles of skeletal muscle biopsies from patient F3/1M during affected disease phase (4 replicates: F3-1M_Affected_1, F3-1M_Affected_2, F3-1M_Affected_3, F3-1M_Affected_4) compared to recovered phase (F3-1M_Recovered_1, F3-1M_Recovered_2, F3-1M_Recovered_3, F3-1M_Recovered_4). 
    
   
  
    
      
      NextSeq 550 
      
    
   
  10 
 
  
    EGAD00001006382 
   
  
    
    Illumina MiSeq total RNA sequencing profiles of skeletal muscle biopsies from patient F3/1M during affected disease phase (4 replicates: F3-1M_Affected_1, F3-1M_Affected_2, F3-1M_Affected_3, F3-1M_Affected_4) compared to recovered phase (F3-1M_Recovered_1, F3-1M_Recovered_2, F3-1M_Recovered_3, F3-1M_Recovered_4). 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001006383 
   
  
    
    August 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001006384 
   
  
    
    Shallow whole-genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of metastatic colorectal cancer patients (mCRC). 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  45 
 
  
    EGAD00001006385 
   
  
    
    Modified Fast Aneuploidy Screening Test-Sequencing System (mFAST-SeqS) was applied to stratify samples based on their overall tumor fraction in cfDNA. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  59 
 
  
    EGAD00001006386 
   
  
    
    All baseline samples and when available EOT were processed for high-resolution mutation analysis. We designed a SureSelectXT-HS custom panel (Agilent) covering 68 genes with a total size of 260kb using the Agilent SureDesign platform. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  44 
 
  
    EGAD00001006387 
   
  
    
    Whole exome sequencing: 24 samples matched tumor-normal and one matched CSF. Focused exome sequencing: 17 samples matched tumor-normal-2 time point CSF. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  35 
 
  
    EGAD00001006388 
   
  
    
    RNA sequencing of frozen resected specimens of desmoplastic small round cell tumors (DSRCTs). Four patients have specimens from multiple tissue sites included in this dataset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001006389 
   
  
    
    Whole Exome sequencing data of tumour samples for 112 patients with endometrioid ovarian carcinoma in FASTQ format. Data was derived as summarized below:
Library Preparation: Libraries were prepared from each DNA sample using the Illumina TruSeq Exome Library Prep kit (#FC-150-1002) according to the provided protocol using modifications for working with FFPE sourced material. Libraries were quantified using the Qubit 2.0 Fluorometer and the Qubit DNA HS assay (#Q32854) and the size distribution of fragments was assessed using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626).
Library QC: Exome-captured sequencing library pools were quantified using the Qubit 2.0 Fluorometer and the Qubit DNA HS assay (#Q32854) and the size distribution of fragments was assessed using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Fragment size and quantity measurements were used to calculate molarity for each library pool.
Sequencing: Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC-404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). 
    
   
  
    
      
      NextSeq 550 
      
    
   
  112 
 
  
    EGAD00001006390 
   
  
    
    PitNET white blood cell DNA - tumor DNA exome sequencing samples with Illumina exome sequencing. Fifteen patients and total of 30 exomes. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001006391 
   
  
    
    Whole exome sequencing of neuroendocrine cervical cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  29 
 
  
    EGAD00001006392 
   
  
    
    An investigation of clonal haematopoiesis in patients with neurodegenerative disease. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  181 
 
  
    EGAD00001006393 
   
  
    
    The dataset contains the targeted sequencing (TS) and the whole genome low pass (WGS) BAM files of the study. 
For the TS:
Samples: TS_XXX
There are normal and tumor DNA samples.
Each DNA strand has been sequenced independently (PoolA and PoolB).
A TruSeq Custom Amplicon panel of 20 genes frequently mutated in ILC and/or ER-positive BC in general was designed using DesignStudio from Illumina: AKT1, ARID1A, CDH1, ERBB2, ERBB3, ESR1, FOXA1, GATA3, IGF1R, JAK2, MAP2K4, MAP3K1, NF1, PIK3CA, PTEN, RB1, RUNX1, STAT3, TBX3, and, TP53. 
For the WGS:
Samples: WGS_XXX
There are normal and tumor DNA samples.
Samples were sequenced to an average target coverage of 0.5X. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  730 
 
  
    EGAD00001006394 
   
  
    
    Exome sequencing of frozen resected specimens of desmoplastic small round cell tumors (DSRCTs). Four patients have specimens from multiple tissue sites included in this dataset. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  39 
 
  
    EGAD00001006395 
   
  
    
    Whole-exome sequencing (WES) in a well-characterized sample of 14 matched EP tumour/healthy surrounding tissue samples. The sequencing was done with paired EXOME sequencing on Illumina HiSeq 4000  using Agilent SureSelect XT HS + Human All Exon V7. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  28 
 
  
    EGAD00001006396 
   
  
    
    To investigate the cellular composition of the human pancreas, we performed single-nucleus sequencing from snap frozen biopsies of pancreata from adult, neonatal and diseased (chronic pancreatitis) human donors. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001006397 
   
  
    
    Transcriptome analysis of nontumorous human breast tissues. 196 cases were included in the dataset. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  196 
 
  
    EGAD00001006398 
   
  
    
    ChIP-Seq files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses". 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  242 
 
  
    EGAD00001006399 
   
  
    
    The dataset contains a full genomics characterization of 527 Asian breast tumours. This includes whole-exome sequencing of tumour tissue at 80X, whole-exome sequencing of matched normal (blood) tissue at 40X, shallow-whole genome sequencing at 0.1X for copy number analyses, and RNA-seq of tumour tissue at 40X coverage (>15 million reads). Whole-exome libraries were prepared using the Nextera Rapid Capture Exome Kit; exome capture was performed in pools of 3 and subjected to paired end
75 sequencing on a HiSEQ4000 platform. RNA libraries were prepared  using the TruSeq Stranded Total RNA HT kit with Ribo-Zero Gold as per manufacturer’s instructions and also subjected to paired end 75 sequencing on a HiSEQ4000 platform. Uploaded bam files have been mapped to the hs37d5 human genome and processed using the standard GATK pipelines. Paired clinical, demographic, genotyping, and overall survival data for these patients are available from the associated publications or by request. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2235 
 
  
    EGAD00001006400 
   
  
    
    1 cell line and 123 patient samples including 38 normal (22 paired normal and 16 unpaired),  85 tumor-initial, FASTQ file types, Agilent SureSelect Human All Exon V6 Kit 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  124 
 
  
    EGAD00001006401 
   
  
    
    This dataset contains RNA-Seq data from 204 primary melanomas and 177 regional lymph nodes. More details can be found in the manuscript: "Tumour gene expression signature in primary melanoma predicts long-term outcomes: A prospective multicentre study" 
    
   
  
    
      
      unspecified 
      
    
   
  381 
 
  
    EGAD00001006402 
   
  
    
    Whole genome sequencing of 29 donors of healthy mammary tissue. BAM files of stromal and epithelial DNA are included. 
    
   
  
    
      
      unspecified 
      
    
   
  58 
 
  
    EGAD00001006403 
   
  
    
    Dataset contains genomic sequencing of 87 samples (blood germline, normal prostate tissues, human tumors, PDOs and PDXs). Sequencing was performed by whole-exome sequencing or targeted sequencing of prostate cancer genes. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      Ion Torrent S5 XL 
      
    
   
  87 
 
  
    EGAD00001006404 
   
  
    
    Dataset contains RNA-seq of 30 samples (normal prostate tissue, human prostate cancer, PDX and organoids). 
    
   
  
    
      
      unspecified 
      
    
   
  26 
 
  
    EGAD00001006406 
   
  
    
    RNA was extracted from eight diagnostic ETV6-RUNX1 positive acute lymphoblastic leukemia samples collected in PAXgene blood RNA tubes using PAXgene Blood RNA kit (cat #762174, Qiagen GmbH, Hilden, Germany), following the version 2 instructions for manual purification. Samples were processed with Globin-Zero Gold rRNA Removal Kit (Illumina) and directional libraries were prepared using NEBNext Ultra Directional RNA Library Prep kit (New England Biolabs). The library preparation and paired end (150 bp) sequencing were performed by Novogene (HK) Company Limited (Hong Kong, China) using Illumina Novaseq 6000 aiming at 70 million read pairs per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001006407 
   
  
    
    Here we provide a catalogue of variants called after sequencing the exomes of 45 babies from the State of Rio Grande do Nord in Brazil. Our data set provides a useful reference point for diagnosis of rare diseases in Brazil. 
    
   
  
    
   
  45 
 
  
    EGAD00001006408 
   
  
    
    This dataset includes whole-exome sequencing data for multifocal ileal tumor samples from two patients. Exonic sequences were enriched using the Agilent V2 capture probe set and sequenced by 76-bp paired-end reads using the Illumina Genome Analyzer IIx system with a mean coverage of 80x for each base 
    
   
  
    
      
      Illumina Genome Analyzer IIx 
      
    
   
  31 
 
  
    EGAD00001006409 
   
  
    
    Formalin-fixed, paraffin-embedded samples from 27 FIT interval CRC and 54 screen-detected CRCs collected in a pilot-program of FIT-based CRC screening in the southwest and northwest regions in the Netherlands, were used in this study. DNA was extracted for 1) Shallow Sequencing (copy number analysis)  and 2) TSACP Amplicon Cancer Gene Panel (mutations) of 22 FIT Interval CRCs and 45 screen-detected CRCs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  66 
 
  
    EGAD00001006410 
   
  
    
    Files from whole exome sequencing of matched normals and multiple tumors from 7 melanoma patients. The tumors include primary tumors and distant metastases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  127 
 
  
    EGAD00001006414 
   
  
    
    1075 members of the LBC1936 were sequenced using the Illumina HiSeq X platform. This dataset contains the gvcfs. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  1075 
 
  
    EGAD00001006415 
   
  
    
    Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of patients with advanced metastatic melanoma, stage III(B-D)-IV, who were receiving treatment with anti-PD1 (n =26); and patients with kidney, non-small cell lung, liver and bladder cancer who were receiving treatment with anti-PD1. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end sequencing was performed on NextSeq500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  49 
 
  
    EGAD00001006416 
   
  
    
    297 members of the LBC1921 were sequenced using the Illumina HiSeq X platform. This dataset contains the gvcfs. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  297 
 
  
    EGAD00001006417 
   
  
    
    single cell sequencing esophagus, stomach and duodenum of :
4 esophagus samples
9 gastric samples
5 duodenum samples 
    
   
  
    
   
  18 
 
  
    EGAD00001006418 
   
  
    
    These samples were sequenced at the Broad Institute on an Illumina HiSeqX at 30x -- PCR Free. The CRAMS and VCF are as produced by Broad. The VCFs produced were generated by the Broad using GATK. 
    
   
  
    
   
  100 
 
  
    EGAD00001006419 
   
  
    
    WGS data of plasma samples from CRC patients (N=12) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001006420 
   
  
    
    WGS data of plasma samples from BRCA patients (N=10) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001006421 
   
  
    
    WGS data of plasma samples from healthy individuals (N=29) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  29 
 
  
    EGAD00001006422 
   
  
    
    This dataset includes 289 samples from 46  high grade serous epithelial ovarian cancer patients. Data are from both tissue samples (either primary tumor, or synchronous metastases) and circulating cell-free DNA (cfDNA) of plasma samples taken during therapy and follow-up. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  289 
 
  
    EGAD00001006423 
   
  
    
    Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy.  Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  137 
 
  
    EGAD00001006424 
   
  
    
    Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy.  Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  48 
 
  
    EGAD00001006425 
   
  
    
    The phenotypic data for ~12500 samples of the AWI-Gen Phase 1 Population cross-sectional study of older adults (mostly between 40 and 60 years), men and women. Six study sites in four sub-Saharan African counties including Ghana, Burkina Faso, Kenya and South Africa. Some groups are missing data for specific variables. Data includes questionnaire data (demography, health history, family health history, behaviour and infection data); anthropometry; and laboratory assays on blood and urine. 
    
   
  
    
   
  12032 
 
  
    EGAD00001006426 
   
  
    
    This study contain the WGS and WEX aligned bam files and RNA-seq fastq files for human liver tumors. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  183 
 
  
    EGAD00001006427 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001006428 
   
  
    
   
  
    
      
      NextSeq 550 
      
    
   
  12 
 
  
    EGAD00001006429 
   
  
    
    Whole genome sequencing of EBV Associated Nasopharyngeal Carcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  138 
 
  
    EGAD00001006431 
   
  
    
    Background: The development of retinoblastoma is thought to require pathological genetic changes in both alleles of the RB1 gene. However, cases exist where RB1 mutations are undetectable suggesting alternative pathways to malignancy. Methods: We applied comprehensive whole genome sequencing (WGS) and transcriptomics to sporadic retinoblastomas derived from twenty patients attending our clinic, contrasting these results to that obtained through customary clinical testing. We sought RB1 and other driver mutations, investigated mutation burden, mutational signatures and phylogenetic relatedness in one case of bilateral retinoblastoma. Results: At least one RB1 mutation was identified in all retinoblastomas. We confirmed RB1 mutations previously identified by clinical screening, identified three new RB1 mutations and provided clarity to the mechanism behind a further six mutations. Eight tumours carried structural rearrangements involving RB1 ranging from relatively simple to extremely complex rearrangement patterns, including a chromothripsis-like pattern in one tumour. Potential driver mutations included mutations in BCOR (5/20) and amplification of MYCN (2/20) and MDM4 (1/20). We show that RB1 mutations are not mutually exclusive of MYCN amplifications, and further reveal that all tumours demonstrate increased MYCN expression suggesting a universal role in retinoblastoma tumorigenesis.  Bilateral tumours obtained from one patient harboured conserved germline but divergent somatic RB1 mutations, indicating independent evolution. In-keeping with previous WGS of paediatric cancers, the mutation burden in retinoblastomas was extremely low. Mutational signature analysis showed a predominance of signatures associated with cell division and an absence of ultraviolet-related DNA damage. In a tumour exposed to chemotherapy prior to enucleation, a profound platinum-related mutational signature was observed. Conclusions: WGS provides a complete picture of the genomic landscape of retinoblastomas, allowing the discovery of mutations otherwise undetected by conventional clinical screening approaches. The presence of at least one RB1 mutation in all retinoblastomas and the relative paucity of driver mutations in other genes suggests mutations beyond RB1, MYCN and BCOR are rare. Whilst most RB1 mutations are identifiable by clinical screening, the increased resolution and ability to detect otherwise elusive rearrangements of RB1 by WGS, confirming whether they are somatic or germline, has important repercussions on clinical management and advice on recurrence risks. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  41 
 
  
    EGAD00001006433 
   
  
    
    Shallow whole genome sequencing of 29 BIA-ALCL patients for copy number analysis and 24 Alk-negative ALCL samples as control cohort. 7 Whole exome sequencing BIA-ALCL samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001006434 
   
  
    
    Whole exome seq (N=21) and RNA-seq (N=36) data of additional T-ALL 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001006435 
   
  
    
    The dataset contains metadata for all cells before scRNA-seq quality control and for cells passing quality control. It also contains a count matrix with Salmon gene counts for all cells passing quality control, and reconstructed B-cell receptor sequences using the computational tool BraCeR. The scRNA-seq data was generated using the Smart-seq2 protocol and sequenced on Illumina NextSeq500. 
    
   
  
    
   
  12 
 
  
    EGAD00001006436 
   
  
    
    This dataset contains scRNA-seq fastq files (trimmed for quality and adapters using Trim Galore) for 3739 intestinal plasma cells of known or unknown antigen specificities from in total 12 individuals (4 untreated coeliac disease patients, 3 treated coeliac disease patients, 5 controls). The data was generated using the Smart-seq2 protocol and sequenced on the Illumina NextSeq500 platform with 75 bp paired-end reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001006438 
   
  
    
    Contains data for all cells sequenced for this study. Data is organized as one bam-file per sample. Individual cells can be identified through the CB tag in the bam-files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001006439 
   
  
    
    Illumina RNASeq sequencing of tumour samples from 53 cases of cutaneous melanoma and 61 cases of acral melanoma 
    
   
  
    
   
  - 
 
  
    EGAD00001006440 
   
  
    
    This dataset includes whole exome sequencing reads from 10 normal and 14 cell lines based on Agilent SureSelect XT Human All Exon v6. They are all 2*100bp reads sequenced using Illumina HiSeq4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001006441 
   
  
    
    This dataset includes whole transcriptome sequencing reads from 8 cell lines based on TruSeq stranded mRNA kit (Illumina). They are all 2*75bp reads sequenced using Illumina NextSeq500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001006442 
   
  
    
    WGS files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  184 
 
  
    EGAD00001006443 
   
  
    
    WXS files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  260 
 
  
    EGAD00001006444 
   
  
    
    RNASeq files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  132 
 
  
    EGAD00001006445 
   
  
    
    Glioma is the most common and aggressive brain cancer in adults. While primary glioma has been widely studied, molecular characterization of recurrent glioma is still rare. The high-quality sequencing data that we generated provides a useful resource for the community. The CGGA project contains over 2,000 samples from Chinese cohorts. It totally includes the whole-exome sequencing (286), DNA methylation (159), mRNA sequencing (1,018), mRNA microarray (301) and microRNA microarray (198) and matched clinical data. CGGA removes the barriers to researchers, providing rapid and convenient access to high-quality functional genomic data resources for biological research and clinical applications. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  572 
 
  
    EGAD00001006446 
   
  
    
    Dataset contains fastq files of tumor transcriptomes of 12 pituitary neuroendocrine tumors. Patients with and without somatostatin analogue treatment before tumor surgery can be compared. Sequencing was performed on MGISEQ-2000. 
    
   
  
    
      
      unspecified 
      
    
   
  12 
 
  
    EGAD00001006447 
   
  
    
    To elucidate the epigenetic changes which occur when human long-term hematopoietic stem cells (LT-HSC) become activated we performed Bulk ATAC-Seq on 13 sorted bulk hematopoietic populations from cord bloodas well as single-cell ATAC-Seq upon CD34+CD38-CD45RA- cells enriched for HSC as well as CD34+/CD38+ progenitor cells both from cord blood. These studies revealed gains of chromatin accessibility around CTCF binding sites during HSPC activation, as such we additionally performed Low-C to directly profile the 3D conformation of human cord-blood derived LT-HSC and Short-term hematopoietic stem cells (ST-HSC), as well as Hi-C , ATAC-Seq and CTCF ChIP-Seq upon the OCIAML-2 cell line in which CTCF sites gained during LT-HSC activation are enriched. Finally we transduced human cord-blood LT-HSC with an shCTCF vector; in-vitro cultured LT-HSC cells harbouring shCTCF were used to perform RNA-Seq, and scATAC-Seq was performed on CD34+/CD38- human CB cells transduced with shCTCF, four weeks post xeno-transplantation into mice. Collectively these studies have helped us demonstrate the role of 3D chromatin conformation changes during human LT-HSC activation. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
      unspecified 
      
    
   
  62 
 
  
    EGAD00001006448 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001006449 
   
  
    
    37 surgical samples were interrogated by WXS, and 50 formalin-fixed, paraffin-embedded samples were interrogated by target-seq. Agilent SureSelect XT kit and SureSelect Human Exon V6 were used to generate exome libraries. Agilent SureSelect XT low input kit and custom capture panel designed on SureDesign were used to generate target-seq libraries.  All libraries were sequenced on Illumina HiSeq 2500 platform. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  124 
 
  
    EGAD00001006450 
   
  
    
    We characterised H3K27M-mutant diffuse intrinsic pontine glioma (DIPG, n=21) and RNA-Seq (n=26 DIPG, 12 normal brain) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Ion Torrent Proton 
      
    
   
  59 
 
  
    EGAD00001006451 
   
  
    
    A total of 9 brain metastasis were sequenced. For 6/9 a matched cerebrospinal fluid sample, prior to surgery and in two cases after surgery (+1 month from surgery) and after treatment (+3 month) were collected. Single-cell T cell receptor clonotypes were produced using the Chromium Single Cell 5’ Library and sequenced on an Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001006452 
   
  
    
    A total of 9 brain metastasis were sequenced. For 6/9 a matched cerebrospinal fluid sample, prior to surgery and in two cases after surgery (+1 month from surgery) and after treatment (+3 month) were collected. Single-cell gene expression was produced using the Chromium Single Cell 5’ Library and sequenced on an Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD00001006453 
   
  
    
    Whole exome sequencing of matched tumor (brain metastasis) -normal (blood) from 6 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001006456 
   
  
    
    RNA sequencing data of over 200 HGSOC samples at diagnosis, after chemotherapy and during progression. 
    
   
  
    
      
      unspecified 
      
    
   
  212 
 
  
    EGAD00001006457 
   
  
    
    Genomic analysis between pre-invasive and invasive components of malignant pulmonary nodule (MPN) facilitates the description of lung adenocarcinoma (LUAD) evolutionary patterns. We conduct an analysis of gene-panel sequencing on 53 T1 stage LUAD cases, which extend the understanding of evolutionary trajectories during invasiveness acquisition in early LUAD. 
    
   
  
    
   
  174 
 
  
    EGAD00001006458 
   
  
    
    Whole genome sequencing was performed on 24 patients (tumor DNA paired to constitutional DNA). WGS libraries were subjected to paired-end (2 x 100 bp) sequencing on NovaSeq (Illumina). The 96 files are in FASTQ format. 
    
   
  
    
   
  48 
 
  
    EGAD00001006459 
   
  
    
    Bottleneck sequencing of human tissue including neurons, cord blood, sperm.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2020-10-20. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  192 
 
  
    EGAD00001006460 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0006_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006461 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0080_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006462 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0080_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006463 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0141_004 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006464 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0142_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006465 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0142_003 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006466 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0146_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006467 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0149_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006468 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0149_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006469 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0150_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006470 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0150_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006471 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0152_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006472 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0152_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006473 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0163_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006474 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0163_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006475 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0172_001 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006476 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0172_002 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006477 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX062 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006478 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0063_000 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006479 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX064 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006480 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX065 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006481 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX066 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006482 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX068 1 samples; filetype=fastq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006483 
   
  
    
    Transcriptome profiling by high-throughput sequencing for single cells for library TENX069 1 samples; filetype=fastq 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001006484 
   
  
    
    Whole-transcriptome characterization of cfRNA in cancer (stage III breast [n=46], lung [n=30]) and non-cancer (n=89) participants from the Circulating Cell-free Genome Atlas (NCT02889978). Dataset includes collapsed BAM files for plasma cfRNA from each patient, as well as collapsed BAM files for RNA from matched tumor tissue (when available). 
    
   
  
    
   
  303 
 
  
    EGAD00001006485 
   
  
    
    Raw FASTQ files obtained from in situ Hi-C of 16 normal B cells (3 naive B cells, 3 germinal center B cells, 3 plasma cells, and 3 memory B cells, together with a merge file for each subpopulation), 7 chronic lymphocytic leukemias (2 unmutated IGHV and 5 mutated IGHV), and 5 mantle cell lymphomas (2 conventional and 3 leukemic non-nodal). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001006486 
   
  
    
    Valid reads obtained after analyzing in situ Hi-C data of 16 normal B cells (3 naive B cells, 3 germinal center B cells, 3 plasma cells, and 3 memory B cells, together with a merge file for each subpopulation), 7 chronic lymphocytic leukemias (2 unmutated IGHV and 5 mutated IGHV), and 5 mantle cell lymphomas (2 conventional and 3 leukemic non-nodal). 
    
   
  
    
   
  28 
 
  
    EGAD00001006487 
   
  
    
    The dataset consists of BAM files of 2 pairs of matched tumor/normal samples of a men with advanced prostate cancer. One pair is whole genome sequencing: WGS_T/WGS_N for tumor and normal samples, respectively; and the other pair is whole exome sequencing: WES_T/WES_N for tumor and normal specimens, respectively.
Details can be found at the publication titled: "Molecular Medicine Tumor Board: Whole Genome Sequencing to Inform on Personalized Medicine for a Man with Advanced Prostate Cancer" 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001006488 
   
  
    
    WGS data for 20 Glioblastoma stem cell (GSC) lines and matched blood samples. Fastq files are available.
For 10 GSC samples and the 10 matched blood samples the reads are on 3 fastq files per sample 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  80 
 
  
    EGAD00001006537 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001006538 
   
  
    
    WGBS data for EGAS00001004660, "Aggressive PDACs show hypomethylation of repetitive elements and the execution of an intrinsic IFN program linked to a ductal cell-of-origin" 
    
   
  
    
   
  13 
 
  
    EGAD00001006539 
   
  
    
    RNA data for EGAS00001004660, "Aggressive PDACs show hypomethylation of repetitive elements and the execution of an intrinsic IFN program linked to a ductal cell-of-origin" 
    
   
  
    
   
  23 
 
  
    EGAD00001006540 
   
  
    
    This dataset contains the WGS of 35 samples (high grade osteosarcoma). All cases were reviewed by an expert bone pathologist and have a tumour content of 50% minimum. Paired-end libraries from fresh frozen tumour samples were prepared using the Agilent SureSelectXT HumanV5 kit for whole-genome sequencing (WGS). These were sequenced together with a tumour complementary DNA on an Illumina HiSeq2500 (paired-end 100 bp). Sequencing reads were mapped to the GRCh37 human reference genome using HISAT2 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001006541 
   
  
    
    The data provided here was critical in establishing  that human long-term hematopoietic stem cells (LT-HSC), previously described as the most primitive HSC population, is actually composed of distinct subsets that can be prospectively isolated. Via mechanistic studies centering around the Rho-GTPase effector kinase PAK4 and its inhibitor INKA1, we identified the immune checkpoint ligand CD112 as a marker for hematopoietic stem and progenitor cells, that is highest expressed on LT-HSC. More importantly, CD112 can be used to stratify functionally distinct subsets within LT-HSC: In response to regeneration-mediated stress, the CD112low subset exhibits a transient restraint (termed latency) before contributing to hematopoietic reconstitution, while the CD112high subset is primed to respond rapidly. High resolution RNA-seq of the CD112 surface expression spectrum within rare LT-HSC subsets (human umbilical cord blood) demonstrated that more genes are differentially upregulated in the deeper quiescent and less metabolic active subset. Genes enriched in this subset centre around cell adhesion and Rho-GTPase signaling. This is in agreement with the scRNAseq data from human G-CSF mobilized peripheral blood (mPB) generated here that was used as an model of in vivo activation/priming revealing via RNA-velocity and pseudo-time analysis that INKA1high versus PAK4high, CDK6high and CD112high enrichment are either detected early or late in diffusion pseudotime indicative of quiescent versus primed cell status, respectively. RNAseq following INKA1 overexpression in LT-HSC  and ST-HSC revealed by GSEA an overall stemness preserving phenotype and particularly in LT-HSC, but not in short-term HSC (ST-HSC), suppression of transcriptional programs linked to  activation.  Collectively, our data decipher the molecular intricacies underlying HSC heterogeneity and self-renewal regulation and point to latency as an orchestrated physiological response that integrates quiescence control with HSC fate choices to preserve a stem cell reservoir. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001006542 
   
  
    
    This dataset contains:
Targeted proximity-ligation assay, enriched using capture probes (1092 samples)
Targeted proximity-ligation assay, enriched using 4C (1230 samples)
Genome-wide proximity-ligation assay, enriched using HiC ( 6 samples) 
    
   
  
    
      
      Illumina MiniSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  2328 
 
  
    EGAD00001006543 
   
  
    
    WGS BAMs of 19 adult patients with T-acute lymphoblastic leukemia with primary, remission and relapse sample per patient. Total = 58 samples sequenced with HiSeq 4000 or NovaSeq 6000 (Illumina). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  57 
 
  
    EGAD00001006544 
   
  
    
    ATAC-seq data for 2 glioblastoma cell lines (LN229, ZH487), NT and SOX10KD. 
    
   
  
    
   
  2 
 
  
    EGAD00001006545 
   
  
    
    Whole genome sequencing data for 20 human glioblastoma patients. 
    
   
  
    
   
  20 
 
  
    EGAD00001006546 
   
  
    
    Whole Genome Bisulfite data for human glioblastoma patients, EGAS00001003953. 68 human samples 
    
   
  
    
   
  68 
 
  
    EGAD00001006547 
   
  
    
    RNA data for human glioblastoma patients, EGAS00001003953. 64 human samples, 2 cell lines (LN229, ZH487). 
    
   
  
    
   
  66 
 
  
    EGAD00001006548 
   
  
    
    ChIPseq data for human glioblastoma patients, EGAS00001003953. Mix of input, H3K27ac, H3K27me1, H3K27me3, H3K36me3, H3K4me1, H3K4me3, H3K9me3 and BRD, 20 human samples, 2 cell lines (LN229, ZH487). 
    
   
  
    
   
  22 
 
  
    EGAD00001006550 
   
  
    
    In a dual-center, two-cohort study, we performed single-cell RNA-sequencing of whole blood and peripheral blood mononuclear cells to determine changes in immune cell composition and activation in mild vs. severe COVID-19 over time. This study provides detailed insights into the systemic immune response to SARS-CoV-2 infection and reveals profound alterations in the myeloid cell compartment associated with severe COVID-19. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  141 
 
  
    EGAD00001006551 
   
  
    
    RNA-seq profiling of 2 prostate cancer xenograft mouse models, each at the intact state (n=3), castrated state (n=4) and castrated + AR replacement (n=3). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001006552 
   
  
    
    Raw data from cancer panel sequencing of lung adenocarcinomas from admixed Latin American populations. Predominantly samples carrying known oncogene mutations (n=581). 
    
   
  
    
   
  578 
 
  
    EGAD00001006553 
   
  
    
    Off-target amplification can lead to false positive human brain microbiome detection. 16s rRNA amplicon samples from brain tissue of healthy and Parkinson's disease patients. 
    
   
  
    
   
  114 
 
  
    EGAD00001006554 
   
  
    
    This dataset includes 1,359 paired-end shotgun metagenomics samples from 946 healthy donors of the Milieu Intérieur cohort. 413 of the donors provided two samples (V1 and V2). 
    
   
  
    
   
  1359 
 
  
    EGAD00001006555 
   
  
    
    To investigate the mechanism by which GATA1s and STAG2 deficiency contribute to Down Syndrome leukemogenesis, specifically within the propagating CD34/CD117 cell fractions from primary xenografts, we carried out transcriptional and epigenetic profiling by RNAseq and ATACseq. The chromatin accessibility landscape was compared to bulk ATACseq of
individually sorted N-FL HSPC subpopulations.
To investigate the mechanism underlying the synergy between T21 and GATA1s in driving preleukemia development, we analyzed the binding occupancy of GATA1. We performed Cut&Run assays to profile
 genome-wide GATA1 binding sites and also to quantify binding changes upon GATA1s editing in N-FL and T21-FL CD34+ enriched HSPCs. Lastly, we profiled miRNAs from N-FL and T21-FL CD34+ enriched
 HSPCs by miRNA-Seq. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  91 
 
  
    EGAD00001006556 
   
  
    
    This dataset contains 19 scRNAseq realized on neuroblastoma patients biopsies straight after surgical act. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD00001006557 
   
  
    
    This dataset contains 4 samples (2x input and 2x H3K27ac ChIPseq) in the IC-pPDXC-63 cell line. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001006558 
   
  
    
    This dataset contains 72 RNAseq (BAM and Fastq files are available for each sample). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  114 
 
  
    EGAD00001006559 
   
  
    
    RNASeq gene expression profiles from 3 icas9 human iPSC derived cortical neurons treated with and without doxycycline. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001006560 
   
  
    
    paired RNA-Seq data of VDH15 cells with and without deletion of NSUN3. The 11 samples were sequenced on HiSeq 4000 and prepared with SmarTer low input RNA and Chip NEBNext kit. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD00001006561 
   
  
    
    We performed whole- genome sequencing, rare variant filtering, segregation analysis and functional validation of PD cosegregating rare genetic variation in two families (6 samples) segregating PD associated GBA variants c.115+1G>A (ClinVar ID: 93445, ) and p.L444P (ClinVar ID: 4288) respectively. The paired WGS sequencing was run on HiSeq X Ten and the library preparation kit was Illumina TruSeq DNA nano. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001006562 
   
  
    
    Updated INSPIRE whole exome sequencing data: PBMC controls 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD00001006563 
   
  
    
    INSPIRE whole exome sequencing of tumors updated 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  46 
 
  
    EGAD00001006564 
   
  
    
    INSPIRE whole transcriptome sequencing of tumors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  65 
 
  
    EGAD00001006565 
   
  
    
    The mutational status of 112 recurrently mutated genes in B-cell lymphoma was examined by targeted next-generation sequencing (NGS). Libraries were performed with 150 ng of genomic DNA (gDNA) obtained from formalin-fixed paraffin-embedded (FFPE) biopsy using molecular-barcoded library adapters (ThruPLEX Tag-seq kit; Takara) coupled with a custom hybridization capture based method (SureSelect XT Target Enrichment System Capture strategy, Agilent Technologies Inc.) and sequenced in a MiSeq instrument (Illumina, 2x150bp). 
    
   
  
    
   
  45 
 
  
    EGAD00001006566 
   
  
    
    The mutational status of 112 recurrently mutated genes in B-cell lymphoma was examined by targeted next-generation sequencing (NGS). Libraries were performed with 15-30 ng of cfDNA obtained from plasma using molecular-barcoded library adapters (ThruPLEX Tag-seq kit; Takara) coupled with a custom hybridization capture based method (SureSelect XT Target Enrichment System Capture strategy, Agilent Technologies Inc.) and sequenced in a MiSeq instrument (Illumina, 2x150bp). 
    
   
  
    
   
  79 
 
  
    EGAD00001006567 
   
  
    
    This dataset is composed of 86 samples: 15 samples of bronchoalveolar lavage fluid (BAL), 17 samples of non-malignant lung tissue, 14 samples of peritumoural tissue, 16 tumour tissues, 8 negative DNA extraction controls, 16 negative sampling controls for BAL. Samples were obtained from 17 NSCLC patients (average age 68 years). Sequenced region was 16S V3-V4. Fastq files are provided. 
    
   
  
    
   
  86 
 
  
    EGAD00001006568 
   
  
    
    This dataset contains 20 whole genome sequences from 10 tumor-normal pairs from conjunctival melanomas. 
    
   
  
    
   
  20 
 
  
    EGAD00001006569 
   
  
    
    Somatic SNVs and Indels for INSPIRE Tumor WES called using Mutect, Mutect2, Varscan2, Vardict, and Strelka2 
    
   
  
    
   
  - 
 
  
    EGAD00001006570 
   
  
    
    The clinical relevance of immune landscape intratumoural heterogeneity (immune-ITH) and its role in tumour evolution remain largely unexplored. Here, we uncover significant spatial and phenotypic immune–ITH from multiple tumour sectors and decipher its relationship with tumour evolution and disease progression in hepatocellular carcinomas (HCC). Immune–ITH is associated with tumour transcriptomic-ITH, mutational burden, and distinct immune microenvironments. Tumours with low immune–ITH experience higher immunoselective pressure and escape via loss of heterozygosity in human leukocyte antigens and immunoediting. Instead, the tumours with high immune-ITH evolve to a more immunosuppressive/exhausted microenvironment. This gradient of immune pressure along with immune-ITH represents a hallmark of tumour evolution, which is closely linked to the transcriptome-immune networks that contributes to disease progression and immune inactivation. Remarkably, high immune-ITH and its transcriptomic signature are predictive for worse clinical outcome in HCC patients. This in-depth investigation of ITH provides evidence on tumour-immune co-evolution along HCC progression. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  70 
 
  
    EGAD00001006571 
   
  
    
    Raw data from cancer panel sequencing of lung adenocarcinomas from admixed Latin American populations, predominantly samples without known oncogene mutations (n=532). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  532 
 
  
    EGAD00001006572 
   
  
    
    The dataset contains somatic variants in 344 colorectal cancer samples. Variants are called with Mutect2 (GRCh38).
Important: VCF-files include also variants, which have been annotated as "str_contraction" and "panel_of_normals".
Please, use only "PASS" variants in studies, which are not microsatellite repeat related. Samples are sequenced 
with Novaseq 6000, HiSeq 2000, and HiSeq X Ten instruments (average coverage depth ~30+). The dataset consists of 
257 MSS, 58 MSI, 25 MSS IBD, and 4 POLE mutant CRCs. 
    
   
  
    
   
  344 
 
  
    EGAD00001006573 
   
  
    
    Here we report successful gene knock-in (KI) in the eggs of Schistosoma mansoni by combining CRISPR/Cas9 with single-stranded oligodeoxynucleotides (ssODNs). We targeted the acetylcholinesterase (AChE) gene of S. mansoni using two synthetic guide RNAs (gRNAs), X5 and X7, respectively. Liver eggs of S. mansoni were exposed to CRISPR-vector containing X5 or X7 by electroporation. Simultaneously, eggs were transfected with a ssODN donor encoding a stop codon in all six frames. Next generation sequencing analysis revealed that CRISPR/Cas9-mediated editing in S. mansoni eggs resulted in Homology-Directed Repair (HDR) when template DNA ssODN provided. Furthermore, soluble egg antigen (SEA) from AChE-modified eggs exhibited markedly reduced AChE activity compared with controls, indicative that programmed Cas9 cleavage mutated the AChE gene. Following injection of modified schistosome eggs into the tail veins of mice, a significant decrease in granuloma size in the lungs of these animals. Notably, an enhanced Th2 response induced by eggs in lung, and splenocytes small intestine-draining mesenteric lymph node cells was also generated in mice injected with X5-KI eggs in different methods. These findings further demonstrate the power and utility of CRISPR/Cas9-based genome editing for undertaking functional genomics studies in schistosomes. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  22 
 
  
    EGAD00001006574 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001006575 
   
  
    
    To identify dysfunctional neuronal subtypes underlying seizure activity in the human brain, we have performed single-nucleus transcriptomics analysis of >110,000 neuronal transcriptomes derived from temporal cortex samples of multiple temporal lobe epilepsy and non-epileptic subjects. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  19 
 
  
    EGAD00001006576 
   
  
    
    The PMCC AML RNAseq dataset consists of 81 AML patient samples (clinical data in Supplemental Table 11 of manuscript),  processed in two batches. These patient samples are able to engraft in the NSG (NOD.Cg PrkdcscidIl2rgtm1Wjl /SzJ) mouse model. Five patients (90543, 598, 90240, 110484, 100500) were included in both batches. Viaably frozen material from the Leukemia Tissue Bank at Princess Margaret Cancer Centre/ University Health Network were thawed by dropwise addition of X-VIVO + 50% fetal calf serum supplemented with DNase (100μg/mL final concentration, Roche). RNA was extracted from bulk peripheral blood mononuclear cells (PBMC) using the RNeasy Micro Kit (Qiagen Inc.). A paired-end 76 base-pair flow-cell lane Illumina High seq 2000 yielded an average of 240 million sequence reads aligning to genome per sample at the Genome Sciences Centre, BC Cancer Agency for cohort 1. Cohort 2 was subjected to 125 bp, paired-end RNA-sequencing on the Illumina HiSeq 2500 with an average of 50 million reads/sample at the Centre for Applied Genomics, Sick Kids Hospital. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  85 
 
  
    EGAD00001006577 
   
  
    
    RNAseq and WES of liver metastases samples (resections and biopsies) of CM and UM patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  103 
 
  
    EGAD00001006578 
   
  
    
    This dataset contains three cram files for paired end sequencing of a trio, sequenced with Illumina Hiseq 2500 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001006579 
   
  
    
    This dataset comprises Circle-seq data for 12 neuroblastoma cell lines supporting Koche et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma (2020). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001006580 
   
  
    
    Circle-seq data for 21 primary neuroblastoma samples supporting Koche et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma (2020). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      MinION 
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001006581 
   
  
    
    This is human phenotype data for participants in a gut microbiome study. This data was collected at the same time as the stool samples used for the microbiome component.  Participants were also part of the AWI-Gen Phase 1 main study. https://www.ebi.ac.uk/ena/data/view/PRJEB40733 
    
   
  
    
   
  171 
 
  
    EGAD00001006582 
   
  
    
    To investigate the molecular and biological pathways altered by S1PR3OE in human hematopoietic stem cells (HSC), we performed RNA-sequencing (RNA-seq) of LT- and ST-HSC 3 days after transduction with control or S1PR3 overexpression (OE) lentiviral vectors.  LT-HSC and ST-HSC from 3 pool of CB lin- were FACS-purified, cells were prestimulated for 4 hours and transduced with lentiviral vectors. At day 3, 2000-5300 BFP+ cells were FACS-purified for RNA isolation with a PicoPure kit.  We were able to isolate only 1600-1800 BFP+ cells from LT-HSC control samples as opposed to 4000-5400 BFP+ cells from S1PR3OE samples. Thus, we pooled all control BFP+ LT-HSC cells into one sample for RNA-seq analysis. BFP- LT-HSC from control vector transduction were purified from CB1 as an additional LT-HSC control. Nextera libraries generated from 10 ng RNA from 5 LT-HSC samples (2 controls, 3 S1PR3OE) and 6 ST-HSC samples (3 controls, 3 S1PR3OE) were subjected to 125 bp, paired-end RNA-sequencing on the Illumina HiSeq 2500 with an average of 50 million reads/sample at the Center for Applied Genomics, Sick Kids Hospital. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001006583 
   
  
    
    human medulloblastoma xenograft isolated from mouse brain was frozen and genomic DNA or RNA was extracted. Bisulfite converted DNA was processed and hybridised to illumina EPIC or 450K arrays using standard protocols. RNAseq was performed on total RNA. 
    
   
  
    
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001006584 
   
  
    
    A deeper understanding of the pathological mechanisms of SARS-CoV-2 infection is required to combat COVID-19. Through this dataset, we analyze postmortem lung cells from patients that are infected/uninfected with SARS-CoV-2 with snRNA-seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006585 
   
  
    
    In our study, we hypothetyzed that CD34progenitors from cases with undetectable Minimal residual Disease (MRD) by flow cytometry  would contain cells with leukemic-initiating-potential that could be identified on genetic (rather than phenotypic) grounds by Whole Exome Sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001006587 
   
  
    
    94 sample with multi-omics analysis of ALT-positive neuroblastoma tumors, rna sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  715 
 
  
    EGAD00001006589 
   
  
    
    The dataset contains transcriptional data obtained using total RNA sequencing on a Illumina machin. 59 samples are from the Hammersmith Hospital (HH) cohort of human primary ovarian tumours and 20 samples are from ovarian cancer cell lines Kuramochi (3 replicates) and Ovsaho (2 replicates) treated with 1 uM DNMTi guadecitabine or vehicle, at an early (day 5) or late (day 8) timepoint. 
    
   
  
    
   
  79 
 
  
    EGAD00001006591 
   
  
    
    Stem cells within prostate epithelium frequently undergo malignant transformation, but there is limited information on their clonal dynamics and mutation burden in healthy human prostates. We sequenced whole genomes from 409 microdissections of prostate epithelium across 8 donors, using phylogenetic reconstruction with spatial mapping in a 59-year-old man’s prostate to provide high-resolution reconstruction of tissue dynamics across the lifespan. Somatic mutation burden increases linearly with age, at ~16 mutations/year/clone, and is higher in peripheral than peri-urethral regions. The 24-30 independent glandular subunits are established as rudimentary ductal structures during fetal development by 5-10 embryonic cells each. Puberty induces formation of further side branches and terminal acini by local stem cells disseminated through the rudimentary ducts during development. During adult tissue maintenance, clonal expansions are small, with limited geographic scope and minimal migration. Driver mutations are rare in normal ageing prostate epithelium, but the one canonical driver we did observe generated a sizable intraepithelial clonal expansion. By resolving unbiased, continuously occurring lineage-marking mutations, we define stem cell dynamics through embryogenesis, puberty and ageing, with relevance for prostate cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  49 
 
  
    EGAD00001006593 
   
  
    
    Whole genome sequencing data used in the manuscript: DNA polymerase and mismatch repair exert distinct microsatellite instability signatures in normal and malignant human cells. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001006594 
   
  
    
    EORTC RP1335 SPECTA Lung cancer data - Oncomine dataset 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  350 
 
  
    EGAD00001006595 
   
  
    
    This dataset contains 160 single-cell derived blood colonies from two neonates and 6 adults. It also contains 18 samples that were used as matched normals to call mutations in NanoSeq data (dataset EGAD00001006459). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001006596 
   
  
    
    The goal of this project was to perform long-read RNA sequencing (LR-seq, PacBio) in combination with short-read RNA-seq for systematic characterization of the isoform diversity in primary breast tumor samples. We sequenced the full-length transcriptomes of 26 breast tumors and 4 normal breast samples. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001006597 
   
  
    
    The goal of this project was to perform long-read RNA sequencing (LR-seq, PacBio) in combination with short-read RNA-seq for systematic characterization of the isoform diversity in primary breast tumor samples. We sequenced the full-length transcriptomes of 26 breast tumors and 4 normal breast samples. 
    
   
  
    
      
      PacBio RS II 
      
      Sequel 
      
    
   
  26 
 
  
    EGAD00001006598 
   
  
    
    The dataset was generated for studying metastatic mechanism of pancreatic ductal adenocarcinoma (PDAC). It is consisted of pair-end raw RNA sequencing reads of 33 fresh froze PDAC specimens, which includes 6 tumor-adjacent normal tissues (N), 13 primary tumors (PT), and 14 hepatic metastases (HM) from 14 PDAC  patients (6 N-PT-HM trios, 7 PT-HM paires, and 1 HM). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  32 
 
  
    EGAD00001006599 
   
  
    
    This data set contains pair-end raw whole exome sequencing data of matched primary tumors (PT) and hepatic metastases (HM) of pancreatic ductal adenocarcinoma (PDAC). Eight tumor adjacent normal tissues (N) were also evaluated. In total, there are 30 specimens generated from 11 PDAC cases, including 8 PT-HM-N trios and 3 PT-HM paires. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001006601 
   
  
    
    This dataset consists of ATAC-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting TET2, IRF4 and EGR2. In total, it includes 26 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  24 
 
  
    EGAD00001006602 
   
  
    
    This dataset consists of ChIP-seq data data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to mRNA transfection for PU.1, IRF4, and EGR2. In total, it the data set includes 18 samples. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  18 
 
  
    EGAD00001006603 
   
  
    
    This dataset consists of 5hmC capture-seq data from human monocytes, monocyte-derived dendritic cells. It includes two biological replicates and three time points. Including controls, the dataset comprises 10 samples in total. 
    
   
  
    
      
      Illumina HiSeq 1000 
      
      NextSeq 550 
      
    
   
  10 
 
  
    EGAD00001006604 
   
  
    
    This dataset consists of RNA-seq data data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting TET2, IRF4 and EGR2. In total, it includes 43 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  43 
 
  
    EGAD00001006608 
   
  
    
    Single-cell RNA and VDJ sequencing of early breast cancer 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  84 
 
  
    EGAD00001006609 
   
  
    
    RNASeq files for paper titled "Prognostic and therapeutic significance of leukemia subtypes and minimal residual disease measurements in pediatric acute lymphoblastic leukemia treated with contemporary risk-directed trial: a cohort study" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  122 
 
  
    EGAD00001006610 
   
  
    
    We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Sorted Tcons were activated using a titration of anti-CD3 and anti-CD28 in combination as well as individually. As a control we cultured cells in the same conditions but without the stimuli. In total, we defined seven conditions from four individuals for sequencing.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  74 
 
  
    EGAD00001006611 
   
  
    
    We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Isolated memory and naïve T cells were activated using anti-CD3, anti-CD28 or both in combination. As a control we cultured cells in the same conditions but without the stimuli. We carried Chipmentation using the H3K27ac antibody on 200,000 cross-linked cells. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001006612 
   
  
    
    We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Sorted Tcons were activated using a titration of anti-CD3 and anti-CD28 in combination as well as individually. As a control we cultured cells in the same conditions but without the stimuli. In total, we defined seven conditions from four individuals for sequencing.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001006613 
   
  
    
    linking 82 samples/82 runs of WES from EGAS00001004338 Umbrella study to EGAS0001004786 study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001006614 
   
  
    
    linking 58 samples/58 runs of WGS from EGAS00001004338 Umbrella study to EGAS0001004786 study 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001006615 
   
  
    
    linking 55 samples/74 RNA-Seq runs -  out of EGAS00001004338 Umbrella study to EGAS0001004786 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001006616 
   
  
    
    This dataset contains somatic alteration calls summarized at the gene level for 715 patients profiled by FoundationOne (Foundation Medicine). 
    
   
  
    
   
  1538 
 
  
    EGAD00001006617 
   
  
    
    This dataset contains demographic, histology, PDL1 IHC, TMB and outcome data (PFS and ORR) for 836 patients, 823 of which had RNAseq, 715 had FMI, with 702 patients having both. The dataset also includes xCell deconvolution scores for patients with RNAseq data. 
    
   
  
    
   
  1538 
 
  
    EGAD00001006618 
   
  
    
    This dataset contains log2(TPM + 1) transformed counts for the 823 tumor samples profiled by RNAseq. 
    
   
  
    
   
  1538 
 
  
    EGAD00001006619 
   
  
    
    This dataset contains FASTq files for tumors from the 823 patients profiled by RNAseq. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  823 
 
  
    EGAD00001006620 
   
  
    
    RNAseq of tumours derived from NSCLC (n=3 PDX; n=1 CDX) and melanoma (n=1 CDX) xenograft models treated with ADC or controls (64 samples; 189 FASTQ files). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001006621 
   
  
    
    This dataset contains DNA sequencing data for twelve hepatoblastoma tumor samples, four of which have matched normals. Nine of the samples also have RNA sequencing data from the tumor sample.
The dataset comprises six samples from patient tissues and six from cell lines; see metadata for details. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001006622 
   
  
    
    This dataset comprises DNA sequencing for tumor and matched normal from an Alveolar Rhabdomyosarcoma patient. It also include RNA sequencing data from the tumor sample. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001006623 
   
  
    
    We report a patient with mycobacterial disease due to inherited deficiency of the transcription factor T-bet. PBMCs from the patient and his heterozygous father were analyzed with scRNA-seq.   
these represent 2 single cell RNA samples generated using the 0xGenomics technology and being processed through cell ranger. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001006624 
   
  
    
    31 single-cell transcriptomes of neuroblastomas and normal human developing adrenal glands at various stages of embryonic and fetal development 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  31 
 
  
    EGAD00001006625 
   
  
    
    144 sample from  individuals with ALT-positive neuroblastoma tumors, chip-seq sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  144 
 
  
    EGAD00001006626 
   
  
    
    238 samples from individuals with ALT-positive neuroblastoma tumors, high coverage whole genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  238 
 
  
    EGAD00001006627 
   
  
    
    Fastq files for single cell RNA sequencing of cells from ovarian cancer tumor and ascites (10X chromium 5' v1.1 libraries). Multiple cases are pooled in each sequencing library. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001006628 
   
  
    
    RNA was extracted from fresh frozen LMS material for 29 untreated tumors (24 primary tumors, 5 metastatic12 relapses) and 13 tumors treated with radiation(7 primaries, 6 metastatic relapses).
RNA-Seq sequencing was performed using established protocols on Illumina HiSeq 2500 
    
   
  
    
   
  51 
 
  
    EGAD00001006629 
   
  
    
    The dataset contains FASTQ files referring to the study "Small RNA sequencing from CSF extracellular vesicles - PD/CTR". For this project, RNA was isolated from CSF extracellular vesicles obtained by ultracentrifugation. Libraries were prepared with the TruSeq Small RNA library prep Illumina, and sequencing conducted in the Illumina HiSeq4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  104 
 
  
    EGAD00001006630 
   
  
    
    Clinical data from IMvigor210, POPLAR, IMmotion150: Clinical data include demographics, tumor type, PD-L1 IHC, tumor mutation burden, objective response rate, overall survival and progression free survival for 611 patients across IMvigor210, POPLAR and IMmotion150.
Clinical data from PCD4989g: Clinical data include tumor type, PD-L1 expression and objective response rate for 206 patients from PCD4989g. 
    
   
  
    
   
  1651 
 
  
    EGAD00001006631 
   
  
    
    RNAseq FASTq files from 817 bulk pre-treatment tumors from three indications (mUC, NSCLC and RCC) across three phase II (IMvigor210, POPLAR, IMmotion150) and a phase I (PCD4989g) clinical trials. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  817 
 
  
    EGAD00001006632 
   
  
    
    Whole exome sequencing FASTq files from 469 pre-treatment tumors from IMvigor210, POPLAR and IMmotion150, with matched PBMC samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  834 
 
  
    EGAD00001006634 
   
  
    
    We have in total 16 files, technical duplicates of 8 unique samples from Pre and Post BCG samples collected from four non muscle invasive bladder cancer patients. These are bulk RNAseq samples generated by high-throughput sequencing platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006636 
   
  
    
    RNA sequencing of a total of 41 tumor biopsies taken from a total of 14 patients with colorectal cancer. Ribosomal RNA was removed using the Ribo-Zero Gold rRNA Removal Kit (Illumina, CA, USA) and Paired-end sequencing were performed using ScriptSeq v2 RNA-seq Library preparation Kit (Illumina). Data processing of the paired raw sequence reads was performed using TopHat2, with mapping to the human reference genome HG19. Forty-one BAM files with reads mapping the the human reference genome (HG19) is enclosed. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  41 
 
  
    EGAD00001006637 
   
  
    
    89 samples of individuals with ALT-positive neuroblastoma tumors, exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  89 
 
  
    EGAD00001006638 
   
  
    
    97 samples of individuals with ALT-positive neuroblastoma tumors, low coverage whole genome sequencing 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  97 
 
  
    EGAD00001006639 
   
  
    
    63 samples of individuals with ALT-positive neuroblastoma tumors, high coverage whole genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  63 
 
  
    EGAD00001006640 
   
  
    
    Aligned whole-genome sequencing and RNA-seq of localised prostate cancer for study 'Loss of SNAI2 in prostate cancer correlates with clinical response to androgen deprivation therapy'. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  109 
 
  
    EGAD00001006641 
   
  
    
    During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006642 
   
  
    
    During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001006643 
   
  
    
    During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  85 
 
  
    EGAD00001006644 
   
  
    
    this dataset corresponds to 3 patient of HPV-driven warts single cell RNA data generated through the 10X genomics platform and aligned on GRCh38 reference using the cell ranger tools. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  4 
 
  
    EGAD00001006645 
   
  
    
    10 samples (one baseline, 9 on-treatment). Fastq files containing 5'GEx data, prepared using 10x Genomics pipeline, sequenced on Illumina HiSeq4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001006646 
   
  
    
    The data set consists of fastq raw files from RNA-seq of seven mucosal biopsies of the colon from seven patients, among them three patients with irritable bowel syndrome with mixed type symptoms. Paired end sequencing on Illumina NovaSeq 6000 was used. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD00001006648 
   
  
    
    Genetic redundancy has evolved as a way for human cells to survive the loss of genes that are single copy and essential in other organisms, but also allows tumours to survive despite having highly rearranged genomes. In this study we CRISPR screen 1,191 gene pairs, including paralogues and known and predicted synthetic lethal interactions to identify 105 gene combinations whose co-disruption results in a loss of cellular fitness. 27 pairs influence fitness across multiple cell lines including the paralogues FAM50A/FAM50B, two genes of unknown function. Silencing of FAM50B occurs across a range of tumour types and in this context disruption of FAM50A reduces cellular fitness whilst promoting micronucleus formation and extensive perturbation of transcriptional programmes. This dataset includes CRISPR screening of cancer cell lines, RNA sequencing studies of cancer cell lines and also data from the sequencing of tumour xenografts collected from mice. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD00001006649 
   
  
    
    Genetic redundancy has evolved as a way for human cells to survive the loss of genes that are single copy and essential in other organisms, but also allows tumours to survive despite having highly rearranged genomes. In this study we CRISPR screen 1,191 gene pairs, including paralogues and known and predicted synthetic lethal interactions to identify 105 gene combinations whose co-disruption results in a loss of cellular fitness. 27 pairs influence fitness across multiple cell lines including the paralogues FAM50A/FAM50B, two genes of unknown function. Silencing of FAM50B occurs across a range of tumour types and in this context disruption of FAM50A reduces cellular fitness whilst promoting micronucleus formation and extensive perturbation of transcriptional programmes. This dataset includes CRISPR screening of cancer cell lines, RNA sequencing studies of cancer cell lines and also data from the sequencing of tumour xenografts collected from mice. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  55 
 
  
    EGAD00001006650 
   
  
    
    This dataset contains Raw Reduced Representation DNA bisulfite-sequencing data obtained from human brain samples corresponding to 3 Young and 3 Old individuals (aging context), and 3 normal and 3 Glioblastoma samples (tumor context). RRBS libraries were prepared at Diagenode SA and samples were sequenced using the Illumina Novaseq6000 sequencing platform.The accompanying samples from this study (mouse tissues) are located at the ENA database  under the accession number PRJEB41460. 
    
   
  
    
   
  12 
 
  
    EGAD00001006653 
   
  
    
    This dataset contains paired-end whole-exome sequencing data (2x50 bp) from the normal sample, three synchronous primary tumors and the recurrence of a head and neck cancer patient. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001006654 
   
  
    
    This dataset contains paired-end RNA sequencing data (2x50 bp) from the three synchronous primary tumors and the recurrence of a head and neck cancer patient. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001006655 
   
  
    
    This dataset contains PBMC genome-wide RNAseq reads from 21 samples and one expression matrix file after alignment and aggregation of the 21 samples. The samples are case-control drawn on day 6 from long-term GFD treated CD patients after 3 day oral gluten challenge, on day 0 from patient controls on long-term GFD treatment, and on day 6 from 4 week GFD treated healthy controls after 3 day oral gluten challenge. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001006656 
   
  
    
    For this project about non-muscle invasive bladder cancer (NMIBC), we analysed total RNA-seq data from 535 patients. Sequencing of total RNA was performed using ScriptSeq-v2 RNA-Seq Library Preparation Kit (Illumina) and KAPA RNA HyperPrep Kit with RiboErase HMR (Roche). RNA input was 500 ng for both kits. The dataset is composed of 1,596 fastq files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  535 
 
  
    EGAD00001006657 
   
  
    
    This dataset entails 40 Bulk-RNA sequenced patient-derived gastro-intestinal neuroendocrine (GEP-NEN) neoplasms. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  40 
 
  
    EGAD00001006658 
   
  
    
    Germline exome tumor/control pairs for 41 medulloblastoma cases, MBG cohort sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  53 
 
  
    EGAD00001006659 
   
  
    
    218 control exomes, CEF cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  218 
 
  
    EGAD00001006660 
   
  
    
    Tumor/Control pairs for 8 medulloblastoma cases, MB cohort, mixed exome and whole genome data, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001006661 
   
  
    
    Exome controls for 70 individuals, PAN-GATC cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  70 
 
  
    EGAD00001006662 
   
  
    
    3 control whole genomes, SF cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001006663 
   
  
    
    9 tumor/control exomes, SJMB samples, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      unspecified 
      
    
   
  18 
 
  
    EGAD00001006664 
   
  
    
    6 control exomes, TB cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001006665 
   
  
    
    225 clinical cases, control exomes with some paired tumor data, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature). 
    
   
  
    
      
      unspecified 
      
    
   
  264 
 
  
    EGAD00001006666 
   
  
    
    This dataset contain 45 pairs of colorectal tumor and adjacent normal tissue. Four of them are from a previous study EGAS00001002477.
For each sample a BAM was generated by aligning to GRCh37.
Those colorectal cancer all have the MSI phenotype. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  90 
 
  
    EGAD00001006667 
   
  
    
    This dataset contain 133 pairs of colorectal tumor and adjacent normal tissue.
For each sample paired RNA-seq fastq were generated using an Illuma Myseq-2000.
Those colorectal cancer comprise 101 MSI and 32 MSS tumors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  266 
 
  
    EGAD00001006668 
   
  
    
    WES performed on 15 CUP-derived samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001006669 
   
  
    
    This dataset contains 91 RNAseq paired reads, in fastq format. Samples were collected from fresh bone marrow and peripheral blood sample from AML patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  91 
 
  
    EGAD00001006670 
   
  
    
    This dataset contains 18 ATACseq reads, in fastq format. Samples were collected from fresh bone marrow and peripheral blood sample from AML patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001006671 
   
  
    
    The SARS-CoV-2 pandemic has led to increasing numbers of COVID-19 patients all over the world. Aetiopathologies range from no symptoms, mild flu-like to severe cases succumbing to respiratory failure. Reports on a dysregulated immune system in the severe cases, showing similarities to cytokine release syndrome, calls for better characterization and understanding of the changes in the immune system as well as their variance across COVID-19 patients in order to be able to design according to host-directed therapies. Here, we profiled blood transcriptomes of 39 COVID-19 patients and 10 control donors. Enriched granulocyte signatures in whole blood samples were verified in granulocyte samples from 49 COVID-19 patients in a second cohort. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  41 
 
  
    EGAD00001006673 
   
  
    
    Please note: This synthetic data set (with cohort “participants” / ”subjects” marked with FAKE) has no identifiable data and cannot be used to make any inference about cohort data or results. The purpose of this dataset is to aid development of technical implementations for cohort data discovery, harmonization, access, and federated analysis. In support of FAIRness in data sharing, this dataset  is made freely available under the Creative Commons Licence (CC-BY). Please ensure this preamble is included with this dataset and that the  CINECA project (funding: EC H2020 grant 825775) is acknowledged. For any questions please contact isuru@ebi.ac.uk or cthomas@ebi.ac.uk
This dataset (CINECA_synthetic_cohort_EUROPE_UK1) consists of 2521 samples which have genetic data based on 1000 Genomes data (https://www.nature.com/articles/nature15393), and synthetic subject attributes and phenotypic data derived from UKBiobank (https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001779). These data were initially derived using the TOFU tool (https://github.com/spiros/tofu), which generates randomly generated values based on the UKBiobank data dictionary. Categorical values were randomly generated based on the data dictionary, continuous variables generated based on the distribution of values reported by the UK Biobank showcase, and date / time values were random. Additionally we split the phenotypes and attributes into 4 main classes - general, cancer, diabetes mellitus, and cardiac. We assigned the general attributes to all the samples, and the cardiac / diabetes mellitus / cancer attributes to a proportion of the total samples. Once the initial set of phenotypes and attributes were generated, the data data was checked for consistency and where possible dependent attributes were calculated from the independent variables generated by TOFU. For example, BMI was calculated from height and weight data, and age at death generated by date of death and date of birth. These data were then loaded to the development instance of Biosamples (https://www.ebi.ac.uk/biosamples/) which accessioned each of the samples. 
The genetic data are derived from the 1000 Genomes Phase 3 release (https://www.internationalgenome.org/category/phase-3/). The genotype data consists of a single joint call vcf files with call genotypes for all 2504 samples, plus bed, bim, fam, and nosex files generated via plink for these samples and genotypes. The genotype data has had a variety of errors introduced to mimic real data and as a test for quality control pipelines. These include gender mismatches, ethnic background mislabelling and low call rates for a randomly chosen subset of sample data as well as deviations from Hardy Weinberg equilibrium and low call rates for a random selection of variants. Additionally 40 samples have raw genetic data available in the form of both bam and cram files, including unmapped data. The gender of the samples in the 1000 genomes data has been matched to the synthetic phenotypic data generated for these samples. The genetic data was then linked to the synthetic data in BioSamples, and submitted to EGA. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  448 
 
  
    EGAD00001006674 
   
  
    
    RNASeq files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001006675 
   
  
    
    WXS files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  137 
 
  
    EGAD00001006676 
   
  
    
    WGS files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  35 
 
  
    EGAD00001006677 
   
  
    
    Single cell RNA-sequencing of sternal bone marrow reciding Hematopoietic Stem Cells (HSCs) and Megakaryocytes (MKs) from individuals undergoing elective open heart valve replacement. HSCs were defined as Lineage-, CD34+, CD38-, CD45RA-, CD90+, CD49f+ cells. MKs where CD41a+, CD42b+ and ploidy was determined with Hoechst.
A sternal bone marrow scraping was taken directly following median sternotomy using a Volkmann’s spoon. The sample was collected into an EDTA Vacutainer tube containing 1.8mg/ml EDTA. 4mL of Dulbecco’s phosphate buffered saline (PBS, Sigma) containing 10% human serum albumin (HSA, Gemini Bio Products) was added and the whole volume was resuspended by pipetting 2-3 times. The sample was then put on metallic thermal beads (ThermoFisher Scientific) at a temperature between 0-4°C and transported to the University of Cambridge for further processing.
For HSC isolation the cells were stained with the following antibody cocktail: PECy5 conjugated anti-lineage specific antibodies: CD2 (BD), CD3 (BD), CD10 (BD), CD11b (BD), CD11c (BD), CD19 (BD), CD20 (BD), CD56 (BD), biotinylated CD42b (Pab5, NHS Blood and Transplant, International Blood Group Reference Laboratory [IBGRL]), biotinylated GP6 (Pab5, NHS Blood and Transplant, International Blood Group Reference Laboratory [IBGRL]) used in combination with PECy5 conjugated streptavidin (Biolegend). Alexa Fluor 700 conjugated anti-CD34 (BD), PerCP-Cy5.5 conjugated anti-CD38 (BD), Pacific Blue conjugated anti-CD45RA (Invitrogen), PECy7 conjugated anti-CD90 (BD),PE conjugated anti-CD49f (BD). After staining cells were kept at 4°C before sorting using a FACS Aria Fusion flow sorter (BD). Single HSCs defined as Lineage-, CD34+, CD38-, CD45RA-, CD90+, CD49f+ cells were sorted by FACS directly into individual wells of a 96-well plate. Index sort data was collected for each single cell. For MK isolation the cells were stained for surface MK markers with mouse anti-human CD41a APC conjugated antibody (BD) and mouse anti-human CD42b PE conjugated antibody (BD) and for ploidy analysis with 1ug/ml Hoechst 33342 (Invitrogen). After incubation at 37°C for 30 minutes, the cells were kept at 4°C before sorting using a FACS Aria Fusion flow sorter (BD). Single cells and MK pools of 20 cells were sorted by FACS according to ploidy level using a 100uM nozzle directly into individual wells of a 96-well plate.
cDNA synthesis and poly(A) enrichment was performed following the G&T-seq protocol (Macaulay et al. 2015), a variation of the Smart-seq2 protocol1. ERCC spike-in RNA (Ambion) was added to the lysis buffer in a dilution of 1:4,000,000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2383 
 
  
    EGAD00001006678 
   
  
    
    HiC files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001006679 
   
  
    
    WGS files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001006680 
   
  
    
    RNASeq files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint." 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001006681 
   
  
    
    DNAs were genotyped on Illumina Infinium HumanCoreExome Beadchips (Illumina Inc., San Diego, CA, USA) with probes for 551,004 single nucleotide variants (SNVs): 282,373 informative across ancestries; 268,631 exome-focused.  Human genome build 37 (hg19) was used. 
    
   
  
    
   
  1 
 
  
    EGAD00001006701 
   
  
    
    As more clinically-relevant genomic features of myeloid malignancies are revealed, it has become clear that targeted clinical genetic testing is inadequate for risk stratification. Here, we developed and validated a clinical transcriptome-based assay for stratification of acute myeloid leukemia (AML). Comparison of RNA-Seq to whole genome and exome sequencing revealed that as a standalone assay, RNA-Seq offered the greatest diagnostic return, enabling identification of expressed gene fusions, single nucleotide and short insertion/deletion variants, and whole-transcriptome expression information. Expression data were used to develop a novel risk score which, when combined with molecular risk guidelines, allowed for the re-stratification of 22.1 to 25.3% of AML patients from three independent cohorts into correct risk groups. Within the adverse-risk subgroup, we identified a subset of patients characterized by dysregulated integrin signaling and RUNX1 or TP53 mutation. We show that these patients may benefit from therapy with inhibitors of focal adhesion kinase (PTK2), demonstrating additional utility of transcriptome-based testing for therapy selection in myeloid malignancy. 
    
   
  
    
   
  275 
 
  
    EGAD00001006730 
   
  
    
    The dataset includes whole exome DNA sequencing on pre-treatment tumor biopsies of lymph node metastases (n=60) matched with blood samples (n=60) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  120 
 
  
    EGAD00001006731 
   
  
    
    The dataset includes RNA sequencing on pre-treatment tumor biopsies of lymph node metastases (n=65) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  65 
 
  
    EGAD00001006732 
   
  
    
    Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – patient metatdata (Mutographs) 
    
   
  
    
   
  - 
 
  
    EGAD00001006733 
   
  
    
    This dataset contains all available targeted exon sequencing bam files from our study, "BRCA2, ATM, and CDK12 defects differentially shape prostate tumor driver genomics and clinical aggression". Patient identifiers are denoted by the first segment of the sample aliases (e.g. "P1"), and additional information is appended to reflect which sample is referenced. These include serial cfDNA samples ("-1", "-2", "-3", etc.), paired white blood cell or benign tissue control samples ("-Control"), or primary archival tissue samples derived from a diagnostic biopsy, prostatectomy, or transurethral resection of the prostate ("-Tissue"). All samples were sequenced using Illumina technology. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  368 
 
  
    EGAD00001006734 
   
  
    
    Human fecal WMS data from patients treated with combined anti-CTLA-4 and anti-PD-1 immunotherapy for advanced melanoma. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  46 
 
  
    EGAD00001006735 
   
  
    
    Human fecal 16S rRNA gene sequencing data from patients treated with combined anti-CTLA-4 and anti-PD-1 immunotherapy for advanced melanoma. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  54 
 
  
    EGAD00001006736 
   
  
    
    Genotyping of 244 early RA patients and 44 vaccine recipient controls was performed using the Illumina InfiniumCoreExome-24-v1-1 according to the manufacturer’s SOP.  Raw idats from the Illumina iScan instrument were imported into GenomeStudio (v2011.1).  Samples < 90 % call rate were excluded.  Data was exported to PLINK PED/MAP format on the forward strand. Data was converted from PED/MAP to BED/BIM/FAM using PLINK v1.07. 
    
   
  
    
   
  276 
 
  
    EGAD00001006737 
   
  
    
    Proteome data of neuroblastoma patients 
    
   
  
    
   
  34 
 
  
    EGAD00001006738 
   
  
    
    Data supporting: "Evidence that polyploidy in esophageal adenocarcinoma originates from mitotic slippage caused by defective chromosome attachments" Scott et al.
WGS and RNAseq sequencing data
Organoid, tumour and normal samples
BAM files 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001006739 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  17 
 
  
    EGAD00001006740 
   
  
    
    40 samples of WES and their normal controls; 33 samples of RNAseq data. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  113 
 
  
    EGAD00001006741 
   
  
    
    Matrices of TPM-normalized counts from RNAseq data for the three phase II clinical trials (IMvigor210, POPLAR, IMmotion150) and the phase I clinical trial PCD4989g. 
    
   
  
    
   
  817 
 
  
    EGAD00001006742 
   
  
    
    Pan Prostate Cancer Group UK BAM files 
    
   
  
    
   
  561 
 
  
    EGAD00001006743 
   
  
    
    We performed whole-exome sequencing on 46 pairs of PSCCE and matched normal sample. Somatic mutations were called using MuTect2. 
    
   
  
    
   
  46 
 
  
    EGAD00001006744 
   
  
    
    RNA exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006745 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006746 
   
  
    
    Non-tumorous breast tissues from BRCA1 or BRCA2 carriers were subject to RNA sequencing. The total number of samples is 130. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  130 
 
  
    EGAD00001006747 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006748 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006749 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006750 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006751 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006752 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006753 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006754 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006755 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006756 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006757 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006758 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006759 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006760 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006761 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006762 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006763 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006764 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006765 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006766 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006767 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006768 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006769 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006770 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006771 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006772 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006773 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006774 
   
  
    
    CSC DDR dataset contains 4 bam files of two pairs of colorectal CSCs sensitive and resistant to ATR or CHK1 inhibitor 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001006777 
   
  
    
    We generated HPRT-only knockout lines as well as the combination of HPRT with MSH2, UNG and XPC. Whole genome sequencing was performed on generated clones and subclones. By subtracting variants present in the clones from those in the subclones, the somatic mutations, that accumulated in between the clonal steps, were determined. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001006778 
   
  
    
    Single cell RNA and CITE sequencing of newly-diagnosed and recurrent GBM 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001006779 
   
  
    
    T cells were isolated from human blood and tissues: skin and fat. Subsequently,  CD4+ T cells and CD25+ T cells were FACS sorted. scATAC libraries were prepared using the 10x Genomics Kit (CG000168_ChromiumSingleCell_ATAC_ReagentKits_UserGuide_RevD.pdf) and sequenced on an Illumina NextSeq550. In total 15 samples were prepared 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001006780 
   
  
    
    Whole genome sequencing BAMs of DNA obtained from blood/saliva of 8 patients with adult granulosa cell tumors. These patients come from four independent families, with each family having 2 affected family members. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001006781 
   
  
    
    This dataset contains RNA-seq raw data in fastq format from 14 tumor samples. The samples are from primary tumors or metastasis and represent various cancer entities. The samples are formalin-fixed paraffin-embedded (FFPE) treated. For target enrichment SureSelect XT Human All Exon V6 was used. The libraries were sequenced in paired-end mode (2 x 50 nt) on a NovaSeq6000 S2 flow cell. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001006782 
   
  
    
    In this study, we aimed to identify somatic structural variation of acute myeloid leukemia (AML) at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from 32-year-old male donor. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  42 
 
  
    EGAD00001006783 
   
  
    
    This dataset contains RNA-seq raw data in fastq format from 9 melanoma samples. The samples are formalin-fixed paraffin-embedded (FFPE) treated. For target enrichment SureSelect XT Human All Exon V6 was used. The libraries were sequenced in paired-end mode (2 x 50 nt) on a NovaSeq6000 S2 flow cell. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001006784 
   
  
    
    This study aims to identify novel candidate variants from human Y-chromosomal genes DAZ, BPY2 and CDY1/2 by resequencing the coding regions of these genes from male patients with spermatogenic impairment. The coding regions of the genes plus a selection of phylogenetically informative Y-chromosomal markers have been amplified by standard PCR, amplicon lengths range from 178 to 486 bp. Amplicons were quantified by gel electrophoresis and pooled in approx. equimolar concentrations per patient. For each of the 480 submitted samples, approx. 1 microgram of amplified DNA pool was provided in a total volume of 120 microlitres. The samples were indexed and libraries prepared for PE250bp Illumina MiSeq runs. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  480 
 
  
    EGAD00001006785 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  9 
 
  
    EGAD00001006786 
   
  
    
    Multi region samples are collected from patients, with consent, immediately after resection of the tumour. Samples are digested and sorted using FACS as single cells  into lysis buffer. Cells are then stored until further processing for G&T-seq. After sequencing, we will explore intra-tumour heterogeneity using computational approaches to integrate RNA and DNA data onto the tumour phylogeny
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  672 
 
  
    EGAD00001006789 
   
  
    
    eQTLsummary&GeneTable from eQTL study in 299 intestinal biopsy samples from IBD 
    
   
  
    
   
  1 
 
  
    EGAD00001006790 
   
  
    
    intrstinal.eQTLsummary 
    
   
  
    
   
  1 
 
  
    EGAD00001006791 
   
  
    
    eQTL.study.release.phenotype 
    
   
  
    
   
  1 
 
  
    EGAD00001006792 
   
  
    
    eQTLsummary&GeneTable 
    
   
  
    
   
  1 
 
  
    EGAD00001006793 
   
  
    
    Whole genome sequencing of tumour (90X) - normal (30X) patient pair and bulk transcriptome sequencing (80M PE reads) of tumour sample. 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001006794 
   
  
    
    Bulk RNA-seq data of tumour data. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  218 
 
  
    EGAD00001006795 
   
  
    
    Plasmablastic lymphoma (PBL) represents a clinically heterogeneous subtype of aggressive B-cell non-Hodgkin lymphoma. Although targeted sequencing studies and a single center whole exome sequencing (WES) study in HIV+ patients recently revealed several genes, associated with PBL pathogenesis, the global mutational landscape and transcriptional profile of PBL remain elusive. To inform on disease-associated mutational drivers, mutational patterns and perturbed pathways in HIV+ and HIV- PBL we performed WES and RNA-sequencing (RNA-seq) of 34 PBL tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  73 
 
  
    EGAD00001006796 
   
  
    
    PBMCs isolated from 27 individuals (11 narcolepsy type 1 patients, 16 healthy controls) were stimulated with the peptide Neuroaminidase 175-189 or Protein-O-mannosyl transferase 1 (POMT1) 675-689 or media as control. FACS sorted CD4+ and CD8+ lymphocytes from one patient were subjected to the same stimulation. Transcriptome profiling was done with a 3' tagging protocol. T cell receptor repertoires were profiled with amplicon sequencing (Rep-seq). The dataset contains FASTQ files with sequencing reads, transcript count matrices and TCR clonotypes. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  82 
 
  
    EGAD00001006797 
   
  
    
    PBMCs isolated from 27 individuals (11 narcolepsy type 1 patients, 16 healthy controls) were stimulated with the peptide Neuroaminidase 175-189 or Protein-O-mannosyl transferase 1 (POMT1) 675-689 or media as control. FACS sorted CD4+ and CD8+ lymphocytes from one patient were subjected to the same stimulation. Transcriptome profiling was done with a 3' tagging protocol. T cell receptor repertoires were profiled with amplicon sequencing (Rep-seq). The dataset contains FASTQ files with sequencing reads, transcript count matrices and TCR clonotypes. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  82 
 
  
    EGAD00001006798 
   
  
    
    eQTL.study.release.inflammation.eQTLsummary 
    
   
  
    
   
  1 
 
  
    EGAD00001006799 
   
  
    
    This dataset includes single cell amplicon based sequencing from 10 samples from SDS patient bone marrow samples, including one patient with serial samples.  There are two fastq files per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001006800 
   
  
    
    This dataset includes amplicon based sequencing of myeloid malignancy associated genes as well as EIF6 in 99 patients with Shwachman-Diamond syndrome and 11 patients who are "SDS-like".  SDS-like patients are those that have clinical features of the disease but do not have a confirmed disease-causing mutation.  There are 421 samples from serial timepoints, denoted alphabetically or numerically.  For each timepoint, there is a single BAM file. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  421 
 
  
    EGAD00001006801 
   
  
    
    RNA-seq of dermal fibroblasts treated ± TGF-β from control and Shprintzen-Goldberg syndrome patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  36 
 
  
    EGAD00001006802 
   
  
    
    Single nuclei RNA-sequencing of snap frozen glioblastoma tumor tissue with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38  reference genome with intron using CellRanger. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001006803 
   
  
    
    Single cell RNA-sequencing of fresh glioblastoma tumor biopsies with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38 reference genome using CellRanger. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  23 
 
  
    EGAD00001006804 
   
  
    
    Single cell RNA-sequencing of glioblastoma stem cell (GSC) lines with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38 reference genome using CellRanger. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  29 
 
  
    EGAD00001006807 
   
  
    
    Neutrophils at timepoint 0h 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001006808 
   
  
    
    Neutrophils infected with Leishmania donovani at timepoint 6h 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001006809 
   
  
    
    Neutrophils at timepoint 6h 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001006811 
   
  
    
    Dataset includes cell-free ChIP-seq data of 268 samples (from 61 self-declared healthy donors, four patients with acute myocardial infarction, 29 patients suffering from autoimmune, metabolic, or viral liver diseases and 56 metastatic colorectal carcinoma (CRC) patients).
DNA libraries preparation is documented in the methods section. Libraries were paired end sequenced by Illumina NextSeq 500 and aligned to the human genome (hg19) using bowtie2 (2.3.4.3) with ‘no-mixed’ and ‘no-discordant’ flags.
This dataset includes fastq and BAM files of all samples. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  271 
 
  
    EGAD00001006812 
   
  
    
    This dataset includes the RNA sequencing of 14 samples. Samples are FACS sorted CD8+ T cells expressing or not the integrin CD103. The paired samples (TRM and non-TRM) were sorted from the tumor of 7 lung cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  14 
 
  
    EGAD00001006813 
   
  
    
    RNA was extracted from GSCs using the Qiagen RNeasy Plus kit. RNA sample quality was measured by Qubit (Life Technologies) for concentration and by Agilent Bioanalyzer for RNA integrity. All samples had RIN above 9. Libraries were prepared using the TruSeq Stranded mRNA kit (Illumina). Two hundred nanograms from each sample were purified for polyA tail containing mRNA molecules using poly-T oligo attached magnetic beads, then fragmented post-purification. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This is followed by second strand cDNA synthesis using RNase H and DNA Polymerase I. A single “A” base was added and adapter ligated followed by purification and enrichment with PCR to create cDNA libraries. Final cDNA libraries were verified by the Agilent Bioanalyzer for size and concentration quantified by qPCR. All libraries were pooled to a final concentration of 1.8nM, clustered and sequenced on the Illumina NextSeq500 as a pair-end 75 cycle sequencing run using v2 reagents to achieve a minimum of ~40 million reads per sample. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  87 
 
  
    EGAD00001006814 
   
  
    
    The dataset consists of Oxford Nanopore targeted RNA-based amplicon data of 12 classical HLA genes (HLA-A, -B, -C, -DRA, -DRB1, -DRB3, -DRB4, -DRB5, -DQA1, -DQB1, -DPA1, and DPB1) of 50 healthy individuals. The 12 classical genes were sequenced in two separate gene pools on R9.4 flowcells using MinION sequencer. Per individual, gene pool 1 contains HLA-A, -B, -C, -DRB1, -DRB3, -DRB4, -DRB5, and -DPB1 and gene pool 2 HLA-DRA, -DQA1, -DQB1, and -DPA1. The dataset includes 100 fastq files of Oxford Nanopore 2D reads (50 for gene pool 1 and 50 for gene pool 2). 
    
   
  
    
      
      MinION 
      
    
   
  100 
 
  
    EGAD00001006815 
   
  
    
    This dataset includes paired WES from bone marrow samples of patients with SDS and paired bone marrow-derived fibroblasts as a germline reference.  Some patients have multiple samples collected serially over time.  There are 74 BAM files included in this dataset. 
    
   
  
    
      
      AB 5500 Genetic Analyzer 
      
      Illumina HiSeq 2500 
      
    
   
  74 
 
  
    EGAD00001006816 
   
  
    
    The dataset includes the whole-exome sequencing (WES) of an extramedullary tumor anterior to the spinal cord at T4, which was resected and diagnosed as gliosarcoma. The patient initially diagnosed with a low-grade brain glioma via biopsy, followed by adjuvant radiation and temozolomide treatment. WES was performed using Illumina NovaSeq6000 with 2x100 bp reads. Mean coverage of 152.4x and 230.6x was achieved for normal and tumor, respectively. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006817 
   
  
    
    CTCF ChIP-seq of 14 leukemia patients: 6 AML without 3q rearrangements, 1 AML with 3q26, 1 AML with t(3;8) and 6 T-ALL 
H3K27ac ChIP-seq of AML patients: 4 cases with t(3;8), another with inv(3) and another with normal karyotype.
H3K27ac ChIP-seq of CD34+ cells from one healthy donor
RUNX1 ChIP-seq of one t(3;8) AML patient 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  21 
 
  
    EGAD00001006818 
   
  
    
    The viewpoints used in the 4C-seq data were either the EVI1 promoter or the MYC super-enhancer 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001006819 
   
  
    
    RNA-seq was generated to investigate differences in gene expression between t(3;8) AML and other primary AMLs. Briefly, sample libraries were prepared using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001006820 
   
  
    
    This dataset contains DNA sequencing of the chromosome 3q region in 28 primary AML cases with 3q26 rearrangements (3q26-rearranged AML).
Genomic DNA was fragmented using the Covaris shearing device (Covaris), and sample libraries were assembled following the TruSeq DNA Sample Preparation Guide (Illumina). After ligation of adapters and an amplification step, target sequences of chromosomal regions 3q21.1-q26.2 were captured using custom in-solution oligonucleotide baits (Nimblegen SeqCap EZ Choice XL). The design of target sequences was based on the human genome assembly hg19: chr3q21.1:126036241-130672290 - chr3q26.2:157712147-175694147. Amplified captured sample libraries were paired-end sequenced (2x100 bp) on the HiSeq 2500 platform (Illumina) and aligned against the hg19 reference genome using the Burrows-Wheeler Aligner (BWA). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001006821 
   
  
    
    ChIP-seq was conducted in blasts from patients with t(3;3) AML to assess differences of the GATA2 super-enhancer between the translocated allele and the non-translocated allele. The dataset includes 2x H3K27ac ChIP-seq and 1x MYB ChIP-seq.
ChIP samples were processed according to the Illumina TruSeq ChIP Sample Preparation Protocol (Illumina) or Diagenode Library V3 preparation protocol (Diagenode) and either sequenced single-end (1x 50 bp) on the HiSeq 2500 platform (Illumina) or paired-end (2x100 bp) on the Novaseq 6000 platform (Illumina). Briefly, reads were aligned to the human reference genome build hg19 with bowtie for single-end runs and bowtie2 for paired-end runs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001006822 
   
  
    
    In this study, we aimed to identify somatic structural variation of chronic lymphocytic leukemia (CLL) at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from 63-year-old female patient. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  86 
 
  
    EGAD00001006823 
   
  
    
    Approximately 1000 trio's with varying degrees of cognitive disorders. All samples have been sequenced for the AnkyrinG interactome using MIPS technology. Data is presented as BAM and unfiltered VCF files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001006824 
   
  
    
    Whole genome sequencing data (Illumina NovaSeq 6000) of clonal cultures derived from pediatric human bone marrow-derived hematopoietic stem and progenitor cells (in total 35 samples from 7 donors), bulk pediatric acute myeloid/lymphoid leukemia blasts (in total 2 samples from 1 patient) and bulk control mesenchymal stem cell cultures (4 samples from 4 patients) to study the mutation accumulation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  81 
 
  
    EGAD00001006825 
   
  
    
    The dataset consists of the BAM-files of 745 patients and 810 controls (retained after quality control) of a set of 34 candidate genes obtained after targeted enrichment via Molecular Inversion Probes (MIPS) technology. The sequencing was performed on the NextSeq 500 (Illumina, CA, USA) using custom sequencing and index primers in three 2 x 76 bp, dual indexed runs using a 150 cycles High-Output Illumina kit (Illumina, CA, USA). Alignment of the fastq reads to the human genome was performed using BWA (v0.7.4). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1555 
 
  
    EGAD00001006826 
   
  
    
    This dataset comprises RNA-seq expression profiles from 57 subjects, of which 39 are DMD patients and 18 healthy controls. The data are described in the following article:
Signorelli, Ebrahimpoor et al. (in review). Peripheral blood transcriptome profiling enables monitoring disease progression in dystrophic mice and patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001006827 
   
  
    
    We performed single-cell RNA-sequencing of cells in the bronchoalveolar lavage (BAL) fluid of severe COVID-19. In addition, we performed single-cell RNA-sequencing of SARS-CoV-2 stimulated classical blood monocytes. This study provides detailed insights into the alveolar macrophage response to SARS-CoV-2 infection and reveals a profibrotic macrophage response in severe COVID-19 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006828 
   
  
    
    In Coronavirus Disease 2019 (COVID-19), hypertension and cardiovascular diseases are major risk factors for critical disease
progression. However, the underlying reasons and the effect of the main anti-hypertensive therapies—angiotensin-converting
enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs)—remain unclear. Combining clinical data (n = 144) and
single-cell sequencing data of airway samples (n = 48) with in vitro experiments, we observed a distinct inflammatory predisposition
of immune cells in patients with hypertension that correlated with critical disease progression. ACEI treatment
associated with dampened COVID-19-related hyperinflammation and with increased cell intrinsic anti-viral responses, whereas
ARB treatment related to enhanced epithelial–immune cell interactions. Macrophages and neutrophils of patients with hypertension,
in particular under ARB treatment, exhibited higher expression of the pro-inflammatory cytokines CCL3 and CCL4
and the chemokine receptor CCR1. Although the limited size of our cohort does not allow us to establish clinical efficacy, our
data suggest that the clinical benefits of ACEI treatment in patients with COVID-19 who have hypertension warrant further
investigation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  33 
 
  
    EGAD00001006829 
   
  
    
    We are presenting raw and processed data of our study where we analyze fine-needle aspirate (FNA) samples of  primary cutaneous B-cell lymphoma patients undergoing oncolytic virotherapy (https://clinicaltrials.gov/ct2/show/NCT03458117). We are uploading single cell RNA-sequencing and immune repertoire profiling data of 29 FNA samples from four pCBCL patients. The four patients have three different subtypes of primary cutaneous B-cell lymphoma: pCDLBCL-LT, pCFCL and pCMZL. The samples are taken at different time points following T-VEC injection (from baseline up to 91 days after injection).
We are uploading the sequences in BAM format and the outputs of the cellranger pipeline (count matrices and filtered VDJ contigs) in CSV format. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  34 
 
  
    EGAD00001006830 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006831 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006832 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006833 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006834 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006835 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006836 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006837 
   
  
    
    RNA-exome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001006838 
   
  
    
    The dataset contains 200 fastq files of Illumina 5'end RNA sequencing data of 50 PBMC samples. The paired-end data includes 100 fastq files (R1 and R2) of 50 full-length cDNA sequencing libraries and 100 fastq files (R1 and R2) of 50 HLA amplicon sequencing libraries. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  100 
 
  
    EGAD00001006840 
   
  
    
    RNA-seq Phase Ib of olaparib and capivasertib 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  74 
 
  
    EGAD00001006841 
   
  
    
    T200 sequencing (Phase Ib of olaparib and capivasertib) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  162 
 
  
    EGAD00001006842 
   
  
    
    Microglia were derived from iPSCs and treated with mimics and inhibitors of the miRNAs hsa-miR-150-5p, hsa-miR-193a-3p and hsa-miR-19b-3p. RNA-sequencing was then performed to examine the effects of up- and down-regulation of the respective miRNAs. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  30 
 
  
    EGAD00001006843 
   
  
    
    CAGE-sequencing was performed on frontal post-mortem human brain tissue of patients with FTD caused by mutations in GRN, MAPT or C9orf72 and healthy controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  57 
 
  
    EGAD00001006844 
   
  
    
    iPSC-derived neurons were treated with mimics and inhibitors of the miRNAs miR-150-5p, hsa-mir-193a-3p and hsa-miR-19b-3p.
RNA-sequencing was then performed to examine the effects of miRNA up-regulation and inhibition. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  15 
 
  
    EGAD00001006845 
   
  
    
    This dataset contains smRNA-seq data from human post-mortem brain tissue of the frontal lobe of patients with FTD and healthy controls. These samples depict the data generated at the DZNE Göttingen and should be used together with the data generated at the DZNE Tübingen. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  33 
 
  
    EGAD00001006846 
   
  
    
    This dataset contains smRNA-seq data from post-mortem human brain tissue of the frontal lobe of patients with FTD and healthy controls. The smRNA-sequencing was done in two parts, this dataset depicts the data generated at the DZNE Tübingen. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  9 
 
  
    EGAD00001006847 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001006848 
   
  
    
    WGS of 17 GSC populations derived from patient tumours 
    
   
  
    
      
      HiSeq X Five 
      
      HiSeq X Ten 
      
    
   
  27 
 
  
    EGAD00001006849 
   
  
    
    Cancer cells enter a reversible drug tolerant persister (DTP) state to evade death from both chemotherapies and targeted agents. It is increasingly appreciated that the DTP state is an important driver of therapy failure and tumor relapse. We combined cellular barcoding and mathematical modeling in patient-derived colorectal cancer xenograft models to identify and characterize the cancer cells capable of generating DTPs in response to standard-of-care chemotherapy. Barcode analysis revealed no loss in clonal complexity of tumors that entered the DTP state and recurred following treatment cessation. Our data fits a mathematical model in which all cancer cells, and not a small subpopulation, possess an equipotent capacity to enter the DTP state. Mechanistically, we determined that DTPs display remarkable transcriptional and functional similarities to diapause, a reversible state of suspended embryonic development triggered by unfavorable environmental conditions. Our study provides new insights into how cancer cells use a developmentally conserved mechanism to drive the DTP state pointing to novel therapeutic opportunities to target diapause-like DTPs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001006850 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes Whole exome DNA sequencing 
on bladder tumor samples (n=24) matched with blood samples (n=24). The data is pre-treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001006851 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes Tumor mutational burden (TMB) calculated on bladder tumor pre-treatment DNA sequencing data (n=24).
Details about the Tumor Mutational Burden calculation can be found on the Methods section from the Nature Medicine paper (https://doi.org/10.1038/s41591-020-1085-z) 
    
   
  
    
   
  - 
 
  
    EGAD00001006852 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes High coverage Whole exome DNA sequencing on pre-treatment bladder tumor samples (n=3) matched with post-treatment metastasised adjacent lymph nodes isolated with laser microdissection (n=3) for 3 unique patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001006853 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the Response labels used for the analysis of the data.
Details about the clinical definitions of Response can be found on the paper (https://doi.org/10.1038/s41591-020-1085-z) 
    
   
  
    
   
  - 
 
  
    EGAD00001006854 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the Transcript read counts derived from the RNA sequencing data. The samples are pre-treatment (n=18) and post-treatment (n=18), and not all samples are paired.
The data processing pipeline can be found on the Methods section from the Nature Medicine paper (https://doi.org/10.1038/s41591-020-1085-z) 
    
   
  
    
   
  - 
 
  
    EGAD00001006855 
   
  
    
    NABUCCO cohort 1 sequencing data. The dataset includes RNA sequencing pre-treatment on tumor samples (n=18) and RNA sequencing post-treatment on bladder tumor samples (n=18). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001006856 
   
  
    
    Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the pre-treatment PD-L1 staining on tumor samples (n=24).
Details about the PD-L1 staining can be found on the paper (https://doi.org/10.1038/s41591-020-1085-z) 
    
   
  
    
   
  - 
 
  
    EGAD00001006857 
   
  
    
    A comprehensive RNA repository (both coding and non-coding) from 17 patients diagnosed with esophageal adenocarcinoma, high-grade dysplastic or non-dysplastic Barrett’s esophagus. Per patient, a blood plasma sample, and a healthy esophageal and disease tissue sample were collected. This dataset includes both mRNA and small RNA sequencing data (fastq.gz files) of all tissue and plasma samples. In total,102 RNA-seq libraries from 51 samples (17 plasma and 34 tissue samples) were sequenced (plasma mRNA libraries were sequenced twice). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  102 
 
  
    EGAD00001006858 
   
  
    
    Whole genome sequencing data of EBV associated DLBCL of 8 matched tumor-normal patients. Additionally, targeted resequencing data of 47 patients is provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  63 
 
  
    EGAD00001006859 
   
  
    
    Osteosarcoma, the most common primary malignant tumour of bone, affects children and adults alike. No fundamental biological differences between paediatric and adult osteosarcoma are known. Here, we apply multi-region whole genome sequencing to an index case of a four-year old child whose aggressive tumour harboured high level, focal amplifications of MYC and CCNE1 connected by translocations. We re-analysed copy number readouts of 258 cases of high-grade osteosarcoma from three different cohorts and identified an additional three cases with MYC and CCNE1 co-amplification, confined to children and associated with aggressive disease. Examining the age distribution of MYC and CCNE1 amplicons across all cases revealed a significant enrichment of focal MYC amplification in children, whereas CCNE1 amplification is not strictly restricted to children. Our findings indicate that amplification of the MYC oncogene, known to be associated with a poor outcome, delineates a variant of osteosarcoma specific to childhood. When co-amplified with CCNE1, it may herald an aggressive disease course. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  8 
 
  
    EGAD00001006860 
   
  
    
    RNA-seq data (44 samples) from tumor tissue specimens pre and post fasting-mimicking diet from 22 early-stage breast cancer patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001006861 
   
  
    
    Smart-seq2 single cell RNA sequencing reads from kidney glomerular single cells of healthy human individuals. The dataset contains single-end fastq files of 766 single cells. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  766 
 
  
    EGAD00001006862 
   
  
    
    7 RNA-seq samples in total: CD19n_IgAn (x2) , CD19n_IgAp (x2) , CD19p_IgAn (x1), CD19p_IgAp (x2) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001006863 
   
  
    
    This dataset contains 11 paired-end FASTQ sequences from mRNA-Seq on single human M-II stage oocytes that were collected from gonadotropin stimulated women undergoing fertility treatments. M-II stage oocytes were collected and flash frozen prior to lysis followed by RNA extraction, full length cDNA preparation and amplification using the Ultra-low-input SMART-Seq2 v4 kit from Takara Clonetech. Further, these cDNA were used to prepare libraries for sequencing according the Nextera XT DNA library preparation kit from Illumina. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001006864 
   
  
    
    16S amplicon data of nasopharyngeal swabs in a COVID-19 cohort recruited at UZ Leuven. The dataset contains a single experiment, comprising 150 runs corresponding to 125 unique samples. Runs comprise paired fastq files (2*250 bases) obtained from an Illumina MiSeq instrument. 
    
   
  
    
   
  125 
 
  
    EGAD00001006865 
   
  
    
    22 RNA-seq samples of ex-vivo (TN and Treg), cultured Treg, TET1 and untreated mCherry-MOCK 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  22 
 
  
    EGAD00001006866 
   
  
    
    6 samples from individuals with multiple myeloma  with selective elimination of immunosuppressive T cells, rna sequencing 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001006867 
   
  
    
    Sequence data (paired-end FASTQ format) for 209 samples from 73 sample sites, from 7 individuals. Samples include primary melanomas, metastatic tumours and ctDNA 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  209 
 
  
    EGAD00001006868 
   
  
    
    Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – sequence data (Mutographs) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1145 
 
  
    EGAD00001006869 
   
  
    
    Dataset contains plasma DNA whole genome sequencing on 4 breast cancer patients. It also includes matched germline and tumour whole genome sequencing data. Two benign cancer patients were also sequenced and their plasma DNA and matched germline whole genome sequencing data are included in the dataset. Samples were sequenced on Illumina HiSeqX Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  16 
 
  
    EGAD00001006871 
   
  
    
    TAPS data from 21 patients with HCC, 23 patients with PDAC, 30 non-cancer controls, 4 patients with cirrhosis, and 7 patients with pancreatitis. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  255 
 
  
    EGAD00001006873 
   
  
    
    BAM files (aligned against the hg38 genome) from a targeted amplicon sequencing (139 genes) experiment (median depth 1000X) on 218 samples from Stage 1 epithelial ovarian cancer biopises. Samples labeled "bis" or "tris" with the same ID are relapses; "left" or "right" samples indicate, in the case of bilateral tumor, from which ovary the sample was taken. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  218 
 
  
    EGAD00001006874 
   
  
    
    Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al.
WGS (BAM files)
5 Barrett's samples
5 normal oesophageal samples
5 normal gastric cardia samples
5 normal duodenal samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006875 
   
  
    
    In this study, we enhanced 5mC detection using SMRT sequencing by holistically analyzing kinetic signals of a DNA polymerase and sequence context for every base within a measurement window. We employed a convolutional neural network to train a methylation classification model. 
    
   
  
    
      
      NextSeq 500 
      
      Sequel II 
      
    
   
  42 
 
  
    EGAD00001006876 
   
  
    
    Bulk RNA-seq data of tumours in EGAS00001004572. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  226 
 
  
    EGAD00001006877 
   
  
    
    RNA-seq dataset for Mutation-specific non-canonical pathway of PTEN as a distinct therapeutic target for glioblastoma 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  42 
 
  
    EGAD00001006878 
   
  
    
    high depth WGS sequencing of 8 sites of a RET fusion tumour 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001006879 
   
  
    
    Three capture (Agilent’s SureSelectXT HS, Illumina’s Nextera Rapid Capture Custom, and New England Biolabs’ Next Direct Custom) and one amplicon-based (Qiagen’s Human Breast Cancer Panel) targeted sequencing methods on 6-8 paired blood and FFPE from the Malaysian Breast Cancer Cohort. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  56 
 
  
    EGAD00001006880 
   
  
    
    This dataset includes high-coverage genomes (~36x) of 317 individuals from 20 populations of the Pacific (Taiwan, Philippines, Solomon Islands, Vanuatu archipelago), described in “Genomic insights into population history and biological adaptation in Oceania”, by Choin, Mendoza-Revilla, Arauna, and colleagues (Nature 2021). The data is made of 331 fastq files. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  317 
 
  
    EGAD00001006881 
   
  
    
    BAM files (aligned against the hg38 genome) from a shallow whole-genome sequencing experiment (median depth 0.5X) on 218 samples from Stage 1 epithelial ovarian cancer biopises. Samples labeled "bis" or "tris" with the same ID are relapses; "left" or "right" samples indicate, in the case of bilateral tumor, from which ovary the sample was taken. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  218 
 
  
    EGAD00001006882 
   
  
    
    Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al.
RNAseq (BAM files)
12 Barrett's samples
12 normal oesophageal samples
11 normal gastric cardia samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001006883 
   
  
    
    The dataset contains FASTQ files referring to the study "Multi-omics analysis of Parkinson’s disease midbrains". For this project, RNA was isolated from human postmortem midbrain tissue (PD and Control samples). Libraries were prepared with the TruSeq Small RNA library prep (Small RNA Seq) and the TruSeq Stranded Total RNA Kit (for transcriptomics), both from Illumina. Sequencing for both experimental setups was conducted in the Illumina HiSeq4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001006884 
   
  
    
    We show that lysosomes are antagonistically controlled by TFEB and MYC to balance catabolic and anabolic processes required for activating LT-HSC and guiding their lineage fate. TFEB-mediated induction of the endolysosomal pathway for membrane receptor degradation limits LT-HSC metabolic and mitogenic activation; this promotes quiescence and self-renewal and governs erythroid-myeloid commitment. By contrast, MYC engages biosynthetic processes while repressing lysosomal catabolism to drive LT-HSC activation. Collectively, our study identifies lysosomes as a central regulatory hub for proper and coordinated stem cell fate determination. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  89 
 
  
    EGAD00001006885 
   
  
    
    The data set comprises 48 samples from term and preterm infants. Expression profiles were generated using different stimuli (O2 3%, 21%, 65%; LPS stimulation). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001006886 
   
  
    
    Short RNA sequencing of post-mortem human hippocampi from the Calgary Brain Bank. The dataset includes patients with Alzheimer's disease (AD) and healthy control individuals (Ctrl). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001006887 
   
  
    
    Patient samples were sequenced by Foundation Medicine, Inc. (Cambridge, MA), using FoundationOne CDx, a comprehensive NGS-based in vitro diagnostic device designed to capture cancer genes. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  320 
 
  
    EGAD00001006888 
   
  
    
    Molecular cancer paper (https://doi.org/10.1186/s12943-021-01327-5): This dataset contain shallow whole-genome sequencing (sWGS) of plasma cell-free DNA from cancer patients and healthy subjects, obtained with both Nanopore and Illumina technology.
A total of 6 cancer patients and 5 healthy subjects have been sequenced with Nanopore; 4 of the cancer patients have been also sequenced with Illumina.
In addition, genomic DNA from white blood cells of one healthy subjects, genomic and 160bp DNA from HEK cells have been sequenced with Nanopore.
Genome Biology paper: 3 additional healthy samples have been sequenced (HU), two different bioinformatic pipeline were applied.
2019: Fastqs from the molecular cancer paper were re-demultiplexed and adapter-trimmed (using guppy for multiplex samples, and porechop for singleplex) preserving 5' ends to allow fragmentomics analysis.
HAC: All the samples were basecalled with the same updated High Accuracy model (the latest at the time of the analysis) and post-processed as the 2019 dataset.
Raw FAST5 are currently available upon request, but will be uploaded soon. 
    
   
  
    
      
      GridION 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001006889 
   
  
    
    Genotype of C3 SNPs in 140 LOTx donors and recipients pairs. 
    
   
  
    
   
  290 
 
  
    EGAD00001006893 
   
  
    
    Single-cell RNA sequencing of bronchoalveolar lavages from COVID-19 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  35 
 
  
    EGAD00001006894 
   
  
    
    Actinic keratoses (AK) are lesions of epidermal keratinocyte dysplasia and are precursors for invasive cutaneous squamous cell carcinoma (CSCC). Identifying the specific genomic alterations driving progression from normal skin-AK-invasive CSCC is challenging due to the massive ultraviolet radiation-induced mutational burden characteristic at all stages of this progression. Here, we report the largest AK whole exome sequencing study to date and perform mutational signature and candidate driver gene analysis on these lesions. We demonstrate in 37 AK, from both immunosuppressed and immunocompetent patients, that there are significant similarities to CSCC in terms of mutational burden, copy number alterations, mutational signatures and patterns of driver gene mutations. We identify 44 significantly mutated AK driver genes and confirm that these genes are similarly altered in CSCC. We identify the azathioprine mutational signature in all AK from patients exposed to the drug, providing further evidence for its role in keratinocyte carcinogenesis. CSCC differ from AK in having higher levels of intra-sample heterogeneity. Alterations in signaling pathways also differ, with immune-related signaling and TGF-β signaling significantly more mutated in CSCC. Integrating our findings with independent gene expression datasets confirms that dysregulated TGF-β signaling may represent an important event in AK-CSCC progression. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  74 
 
  
    EGAD00001006895 
   
  
    
    Paired end whole exome sequencing (WES) data of tumor/normal pairs (sorted malignant CD3+/Vb+ T-cells and CD19+ non-malignant B-cells) for the identification of somatic mutation. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  12 
 
  
    EGAD00001006896 
   
  
    
    Initial WGS of plasma cell neoplasms in fire fighters exposed to the WTC attack 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001006897 
   
  
    
    Simple, Multiplexed, PCR-based barcoding of DNA for Sensitive mutation detection using Sequencing (SiMSen-Seq) of 11 PIK3CA hotspot mutations in plasma DNA of breast cancer patients.ng 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  66 
 
  
    EGAD00001006898 
   
  
    
    Single cell sequencing of 12 ovarian cancer biopsies from 7 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001006899 
   
  
    
    Gluten reactive T-cells from blood samples from patients undergoing a 3 day gluten challenge. Samples were collected on day 6.
Both gluten reactive and non-gluten reactive T-cells were sequenced. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001006900 
   
  
    
    Paired T-cell receptor sequences sequenced from single cells, from intraepithalial CD8+ αβ T-cells. Sequences are from from untreated and treated (on a gluten-free diet) celiac disease patients and controls. 
    
   
  
    
   
  19 
 
  
    EGAD00001006901 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fraction and ploidy sorted malignant CD3+/Vb+ T-cells and corresponding CD19+ non-malignant B-cells 
    
   
  
    
      
      NextSeq 550 
      
    
   
  11 
 
  
    EGAD00001006902 
   
  
    
    Bulk WGS fastq files for germline and tumours in EGAS00001004572. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  480 
 
  
    EGAD00001006903 
   
  
    
    This dataset includes 87 scRNA-seq samples of bone marrow aspirates of 20 relapsed/refractory patients generated with the 3´(v2) kit of the 10x Chromium platform. Bone marrow cells have been sorted using CD138 +/- fractions using magnetic beads for plasma cell enrichment and processed independently.
For 14/20 patients multiple treatment timepoints are available that includes samples before treatment and at relapse during treatment. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  87 
 
  
    EGAD00001006904 
   
  
    
    This dataset accompanies the publication of Sugita M et al. "Targeting the Epichaperome As an Effective Precision Medicine Approach in a Novel PML-SYK Fusion Acute Myeloid Leukemia" Npj Precision Oncology 2021 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  13 
 
  
    EGAD00001006905 
   
  
    
    Whole genome sequencing of 29 samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001006906 
   
  
    
    Bulk GRIDSS somatic sv vcfs from tumour-normal analysis in EGAS00001004572 
    
   
  
    
   
  500 
 
  
    EGAD00001006907 
   
  
    
    Bulk Strelka somatic snv vcfs from tumour-normal analysis in EGAS00001004572 
    
   
  
    
   
  500 
 
  
    EGAD00001006908 
   
  
    
    Bulk copy number segments from Purple analysis in EGAS00001004572 
    
   
  
    
   
  252 
 
  
    EGAD00001006909 
   
  
    
    Bulk methylation tumour profiles from infinium methylation epic bead kit in EGAS00001004572 
    
   
  
    
   
  76 
 
  
    EGAD00001006910 
   
  
    
    Bulk Germline snv vcfs from haplotypecaller analysis in EGAS00001004572 
    
   
  
    
   
  247 
 
  
    EGAD00001006911 
   
  
    
    Bulk RNAseq from HCV infected liver biopsies. Two fastq files per sample for Paired end sequecing. Some samples were sequenced on multiple plates. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  225 
 
  
    EGAD00001006913 
   
  
    
    Five commercially available parallel sequencing assays were evaluated for their ability to detect gene fusions in eight cell lines and 18 FFPE tissue samples carrying a variety of known gene fusions. Four RNA-based assays and one DNA-based assay were compared; two were hybrid capture-based, TruSight Tumor 170 Assay (Illumina) and SureSelect XT HS Custom Panel (Agilent), and three were amplicon-based, Archer FusionPlex Lung Panel (ArcherDX), QIAseq RNAscan Custom Panel (Qiagen) and Oncomine Focus Assay (Thermo Fisher Scientific). 
    
   
  
    
      
      Illumina MiSeq 
      
      Ion Torrent S5 
      
      NextSeq 500 
      
    
   
  228 
 
  
    EGAD00001006914 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001006915 
   
  
    
    Samples of nucleated cells found in peripheral blood from over 300 patients suffering from resectable pancreatic ductal adenocarcinoma, non-resectable pancreatic cancer, chronic pancreatitis, or none of these.
Please cite this article when using data:
Al-Fatlawi, A.; Malekian, N.; García, S.; Henschel, A.; Kim, I.; Dahl, A.; Jahnke, B.; Bailey, P.; Bolz, S.N.; Poetsch, A.R.; Mahler, S.; Grützmann, R.; Pilarsky, C.; Schroeder, M. Deep Learning Improves Pancreatic Cancer Diagnosis Using RNA-Based Variants. Cancers 2021, 13, 2654. https://doi.org/10.3390/cancers13112654 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  311 
 
  
    EGAD00001006916 
   
  
    
    This dataset contains: 
1) Raw FASTQ and BAM files for short reads. Here, DNA libraries were prepared using Nextera Rapid Capture Custom Enrichment kit (Illumina) and paired-end sequenced on a HiSeq2500 (Illumina).
2) Raw FASTQ and BAM files for long reads. Here, DNA libraries were prepared using 1D DNA ligation Sequencing Kit (SQK-LSK109, Oxford Nanopore) and single-end sequenced on a MinION device (Oxford Nanopore). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      MinION 
      
    
   
  86 
 
  
    EGAD00001006917 
   
  
    
    Project Neurodevelopmental Disorders 245 Samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  245 
 
  
    EGAD00001006918 
   
  
    
    We provide a diverse keratinocyte transcriptome signature between SFN and FMS patients, which may hint towards distinct pathomechanisms of small fiber sensitization and lay the basis for advanced diagnostics in both entities 
    
   
  
    
      
      NextSeq 500 
      
    
   
  29 
 
  
    EGAD00001006919 
   
  
    
    Whole exome sequencing data of 28 matched normal-tumor-relapse patients from Lübeck and Munich (Germany). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  84 
 
  
    EGAD00001006920 
   
  
    
    This dataset contains exome sequencing from replication repair deficient brain tumor samples. 
    
   
  
    
   
  7 
 
  
    EGAD00001006921 
   
  
    
    Hypermutant tumors which harbor many somatic mutations may obscure the interpretation of targetable genomic events. This dataset contains transcriptome sequencing from 21 replication repair deficient brain tumor samples as well as healthy controls. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  30 
 
  
    EGAD00001006922 
   
  
    
    Longitudinal single-cell RNA-seq data of prospectively collected tumor tissue samples before and after chemotherapy from 11 HGSOC patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001006923 
   
  
    
    These data were used in the following publication:
Andradas, C.; Byrne, J.; Kuchibhotla, M.; Ancliffe, M.; Jones, A.C.; Carline, B.; Hii, H.; Truong, A.; Storer, L.C.D.; Ritzmann, T.A.; et al. Assessment of Cannabidiol and delta9-Tetrahydrocannabiol in Mouse Models of Medulloblastoma and Ependymoma. Cancers 2021, 13, 330. https://doi.org/10.3390/cancers13020330
There are 4 paired-end RNA-seq samples from paediatric brain cancer cell lines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001006924 
   
  
    
    Plasmodium vivax offers unique challenges for control and elimination, and may prove a tougher hurdle to overcome than Plasmodium falciparum. And yet compared to P. falciparum we know very little about the innate and adaptive immune responses that need to be harnessed to reduce disease and transmission. We recently generated a blood bank of a new clonal field isolate of P. vivax (PvW1) for human challenge studies and used systems immunology tools to track the host response throughout infection and convalescence. As part of this study, RNA-sequencing was used to resolve changes in whole blood gene expression through time in 6 volunteers (7-9 time-points per volunteer). In summary, these data show that P. vivax induces two distinct transcriptional programmes in whole blood during and after infection. During infection, transcriptional profiling reveals the rapid mobilisation of an emergency myeloid response, which leads to systemic inflammation and the recruitment of all major T cell subsets into lymphoid tissues. Six days after infection, this innate response subsides and a transcriptional signature of proliferation is revealed. This most likely represents widespread activation of lymphocytes, which return to the circulation after parasite clearance - transcriptional profiling of T cells at this time-point could therefore reveal the outcomes of critical cell-cell interactions that take place within the spleen during infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  54 
 
  
    EGAD00001006925 
   
  
    
    This dataset contains raw .fastq files of a paired-end RNA-seq experiment on 15 PTCL-NOS samples. Samples were prepared with Truseq stranded mRNA library kit. 
    
   
  
    
   
  15 
 
  
    EGAD00001006926 
   
  
    
    This dataset contains subtype assignments for 271 tumor samples profiled by RNA-seq. 
    
   
  
    
   
  271 
 
  
    EGAD00001006927 
   
  
    
    This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the entire transcriptome. 
    
   
  
    
   
  271 
 
  
    EGAD00001006928 
   
  
    
    This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the subset of genes used for validation of the NMF cluster assignments. 
    
   
  
    
   
  271 
 
  
    EGAD00001006929 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001006930 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001006931 
   
  
    
    De- and transdifferentiation of melanoma is a rare histopathological phenomenon that has not be characterised genetically. In this project we plan to sequence the genomes of de and transdifferentiated cases so as to define their genetic make-up. . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001006932 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001006933 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. 
Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. 
Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.       . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001006934 
   
  
    
    We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function.  . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001006935 
   
  
    
    We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function.  . 
This dataset contains all the data available for this study on 2021-02-02. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  9 
 
  
    EGAD00001006936 
   
  
    
    Single-cell RNA sequencing was performed for cells from five early-stage LUADs and fourteen multi-region normal lung tissues of defined spatial proximities from the tumors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  35 
 
  
    EGAD00001006937 
   
  
    
    Chromium V(D)J and 5' Gene Expression platform (10X Genomics) was used to study patients with aplastic anemia. CD45+ cells from two patients (patient AA-3: 3 longitudinal samples from bone marrow and patient AA-4: 3 longitudinal samples from peripheral blood) were analysed. The raw data was processed using Cell Ranger 3.0.1 pipelines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  192 
 
  
    EGAD00001006938 
   
  
    
    CD14+ monocytes from 4 African and 4 Europeans individuals with varying degree of ex-vivo susceptibility to Influenza, were either stimulated with Influenza A virus, or left resting. 
Cells from all 16 samples were collected at 4 time points (0, 2, 4, 6h post infection), and pooled across 13 libraries. Samples were processed on the 10x chromium with 3' reagents kits, V3 chemistry and sequenced with Hiseq X ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  13 
 
  
    EGAD00001006939 
   
  
    
    Whole genome sequencing for single cells for library A95629A 1023 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001006940 
   
  
    
    Whole genome sequencing for single cells for library A95654B 1740 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001006941 
   
  
    
    Whole genome sequencing for single cells for library A95673A 1446 cells; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  9 
 
  
    EGAD00001006942 
   
  
    
    Whole genome sequencing for single cells for library A95703B 1267 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001006943 
   
  
    
    Whole genome sequencing for single cells for library A95728A 876 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001006944 
   
  
    
    Whole genome sequencing for single cells for library A96192B 1304 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001006945 
   
  
    
    Whole genome sequencing for single cells for library A96217B 1616 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  6 
 
  
    EGAD00001006946 
   
  
    
    Whole genome sequencing for single cells for library A96219B 1743 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001006947 
   
  
    
    Whole genome sequencing for single cells for library A98269B 1609 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001006948 
   
  
    
    In this proof of principle study, we performed whole genome sequencing of two cases with multiple relapses in order to investigate whether groups of mutations separated in time show distinct mutational signatures. In patient 1, who experienced two relapses, the analysis unraveled a continuous interplay of aberrant AID/APOBEC-associated activities. Patient 2 had three relapses. We identified episodic mutational processes at diagnosis and first relapse leading to mutations resembling UV light-driven DNA damage, and thiopurine-associated damage at first relapse. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001006950 
   
  
    
    Paired-end DNA-seq FASTQ files from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). Each sample was multiplexed across flowcells and lanes, leading to a total number of 83 pairs of FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006951 
   
  
    
    Paired-end BAM files from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). FASTQ files were processed at the CNAG (Barcelona) using the GEM short-read aligner on the human genome version hs37d5, producing a total of 16 BAM files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006952 
   
  
    
    VCF file from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). BAM files were processed at the CNAG (Barcelona) with their pipeline, including GATK v3.6 for genotyping and other tools such as snpEff for annotating variants, to produce this VCF file with a total of 10,630,259 variants, out of which 8,731,523 are SNVs. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001006953 
   
  
    
    Lifelines-DEEP plasma un-targeted metabolomics 
    
   
  
    
   
  - 
 
  
    EGAD00001006954 
   
  
    
    iAMP21 WGS, total of 224 samples 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  224 
 
  
    EGAD00001006955 
   
  
    
    This dataset contains single cell DNA amplicon sequencing of 12 B-ALL patients. For all patients a diagnosis sample was processed, while 4 patients were also followed up during treatment, summing up to a total of 23 samples. Mutations were called in the predefined set of amplicons. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD00001006956 
   
  
    
    30 samples of 15 individuals with neuroblastoma tumor, whole genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001006957 
   
  
    
    We perform whole exome sequencing on 50 pairs of gastric cancer and matched normal samples. 
    
   
  
    
      
      unspecified 
      
    
   
  100 
 
  
    EGAD00001006959 
   
  
    
    Second round of follow-up of population-based LifeLines-DEEP cohort 
    
   
  
    
   
  676 
 
  
    EGAD00001006960 
   
  
    
    RNAseq fastq files from 611 bulk pre-treatment tumors from two indications: metastatic urothelial bladder cancer patients (IMvigor210) and metastatic renal cell carcinoma (IMmotion150) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  611 
 
  
    EGAD00001006961 
   
  
    
    The sequencing data of the CTSC gene after whole genome sequencing of blood samples from two individuals with Papillon-Lèfevre Syndrome. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001006962 
   
  
    
    The transcriptome of peripheral blood cells (PBMCs) from control or patients with an activation mutation on the STAT1 gene was analyzed. This analysis aimed to identify the major changes in the circulating immune cells of patients with STAT1 mutation and compare this with the result of the perturbation prediction tool huva (human variation, R). 
    
   
  
    
      
      AB 5500xl Genetic Analyzer 
      
    
   
  6 
 
  
    EGAD00001006963 
   
  
    
    Whole-genome sequencing of 135 tumor samples and 98 normal samples of gastric cancer with peritoneal metastasis 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  232 
 
  
    EGAD00001006964 
   
  
    
    This dataset contains the RNA and ChIP Sequencing data from the study Kalirin-RAC controls nucleokinetic migration in ADRN-type neuroblastoma. The data is organized in 7 experiments which are divided by both sequencing technology or the application of siRNA or drug interventions (or lack thereof) on neuroblastoma cell lines.
The experiment names and the file names have been chosen in each respective experiment to guide future users of the data to replicate the analyses in the manuscript. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  54 
 
  
    EGAD00001006965 
   
  
    
    The Genomic Diversity in Africa Project (GDAP) started with the plan to develop a genomic resource from African populations, characterise genomic diversity and population history, and facilitate clinical studies in Africa. Currently, 25 individuals from 24 ethnolinguistic groups have been whole-genome sequenced at high depth totalling 585 individuals. An additional 41 individuals have been sequenced with 10X Genomics libraries. At this stage, the initial curation of this dataset has been finished and we are performing the analysis in coordination with our collaborators. The current state of the GDAP represents a very diverse panel of African populations that maximizes geographical and ethnic variation and represents a great starting point to achieve the aforementioned goals. However, southern sub-Saharan countries, Bantu speakers and hunter gatherer groups are currently underrepresented, despite being crucial to understand the evolutionary history of the continent. After extensive effort to collate studies documentation, we finally have the opportunity to sequence 600 new individuals from these groups, including countries as Gabon, Rwanda and Zambia, and address these deficiencies. We aim to proceed with the same strategy: to sequence at high depth 25 individuals with standard PCR free libraries, with 2 additional individuals with 10X Genomics Chromium libraries per ethnolinguistic group. The former allows a good representation of variants down to low frequency in any given population, and the latter allows accurate phasing and the analysis of structural variation. By including these new populations, we want to investigate three crucial questions in African history in addition to the initial objectives: the Bantu expansion, the evolutionary history of hunter gatherers and the transatlantic slave trade. Additionally, the expanded dataset will help us better discover the genetic variation present in Africa and characterize the African pangenome.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2021-02-12. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  184 
 
  
    EGAD00001006966 
   
  
    
    The dataset contains raw miRNA sequencing data of plasma samples from 20 newly diagnosed colorectal cancer cases and 20 controls free of colorectal neoplasms matched by age and sex. It includes files in the FASTQ compressed (.gz) format. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD00001006967 
   
  
    
    For this project about non-muscle invasive bladder cancer (NMIBC), we analysed total RNA-seq data from 47 patients used for validation. Sequencing of total RNA was performed using KAPA RNA HyperPrep Kit with RiboErase HMR (Roche). RNA input was 100 to 500 ng. The dataset is composed of 94 fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD00001006968 
   
  
    
    bam files, mapped to hg19 after dedup, recal, recalibration and clipping of overlapping redas 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  210 
 
  
    EGAD00001006969 
   
  
    
    NOTCH1 mutant clones occupy the majority of normal human esophagus by middle age, but are comparatively rare in esophageal cancers, suggesting NOTCH1 mutations may promote clonal expansion but impede carcinogenesis.  Here we test this hypothesis. Visualizing and sequencing NOTCH1 mutant clones in aging normal human esophagus, reveals frequent biallelic mutations that block NOTCH1 signaling.  In mouse esophagus, heterozygous Notch1 mutation confers a competitive advantage over wild type cells, an effect enhanced by loss of the second allele.  Notch1 loss alters transcription but has minimal effects on epithelial structure and cell dynamics.  In a carcinogenesis model, Notch1 mutations were less prevalent in tumors than normal epithelium. Deletion of Notch1 reduced tumor growth, an effect recapitulated by anti-NOTCH1 antibody treatment.  We conclude that Notch1 mutations in normal epithelium are beneficial as wild type Notch1 promotes tumor expansion. NOTCH1 blockade has therapeutic potential in esophageal squamous tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001006970 
   
  
    
    Globally, human populations show structured genetic diversity as a result of geographical dispersion, selection and drift. Understanding this genetic variation can provide insights into the evolutionary processes that shape both human adaptation and variation in disease. Populations from SSA have the highest levels of genetic diversity. This characteristic, in addition to historical genetic admixture, can lead to complexities in the design of studies assessing the genetic determinants of disease and human variation. However, such studies of African populations are also likely to provide new opportunities to discover novel disease susceptibility loci and variants and refine gene-disease association signals. A systematic assessment of genetic diversity within SSA would facilitate genomic epidemiological studies in the region.
The Genome Diversity in Africa Project (GDAP) aims to produce a comprehensive catalogue of human genetic variation in SSA, including single nucleotide polymorphisms (SNPs), structural variants, and haplotypes. This resource will make a substantial contribution to understanding patterns of genetic diversity within and among populations in SSA, as well as providing a global resource to help design, implement and interpret genomic studies in SSA populations and studies comprising globally diverse populations, complementing existing genomic resources. Specifically, we plan to carry out high depth whole genome sequencing of up to 2000 individuals across Africa (25 individuals from each ethnolinguistic group).
Our scientific objectives are to: 1) develop a resource that provides a comprehensive catalogue of genetic variation in populations from SSA accessible to the global scientific community; 2) characterise population genetic diversity, structure, gene flow and admixture across SSA; 3) develop a cost-efficient, next-generation genotype array for diverse populations across SSA; and 4) facilitate whole genome-sequencing association studies of complex traits and diseases by developing a reference panel for imputation and resource for enhancing fine-mapping disease susceptibility loci. These scientific objectives will be supported by cross-cutting operational activities, including network and management of the consortium, research ethics, and research capacity building in statistical genetics and bioinformatics
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
This dataset contains all the data available for this study on 2021-02-16. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  53 
 
  
    EGAD00001006973 
   
  
    
    Clinical data corresponding to the patients with ovarian cancer studied using scRNA-seq and bulk RNA-seq. Variables include molecular subtype, predicted immune phenotype, reviewing pathologist comment, final immune phenotype, histology characterization, and tumor stage. 
    
   
  
    
   
  59 
 
  
    EGAD00001006974 
   
  
    
    Matrices of counts from single-cell RNA-seq data for 15 samples from patients with ovarian cancer, 5 samples for each of the 3 tumor immune phenotypes (Infiltrated, Excluded and Desert). Dissociated cells from each tumor sample have been sorted to isolate live cells from the 3 compartments: tumor, immune and stromal. Each of the compartments has been analyzed separately by scRNAseq, excluding some desert tumors for which cells from the stromal and immune compartments have been pooled. Sequencing was performed using 10X Genomics Chromium Single Cell platform (v2 Chemistry). 
    
   
  
    
   
  44 
 
  
    EGAD00001006975 
   
  
    
    RNAseq FASTq files from 15 samples from patients with ovarian cancer, 5 samples for each of the 3 tumor immune phenotypes (Infiltrated, Excluded and Desert). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001006976 
   
  
    
    Globally, human populations show structured genetic diversity as a result of geographical dispersion, selection and drift. Understanding this genetic variation can provide insights into the evolutionary processes that shape both human adaptation and variation in disease. Populations from SSA have the highest levels of genetic diversity. This characteristic, in addition to historical genetic admixture, can lead to complexities in the design of studies assessing the genetic determinants of disease and human variation. However, such studies of African populations are also likely to provide new opportunities to discover novel disease susceptibility loci and variants and refine gene-disease association signals. A systematic assessment of genetic diversity within SSA would facilitate genomic epidemiological studies in the region.
The Genome Diversity in Africa Project (GDAP) aims to produce a comprehensive catalogue of human genetic variation in SSA, including single nucleotide polymorphisms (SNPs), structural variants, and haplotypes. This resource will make a substantial contribution to understanding patterns of genetic diversity within and among populations in SSA, as well as providing a global resource to help design, implement and interpret genomic studies in SSA populations and studies comprising globally diverse populations, complementing existing genomic resources. Specifically, we plan to carry out high depth whole genome sequencing of up to 2000 individuals across Africa (25 individuals from each ethnolinguistic group).
Our scientific objectives are to: 1) develop a resource that provides a comprehensive catalogue of genetic variation in populations from SSA accessible to the global scientific community; 2) characterise population genetic diversity, structure, gene flow and admixture across SSA; 3) develop a cost-efficient, next-generation genotype array for diverse populations across SSA; and 4) facilitate whole genome-sequencing association studies of complex traits and diseases by developing a reference panel for imputation and resource for enhancing fine-mapping disease susceptibility loci. These scientific objectives will be supported by cross-cutting operational activities, including network and management of the consortium, research ethics, and research capacity building in statistical genetics and bioinformatics
 . 
This dataset contains all the data available for this study on 2021-02-17. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina MiSeq 
      
    
   
  27 
 
  
    EGAD00001006977 
   
  
    
    In order to characterize the T cell receptor (TCR) repertoire of gluten specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.5:DQ2.5-gluten tetramer binding CD4+ T cells isolated from blood, biopsies  and T cell line from celiac disease patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  44 
 
  
    EGAD00001006978 
   
  
    
    4 HPS1 patient monocyte-derived macrophages and 4 controls were RNA sequenced at baseline and after Salmonella Typhimurium infection. We used paired end sequencing on an Illumina HiSeq 4000. Each sample was run on 3 lanes for sequencing depth, which we combined for our analysis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  48 
 
  
    EGAD00001006979 
   
  
    
    PacBio long-read circular consensus (CCS) sequencing data for individual HV31 generated on PacBio Sequel II instrument, using size-selected (10-15 kb) DNA from CD14+ monocytes, to a sequencing depth of ~12×.  Sequencing was performed at the Wellcome Sanger Institute. 
    
   
  
    
      
      Sequel 
      
    
   
  1 
 
  
    EGAD00001006980 
   
  
    
    Temporal HER2-negative breast cancers WES 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  94 
 
  
    EGAD00001006981 
   
  
    
    Single-cell RNA-Sequencing of five TNBC primary breast cancers from Wu et al. (2020) EMBO J study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001006982 
   
  
    
    The dataset is composed of three sequenced tumor samples: (I) Meta-bone-557 (bone metastasis obtained from occipital lesion resection during treatment with Liposomal doxorubicin); (II) Meta-CNS-888 (brain metastasis obtained from surgical resection during treatment with Nivolumab); (III) Primary-liver-463 (primary hepatic tumor obtained from surgical resection during treatment with Nivolumab). Genomic DNA from tumor samples was extracted using GeneRead DNA FFPE kit (Qiagen), containing Uracyl-D Glycosylase, according to the manufacturer’s instructions. Whole-exome libraries were prepared using SureSelect XT Clinical Research Exome Target Enrichment kit (Agilent Technologies # 5190-7338). Sequences (150bp paired-end) were generated on a NextSeq 500 sequencing platform (Illumina). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001006983 
   
  
    
    We combined samples from 1,469 inflammatory bowel disease (IBD) patients consisted of 896 Crohn’s disease (CD) and 573 ulcerative colitis (UC) and 4,041 controls used in our previously published GWAS with 1,726 additional IBD patients (725 CD and 1,001 UC) and 378 additional controls genotyped using the Asian Screening Array (ASA). We uploaded summary statistics of three meta-analyses in text files. 
    
   
  
    
   
  7614 
 
  
    EGAD00001006984 
   
  
    
    This study includes treatment-naïve fresh tissue sample  from 4 HGSOC patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001006985 
   
  
    
    RNA-seq data from Korean CRC samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  160 
 
  
    EGAD00001006986 
   
  
    
    We created three technical replicates of cell-free DNA from AML patient plasma to assess batch effects and utility of spike-in controls for the cfMeDIP-seq method. Each set of samples were given to three different technicians with slightly different protocols. Details can be found in Wilson et al. "Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls". 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  15 
 
  
    EGAD00001006987 
   
  
    
    To analyse genome wide DNA copy number changes in combination with mutation status of CRC-related genes, CIMP and MSI, in order to explore the biology of PCCRCs.
Formalin-fixed, paraffin-embedded samples from 122 PCCRCs and 98 prevalent CRCs collected in 3 different hospitals in the region of South Limburg, the Netherlands, were used in this study. DNA was extracted for molecular analysis.
Labels have been updated. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  203 
 
  
    EGAD00001006988 
   
  
    
    Whole exome sequencing of samples carrying an MBD4 mutation -n=9) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001006989 
   
  
    
    Targeted sequencing of MBD4 of either tumor and germline DNA from Uveal Melanoma assembled by pool (germline pool or tumor pool). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  186 
 
  
    EGAD00001006990 
   
  
    
    JAGuaR outputs from RNA-seq of 35 pancreatic neuroendocrine neoplasms. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001006991 
   
  
    
    STAR outputs from RNA-seq of 84 pancreatic neuroendocrine neoplasms, 10 normal islet samples and 4 cell line samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  98 
 
  
    EGAD00001006992 
   
  
    
    BWA outputs from whole-exome sequencing of 35 pancreatic neuroendocrine neoplasms. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001006994 
   
  
    
    we performed sequential scRNA-seq of 21 specimens (discovery cohort) collected at baseline, during treatment, and/or at disease remission/progression from 3 ibrutinib-responsive (R) patients (Pt-V, C and D) and 2 non-responsive (NR) patients (Pt-B and E). In addition, the PBMC samples from two healthy donors (N1 and N2) were included as the normal controls. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001006995 
   
  
    
    The data contains single-cell gene sequencing  data (10x Genomics) from FACS-purified CD8 T lymphocytes from two Austrian patients. The cells were stimulated with one MHC class I peptides obtained from a common (wild type) variant and an emerging mutant variant of the SARS-Cov-2 virus. Then the samples were multiplexed using hashtag oligos. We provide the raw and aligned sequence data for: 
i. The single-cell experiments
ii. The PCR-amplified samples for enrichment of  the hashtag oligo multiplexing barcodes
iii. The PCR-amplified samples for enrichment of the T Cell Receptor (TCR) VDJ region for immuno-profiling.
The samples and libraries were processed and obtained in collaboration between St. Anna Children's Cancer Research Institute (CCRI), CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, and the Medical University of Vienna. The cell barcodes and processed data has been submitted to the GEO database with GEO accession GSE166651. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001006996 
   
  
    
    Whole exome sequencing of 52 chronic phase/blast crisis pairs obtained from chronic myeloid leukemia 
    
   
  
    
   
  104 
 
  
    EGAD00001006997 
   
  
    
    Single-cell RNA sequencing of 13 ‘mild-moderate’ and 10 ‘critical’ COVID19 PBMC samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001006999 
   
  
    
    Malignant peripheral nerve sheath tumor (MPNST)-like melanoma is a rare malignancy with overlapping characteristics of both neural sarcoma and melanoma. The genomics of MPNST-like melanoma have not been previously described. In this study, we performed whole exome sequencing analysis in 8 samples from 6 patients diagnosed with MPNST-like melanoma. Our results demonstrate that, although MPNST-like melanoma shares oncogenic alterations common to both cutaneous melanoma and MPNST, it also presents unique genomic alterations not previously described in neither of the malignancies. 
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD00001007000 
   
  
    
    Malignant peripheral nerve sheath tumor (MPNST)-like melanoma is a rare malignancy with overlapping characteristics of both neural sarcoma and melanoma. The genomics of MPNST-like melanoma have not been previously described. In this study, we performed whole transcriptome sequencing analysis in 8 samples from 6 patients diagnosed with MPNST-like melanoma. In correlation with deletion ofxa0SERPINB4xa0in all our samples, there was noxa0SERPINB4xa0mRNA expression in our cohort, suggesting a potential tumor-suppressor role of SERPINB4 in MPNST-like melanomas.xa0HRAS, a gene uncommonly mutated in cutaneous melanomas, was mutated in 2 patients, but with no increased mRNA expression.xa0BRAFxa0mRNA expression, resultant from an atypicalxa0BRAFxa0mutation, was increased in association with an inactivatingxa0NF1xa0mutation. Our data demonstrate the role of alternative mechanisms of RAS pathway activation in MPNST-like melanomas and suggest the potential role of other molecular pathways in its carcinogenesis. 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD00001007001 
   
  
    
    Anal SCC cell line and parent tumour comparative whole exome sequencing 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001007002 
   
  
    
    The aligned bam file of next generation sequencing performed on PSCCE. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001007003 
   
  
    
    We collected peripheral blood mononuclear cells (PBMC) from 6 RA patients and 4 healthy controls, as well as synovial fluid (SF) from the same RA patients. We then sorted B cells, CD4+ and CD8+ T cells, regulatory T cells and monocytes using flow cytometry and profiled regions marked with H3K27ac using CUT&Tag. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  59 
 
  
    EGAD00001007004 
   
  
    
    To identify genomic drivers present in limited-stage small cell lung cancer (LS-SCLC); To determine the overall tumor mutational burden in LS-SCLC; To determine genomic intratumor heterogeneity (ITH) in LS-SCLC. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  69 
 
  
    EGAD00001007005 
   
  
    
    Paired-end RNA-sequencing of tumour tissue samples (n=85) from primary urothelial bladder cancer patients. Sequencing was performed using either HiSeq (n=27) or NextSeq (n=58) Illumina platforms. Of the 85 samples, 78 are Non-muscle invasive (NMIBC) and 7 are Muscle invasive (MIBC). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  85 
 
  
    EGAD00001007006 
   
  
    
    Tumor exomes for 15 DLBCL samples with PMBL GE signature, with 6 matching normal exomes and 1 pooled normal exome. 
    
   
  
    
   
  22 
 
  
    EGAD00001007010 
   
  
    
    This dataset contains all sequencing data of the publication "Oncogenic cooperation between the TCF7-SPI1 fusion and NRAS(G12D) requires β-catenin activity to drive T-cell acute lymphoblastic leukemia." This is bulk RNA sequencing of 4 T-ALL patients (X09, XB37, XB41 and XB47) of which X09 has a TCF7-SPI1 fusion, single cell RNA sequencing of these 4 patients toghether with a PDX model of the X09 patient and two patients from another cohort (SJTALL030263 and SJTALL031201) which also have a TCF7-SPI1 fusion, and nanopore sequencing of all patients with the TCF7-SPI1 fusion. Moreover these patient samples with the fusion where treated with PKF 118-310, and bulk RNA sequencing was performed in triplicate to determine the differentially expressed genes. 
    
   
  
    
      
      GridION 
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  38 
 
  
    EGAD00001007011 
   
  
    
    Shallow whole genome sequencing of 77 inflammatory myofibroblastic tumor samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  77 
 
  
    EGAD00001007012 
   
  
    
    Whole exome sequencing of 66 inflammatory myofibroblastic tumor samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001007013 
   
  
    
    Single-cell B-cell receptor sequencing (scBCR-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001007014 
   
  
    
    Single-cell T-cell receptor sequencing (scTCR-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001007015 
   
  
    
    Single-cell RNA sequencing (scRNA-seq) data of  peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001007016 
   
  
    
    Whole exome sequencing (WES) data of peripheral blood mononuclear cells (PBMCs) obtained from 2 ATL patients (3 samples). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001007017 
   
  
    
    Single-cell antibody-derived tag sequencing (scADT-seq)  data of  peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001007018 
   
  
    
    Bulk RNA sequencing (RNA-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 7 ATL patients (9 samples) and 9 HTLV-1-infected asymptomatic carriers. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD00001007019 
   
  
    
    This dataset includes fastq files from sWGS and exome sequencing data derived from dsDNA and ssDNA libraries of plasma cfDNA samples extracted by a column- or bead-based DNA extraction method 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  198 
 
  
    EGAD00001007020 
   
  
    
    This submission is of the sequencing  data used in the CRISPR iPSC methods paper. Specifically it is 3 fastq files that each represent a replicate of an experiment to transduce the Toronto KnockOut CRISPR Library - Version 3 (TKOv3) into inferred pluripotent stem cell (iPSC) derived macrophages. The sequencing is of the guide RNAs from the TKOv3 having been extracted from the transduced iPSC derived macrophages. 
    
   
  
    
   
  3 
 
  
    EGAD00001007022 
   
  
    
    In the context of research, this dataset contains 423 IRD samples; 411 of them analyzed with Clinical Exome Sequencing solutions, and 12 with Whole Exome Sequencing. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  423 
 
  
    EGAD00001007023 
   
  
    
    The dataset includes cram files from WGS of 115 tumor samples as well as 43 matched normal tissue or blood. The sequencing was done with HiSeq X Five instrument. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  158 
 
  
    EGAD00001007024 
   
  
    
    The dataset includes fastq files from 109 tumor samples as well as RNA-seq gene expression R data, RNA-seq transcript expression R data, RNA-seq gene counts matrix, RNA-seq transcript counts matrix, RNA-seq gene FPKM matrix RNA-seq transcript FPKM matrix for the 109 samples. The sequencing was done with HiSeq 2000 instrument. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  110 
 
  
    EGAD00001007025 
   
  
    
    The BEACCON study aimed to address the lack of power of previous studies to identify novel BC predisposition genes by performing extensive sequencing in 12,000 women (11,511 analysed following exclusions) and further enhancing power by using an ‘extreme phenotype’ design with enrichment of familial non-BRCA1 and BRCA2 cases, compared with a control population of older women with ongoing confirmation of cancer-free status at June 2019. Three-quarters of the 1303 candidate genes screened were selected based on empiric evidence from local (69 multi-case BC families) or international whole exome sequencing studies, and the remainder were included to provide detailed coverage of functional pathways with established associations with BC. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  11505 
 
  
    EGAD00001007026 
   
  
    
    106 Whole Exom Sequencing (WXS) of CMML samples. Paired-end fastq are provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  106 
 
  
    EGAD00001007027 
   
  
    
    The Dutch Microbiome Project (DMP) data includes shotgun metagenomic sequencing of faecal samples 8,208 Dutch individuals. Paired-end sequencing was performed using Illumina HiSeq 2000 platform. Data is archived in two batches to facilitate easier data access and upload to EGA. Batch 1 of DMP includes 4396 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  4396 
 
  
    EGAD00001007028 
   
  
    
    Nanoseq data from sperm from 2 individuals, including technical replicates from one individual (10 total sequences). 8 additional samples and 2 matched normals to call mutations in NanoSeq data (dataset EGAD00001006459). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001007029 
   
  
    
    Human Induced Pluripotent Stem Cells (hiPSC) are an established patient-specific model system where opportunities are emerging for cell-based therapies. We compared and contrasted hiPSCs derived from different tissues, skin and blood, in the same individual. We show extensive single-nucleotide mutagenesis in all hiPSC lines, although fibroblast-derived hiPSCs (F-hiPSCs) are particularly heavily mutagenized by ultraviolet (UV)-related damage. We utilized genome sequencing data on 454 F-hiPSCs and 44 blood-derived hiPSCs (B-hiPSCs) to gain further insights. Across 324 whole genome sequenced (WGS) F-hiPSCs derived by the Human Induced Pluripotent Stem Cell Initiative (HipSci), UV-related damage is present in ~72% of cell lines, sometimes causing substantial mutagenesis (range 0.25-15 per Mb). Furthermore, we find remarkable genomic heterogeneity between independent F-hiPSC clones derived from the same reprogramming process in the same donor, due to oligoclonal populations within fibroblasts. Combining WGS and exome-sequencing data of 452 HipSci F-hiPSCs, we identify 272 predicted pathogenic mutations in cancer-related genes, of which 21 genes were hit recurrently three or more times, involving 77 (17%) lines. Notably, 151 of 272 mutations were present in starting fibroblast populations suggesting that more than half of putative driver events in F-hiPSCs were acquired in vivo. In contrast, B-hiPSCs reprogrammed from erythroblasts show lower levels of genome-wide mutations (range 0.28-1.4 per Mb), no UV damage, but a strikingly high prevalence of acquired BCOR mutations in ~57% of lines, indicative of strong selection pressure. All hiPSCs had otherwise stable, diploid genomes on karyotypic pre-screening, highlighting how copy-number-based approaches do not have the required resolution to detect widespread nucleotide mutagenesis. This work strongly suggests that models for cell-based therapies require detailed nucleotide-resolution characterization prior to clinical application. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  86 
 
  
    EGAD00001007030 
   
  
    
    Single-cell RNA-Sequencing of three primary breast cancers, two primary prostate cancers, and a metastatic melanoma sample from Wu et al. (2021) Genome Medicine study. Each tumour was sequenced across different cryopreservation conditions including Fresh Tissue (FT), cryopreserved single-cell suspensions (CCS), cryopreserved solid tissue fragments (CT) and a cryopreserved after overnight cold storage (CO). Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq platform. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  18 
 
  
    EGAD00001007031 
   
  
    
    Targeted capture sequencing data of peripheral blood mononuclear cells (PBMCs) obtained from 4 ATL patients (6 samples) and 10 HTLV-1-infected asymptomatic carriers. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  15 
 
  
    EGAD00001007032 
   
  
    
    Human single cells were clonally expanded by culture and whole-genome sequenced. This dataset includes 334 clonal samples and 7 blood bulks from seven individuals (DB2, DB3, DB5, DB6, DB8, DB9, DB10). We extracted genomic DNA materials from clonally expanded cells and matched peripheral blood using DNeasy Blood and Tissue kits (Qiagen) according to the protocol. DNA libraries for WGS were generated by an Accel-NGS 2S Plus DNA Library Kit (Swift Biosciences) from 1 µg of genomic DNA materials. WGS was performed on either the Illumina HiSeq X platform or the NovaSeq 6000 platform to generate mean coverage of 25.2X for 374 clonally expanded cells and 94.8X for 7 matched blood tissues. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  240 
 
  
    EGAD00001007033 
   
  
    
    De- and trans-differentiation is a rare and only poorly understood phenomenon in cutaneous melanoma. To study this disease more comprehensively we have retrieved 11 primary cutaneous melanomas from our pathology archives showing biphasic features characterized by a conventional melanoma and additional areas of de-/trans-differentiation as defined by a lack of immunohistochemical expression of all conventional melanocytic markers (S-100 protein, SOX10, Melan-A and HMB-45). The clinical, histologic and immunohistochemical findings were recorded and follow-up was obtained. The patients were mostly elderly (median: 81 years; range: 42-86 years) without significant gender predilection, and the sun-exposed skin of the head and neck area was most commonly affected. The tumors were deeply invasive with a mean tumor thickness of 7 mm (range: 4-80 mm). The dedifferentiated component showed atypical fibroxanthoma-like features in the majority (7), while additional rhabdomyosarcomatous and epithelial transdifferentiation was noted histologically and/or immunohistochemically in two tumors each. The background conventional melanoma component was of desmoplastic (4), superficial spreading (3), nodular (2), lentigo maligna (1) or spindle cell (1) types. For the 7 patients with available follow-up data (median follow-up period of 25 months; range: 8-36 months), 2 died from their disease and 3 developed metastases. Next-generation sequencing of the cohort revealed somatic mutation of established melanoma drivers including mainly NF1 mutations in the conventional component (5 cases), which were also detected in the corresponding de-/trans-differentiated components. In summary, the diagnosis of de-/trans-differentiated melanoma is challenging and depends on the morphologic identification of the conventional melanoma component. Molecular analysis is diagnostically helpful as the mutated gene profile is shared between the conventional and de-/trans-differentiated components. Importantly, de-/trans-differentiation does not appear to confer a more aggressive behavior. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  21 
 
  
    EGAD00001007034 
   
  
    
    De- and trans-differentiation is a rare and only poorly understood phenomenon in cutaneous melanoma. To study this disease more comprehensively we have retrieved 11 primary cutaneous melanomas from our pathology archives showing biphasic features characterized by a conventional melanoma and additional areas of de-/trans-differentiation as defined by a lack of immunohistochemical expression of all conventional melanocytic markers (S-100 protein, SOX10, Melan-A and HMB-45). The clinical, histologic and immunohistochemical findings were recorded and follow-up was obtained. The patients were mostly elderly (median: 81 years; range: 42-86 years) without significant gender predilection, and the sun-exposed skin of the head and neck area was most commonly affected. The tumors were deeply invasive with a mean tumor thickness of 7 mm (range: 4-80 mm). The dedifferentiated component showed atypical fibroxanthoma-like features in the majority (7), while additional rhabdomyosarcomatous and epithelial transdifferentiation was noted histologically and/or immunohistochemically in two tumors each. The background conventional melanoma component was of desmoplastic (4), superficial spreading (3), nodular (2), lentigo maligna (1) or spindle cell (1) types. For the 7 patients with available follow-up data (median follow-up period of 25 months; range: 8-36 months), 2 died from their disease and 3 developed metastases. Next-generation sequencing of the cohort revealed somatic mutation of established melanoma drivers including mainly NF1 mutations in the conventional component (5 cases), which were also detected in the corresponding de-/trans-differentiated components. In summary, the diagnosis of de-/trans-differentiated melanoma is challenging and depends on the morphologic identification of the conventional melanoma component. Molecular analysis is diagnostically helpful as the mutated gene profile is shared between the conventional and de-/trans-differentiated components. Importantly, de-/trans-differentiation does not appear to confer a more aggressive behavior. 
    
   
  
    
   
  18 
 
  
    EGAD00001007035 
   
  
    
    The dataset contains data for n=7211 FINRISK 2002 participants who underwent fecal sampling. Demultiplexed shallow shotgun metagenomic sequences were quality filtered and adapter trimmed using Atropos (Didion et al., 2017), and human filtered using Bowtie2 (Langmead and Salzberg, 2012). The files are in FASTQ format. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7231 
 
  
    EGAD00001007037 
   
  
    
    Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation. 
    
   
  
    
   
  416 
 
  
    EGAD00001007038 
   
  
    
    Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001007039 
   
  
    
    This dataset includes bam files of WES of clonally related neuroblastoma and teratoma as well as peripheral blood samples as a control. Neuroblastoma and teratoma samples were formalin-fixed paraffin embedded. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001007040 
   
  
    
    The dataset contains high-throughput sequencing data derived from a cancer autopsy series of 10 patients.  As part of this study, whole-exome sequencing and RNA-seq were performed for spatially distinct tissue biopsies from the patients.  In addition, plasma samples from the patients were sequenced using a custom panelt to profile ctDNA.  There are 106 files containing whole-exome sequencing data, 107 files containing RNA-seq data, and 9 files containing plasma sequencing data. 
    
   
  
    
   
  222 
 
  
    EGAD00001007041 
   
  
    
    We developed Genetic-Epigenetic Tissue Mapping (GETMap) to determine the tissue composition of plasma DNA carrying genetic variants not present in the constitutional genome through comparing their methylation profiles with relevant tissues. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  152 
 
  
    EGAD00001007042 
   
  
    
    Illumina PCR-free sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~44×.  Sequencing was performed at the Wellcome Centre for Human Genetics on the Illumina Novaseq platform. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007043 
   
  
    
    Oxford Nanopore long-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~63×. Sequencing was performed at the Wellcome Centre for Human Genetics using the Oxford Nanopore PromethION platform. 
    
   
  
    
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001007044 
   
  
    
    MGI standard short-read sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~57×. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007045 
   
  
    
    MGI single-tube long fragment read (stLFR) linked-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~51×. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007046 
   
  
    
    10x linked-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~40×.  Sequencing was performed at Bart’s and the London Genome Centre on the Illumina HiSeq platform. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007047 
   
  
    
    PacBio continuous long read (CLR) sequencing data for individual HV31 generated on PacBio Sequel II instrument, using DNA from CD14+ monocytes, to a sequencing depth of ~35×. Sequencing was performed at the Wellcome Sanger Institute. 
    
   
  
    
      
      Sequel 
      
    
   
  1 
 
  
    EGAD00001007048 
   
  
    
    MGI CoolMPS short-read sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~57×. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007049 
   
  
    
    Bionano DLS optical mapping data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a molecule depth of ~153×.  Optical mapping was performed at the Weatherall Institute of Molecular Medicine using the Bionano Saphyr platform. 
    
   
  
    
   
  1 
 
  
    EGAD00001007050 
   
  
    
    De novo assembly of eight immune system regions for individual HV31, generated using a multi-platform pipeline. A full description of the generation of these assemblies can be found at https://doi.org/10.1101/2021.02.03.429586. 
    
   
  
    
   
  1 
 
  
    EGAD00001007051 
   
  
    
    Exome libraries were prepared using 100ng DNA of tumor tissue or matched normal DNA. Exome capture was performed using Agilent SureSelect Human Exome Library Preparation V5 or V6 COSIMC + kits. 
    
   
  
    
      
      unspecified 
      
    
   
  184 
 
  
    EGAD00001007052 
   
  
    
    This contains H3K27ac ChIP-seq, RNA-seq and HiC fastq files. 
    
   
  
    
      
      Illumina Genome Analyzer 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001007055 
   
  
    
    RNAseq of 55 melanoma tumors that were used as a validation dataset in Garg et al Nat Commun,  2021 Feb 18;12(1):1137. doi: 10.1038/s41467-021-21207-2. 
    
   
  
    
   
  - 
 
  
    EGAD00001007056 
   
  
    
    Low-coverage whole genome sequencing of 29 early breast cancer samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  29 
 
  
    EGAD00001007057 
   
  
    
    Whole-exome sequencing of 30 early breast cancer samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001007058 
   
  
    
    CITE-seq of early breast cancer samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001007060 
   
  
    
    The data are the aggregate results from an IGPP Consortium genome-wide survival study, showing overall risk for Parkinson disease progression associated with each variant in a longitudinal cohort study. 11.2 million deeply imputed variants in 3,821 PD patients
who were prospectively tracked with 36,123 visits over a median of 6.7 years from disease onset (inter-quartile range, 4.2 years) were analyzed. Data include hazard ratio, SNP ID, and P value. 
    
   
  
    
   
  1 
 
  
    EGAD00001007061 
   
  
    
    Amplicon seqeuencing of (1) wildtype IPC298 cell line grown for 3-4 weeks with DMSO, amplified for ARAF exon 11
(2) IPC298 cells treated for 3-4 weeks with 10uM belvarafenib, isolated colony 9, amplified for ARAF exon 11
(3) MelJuso cell line grown with DMSO, amplified for ARAF exon 11 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001007062 
   
  
    
    Whole Exome Sequencing of Belvarafenib resistant IPC-298 clones after treatment for 3-4 weeks with 10uM belvarafenib 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001007063 
   
  
    
    Multiregional analysis of three cases of GBM. For each tumor, 9 portions were analyzed by whole exome sequencing. A total of 27 bam files are present in our dataset. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001007064 
   
  
    
    Shotgun metagenomic sequencing data of a total 2,338 fecal DNA samples from adults of the Pinggu cohort. 
    
   
  
    
      
      unspecified 
      
    
   
  2338 
 
  
    EGAD00001007066 
   
  
    
    epigenome profiling in tumor tissues and paired normal tissues of LUAD patients and transcriptome profiling in tumor tissues of LUAD patients. 
    
   
  
    
   
  83 
 
  
    EGAD00001007070 
   
  
    
    This study consists of over 200 data files from cfDNA and germline DNA from 69 patients and 32 healthy normal volunteers discussed in this publication. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  71 
 
  
    EGAD00001007071 
   
  
    
    PopCol is a cohort study in Stockholm, Sweden that includes a data-rich set of individuals with data available from bowel symptoms questionnaires, gastroenterology visits and biospecimens (genotype and 16S sequencing from blood and stool samples, respectively). Genotyping was carried out using the Illumina HumanOmniExpressExome-8v1 arrays at the SciLifeLab NGI facility in Uppsala, Sweden. Fecal DNA was extracted from samples kept at -80°C using Qiagen 5 QIAamp DNA Stool Mini Kits and analyzed using 16S rRNA gene amplicon sequencing (in the V1-V2 hypervariable region). This was performed on the Illumina MiSeq platform at the Institute of Clinical Molecular Biology (IKMB) in Kiel, Germany. Of these, six PopCol participants were PPI users and 12 used antibiotics. The study was approved by the local Committee of Research Ethics (Forskningskommitté Syd) at Karolinska Institutet, Stockholm, in November 2001. Written informed consent was obtained from all participants 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  134 
 
  
    EGAD00001007072 
   
  
    
    16p11.2 CNV iPSC derived dopaminergic neuron transcriptional gene expression data. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001007073 
   
  
    
    RNA-seq data of non-tumorous breast tissue. There are 32 samples in this cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  32 
 
  
    EGAD00001007074 
   
  
    
    Maastricht IBS cohort with biobank aims to identify subgroups of IBS according to phenotypical and
genotypical characterization. This dataset represents 16S amlicon sequencing of the gut microbiome of case samples and matched controls. Fecal DNA was extracted using the
Qiagen AllPrep kit with bead-beating step. Sequencing of bacterial 16S gene, domain V4, was
performed using the Illumina MiSeq platform. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  356 
 
  
    EGAD00001007075 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HFF7VCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  88 
 
  
    EGAD00001007076 
   
  
    
    We conducted whole-exome sequencing (WES) and microarray profiling on 19 micro-dissected tumor regions of different histologic subtypes from 9 patients with lung cancers of mixed histologic patterns including 6 LUAD, 6 LCNEC, 3 SCLC, 3 LUSC, and one poorly differentiated NSCLC-NOS. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  28 
 
  
    EGAD00001007078 
   
  
    
    Dataset contains WGS sequecing data from clonally expanded hematopoietic stem cells from 7 individual pediatric cancer patients. Samples were taken before (DX - diagnosis) or Follow-up (DX2/REM/FU - Diagnosis 2, remission or follow-up, respectively). In addtion, cord blood clones (Designated CB) treated with X-ray radiation, Cisplatin, Maphosphamide, Vincristine and Doxorubicin and untreated cord blood hematopoietic stem/progenitor cells were have been whole-genome sequenced. (Abbreviations RAD, CISPL, MAPH, VINC, DOX and CTRL, respectively) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  115 
 
  
    EGAD00001007079 
   
  
    
    RNA-Seq data from vocal fold fibroblasts from controls and patients with Reinke’s edema 
    
   
  
    
      
      NextSeq 550 
      
    
   
  27 
 
  
    EGAD00001007080 
   
  
    
    This dataset contains:
i) 241 deep (median 12x) whole-genome sequencing profiles of 95 patients with Ewing sarcoma, 31 patients with other pediatric sarcomas, and 22 additional profiles from healthy controls. Sequencing was performed on a NovaSeq 6000 instrument using the NovaSeq S4 2x100 bp configuration. In addition, pilot experiments for 18 cfDNA samples were performed using Illumina HiSeq 2000/2500 machines (2x75 bp configuration). Data is provided as .fastq.gz files (2 files, .R1 and .R2, per sample). 
i) Low coverage whole-genome-sequencing on 43 tumor biopsy samples from patients with Ewing sarcoma with matched cfDNA samples. The samples were sequenced using a NovaSeq 6000 instrument with the NovaSeq S4 1x100 bp configuration. Data are provided as unmapped (raw) .bam files.
iii) Reduced-representation bisulfite sequencing data for 38 tumor biopsy samples from patients with Ewing sarcoma with matched cfDNA samples, and 2 control samples. RRBS libraries were sequenced on Illumina HiSeq 2000/2500 machines. Data are provided as unmapped (raw) .bam files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  346 
 
  
    EGAD00001007081 
   
  
    
    Paired single-cell sequencing dataset of T-cell receptors from both treated and untreated celiac patients. (Amplicon sequencing, paired-end fastq files). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  62 
 
  
    EGAD00001007082 
   
  
    
    Through the Peruvian Genome Project we generate and analyze the high coverage genomes of 150 individuals where the majority have >90% Native American ancestry and explore questions at the interface of evolutionary genetics, history, anthropology, and medicine. This is the most extensive sampling of high-coverage Native American and mestizo whole genomes to date. Reference: https://doi.org/10.1073/pnas.1720798115 
    
   
  
    
   
  150 
 
  
    EGAD00001007083 
   
  
    
    The innate immune response of cells of hepatic origin (Huh7, Huh7.5, PH5CH and primary human hepatocytes (PHH), 66 samples) was analyzed by transcriptome analysis (RNAseq) upon supernatant delivery or transfection of synthetic dsRNA (poly(I:C)). Expression of TLR3 and RIG-I was reconstituted by lentiviral transduction in Huh7 and Huh7.5 cells. The sequencing is single RNA-Seq on an Illumina HiSeq 4000 with the Illumina TruSeq stranded mRNA kit. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001007085 
   
  
    
    The dataset is composed by the raw and processed sequencing data generated from 8 Australian Patients and 13 Argentinian Patients affected by a form of male infertility characterised by vital, but immotile sperm often in combination with a spectrum of structural abnormalities of the sperm flagellum. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001007086 
   
  
    
    Whole genome sequencing for single cells for library A108757B 1644 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007087 
   
  
    
    Whole genome sequencing for single cells for library A98299B 991 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007088 
   
  
    
    Whole genome sequencing for single cells for library A108759B 1134 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007089 
   
  
    
    Whole genome sequencing for single cells for library A108768B 1126 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007090 
   
  
    
    Whole genome sequencing for single cells for library A108837A 1435 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007091 
   
  
    
    Whole genome sequencing for single cells for library A108846A 1670 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007092 
   
  
    
    Whole genome sequencing for single cells for library A108846B 1636 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007093 
   
  
    
    Whole genome sequencing for single cells for library A108879A 1567 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007094 
   
  
    
    Whole genome sequencing for single cells for library A110632A 1151 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007095 
   
  
    
    Whole genome sequencing for single cells for library A110632B 892 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007096 
   
  
    
    Whole genome sequencing for single cells for library A118833A 919 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007097 
   
  
    
    Whole genome sequencing for single cells for library A118833B 1147 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007098 
   
  
    
    Whole genome sequencing for single cells for library A118845B 1209 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007099 
   
  
    
    Whole genome sequencing for single cells for library A118869A 1165 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007100 
   
  
    
    Whole genome sequencing for single cells for library A118869B 1124 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007101 
   
  
    
    Whole genome sequencing for single cells for library A73044B 2085 cells; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001007102 
   
  
    
    Whole genome sequencing for single cells for library A73047D 2091 cells; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001007103 
   
  
    
    Whole genome sequencing for single cells for library A95626A 1040 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007104 
   
  
    
    Whole genome sequencing for single cells for library A95633B 2259 cells; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  7 
 
  
    EGAD00001007105 
   
  
    
    Whole genome sequencing for single cells for library A95634B 1335 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007106 
   
  
    
    Whole genome sequencing for single cells for library A95646A 1071 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007107 
   
  
    
    Whole genome sequencing for single cells for library A95653B 1342 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007108 
   
  
    
    Whole genome sequencing for single cells for library A95663B 1364 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007109 
   
  
    
    Whole genome sequencing for single cells for library A95675A 668 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007110 
   
  
    
    Whole genome sequencing for single cells for library A95731B 1307 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007111 
   
  
    
    Whole genome sequencing for single cells for library A96115A 1635 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007112 
   
  
    
    Whole genome sequencing for single cells for library A96115B 1201 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007113 
   
  
    
    Whole genome sequencing for single cells for library A96118A 1193 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007114 
   
  
    
    Whole genome sequencing for single cells for library A96130B 1267 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007115 
   
  
    
    Whole genome sequencing for single cells for library A96178A 788 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007116 
   
  
    
    Whole genome sequencing for single cells for library A96178B 846 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007117 
   
  
    
    Whole genome sequencing for single cells for library A96181C 1216 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007118 
   
  
    
    Whole genome sequencing for single cells for library A96193B 2410 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001007119 
   
  
    
    Whole genome sequencing for single cells for library A96225B 959 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001007120 
   
  
    
    Whole genome sequencing for single cells for library A96225C 1034 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001007121 
   
  
    
    Whole genome sequencing for single cells for library A96240A 1493 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007122 
   
  
    
    Whole genome sequencing for single cells for library A96240B 1683 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007123 
   
  
    
    Whole genome sequencing for single cells for library A98172A 898 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007124 
   
  
    
    Whole genome sequencing for single cells for library A98256A 1372 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007125 
   
  
    
    Whole genome sequencing for single cells for library A98256B 1437 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007126 
   
  
    
    Whole genome sequencing for single cells for library A98257B 1286 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007127 
   
  
    
    Whole genome sequencing for single cells for library A98258B 1357 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007128 
   
  
    
    Whole genome sequencing for single cells for library A98274B 1295 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007129 
   
  
    
    Whole genome sequencing for single cells for library A98282A 1198 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007130 
   
  
    
    Whole genome sequencing for single cells for library A98283A 1137 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007131 
   
  
    
    Whole genome sequencing for single cells for library A98285A 1422 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007132 
   
  
    
    Whole genome sequencing for single cells for library A98295B 1381 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007133 
   
  
    
    This dataset are the bam files of WGS data from the paper by He et al. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  56 
 
  
    EGAD00001007134 
   
  
    
    This dataset are the bam files of WES data from the paper by He et al. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001007135 
   
  
    
    This dataset are the bam files of RNA-seq data from the paper by He et al. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  52 
 
  
    EGAD00001007136 
   
  
    
    ADMSC05 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007137 
   
  
    
    ADMSC06 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007138 
   
  
    
    ADMSC07 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007139 
   
  
    
    ADMSC08 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007140 
   
  
    
    Islet-derived-MSC01 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007141 
   
  
    
    Islet-derived-MSC02 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007142 
   
  
    
    Islet-derived-MSC03 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007143 
   
  
    
    Islet-derived-MSC04 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007144 
   
  
    
    Islet-derived-iPSC01 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007145 
   
  
    
    Islet-derived-iPSC02 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007146 
   
  
    
    Pancreas-Islet07 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007147 
   
  
    
    Fat-adipocyte03 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007148 
   
  
    
    Fat-adipocyte04 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007149 
   
  
    
    Fat-adipocyte05 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007150 
   
  
    
    ADMSC01 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007151 
   
  
    
    ADMSC02 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007152 
   
  
    
    ADMSC03 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007153 
   
  
    
    ADMSC05 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007154 
   
  
    
    ADMSC06 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007155 
   
  
    
    ADMSC07 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007156 
   
  
    
    ADMSC08 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007157 
   
  
    
    Fat-adipocyte03 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007158 
   
  
    
    Fat-adipocyte04 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007159 
   
  
    
    Fat-adipocyte05 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007160 
   
  
    
    Islet-derived-MSC01 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007161 
   
  
    
    Islet-derived-MSC02 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007162 
   
  
    
    Islet-derived-MSC03 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007163 
   
  
    
    Islet-derived-MSC04 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007164 
   
  
    
    Islet-derived-iPSC01 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007165 
   
  
    
    Islet-derived-iPSC02 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007166 
   
  
    
    Pancreas-Islet06 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007167 
   
  
    
    Pancreas-Islet07 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007168 
   
  
    
    Pancreas-Islet08 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007169 
   
  
    
    SMC01 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007170 
   
  
    
    SMC02 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007171 
   
  
    
    SMC03 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007172 
   
  
    
    SMC04 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007173 
   
  
    
    SMC05 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007174 
   
  
    
    SMC07 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007175 
   
  
    
    SMC08 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007176 
   
  
    
    SMC09 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007177 
   
  
    
    ADMSC05 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007178 
   
  
    
    ADMSC06 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007179 
   
  
    
    ADMSC07 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007180 
   
  
    
    ADMSC08 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007181 
   
  
    
    Islet-derived-MSC01 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007182 
   
  
    
    Islet-derived-MSC02 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007183 
   
  
    
    Islet-derived-MSC03 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007184 
   
  
    
    Islet-derived-MSC04 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007185 
   
  
    
    Islet-derived-iPSC01 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007186 
   
  
    
    Islet-derived-iPSC02 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007187 
   
  
    
    Pancreas-Islet06 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007188 
   
  
    
    Pancreas-Islet07 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007189 
   
  
    
    Pancreas-Islet08 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007190 
   
  
    
    Fat-adipocyte03 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007191 
   
  
    
    Fat-adipocyte04 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007192 
   
  
    
    Fat-adipocyte05 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007193 
   
  
    
    Pancreas-Islet02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007194 
   
  
    
    Pancreas-Islet02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007195 
   
  
    
    Pancreas-Islet02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007196 
   
  
    
    Pancreas-Islet02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007197 
   
  
    
    Pancreas-Islet02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007198 
   
  
    
    Pancreas-Islet02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007199 
   
  
    
    Pancreas-Islet02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007200 
   
  
    
    Pancreas-Islet03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007201 
   
  
    
    Pancreas-Islet03 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007202 
   
  
    
    Pancreas-Islet03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007203 
   
  
    
    Pancreas-Islet03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007204 
   
  
    
    Pancreas-Islet03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007205 
   
  
    
    Pancreas-Islet03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007206 
   
  
    
    Pancreas-Islet03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007207 
   
  
    
    Pancreas-Islet04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007208 
   
  
    
    Pancreas-Islet04 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007209 
   
  
    
    Pancreas-Islet04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007210 
   
  
    
    Pancreas-Islet04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007211 
   
  
    
    Pancreas-Islet04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007212 
   
  
    
    Pancreas-Islet04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007213 
   
  
    
    Pancreas-Islet04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007214 
   
  
    
    Pancreas-beta01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007215 
   
  
    
    Pancreas-beta01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007216 
   
  
    
    Pancreas-beta01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007217 
   
  
    
    Pancreas-beta01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007218 
   
  
    
    Pancreas-beta01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007219 
   
  
    
    Pancreas-beta01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007220 
   
  
    
    Pancreas-beta01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007221 
   
  
    
    Fat-Preadipocyte01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007222 
   
  
    
    Fat-Preadipocyte01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007223 
   
  
    
    Fat-Preadipocyte01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007224 
   
  
    
    Fat-Preadipocyte01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007225 
   
  
    
    Fat-Preadipocyte01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007226 
   
  
    
    Fat-Preadipocyte01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007227 
   
  
    
    Fat-Preadipocyte01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007228 
   
  
    
    Kidney-Podocyte01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007229 
   
  
    
    Kidney-Podocyte01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007230 
   
  
    
    Kidney-Podocyte01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007231 
   
  
    
    Kidney-Podocyte01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007232 
   
  
    
    Kidney-Podocyte01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007233 
   
  
    
    Kidney-Podocyte01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007234 
   
  
    
    Kidney-Podocyte01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007235 
   
  
    
    Kidney-Podocyte03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007236 
   
  
    
    Kidney-Podocyte03 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007237 
   
  
    
    Kidney-Podocyte03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007238 
   
  
    
    Kidney-Podocyte03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007239 
   
  
    
    Kidney-Podocyte03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007240 
   
  
    
    Kidney-Podocyte03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007241 
   
  
    
    Kidney-Podocyte03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007242 
   
  
    
    Kidney-Podocyte04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007243 
   
  
    
    Kidney-Podocyte04 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007244 
   
  
    
    Kidney-Podocyte04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007245 
   
  
    
    Kidney-Podocyte04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007246 
   
  
    
    Kidney-Podocyte04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007247 
   
  
    
    Kidney-Podocyte04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007248 
   
  
    
    Kidney-Podocyte04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007249 
   
  
    
    Kidney-mesangial01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007250 
   
  
    
    Kidney-mesangial01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007251 
   
  
    
    Kidney-mesangial01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007252 
   
  
    
    Kidney-mesangial01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007253 
   
  
    
    Kidney-mesangial01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007254 
   
  
    
    Kidney-mesangial01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007255 
   
  
    
    Kidney-mesangial01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007256 
   
  
    
    Kidney-mesangial02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007257 
   
  
    
    Kidney-mesangial02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007258 
   
  
    
    Kidney-mesangial02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007259 
   
  
    
    Kidney-mesangial02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007260 
   
  
    
    Kidney-mesangial02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007261 
   
  
    
    Kidney-mesangial02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007262 
   
  
    
    Kidney-mesangial02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007263 
   
  
    
    IPS-Fibroblast01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007264 
   
  
    
    IPS-Fibroblast01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007265 
   
  
    
    IPS-Fibroblast01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007266 
   
  
    
    IPS-Fibroblast01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007267 
   
  
    
    IPS-Fibroblast01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007268 
   
  
    
    IPS-Fibroblast01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007269 
   
  
    
    IPS-Fibroblast01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007270 
   
  
    
    IPS-Fibroblast02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007271 
   
  
    
    IPS-Fibroblast02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007272 
   
  
    
    IPS-Fibroblast02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007273 
   
  
    
    IPS-Fibroblast02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007274 
   
  
    
    IPS-Fibroblast02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007275 
   
  
    
    IPS-Fibroblast02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007276 
   
  
    
    IPS-Fibroblast02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007277 
   
  
    
    IPS-NPC01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007278 
   
  
    
    IPS-NPC01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007279 
   
  
    
    IPS-NPC01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007280 
   
  
    
    IPS-NPC01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007281 
   
  
    
    IPS-NPC01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007282 
   
  
    
    IPS-NPC01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007283 
   
  
    
    IPS-NPC01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007284 
   
  
    
    IPS-NPC02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007285 
   
  
    
    IPS-NPC02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007286 
   
  
    
    IPS-NPC02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007287 
   
  
    
    IPS-NPC02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007288 
   
  
    
    IPS-NPC02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007289 
   
  
    
    IPS-NPC02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007290 
   
  
    
    IPS-NPC02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007291 
   
  
    
    IPS-ENeuron01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007292 
   
  
    
    IPS-ENeuron01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007293 
   
  
    
    IPS-ENeuron01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007294 
   
  
    
    IPS-ENeuron01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007295 
   
  
    
    IPS-ENeuron01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007296 
   
  
    
    IPS-ENeuron01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007297 
   
  
    
    IPS-ENeuron01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007298 
   
  
    
    IPS-ENeuron02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007299 
   
  
    
    IPS-ENeuron02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007300 
   
  
    
    IPS-ENeuron02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007301 
   
  
    
    IPS-ENeuron02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007302 
   
  
    
    IPS-ENeuron02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007303 
   
  
    
    IPS-ENeuron02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007304 
   
  
    
    IPS-ENeuron02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007305 
   
  
    
    458 single cell samples of multiple colorectal cancer organoids 
    
   
  
    
   
  458 
 
  
    EGAD00001007306 
   
  
    
    Whole genome sequencing for single cells for library A95646B 507 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007307 
   
  
    
    Molecular profiling by exome sequencing of an AML case following treatment with a BCL2 inhibitor 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001007308 
   
  
    
    SMC01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007309 
   
  
    
    SMC01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007310 
   
  
    
    SMC01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007311 
   
  
    
    SMC01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007312 
   
  
    
    SMC01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007313 
   
  
    
    SMC01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007314 
   
  
    
    SMC02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007315 
   
  
    
    SMC02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007316 
   
  
    
    SMC02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007317 
   
  
    
    SMC02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007318 
   
  
    
    SMC02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007319 
   
  
    
    SMC02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007320 
   
  
    
    SMC03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007321 
   
  
    
    SMC03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007322 
   
  
    
    SMC03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007323 
   
  
    
    SMC03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007324 
   
  
    
    SMC03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007325 
   
  
    
    SMC03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007326 
   
  
    
    SMC04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007327 
   
  
    
    SMC04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007328 
   
  
    
    SMC04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007329 
   
  
    
    SMC04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007330 
   
  
    
    SMC04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007331 
   
  
    
    SMC04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007332 
   
  
    
    SMC05 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007333 
   
  
    
    SMC05 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007334 
   
  
    
    SMC05 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007335 
   
  
    
    SMC05 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007336 
   
  
    
    SMC05 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007337 
   
  
    
    SMC05 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007338 
   
  
    
    SMC07 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007339 
   
  
    
    SMC07 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007340 
   
  
    
    SMC07 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007341 
   
  
    
    SMC07 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007342 
   
  
    
    SMC07 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007343 
   
  
    
    SMC07 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007344 
   
  
    
    SMC08 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007345 
   
  
    
    SMC08 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007346 
   
  
    
    SMC08 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007347 
   
  
    
    SMC08 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007348 
   
  
    
    SMC08 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007349 
   
  
    
    SMC08 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007350 
   
  
    
    SMC09 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007351 
   
  
    
    SMC09 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007352 
   
  
    
    SMC09 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007353 
   
  
    
    SMC09 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007354 
   
  
    
    SMC09 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007355 
   
  
    
    SMC09 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007356 
   
  
    
    ADMSC01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007357 
   
  
    
    ADMSC01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007358 
   
  
    
    ADMSC01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007359 
   
  
    
    ADMSC01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007360 
   
  
    
    ADMSC01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007361 
   
  
    
    ADMSC01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007362 
   
  
    
    ADMSC02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007363 
   
  
    
    ADMSC02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007364 
   
  
    
    ADMSC02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007365 
   
  
    
    ADMSC02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007366 
   
  
    
    ADMSC02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007367 
   
  
    
    ADMSC02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007368 
   
  
    
    ADMSC03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007369 
   
  
    
    ADMSC03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007370 
   
  
    
    ADMSC03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007371 
   
  
    
    ADMSC03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007372 
   
  
    
    ADMSC03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007373 
   
  
    
    ADMSC03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007374 
   
  
    
    ADMSC05 h3k27Ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007375 
   
  
    
    ADMSC05 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007376 
   
  
    
    ADMSC05 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007377 
   
  
    
    ADMSC05 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007378 
   
  
    
    ADMSC05 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007379 
   
  
    
    ADMSC05 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007380 
   
  
    
    ADMSC05 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007381 
   
  
    
    ADMSC06 h3k27Ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007382 
   
  
    
    ADMSC06 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007383 
   
  
    
    ADMSC06 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007384 
   
  
    
    ADMSC06 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007385 
   
  
    
    ADMSC06 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007386 
   
  
    
    ADMSC06 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007387 
   
  
    
    ADMSC06 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007388 
   
  
    
    ADMSC07 h3k27Ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007389 
   
  
    
    ADMSC07 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007390 
   
  
    
    ADMSC07 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007391 
   
  
    
    ADMSC07 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007392 
   
  
    
    ADMSC07 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007393 
   
  
    
    ADMSC07 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007394 
   
  
    
    ADMSC07 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007395 
   
  
    
    ADMSC08 h3k27Ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007396 
   
  
    
    ADMSC08 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007397 
   
  
    
    ADMSC08 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007398 
   
  
    
    ADMSC08 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007399 
   
  
    
    ADMSC08 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007400 
   
  
    
    ADMSC08 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007401 
   
  
    
    ADMSC08 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007402 
   
  
    
    Islet-derived-MSC01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007403 
   
  
    
    Islet-derived-MSC01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007404 
   
  
    
    Islet-derived-MSC01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007405 
   
  
    
    Islet-derived-MSC01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007406 
   
  
    
    Islet-derived-MSC01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007407 
   
  
    
    Islet-derived-MSC01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007408 
   
  
    
    Islet-derived-MSC01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007409 
   
  
    
    Islet-derived-MSC02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007410 
   
  
    
    Islet-derived-MSC02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007411 
   
  
    
    Islet-derived-MSC02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007412 
   
  
    
    Islet-derived-MSC02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007413 
   
  
    
    Islet-derived-MSC02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007414 
   
  
    
    Islet-derived-MSC02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007415 
   
  
    
    Islet-derived-MSC02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007416 
   
  
    
    Islet-derived-MSC03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007417 
   
  
    
    Islet-derived-MSC03 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007424 
   
  
    
    Islet-derived-MSC03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007425 
   
  
    
    Islet-derived-MSC03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007426 
   
  
    
    Islet-derived-MSC03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007427 
   
  
    
    Islet-derived-MSC03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007428 
   
  
    
    Islet-derived-MSC03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007429 
   
  
    
    Islet-derived-MSC04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007430 
   
  
    
    Islet-derived-MSC04 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007431 
   
  
    
    Islet-derived-MSC04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007432 
   
  
    
    Islet-derived-MSC04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007433 
   
  
    
    Islet-derived-MSC04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007434 
   
  
    
    Islet-derived-MSC04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007435 
   
  
    
    Islet-derived-MSC04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007436 
   
  
    
    Islet-derived-iPSC01 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007437 
   
  
    
    Islet-derived-iPSC01 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007438 
   
  
    
    Islet-derived-iPSC01 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007439 
   
  
    
    Islet-derived-iPSC01 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007440 
   
  
    
    Islet-derived-iPSC01 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007441 
   
  
    
    Islet-derived-iPSC01 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007442 
   
  
    
    Islet-derived-iPSC01 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007443 
   
  
    
    Islet-derived-iPSC02 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007444 
   
  
    
    Islet-derived-iPSC02 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007445 
   
  
    
    Islet-derived-iPSC02 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007446 
   
  
    
    Islet-derived-iPSC02 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007447 
   
  
    
    Islet-derived-iPSC02 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007448 
   
  
    
    Islet-derived-iPSC02 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007449 
   
  
    
    Islet-derived-iPSC02 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007450 
   
  
    
    Pancreas-Islet06 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007451 
   
  
    
    Pancreas-Islet06 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007452 
   
  
    
    Pancreas-Islet06 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007453 
   
  
    
    Pancreas-Islet06 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007454 
   
  
    
    Pancreas-Islet06 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007455 
   
  
    
    Pancreas-Islet06 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007456 
   
  
    
    Pancreas-Islet06 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007457 
   
  
    
    Pancreas-Islet07 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007458 
   
  
    
    Pancreas-Islet07 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007459 
   
  
    
    Pancreas-Islet07 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007460 
   
  
    
    Pancreas-Islet07 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007461 
   
  
    
    Pancreas-Islet07 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007462 
   
  
    
    Pancreas-Islet07 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007463 
   
  
    
    Pancreas-Islet07 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007464 
   
  
    
    Pancreas-Islet08 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007465 
   
  
    
    Pancreas-Islet08 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007466 
   
  
    
    Pancreas-Islet08 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007467 
   
  
    
    Pancreas-Islet08 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007468 
   
  
    
    Pancreas-Islet08 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007469 
   
  
    
    Pancreas-Islet08 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007470 
   
  
    
    Pancreas-Islet08 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007471 
   
  
    
    Fat-adipocyte03 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007472 
   
  
    
    Fat-adipocyte03 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007473 
   
  
    
    Fat-adipocyte03 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007474 
   
  
    
    Fat-adipocyte03 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007475 
   
  
    
    Fat-adipocyte03 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007476 
   
  
    
    Fat-adipocyte03 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007477 
   
  
    
    Fat-adipocyte03 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007478 
   
  
    
    Fat-adipocyte04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007479 
   
  
    
    Fat-adipocyte04 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007480 
   
  
    
    Fat-adipocyte04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007481 
   
  
    
    Fat-adipocyte04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007482 
   
  
    
    Fat-adipocyte04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007483 
   
  
    
    Fat-adipocyte04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007484 
   
  
    
    Fat-adipocyte04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007485 
   
  
    
    Fat-adipocyte05 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007486 
   
  
    
    Fat-adipocyte05 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007487 
   
  
    
    Fat-adipocyte05 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007488 
   
  
    
    Fat-adipocyte05 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007489 
   
  
    
    Fat-adipocyte05 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007490 
   
  
    
    Fat-adipocyte05 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007491 
   
  
    
    Fat-adipocyte05 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007493 
   
  
    
    Data supporting: "Genomic analysis of response to neoadjuvant chemotherapy in esophageal adenocarcinoma" Izadi et al.
WGS for tumour and normal samples.
RNAseq for tumour samples. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001007494 
   
  
    
    RNA-sequencing of meningiomas for integrative molecular classification. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  124 
 
  
    EGAD00001007495 
   
  
    
    Single-cell RNA-Sequencing of 26 primary breast cancers from Wu et al. (2021) study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  26 
 
  
    EGAD00001007496 
   
  
    
    Data supporting: "Widespread reorganisation of the regulatory chromatin landscape facilitates resistance to inhibition of oncogenic ERBB2 signalling" Ogden et al.
WGS for tumour and normal samples.
RNAseq for tumour samples. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001007497 
   
  
    
    One retinoblastoma sample was studied by single-cell RNA-sequencing (10X genomics Chromium). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001007498 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001007499 
   
  
    
    38 PPTCs were sequenced using RNASeq to identify oncogenic variants in driver-unknown tumors and to explore gene expression patterns.
RNA libraries were quantified by qPCR using the Kapa Library Quantification Illumina/ABI Prism Kit protocol (KAPA Biosystems). Libraries were pooled in equimolar quantities and paired-end sequenced on 2 lanes of a High Throughput Run Mode flowcell with the V4 sequencing chemistry on an Illumina HiSeq 2500 platform following Illumina’s recommended protocol to generate paired-end reads of 126-bases in length. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001007501 
   
  
    
    The Dutch Microbiome Project (DMP) data includes shotgun metagenomic sequencing of faecal samples 8,208 Dutch individuals. Paired-end sequencing was performed using Illumina HiSeq 2000 platform. Data is archived in two batches to facilitate easier data access and upload to EGA. Batch 2 of DMP includes ~400 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3848 
 
  
    EGAD00001007502 
   
  
    
    Exome libraries from 47 blood and tissue samples were prepared using Agilent SureSelect Human Exome Library Preparation V5 kit and the Agilent Bravo Automation System fExome libraries were pooled and sequenced with the TruSeq SBS sequencing chemistry using a V4 high throughput flowcell on a HiSeq 2500 platform following Illumina’s recommended protocol. Approximately 6-8 gigabases of raw paired end data of 126-bases were generated per exome library. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  47 
 
  
    EGAD00001007503 
   
  
    
    Multiple metastatic sites were sampled at autopsy from four patients diagnosed with metastatic colorectal cancer and subjected to whole-genome sequencing using the Illumina HiSeq X Ten platform to identify somatic variants, structural rearrangements and mutational signatures. The number of tumour samples per patient ranged from 6 to 66. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  88 
 
  
    EGAD00001007504 
   
  
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001007505 
   
  
    
    This is the raw data obtained from shallow whole-genome sequencing of plasma DNA (plasma-seq) for calling of somatic copy number alterations as well as focal amplifications from patients with lung cancer. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001007506 
   
  
    
    This dataset combines single cell transcriptome data from fetal pancreas at 7-10 wpc, embryonic stem cell-derived pancreas progenitors and spheroids generated from both fetal pancreas and human pluripotent stem cell-derived pancreas progenitors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001007508 
   
  
    
    This dataset contains shallow whole genome sequencing data from paired cfDNA - tumor DNA samples of various pediatric cancer entities (total 215 samples). Files are provided in fastq format. Samples were sequenced on an Ion Proton sequencer (Thermo Fisher Scientific) or a Hiseq3000 (Illumina). Data analysis is available at https://github.com/rmvpaeme/sWGS_pediatric_cancer. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
      Ion Torrent Proton 
      
    
   
  215 
 
  
    EGAD00001007509 
   
  
    
    Longitudinal genome-sequencing analysis of 12 patients with metastatic or refractory
osteosarcoma. The study was approved at the University Hospital Basel, following the approval of
the ethical committee for mutational analysis of anonymized samples
(“Ethikkommission beider Basel” ref. 274/12). Informed consent was obtained from
all 12 patients. All tumor samples were evaluated by an experienced bone
pathologist to confirm the diagnosis.
WES and low coverage WGS are aligned against the reference genome GRCh37. More details in the associated publication. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001007510 
   
  
    
    The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001007511 
   
  
    
    Dataset with 55 whole-exome sequences from Tunisian non-Imazighen samples and 20 whole-exome sequences from Tunisian Imazighen samples. 
    
   
  
    
      
      unspecified 
      
    
   
  75 
 
  
    EGAD00001007512 
   
  
    
    RNA sequencing data of anti-SARS-CoV-2 spike IgG (monoclonal or patient serum), poly(I:C) and Fostamatinib treated human primary IL10-M2 macrophages 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  14 
 
  
    EGAD00001007515 
   
  
    
    This dataset contains different samples from a single patient with SEF. The dataset contains whole genome, whole exome and RNAseq information. Two of the DNA sequencing samples also contain matched normals. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001007516 
   
  
    
    This dataset contains different samples from a single patient with SEF. The dataset contains whole genome, whole exome and RNAseq information. Two of the DNA sequencing samples also contain matched normals. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001007520 
   
  
    
    In this project we performed targeted sequencing of
      known and suspected melanoma susceptibility genes in a cohort of
      melanoma patients and matched controls. Our aim was to identify variants that predispose to melanoma development. The melanoma cases used in this study were population ascertained. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2731 
 
  
    EGAD00001007521 
   
  
    
    14 samples were processed for single cell DNA sequencing 
    
   
  
    
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD00001007522 
   
  
    
    13 samples were processed by whole genome sequencing 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001007523 
   
  
    
    17 samples were processed for single cell RNA sequencing (SORT-seq) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001007524 
   
  
    
    We collected 80 NASH-HCCs formalin fixed paraffin-embedded (FFPE) samples from 5 different institutions. NASH was diagnosed in FFPE samples by at least two expert pathologists following a described histological algorithm (Bedossa et al., Hepatology, 2012). All NASH patients included in the study were HBV- and HCV-negative. Patients reporting alcohol consumption ≥ 20 g/day for women and 30 g/day for men, as well as patients with a known liver disease superimposed to NASH were excluded. Tumour and paired non-tumour gDNA of NASH-HCC FFPE samples was submitted to Whole Exome Sequencing (WES). Exome capture and sequencing library preparation were performed using the SureSelect Human All Exon V5, no UTR hybridization capture kit from Agilent (Target Size 50 Mb). Libraries were sequenced on an Illumina HiSeq 4000 instrument with 100-bp paired-end reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001007525 
   
  
    
    This is 10X single cell RNA-seq data of esophageal adenocarcinoma organoids. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001007526 
   
  
    
    Novel optineurin frameshift insertion causing familial frontotemporal dementia and parkinsonism without amyotrophic lateral sclerosis 
    
   
  
    
   
  4 
 
  
    EGAD00001007527 
   
  
    
    SPECTA Lung - RP1335 14MG cram files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001007528 
   
  
    
    RNAseq from 13 uveal melanoma patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  13 
 
  
    EGAD00001007529 
   
  
    
    November 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001007530 
   
  
    
    RNAseq data of total TXVI samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  227 
 
  
    EGAD00001007531 
   
  
    
    Comparative transcriptome of CD34+ hematopoietic progenitors from 4 myeloproliferative patients (MPN) and 4 control donors performed by RNA-Sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001007532 
   
  
    
    A molecular signature for IL-10-producing Th1 cells in protozoan parasitic diseases 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001007533 
   
  
    
    sn-RNAseq profiling of the impact of a cytokine storm model in human cardiac organoids 
    
   
  
    
      
      NextSeq 550 
      
    
   
  2 
 
  
    EGAD00001007563 
   
  
    
    We analyzed chromothripsis in 252 human breast cancers from two patient cohorts (149 metastatic breast cancers, 63 untreated primary tumors, 29 local relapses, 11 longitudinal pairs) using whole-genome and whole-exome (paired) sequencing. A lot of the WGS samples were sequenced on Illumina HiSeq X-Ten using Illumina TruSeq Nano DNA. For exome sequencing Agilent_SureSelect_V5+UTRs has been used (sequencing on Hiseq2000, Hiseq2500 and Hiseq4000). For exome sequencing Agilent_SureSelect_V5+UTRs has been used (sequencing on Hiseq2000, Hiseq2500 and Hiseq4000). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  516 
 
  
    EGAD00001007564 
   
  
    
    G3BP2-KIT drives leukemia amenable to kinase inhibition in Ph-like ALL: RNAseq for human sample 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001007565 
   
  
    
    We performed genetic analysis of HLA and immune escape genes in samples from 44 patients sequenced by whole exome sequencing (34 tumor samples, 32 normal samples) and whole genome sequencing (10 tumor samples, 12 normal samples). We also performed HLA targeted sequencing in 26/44 patients (26 tumor samples, 26 normal samples). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  139 
 
  
    EGAD00001007566 
   
  
    
    Matrix of gene x sample RNAseq read count data. 
    
   
  
    
   
  97 
 
  
    EGAD00001007567 
   
  
    
    Sample and clinical data from the Idiopathic Pulmonary Fibrosis Core Biopsy Study, including disease group, sex, diagnosis, and sample location. 
    
   
  
    
   
  97 
 
  
    EGAD00001007568 
   
  
    
    RNA-Seq fastq files for 97 Idiopathic Pulmonary Fibrosis samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  97 
 
  
    EGAD00001007569 
   
  
    
    RNAseq data set, Degradation of Janus kinases in CRLF2-rearranged acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001007570 
   
  
    
    WGS data set of 11 xenograft samples,  Degradation of Janus kinases in CRLF2-rearranged acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001007571 
   
  
    
    Background and Aims: Homologous recombination deficiency (HRD) in pancreatic ductal adenocarcinoma (PDAC), remains poorly defined beyond germline(g) alterations in BRCA1, BRCA2 and PALB2. Methods: We interrogated whole genome sequencing (WGS) data on 391 patients including 49 carriers of pathogenic variants (PVs) in g_BRCA_ and PALB2. HRD classifiers were applied to the dataset and included: 1) the genomic instability score (GIS) used by Myriad MyChoice HRD assay; 2) substitution base signature 3 (SBS3); 3) HRDetect; and, 4) Structural Variant (SV) burden. Clinical outcomes and responses to chemotherapy were correlated with HRD status. Results: Biallelic tumour inactivation of g_BRCA_ or PALB2 was evident in 43/49 germline carriers identifying HRD-PDAC. HRDetect (score ?0.7) predicted gBRCA1/PALB2 deficiency with highest sensitivity (98%) and specificity (100%). HRD genomic tumour classifiers suggested that 7-10% of PDAC that do not harbor g_BRCA/PALB2_ have features of HRD. Of the somatic HRDetect cases, 69% were attributed to alterations in BRCA1/2, PALB2, RAD51C/D and XRCC2, and a tandem duplicator phenotype. TP53 loss was more common in BRCA1- compared to BRCA2-associated HRD-PDAC. HRD status was not prognostic in resected PDAC. However in advanced disease the GIS (p=0.02), SBS3 (p=0.03) and HRDetect score (p=0.005) were predictive of platinum response and superior survival. PVs in g_ATM_ (n=6) or g_CHEK2_ (n=2) did not result in HRD-PDAC by any of the classifiers. In four patients, BRCA2 reversion mutations associated with platinum resistance. Conclusions: Germline and parallel somatic profiling of PDAC outperforms germline testing alone in identifying HRD-PDAC. An additional 7-10% of patients without gBRCA/PALB2 mutations may benefit from DNA damage response agents. 
    
   
  
    
   
  - 
 
  
    EGAD00001007572 
   
  
    
    This data set contains single cell transcriptomes generated using the chromium 10X platform from both fresh cells and nuclei.  The samples measured are derived from children with Wilms tumour, Clear Cell Sarcoma of the Kidney (CCSK), or Malignant Rhabdoid Tumours. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  55 
 
  
    EGAD00001007573 
   
  
    
    Purpose
Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma.
Patients and Methods
Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS.
Results
In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature.
Conclusions
Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  122 
 
  
    EGAD00001007574 
   
  
    
    ctDNA data from IMvigor010: ctDNA data include TMB_status, cfDNA_extracted_ng, Plasma_Volume_Used_ml, Sample_Call, Sample_Number_Positive_Calls, Sample_Mean_VAF_In_Plasma (%), Sample_MTM_per_mL_In_Plasma for 581 patients across IMvigor010. 
    
   
  
    
   
  - 
 
  
    EGAD00001007575 
   
  
    
    RNAseq FASTq files from 728 bulk pre-treatment tumors from IMvigor010. 
    
   
  
    
      
      unspecified 
      
    
   
  728 
 
  
    EGAD00001007576 
   
  
    
    Clinical data from IMvigor010: Clinical data include race, sex, baseline ecog, tumor stage, node status, prior neoadjuvant, PD-L1 status, number lymph nodes resected, pridis, arm, overall survival and disease free survival for 809 patients across IMvigor010. 
    
   
  
    
   
  - 
 
  
    EGAD00001007577 
   
  
    
    RNAseq samples from the iAMP21 study 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      NextSeq 550 
      
    
   
  88 
 
  
    EGAD00001007578 
   
  
    
    ctDNA guiding adjuvant immunotherapy in urothelial carcinoma (Nature 2021). ABACUS clinical data cut (2020) used in 2021 paper: Clinical data include PCR, RFS_months, RFS_event, and OUTCOME for 40 patients in ABACUS with ctDNA data. 
    
   
  
    
   
  - 
 
  
    EGAD00001007579 
   
  
    
    ctDNA guiding adjuvant immunotherapy in urothelial carcinoma (Nature 2021). ABACUS ctDNA data used in 2021 paper: ctDNA data include TMB_(mut/Mb), cfDNA_extracted_ng, Plasma_Volume_Used_ml, Sample_Call, Sample_Number_Positive_Calls, Sample_Mean_VAF_In_Plasma (%), Sample_MTM_per_mL_In_Plasma across 40 patients in ABACUS. 
    
   
  
    
   
  - 
 
  
    EGAD00001007580 
   
  
    
    The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)3, v0.7.15-r1140. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  209 
 
  
    EGAD00001007581 
   
  
    
    Sample libraries were prepped using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned against the human genome (hg19) using STAR v2.5.4b2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  209 
 
  
    EGAD00001007582 
   
  
    
    ChIP-seq data were generated for a number of selected patients to investigate changes in enhancer and promoter regions. ChIP was performed as described previously with slight modifications27. Briefly, cells were crosslinked with 1% formaldehyde for 10 minutes at room temperature and the reaction was quenched with glycine at a final concentration of 0.125 M. Chromatin was sheared using the Covaris S220 focused-ultrasonicator to an average size of 250–350 bp. A total of 2.5 µg of antibody against H3K27ac (Abcam, ab4729) was added to sonicated chromatin of 2 × 106 cells and incubated overnight at 4 °C. Protein A sepharose beads (GE healthcare) were added to the ChIP reactions and incubated for 2 h at 4 °C. Beads were washed and chromatin was eluted. After crosslink reversal, RNase A and proteinase K treatment, DNA was extracted with the Monarch PCR & DNA Cleanup kit (NEB). Sequencing libraries were prepared with the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB) according to the manufacturer’s instructions. The quality of dsDNA libraries was analyzed using the High Sensitivity D1000 ScreenTape Kit (Agilent) and concentrations were assessed with the Qubit dsDNA HS Kit (Thermo Fisher Scientific). Libraries were single-end sequenced on a HiSeq 4000 (Illumina). ChIP-seq reads were aligned to the human reference genome build hg19 with bowtie 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001007583 
   
  
    
    ATAC-seq data were generated for a number of selected patients to investigate changes in enhancer and promoter regions. ATAC-seq was essentially carried out as described in31. Briefly, prior to transposition the viability of the cells was assessed and 1 × 106 cells were treated in culture medium with DNase I (Sigma) at a final concentration of 200 U ml−1 for 30 minutes at 37 °C. After Dnase I treatment, cells were washed twice with ice-cold PBS, and cell viability and the corresponding cell count were assessed. 5 × 104 cells were aliquoted into a new tube and spun down at 500 × g for 5 minutes at 4 °C, before the supernatant was discarded completely. The cell pellet was resuspended in 50 µl of ATAC-RSB buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2) containing 0.1% NP-40, 0.1% Tween-20, and 1% Digitonin (Promega), and was incubated on ice for 3 minutes to lyse the cells. Lysis was washed out with 1 ml of ATAC-RSB buffer containing 0.1% Tween-20. Nuclei were pelleted at 500 × g for 10 minutes at 4 °C. The supernatant was discarded carefully and the cell pellet was resuspended in 50 µl of transposition mixture (25 µl 2× tagment DNA buffer, 2.5 µl transposase (100 nM final; Illumina), 16.5 µl PBS, 0.5 µl 1% digitonin, 0.5 µl 10% Tween-20, 5 µl H2O) by pipetting up and down six times. The reaction was incubated at 37 °C for 30 minutes with mixing before the DNA was purified using the Monarch PCR & DNA Cleanup Kit (NEB) according to the manufacturer’s instructions. Purified DNA was eluted in 20 µl elution buffer (EB) and 10 µl purified sample was objected to a ten-cycle PCR amplification using Nextera i7- and i5-index primers (Illumina). Purification and size selection of the amplified DNA were carried out with Agencourt AMPure XP beads. For purification the ratio of sample to beads was set to 1:1.8, whereas for size selection the ratio was set to 1:0.55. Purified samples were eluted in 15 µl of EB. Quality and concentration of the generated ATAC libraries were analyzed using the High Sensitivity D1000 ScreenTape Kit (Agilent) and libraries were sequenced paired-end on a NovaSeq (Illumina).
ATAC-seq reads were aligned to the human reference genome build hg19 with bowtie2 
    
   
  
    
      
      NextSeq 500 
      
    
   
  64 
 
  
    EGAD00001007585 
   
  
    
    The file data.RData contains data objects needed to run the .rmd template that generates the manuscript's figures. These include ExpressionSets for the training and test sets, module content and annotations, PCA results, the lasso58 signature with coefficients, and Foundation Medicine CDKN2A copy-number alteration data. 
    
   
  
    
   
  1651 
 
  
    EGAD00001007586 
   
  
    
    This submission consists of 15 volunteer FASTQs, split by chemistry:
Chemistry v1.1: 7 FASTQ sets, one set for each volunteer (RA1-7)
Chemistry v2: 16 FASTQ sets, two for each volunteer (RA8-15; FASTQ set 1: Gene Expression (GEX); FASTQ set 2: TCR enrichment (VDJ)) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  7 
 
  
    EGAD00001007587 
   
  
    
    In this prospective study, targeted deep sequencing was performed on a total of 160 primary tumors (474 regions) and 112 lymph nodes from 125 patients with stage I-III lung cancer (LuCaTH). Progressive evolution at clonal divergence scale was observed while specific driver events were positively selected for clonal sweeps during tumor development. Between-region genetic divergence (BRGD) of tumors were assessed and positively correlated with tumor differentiation. A machine learning algorithm was employed to evaluate clinicopathological and molecular parameters of primary tumors underlying lymph node metastasis.  By analyzing clonal lineages and metastatic trajectories across multiple nodal stations, we unraveled a common sequential LNM seeding pattern but with divergent modes of clonal spread. 
    
   
  
    
   
  760 
 
  
    EGAD00001007589 
   
  
    
    Contains 30x coverage whole genome sequence data from 40 HIV+ South Africans. Samples sequenced using Illumina HiSeq. BAM files have been uploaded. Sequenced at Edinburgh Genomics. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  40 
 
  
    EGAD00001007591 
   
  
    
    Retinoblastoma is a rare childhood cancer of the retina. We studied retinoblastoma by Whole-Exome-Sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  182 
 
  
    EGAD00001007592 
   
  
    
    Whole genome sequencing for single cells for library A108732A 1139 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007593 
   
  
    
    Whole genome sequencing for single cells for library A108735A 714 cells; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001007594 
   
  
    
    Whole genome sequencing for single cells for library A108833B 866 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007595 
   
  
    
    Whole genome sequencing for single cells for library A108842A 1477 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007596 
   
  
    
    Whole genome sequencing for single cells for library A108851B 1681 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007597 
   
  
    
    Whole genome sequencing for single cells for library A108863A 1301 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007598 
   
  
    
    Whole genome sequencing for single cells for library A108870A 1795 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007599 
   
  
    
    Whole genome sequencing for single cells for library A110618A 1184 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007600 
   
  
    
    Whole genome sequencing for single cells for library A110618B 1047 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007601 
   
  
    
    Whole genome sequencing for single cells for library A110621A 1137 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007602 
   
  
    
    Whole genome sequencing for single cells for library A110673A 1089 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007603 
   
  
    
    Whole genome sequencing for single cells for library A110673B 1153 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007604 
   
  
    
    Whole genome sequencing for single cells for library A118830A 912 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007605 
   
  
    
    Whole genome sequencing for single cells for library A118862A 1070 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007606 
   
  
    
    Whole genome sequencing for single cells for library A118862B 1152 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007607 
   
  
    
    Whole genome sequencing for single cells for library A95623A 1897 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007608 
   
  
    
    Whole genome sequencing for single cells for library A95623B 1363 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007609 
   
  
    
    Whole genome sequencing for single cells for library A95668B 1905 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007610 
   
  
    
    Whole genome sequencing for single cells for library A95697B 1233 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  20 
 
  
    EGAD00001007611 
   
  
    
    Whole genome sequencing for single cells for library A95700A 1172 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007612 
   
  
    
    Whole genome sequencing for single cells for library A95703A 740 cells; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001007613 
   
  
    
    Whole genome sequencing for single cells for library A95720A 1467 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007614 
   
  
    
    Whole genome sequencing for single cells for library A95730A 739 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007615 
   
  
    
    Whole genome sequencing for single cells for library A96114A 798 cells; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001007616 
   
  
    
    Whole genome sequencing for single cells for library A96124B 1241 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007617 
   
  
    
    Whole genome sequencing for single cells for library A96155C 1316 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007618 
   
  
    
    Whole genome sequencing for single cells for library A96162A 1505 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007619 
   
  
    
    Whole genome sequencing for single cells for library A96183B 1191 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007620 
   
  
    
    Whole genome sequencing for single cells for library A96190A 1833 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007621 
   
  
    
    Whole genome sequencing for single cells for library A96226B 1274 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007622 
   
  
    
    Whole genome sequencing for single cells for library A96233A 637 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007623 
   
  
    
    Whole genome sequencing for single cells for library A96233B 1601 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007624 
   
  
    
    Whole genome sequencing for single cells for library A98168A 1294 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007625 
   
  
    
    Whole genome sequencing for single cells for library A98168B 1290 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007626 
   
  
    
    Whole genome sequencing for single cells for library A98234A 1240 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007627 
   
  
    
    Whole genome sequencing for single cells for library A98234B 1600 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007628 
   
  
    
    Whole genome sequencing for single cells for library A98244A 1271 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007629 
   
  
    
    Whole genome sequencing for single cells for library A98253A 1250 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007630 
   
  
    
    Whole genome sequencing for single cells for library A98253B 1388 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007631 
   
  
    
    Whole genome sequencing for single cells for library A98254A 1490 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007632 
   
  
    
    Whole genome sequencing for single cells for library A98255B 1394 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007633 
   
  
    
    Whole genome sequencing for single cells for library A98271A 1190 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007634 
   
  
    
    Whole genome sequencing for single cells for library A98289A 1066 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007635 
   
  
    
    Whole genome sequencing for single cells for library A98290B 869 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007636 
   
  
    
    Whole genome sequencing for single cells for library A98304A 1560 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001007637 
   
  
    
    Whole genome sequencing for single cells for library A98305A 986 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007638 
   
  
    
    sequencing of libraires enriched for viral particles 
    
   
  
    
      
      NextSeq 550 
      
    
   
  33 
 
  
    EGAD00001007639 
   
  
    
    Sequencing of fecal metagenomic libraries 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  33 
 
  
    EGAD00001007640 
   
  
    
    Whole genome sequencing for single cells for library A96120A 1143 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  11 
 
  
    EGAD00001007641 
   
  
    
    Whole genome sequencing for single cells for library A98172B 941 cells; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001007642 
   
  
    
    We determined intra- and inter-vascular transcriptional heterogeneity within the circulatory system using single-cell RNA-sequencing of 113 CTCs isolated from four key vascular sites along their dissemination route in ten HCC patients. The dataset consists of 146 paired-end fastq files from 113 sigle CTCs, HuH-7 cell line, Hep3B cellline, white blood cell, tumor and normal bulk tissue. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  146 
 
  
    EGAD00001007644 
   
  
    
    Whole exome sequencing data obtained from sorted leukemic clones and a buccal swab as germline reference. The dataset contains raw sequencing data (paired-end reads) for 3 samples (2 leukemic, 1 buccal swab), for a total of 6 fastq files. 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001007645 
   
  
    
    The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)3, v0.7.15-r1140. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001007646 
   
  
    
    Sample libraries were prepped using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned against the human genome (hg19) using STAR v2.5.4b2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001007647 
   
  
    
    Bam files of unmapped sequencing reads of samples in EGAD00001007002. When utilized in combination with corresponding aligned bam files in dataset EGAD00001007002, they contain all sequencing reads of samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001007648 
   
  
    
    The dataset contains phenotypes of 14 melanoma biopsies sequenced in connection with the study UV1-hTERT-mm, where the thereapeutic cancer vaccine UV1 is combined with ipilimumab in treatment of melanoma patients. 
    
   
  
    
   
  1 
 
  
    EGAD00001007649 
   
  
    
    Exome sequences of three unrelated individuals of south Asian ancestry from the EXCEED study 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001007650 
   
  
    
    A dataset of ER+PR+ breast tumor samples that were analyzed in order to identify mutation enrichment in cis-regulatory elements and cistrome. The repository includes sequencing data from 88 patients. 26/88 were sequenced using ATAC-seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  26 
 
  
    EGAD00001007651 
   
  
    
    Three primary adult glioblastoma specimens were dissociated and nuclei were extracted. A portion of the nuclei was used for single-cell ATAC seq and the remainder were submitted for whole genome sequencing, to provide orthogonal validation of copy number variations in the samples compared to single-cell ATAC seq. Samples were sequenced on the Illumina NovaSeq 6000 in paired-end mode. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001007652 
   
  
    
    R code to run analyses on anonymized data from ABACUS for IMvigor010 ctDNA publication. 
    
   
  
    
   
  - 
 
  
    EGAD00001007653 
   
  
    
    TPM matrices of counts from RNAseq data for IMvigor010 from bulk pre-treatment tumors. 
    
   
  
    
   
  - 
 
  
    EGAD00001007654 
   
  
    
    R code to run analyses on anonymized data from IMvigor010 ctDNA publication. 
    
   
  
    
   
  - 
 
  
    EGAD00001007655 
   
  
    
    This dataset includes all FASTQ files for 11 samples where different capture-based methods for transcriptome profiling have been tested. Specifically, we have the 'traditional' RNA-seq experiment with fresh frozen (FF) material, and 3 different capture methods for the matching formalin-fixed paraffin-embedded samples: Agilent (AGI), IDT, and Twist Biosciences (TBS). 
In total, there are 43 samples with paired-end FASTQ files (1 sample did not have sufficient material to test all methods). 
Samples are identified by the R01-R11 IDs with a suffix that indicates the capture method used (or FF for fresh frozen). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  43 
 
  
    EGAD00001007656 
   
  
    
    Single cell sequencing analysis for Dengue patients using Smart-seq2 and 10X platforms 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD00001007657 
   
  
    
    Targeted sequencing was applied to an unselected population-based follicular lymphoma (FL) cohort  (n=548) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres).
DNA extracted from FL samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500.  All data are provided in the CRAM format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  548 
 
  
    EGAD00001007658 
   
  
    
    Targeted sequencing was applied to an unselected population-based Burkitt lymphoma cohort  (n=39) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres).
DNA extracted from tumour samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500.  All data are provided in the CRAM format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001007660 
   
  
    
    Treg and Tfh cells (8 Samples)from the same donors was subject to ATAC-seq processing. Paired end fastq-files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001007661 
   
  
    
    Regulatory Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 17 samples were processed. Fastq files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  17 
 
  
    EGAD00001007662 
   
  
    
    Treg and Tfh cells (8 Samples)from the same donors was subject to RNA-seq (TruSeq Stranded Total RNA)  processing. Single end fastq-files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001007663 
   
  
    
    Tcells were isolated fom Tissues and FACS sorted. In total 50 samples were processed in 5 replicates with the Takara SmartSeq Stranded kit. Single end fastq-files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  50 
 
  
    EGAD00001007664 
   
  
    
    Tcells were isolated fom Tissues and FACS sorted. In total 43 samples were processed in 5 replicates with the Takara SmartSeq Stranded kit. Single end fastq-files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  43 
 
  
    EGAD00001007665 
   
  
    
    Regulatory Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM Single Cell 5' v2  sequencing. In total 17 samples were processed. Fastq files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  17 
 
  
    EGAD00001007666 
   
  
    
    Whole exome sequnecing of upper urinary tract urothelial carcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  467 
 
  
    EGAD00001007667 
   
  
    
    RNA sequencing of upper urinary tract urothelial carcinoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  166 
 
  
    EGAD00001007669 
   
  
    
    This dataset contains 139 Tumor and Control WGS files for samples for Gerhauser et al.,Cancer Cell, 2018, 34:996-1011. WGS and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  141 
 
  
    EGAD00001007670 
   
  
    
    RNAseq data set of BCL11B, 519 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  506 
 
  
    EGAD00001007671 
   
  
    
    Paired-end BAM files with associated index files of 15 AML samples were generated using Agilent SureSelect XT library capture system using a custom panel as described before Takahashi et al Blood 2018. Libraries were sequenced using Illumina HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001007672 
   
  
    
    Pared-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics 5' single cell RNA sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina HiSeq 4000 using recommended cycling parameters. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001007673 
   
  
    
    Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell BCR sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina NovaSeq 6000 using recommended cycling parameters. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001007674 
   
  
    
    Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell TCR sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina NovaSeq 6000 using recommended cycling parameters. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001007675 
   
  
    
    Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell ATAC sequencing following manufacturer's protocol (CG000168 User Guide RevB) for chemistry version 1.0 and targeting 10000 nuclei per sample. Libraries were sequenced using Illumina HiSeq4000 using recommended cycling parameters. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001007676 
   
  
    
    Complete sequence of the mitochondrial DNA of 87 Hessequa-descendant individuals 
    
   
  
    
      
      PacBio RS II 
      
    
   
  87 
 
  
    EGAD00001007677 
   
  
    
    Single-nucleus RNA-sequencing of meningiomas for integrative molecular classification 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001007678 
   
  
    
    Complete reference epigenome (as defined by IHEC) of a SaOS-2 cell line with osteosarcoma. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007679 
   
  
    
    Complete reference epigenome (as defined by IHEC) of a lung epithelial cell line with non-small Cell Lung Adenocarcinoma 
    
   
  
    
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001007680 
   
  
    
    Complete reference epigenome (as defined by IHEC) of normal hTERT RPE1 cell line as well as hTERT RPE1 cell lines engineered to express EPC1-PHF1 and JAZF1-SUZ12 fusion proteins, found in Endometrial Stromal Sarcoma. 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001007682 
   
  
    
    Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 385 older adults, each sampled 2-5 times over ~13 years. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  1269 
 
  
    EGAD00001007683 
   
  
    
    Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 11 older adults, each previously sampled 2-5 times over the preceding ~13 years. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007684 
   
  
    
    Whole-genome sequencing of 288 single-cell-derived blood colonies from 3 elderly individuals with clonal haematopoiesis. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001007685 
   
  
    
    HTG EdgeSeq fastq files of bulk baseline tumors from IMbassdor250: a randomised phase 3 trial comparing atezolizumab with enzalutamide vs enzalutamide alone in patients with metastatic castration-resistant prostate cancer. 
    
   
  
    
      
      unspecified 
      
    
   
  400 
 
  
    EGAD00001007686 
   
  
    
    Samples encompass primary colorectal tumors, adjacent normal colonic mucosa and metastasis of 12 patients, collected by Medical Pathologists from surgically removed specimens. Normal mucosa samples were taken more than 2 cm away from the tumor. Tissues were embedded in optimal cutting temperature (OCT) medium, snapshot frozen in liquid nitrogen within 40 minutes of collection and preserved at -80ºC. Samples were collected between June 2010 and October 2017 as part of a prospective biobanking project. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001007687 
   
  
    
    We performed a prospective investigation in Sézary syndrome by the application of a standardized multiparameter flow cytometry, FACS-cell sorting, and RNA-sequencing for an in-depth immunophenotypic and transcriptional profiling of Sézary cells. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001007688 
   
  
    
    mRNA capture sequencing data of 57 seminal plasma samples. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  57 
 
  
    EGAD00001007689 
   
  
    
    Low-input RNA-Seq (~200 cells per sample) of CD4+ T cells in patients with kidney transplants, dialysis, or healthy controls after the second dose of Tozinameran COVID-19 vaccine. RNA-Seq was performed using the Smart-Seq v4 ultra-low input protocol. The dataset includes 4 healthy control samples, 3 kidney transplant recipients, and 4 patients undergoing dialysis. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001007690 
   
  
    
    ATAC-seq 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001007691 
   
  
    
    Follicular lymphoma (FL) is an indolent cancer of mature B-cells but carries increased risk of transformation to a more aggressive histology over time. We present here comprehensive profiling both tumor and immune compartments in 6 diagnostic FL biopsies by single-cell RNA sequencing. This confirmed results from 155 FL tumors characterized by mass cytometry (CyTOF) which revealed two distinct evolutionary trajectories with disparate risk of transformation and alternate biologies. 
    
   
  
    
   
  6 
 
  
    EGAD00001007692 
   
  
    
    Dataset consists of 19 bam files from RNA sequencing experiments batch1 and batch2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  19 
 
  
    EGAD00001007693 
   
  
    
    64 left atrial appendages from patients without atrial fibrillation (AF) undergoing cardiac surgery, patients with paroxysmal AF and with persistent AF (~20 per group). Trizol RNA isolation, rRNA depletion, paired transcriptome sequencing on illumina NovaSeq 6000. Provided are FastQ and BAM files. Additional data (e.g. clinical characteristics, RIN values etc.) can be provided upon reasonable request. The same RNA samples (62 out of 64) were used for miRNA sequencing. Results from miRNA seuqencing are stored in the EGA database managed by the same DAC. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001007695 
   
  
    
    We analyzed the T-cell receptor (TCR) repertoires from twelve kidney transplant recipients. Six out of the twelve kidney transplant recipients experienced a cellular rejection after kidney transplantation.   TCR repertoires of CD4+ and CD8+ positive T-cells were assessed prior to transplantation and after transplantation at time of allograft biopsy using RNA based T-cell receptor beta next generation sequencing (NGS). In addition, the pre-formed alloreactive TCR repertoire for each kidney transplant recipient was identified using mixed lymphocyte reaction and donor reactive T-cells were subjected to TCR beta sequencing.  In two out of the six patients with cellular rejection the TCR repertoire of graft infiltrating T-cells was additionally captured. This dataset comprises a total of 98 samples. NGS TCR beta libraries of all samples were sequenced on an Illumina NextSeq 500 and raw sequencing data (in the form of fastq files) as well assembled clonotypes and their counts (in the form of clonotype tables) are provided. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  98 
 
  
    EGAD00001007696 
   
  
    
    mRNA capture sequencing data (FASTQ files) of 28 FFPE sarcoma tumors 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001007697 
   
  
    
    mRNA capture sequencing and small RNA sequencing data (FASTQ files) of the exRNAQC study phase 1. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  276 
 
  
    EGAD00001007698 
   
  
    
    Profiling of 24 human anterior cingulate cortex samples by bulk-tissue RNA-sequencing. Samples were derived from 5 non-neurological control individuals and 19 individuals with Lewy body disease (Parkinson’s disease = 7 individuals; Parkinson’s disease with dementia = 6 individuals; dementia with Lewy bodies = 6 individuals). Paired-end FASTQ files for each of the human samples are provided and are denoted by the suffixes R1 (read 1) and R3 (read 2). Fastp (v 0.20.0), a fast all-in-one FASTQ pre-processor, was used for adapter trimming, read filtering and base correction. Fastp default settings were used for quality filtering and base correction. Further details on parameters used are available here: https://github.com/RHReynolds/RNAseqProcessing/blob/master/QC/prealignmentQC_fastp_PEadapters.R. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001007700 
   
  
    
    Lowpass whole genome sequencing of 43 single CTCs and one tumor biopsy 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  44 
 
  
    EGAD00001007701 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001007702 
   
  
    
    Whole exome sequencing of FNH and two HCC compartments from a single patient, along with matched germline. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001007703 
   
  
    
    This dataset includes full tumor transcriptomes from 891 advanced NSCLC tumors. These data originate from pre-treatment samples from two large randomized clinical trials for second-line non-small cell lung cancer (POPLAR and OAK). The patients in these trials were treated with either the PD-L1 inhibitor atezolizumab or chemotherapy. 
    
   
  
    
      
      unspecified 
      
    
   
  891 
 
  
    EGAD00001007704 
   
  
    
    To estimate the contribution of early embryogenic cell lineages in adult tissues, we performed deep targeted sequencing on 379 bulk tissues from various organs of the five individuals (DB3, DB6, DB8, DB9, DB10). Of the 441 early embryonic mutations targeted, 411 mutations could have high-quality baits designed for them. DNA libraries were prepared by SureSelectXT Library Prep Kit (Agilent), hybridized to the appropriate capture panel, multiplexed on flow cells, and subjected to paired-end sequencing (150-bp reads) on the NovaSeq 6000 platform (Illumina) with a mean ~2,900x depth of coverage for the early mutations. Sequence reads were trimmed and mapped to the human reference genome (GRCh37) using the BWA-MEM algorithm. 
    
   
  
    
      
      unspecified 
      
    
   
  379 
 
  
    EGAD00001007705 
   
  
    
    This is a oral microbiota amplicon dataset derived from adult participants aged over 65 years old.
It consists a total number of 491 samples, stored in 982 paired-end FASTQ files with sequence lengths of 200 nucleotides generated with a Illumina MiSeq. Of those 491 samples, 347 were used in analysis, the remaining 144 samples are control samples. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  491 
 
  
    EGAD00001007706 
   
  
    
    WGS TAML samples with their remission controls obtained from bone marrow. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001007707 
   
  
    
    This dataset includes ChIP-seq data for H3K27ac and H3K4me1 on 20 paired samples of colorectal cancer and adjacent normal mucosa. One tumor sample that failed QC is not available. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001007708 
   
  
    
    We performed deep targeted DNA sequencing  with a panel of 134 selected cancer-related genes previously identified to be recurrently mutated in BL/B-AL. A Nextera rapid capture custom kit was designed using the Illumina DesignStudio. For every gene, all regions in which previous mutations had been described were covered. For 40 of these genes, the entire coding sequence was covered. Targeted deep sequencing was performed on a MiSeq-Sequencer using MiSeq Reagent Kit v2 (300 cycle) with 24 samples per run. There are 396 samples in this targetedDNAseq-dataset with 298 tumors (288 patients and 10 cell lines) and 98 normals. There are 132 female, 262 male, and 2 unknown gender samples in this dataset. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  396 
 
  
    EGAD00001007709 
   
  
    
    We collected saliva samples from three nuclear families having 4, 5, and 7 children, respectively. One child in each nfamily had been diagnosed with a pediatric tumor, and neither parent had been diagnosed with cancer. Diagnoses included Wilms tumor, low-grade astrocytoma, and Burkitt’s lymphoma, respectively. We used whole-genome sequencing to profile normal cells from each family member and a linked-read technology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  22 
 
  
    EGAD00001007710 
   
  
    
    Transcriptomic sequencing on pre-immunotherapy melanoma patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  16 
 
  
    EGAD00001007711 
   
  
    
    Control cohort of lymphoma samples sequenced with a hybrid capture panel designed to be able to detect translocations and mutations in lymphoma samples.
used in the paper "Robust detection of translocations in lymphoma FFPE samples using Targeted Locus Capture-based sequencing" 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  19 
 
  
    EGAD00001007712 
   
  
    
    This dataset contains two sets of samples. The reference sample set consists of a total of 669 samples that had been reported previously to be euploid by the NIPTIFY screening test. The validation sample set is based on a previously published validation study by Zilina et al. (1), consisting of 423 samples, of which 259 were high-risk pregnancies that had undergone diagnostic invasive prenatal analysis (1). 
All samples were sequenced with Illumina NextSeq 500 platform, producing 85 bp single-end reads with an average per-sample coverage of 0.32× at the University of Tartu, Institute of Genomics Core Facility, according to the manufacturer’s standard protocols, as described previously (1). This study was performed with the approval of the Research Ethics Committee of the University of Tartu (#315/T-13).
1. Zilina O, Rekker K, Kaplinski L, Sauk M, Paluoja P, Teder H, et al. Creating basis for introducing non‐invasive prenatal testing in the Estonian public health setting. Prenat Diagn [Internet]. 2019 Dec 6;39(13):1262-8. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/pd.5578 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1092 
 
  
    EGAD00001007713 
   
  
    
    Paired fastq files ( 12 pairs, WES) of EGFR treated and untreated PDX models of mCRCs of 2 patiens sequenced on Illumina HiSeq 2000, the enrichment kit was Agilent SureSelect V5+UTRs. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001007714 
   
  
    
    Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event.  Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F,  as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  1029 
 
  
    EGAD00001007715 
   
  
    
    Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event.  Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F,  as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  57 
 
  
    EGAD00001007716 
   
  
    
    The data consists of 3 BAM files. Two of three BAMs are tumour FFPE samples (1 repaired-FFPE; 1 unrepaired-FFPE); The other BAM file is sequenced from normal colon tissue 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001007717 
   
  
    
    The data contain whole transcriptome sequencing of 499 Greenlanders. 
    
   
  
    
      
      unspecified 
      
    
   
  499 
 
  
    EGAD00001007718 
   
  
    
    Single-cell data gene expression data set (5’Chromium 10X) of healthy paediatric volunteers, and paediatric and adult COVID-19 patients. Gene expression was determined from samples of nasal, tracheal and bronchial brushings and blood (PBMCs). In addition to gene expression, PBMC’s were assayed by CITE-seq. A subset of samples have VDJ sequencing data for T cell receptors  (TCR) and B cell receptors (BCR). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  268 
 
  
    EGAD00001007722 
   
  
    
    DNA was extracted from fresh frozen LMS material for 29 untreated tumors (24 primary tumors, 5 metastatic 13 relapses) and 13 tumors treated with radiation (7 primaries, 6 metastatic relapses).
DNA from matched blood was used as a normal reference. Whole-genome sequencing was performed using established protocols on Illumina instruments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  87 
 
  
    EGAD00001007723 
   
  
    
    We included 3 BAM files of the genome sequencing data: 2 of 3 are from tumour samples, namely 1 repaired-FFPE and 1 unrepaired FFPE; the third BAM file is from normal tissue of FFPE block.  There is also a VCF file containing all somatic mutations in the dataset. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001007724 
   
  
    
    Full clinical data for a cohort of 199 individuals with acute coronary syndrome.
Untargeted serum metabolomics using the Metabolon platform for individuals with ACS (n=156).
Serum metabolomics using the Nightingale Health (NMR) platform for individuals with ACS and controls (ACS, n=191; controls, n=961). 
    
   
  
    
   
  1 
 
  
    EGAD00001007725 
   
  
    
    16S rRNA gene V3-V4 region sequenced from 21 saliva samples of BaYaka hunter-gatherer from Congo. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  21 
 
  
    EGAD00001007726 
   
  
    
    16S rRNA gene V3-V4 region sequenced from 148 saliva samples of Agta hunter-gatherers from Philippines. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  148 
 
  
    EGAD00001007727 
   
  
    
    16S rRNA gene V3-V4 region sequenced from 15 saliva samples of farmers from Palanan (Philippines) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  15 
 
  
    EGAD00001007728 
   
  
    
    This dataset includes 406 samples from non small cell lung cancer patients treated with neoadjuvant anti-PD-1. The Single Cell 5’ V(D)J and 5’ DGE kits (10X Genomics) were used to capture immune repertoire information and gene expression from the same cell in an emulsion-based protocol at the single cell level. Libraries were generated and sequenced on an Illumina NovaSeq instrument using 2x150bp paired end sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  406 
 
  
    EGAD00001007729 
   
  
    
    Sex, age at recruitment (2014-2018), and birthdate of GCAT Cohort individuals. 
    
   
  
    
   
  19329 
 
  
    EGAD00001007730 
   
  
    
    First 20 principal components of 4988 genotyped GCAT Cohort individuals with Infinium Multi-Ethnic Global (MEGAEX2) array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE). 
    
   
  
    
   
  4988 
 
  
    EGAD00001007731 
   
  
    
    Disease diagnoses of GCAT Cohort participants obtained from electronic health records (EHR), mainly including the time period from 2012 to 2017. Disease diagnoses are codified in ICD-9, and the position of diagnosis refers to primary/secondary diagnoses (up to 14 secondary diagnoses per visit). The date and origin of the visit are also specified (AP: primary care, UGR: emergency, AH: hospital care, SMA: outpatient medical service, SMH: hospital medical service). 
    
   
  
    
   
  17155 
 
  
    EGAD00001007732 
   
  
    
    Two DNA samples extracted from GM09237 cell line cultured with either normal medium or medium with no folic acid for 5 days were sequenced using the BGISEQ500 platform (BGI whole genome 100 bp paired-end sequencing 60x) as well as PacBio Sequel long read sequencing. 
    
   
  
    
      
      Sequel 
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001007733 
   
  
    
    High-precision human leukocyte antigen (HLA) genotyping is crucial for anti-cancer immunotherapy, but existing tools predicting HLA genotypes using next-generation sequencing (NGS) data are insufficiently accurate. We compared the availability and accuracy of eight HLA genotyping tools (OptiType, HLA-HD, PHLAT, seq2HLA, arcasHLA, HLAscan, HLA*LA, and Kourami) using 1,275 cases from the 1000 Genomes Project data and created a new HLA-genotyping algorithm combining tools. Then, we assessed the new algorithm’s performance in 39 in-house samples with normal whole-exome sequencing (WES) data and polymerase chain reaction–sequencing-based typing (PCR-SBT) results. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  39 
 
  
    EGAD00001007734 
   
  
    
    Whole exome sequencing of cord bloods with activated IL7RA leading to leukemia outgrowth in NSG mice. Four leukemia and two corresponding normal controls were sequenced on an Illumina Nextseq550 sequencer (paired end 2x 150 bp). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001007735 
   
  
    
    Metagenomic sequencing data of human gut microbiome 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  130 
 
  
    EGAD00001007736 
   
  
    
    Our probands A and B are boys-monozygotic twins with the clinical diagnosis of severe intellectual impairment, developmental stagnation, and dysphasia. They were diagnosed at the Department of Medical Genetics and Genomics (University Hospital Brno). Parents provided written informed consent, which was approved by the Research Ethics Committee of Masaryk University and Ethics Committee of University Hospital Brno. Peripheral blood samples were collected in sterile heparinized tubes for cytogenetic analysis. Genomic DNA samples were obtained from 1 ml peripheral blood in EDTA, according to the standard DNA isolation process using the MagNaPure system (Roche Diagnostics, Basel, Switzerland). Quality and quantity were checked using a DeNovix DS-11 Spectrophotometer (DeNovix Inc., Wilmington, DE, USA) and Qubit® 2.0 (Thermo Fisher Scientific, Inc., Waltham, MA, USA). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001007737 
   
  
    
    RNA sequencing from MSTO-211H cell line cultures treated for 72h with vehicle solution, palbociclib 250nM, or abemaciclib 250nM (N = 3, each). RNA-seq prepared using TruSeq Stranded mRNA libraries and sequenced with Illumina HiSeq 4000. Data is in raw fastq format, paired end. Some samples have been split in two lanes, with a final count of 24 fastq files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001007738 
   
  
    
    Whole exome sequencing of 3 patients derived cell lines and patient blood (N = 6 samples), performed using Agillent SureSelect All Exon V5 and Illumina HiSeq 4000. Data is in raw fastq format (N = 10 fastq pairs, 20 files total), as some samples were split between two lanes. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001007739 
   
  
    
    RNA sequencing from 12 xenografts implanted using MSTO-211H cell line and treated with vehicle solution, cisplatin + pemetrexed, or palbociclib (N = 4, each). RNA-seq prepared using TruSeq Stranded mRNA libraries and sequenced with Illumina HiSeq 4000. Data is in raw fastq format, paired end. Some samples have been split in two lanes, with a final count of 34 fastq files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001007740 
   
  
    
    Approximately 200 ng of high-quality of genomic DNA samples were used for library
preparation. DNA libraries were prepared using the Human Core Exome Kit according to
manufacturer’s recommendations (Twist Bioscience, San Francisco, CA, USA) and then
sequenced on Illumina NovaSeq 6000 (Illumina, Inc., san Diego, CA, USA).
Detailed protocol of WES data processing and variant analysis are available in Supplementary
data. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001007741 
   
  
    
    The samples across 17 NHL patient samples with the CD19 CAR-T treatments are sequenced in 10x genomics. Refer to the manuscript supplementary tables and "https://github.com/hwanglab/hwanglab_2021_tigitCarT" for the sequencing sample sheets, the patient clinical information, processed data, and source codes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  109 
 
  
    EGAD00001007742 
   
  
    
    The cytogenetic analysis of probands was performed with a result of normal male karyotypes (46,XY). The microarray analysis on oligonucleotide 180K CGH+SNP microarray platform was then indicated resulting in a detection of a 8q24.23q24.3 duplication (694 kb) in both probands. The family-based real-time PCR confirmed this CNV in both probands and their unaffected mother. Based on the information obtained from databases mentioned above it was classified as likely benign. 
    
   
  
    
      
      Illumina Genome Analyzer 
      
    
   
  2 
 
  
    EGAD00001007743 
   
  
    
    DNA samples were extracted from peripheral blood lymphocytes using commercially available kit (Puregene Core Kit A, Qiagen) according to manufacturer's protocol. Agilent SurePrint G3 Human CGH Microarray 180K platform was used for screening of copy number aberrations (CNAs) using array-CGH protocol recommended by manufacturer (Agilent Technologies), data mining and interpretation of array-CGH results was performed in same manner as in our previously published results. 
    
   
  
    
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001007744 
   
  
    
    Five edited and two unedited organoid clones with one clone prior to editing were paired-end whole genome sequenced using Illumina Novaseq 6000 system.  The reads were mapped to hg38 genome assembly and data is provided as BAM files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001007745 
   
  
    
    35 samples from individuals with colorectal tumors, exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001007746 
   
  
    
    36 samples from individuals with colorectal tumors, whole genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  36 
 
  
    EGAD00001007747 
   
  
    
    48 samples of individuals with rare germline variants in familial multiple myeloma, whole genome sequencing 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007748 
   
  
    
    This dataset includes FASTQ files of low coverage whole genome sequencing of cell free DNA from plasma samples. The samples include 271 plasma samples of patients with an adnexal mass and 125 plasma samples of healthy individuals. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  396 
 
  
    EGAD00001007749 
   
  
    
    WES: (7 samples, BAM files), scRNA-Seq (4 samples, BAM files), TruSight (56 samples, BAM or FASTQ files) 
    
   
  
    
   
  67 
 
  
    EGAD00001007750 
   
  
    
    The dataset contains 2x75bp paired-end sequencing data in DNASE1L3-deficient human subjects. We performed bisulfite sequencing of plasma samples from three DNASE1L3-deficient subjects and one heterozygous parent to investigate how nuclease deficiencies alter plasma cell-free DNA methylation profiles. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001007751 
   
  
    
    The dataset contains sequencing data in wildtype, Dnase1-deficient and Dnase1l3-deficient mice. We performed 2 x 75bp paired-end whole genome bisulfite sequencing of pooled plasma cell-free DNA (cfDNA) and buffy coat genomic DNA. The effects of DNASE1L3 or DNASE1 deficiency on cfDNA methylation was explored in plasma of mice deficient in these nucleases. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  36 
 
  
    EGAD00001007752 
   
  
    
    This dataset contains two experiments. 
1) Single cell RNA-seq of peripheral blood diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not), sequenced with SORT-seq (see cell systems, 2016, doi:10.1016/j.cels.2016.09.002). For some of the patients, multiple indipendent plates were produced (each plate is a sample). Barcode-well correspondence can be found here: https://bitbucket.org/princessmaximacenter/sharq/src/master/data/celseq2_bc384-v4.csv .
2) Single cell RNA-seq of peripheral blood diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not), sequenced with10x Genomics Version 2. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  26 
 
  
    EGAD00001007753 
   
  
    
    This data set includes bam files of WGS of 36 paired lymphomas in immune-privileged sites and normal controls. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  72 
 
  
    EGAD00001007754 
   
  
    
    RNA-sequencing on neuroblastoma PDX model COG-N-519 treated with control miR-1283 and test miR-99b-5p mimics. Three samples from each of the treatment condition were analysed. Next-Seq platform was used for sequencing. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001007755 
   
  
    
    Dataset form patines with retinal dystrophies. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  93 
 
  
    EGAD00001007756 
   
  
    
    OV2295-052021 dataset 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001007758 
   
  
    
    Shallow WGS of neuroblastoma cell lines with large-scale deletions induced through CRISPR-Cas9 and matching controls. Deletion of 11q was induced in the cell line SKNSH and loss of 6q was induced in the cell line NMB. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  13 
 
  
    EGAD00001007759 
   
  
    
    Raw sequencing reads of ATAC-seq of spermatogonia in FASTQ format, comprising 6 samples sequenced on the Illumina HiSeq 4000 platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001007760 
   
  
    
    This dataset contains bam files mapped to hg19 that either were primary bone marrow cells or sorted human cells after long term engraftment in NSG mice. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  229 
 
  
    EGAD00001007761 
   
  
    
    This file contains read identifiers for local CCS, CLR, ONT and MGI reads for each of the eight selected genomic regions (HLA, KIR, IGH, IGK, IGL, TRA, TRD, andTRG). We extracted these reads by aligning whole-genome sequencing data to a draft whole-genome de novo assembly, and selecting reads that map to contigs representing each region. These reads were involved in the polishing and validation of the HV31 assembly. Please refer to the relevant manuscript (https://doi.org/10.1101/2021.02.03.429586) for additional details. Read identifiers are stored in JSON format. Along with the full FASTQ files, this file enables convenient re-analysis of the HV31 sequencing data in the eight selected regions. 
    
   
  
    
   
  1 
 
  
    EGAD00001007762 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  82 
 
  
    EGAD00001007763 
   
  
    
    This dataset includes genome-wide autosomal array data and whole mtDNA sequences for 24 Merchero individuals. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  24 
 
  
    EGAD00001007764 
   
  
    
    Here we present the 1M-scBloodNL study, in which we performed single-cell RNA-seq on 120 individuals of the Northern Netherlands population cohort Lifelines. For each individual peripheral blood mononuclear cells (PBMCs) were sequenced in an unstimulated condition, and after 3 and 24 hour in vitro stimulation with C. albicans (CA), M. tuberculosis (MTB) and P. aeruginosa (PA), totalling approximately 1.3 million cells. scRNA-seq was conducted with the 10X Genomics 3'-end v2 (72 libraries) and v3 (33 libraries) technology. In general, each library contains PBMCs from 8 donors and 2 different stimulation-timepoint combinations. Donors were demultiplexed using a combination of SoupOrCell (https://www.nature.com/articles/s41592-020-0820-1) and genotype information to assign the correct donor to a donor-specific cell cluster. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  988 
 
  
    EGAD00001007765 
   
  
    
    Cellular suspensions (∼15000 cells, with expected recovery of ∼7500 cells) of sorted CD45+ HLA-DR+ CD14+ macrophages from colonic mucosa and muscularis propria were loaded on the 10X Chromium Controller instrument (10X Genomics) according to the manufacturer’s protocol using the 10X GEMCode proprietary technology. All samples from individual patients
were loaded in one batch. The Chromium Single Cell 3´ v2 Reagent kit (10X Genomics) was used to generate the cDNA and prepare the libraries, according to the manufacturer’s protocol.
The libraries were then equimolarly pooled and sequenced on an Illumina NextSeq500 using HighOutput flow cells v2.5. A coverage of 400M reads per sample was targeted, in order to obtain 50 000 reads per cell. The raw data were then demultiplexed and processed with the Cell Ranger software (10X Genomics) v2.1.1. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001007766 
   
  
    
    Human placenta samples from 52: 5 first trimester , 7 second trimester, and 40 term placenta. Data is uploaded as BAM files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  52 
 
  
    EGAD00001007767 
   
  
    
    Exome sequencing was carried out in a tall male (height 3.5 SDS) and his parents (3 samples). The data was sequences on a Illumina Hiseq2000 and the library was prepared with Agilent SureSelect V4. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001007768 
   
  
    
    We profiled transcriptome and epigenome of BMP signaling effects on H3.3K27M DIPG. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  49 
 
  
    EGAD00001007769 
   
  
    
    This dataset includes 914 BAM files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using single-cell reduced-representation bisulfite sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  914 
 
  
    EGAD00001007770 
   
  
    
    This dataset includes 60 BAM files from HF2354, HF3016 glioblastoma cell lines subjected to continuous stress (hypoxia, 3-day and 9-day), stress followed by recovery (irradiation, 4-day stress exposure and 5-day recovery), and no stress/normoxia controls and profiled using reduced-representation bisulfite sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD00001007771 
   
  
    
    This dataset includes 22 BAM files for tumor tissue and matched normal blood from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using whole genome sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001007772 
   
  
    
    This dataset includes paired-end fastq files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using single-cell RNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001007773 
   
  
    
    This dataset includes genome-wide autosomal array data for 11 Iberian Roma individuals used in the Merchero project. 
    
   
  
    
   
  11 
 
  
    EGAD00001007774 
   
  
    
    This dataset contains genotypes  (35.4M of SNVs, Indels and SVs), from 785 samples, after QC filtering, from the 808 WGS GCAT cohort. 
    
   
  
    
   
  785 
 
  
    EGAD00001007775 
   
  
    
    Intrahepatic cholangiocyte organoid clone from patients with chronic alcohol consumption, NASH (nonalcoholic steatohepatitis), and PSC (primary sclerosing cholangitis) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  19 
 
  
    EGAD00001007776 
   
  
    
    This dataset contains whole blood transcriptome data generated from 93 patients with COVID-19 across a range of severities and 23 healthy controls. All patients were PCR positive for SARS-CoV-2 and disease severity ranged from asymptomatic to severe disease requiring ventilation. Individuals without symptoms, or with mild symptoms, were recruited from routine screening of healthcare workers, while COVID-19 patients were recruited at or soon after admission to Addenbrooke’s or Royal Papworth hospitals. Blood samples were taken at recruitment and then again four weeks later. Further details of the cohort and the generation of the RNA-Sequencing data can be obtained from Bergamaschi, L. et al. Longitudinal analysis reveals that delayed bystander CD8+ T cell activation and early immune pathology distinguish severe COVID-19 from mild disease. Immunity 54, 1257-1275 e8 (2021). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  768 
 
  
    EGAD00001007777 
   
  
    
    This dataset contains multiplexed fastq files containing raw BCR repertoire data 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  11 
 
  
    EGAD00001007778 
   
  
    
    This dataset contains samples from 9 patients with alveolar rhabdomyosarcoma. 9 samples have whole exome data (one has multiple). 6 samples have RNAseq data (one has multiple). 6 samples have matched normals dna sequence data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001007780 
   
  
    
    Whole genome sequencing of sick children in neonatal and paediatric intensive care units, aligned to reference assembly GRCh37. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001007782 
   
  
    
    Glioblastoma Patient samples were acquired before and after standard chemoradiation or standard chemoradiation+TTFields (Novo-TTF100) treatment. The set includes paired before-after samples of 6 control patients (chemoradiation) and 6 TTFields+chemoradiation patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001007783 
   
  
    
    The PGDP dataset includes 58 whole genome sequences for Papua New Guinean individuals from different locations. DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGDP dataset provides Fastq and BAM files. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  58 
 
  
    EGAD00001007785 
   
  
    
    Data from 496 OCCAMS (Oesophageal Cancer Clinical And Molecular Stratification) cases.
WGS
BAM files
496x oesophageal adenocarcinoma samples
496x normal samples 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001007786 
   
  
    
    This dataset contains the genotypes jointly-called from whole genome sequencing data of 177 self-reported Peranakans in Singapore. Reads were aligned to GRCh37 reference genome and jointly-called with other WGS samples. Basic quality control measures and population phasing without reference were performed on the called genotypes. The data are stored in VCF v 4.3 format, and one .vcf.gz file stored the genotypes from one of the 23 chromosomes (22 autosomes+X chromosome). 
    
   
  
    
   
  177 
 
  
    EGAD00001007787 
   
  
    
    The study prospectively enrolled patients admitted for HF with LV ejection fraction (LVEF) ≥ 50% and LV wall thickness <12 mm. TTR cardiac amyloidosis was diagnosed according to accepted criteria, which include positive cardiac 99-Tc-DPD scintigraphy in the absence of monoclonal protein expansion in blood. In a cohort of patients with HFpEF without LVH, the prevalence of TTR cardiac amyloidosis was 5%. Transthyretin gene sequencing was performed in positive patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001007788 
   
  
    
    Patient-derived samples were profiled using 10X genomics single-cell CNV and single-cell ATAC kits. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  10 
 
  
    EGAD00001007789 
   
  
    
    SmMIP libraries using cord blood DNA were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina) 
    
   
  
    
   
  16 
 
  
    EGAD00001007790 
   
  
    
    SmMIP libraries using bulk cell line DNA and DNA mixes were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina) 
    
   
  
    
   
  44 
 
  
    EGAD00001007791 
   
  
    
    SmMIP libraries using DNA from patients diagnosed with myeloid malignancies were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina) 
    
   
  
    
   
  336 
 
  
    EGAD00001007792 
   
  
    
    Gallbladder carcinoma is the most common cancer of the biliary tract with dismal survival largely due to delayed diagnosis. Biliary tract intraepithelial neoplasia (BilIN) is the common benign tumor that is suspected to be precancerous lesions. However, the genetic and evolutionary relationships between BilIN and carcinoma remain unclear. Here we performed whole-exome sequencing of coexisting low-grade BilIN (adenoma), high-grade BilIN, and carcinoma lesions, and normal tissues from the same patients. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  44 
 
  
    EGAD00001007793 
   
  
    
    Somatic mutations of RUNX1, which encodes the myeloid and lymphoid transcriptional factor RUNX1, are common in both B- and T- acute lymphoid leukemia (ALL) and are associated with poor prognosis of T-ALL. However, there has been no comprehensive investigation of the pattern or prevalence of RUNX1 germline mutation in both B- and T-ALL. Here we report germline RUNX1 variants in 1.23% of B-ALL and 2.11% of T-ALL, identifying 31 unique variants in 62 B-ALL and 18 unique variants in 26 T-ALL children. The majority of frameshift and nonsense variants affected RUNX1 function in transcriptional regulation, hematopoiesis, and cellular proliferation. We identified JAK3 as the most frequent somatic mutation in T-ALL with RUNX1 variants. These results not only identify RUNX1 as a leukemia predisposition gene but also further underline the importance of germline genetic variants to the development of ALL 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001007794 
   
  
    
    scGBS is a single-cell sequencing-based methodology to haplotype and copy-number profile single cells. Genomic size and complexity is reduced through restriction enzyme digestion and DNA is genotyped through sequencing of the restriction fragments. scGBS data serves as the input for haplarithmisis, an algorithm we previously developed for SNP array-based single-cell haplotyping (Zamani Esteki et al., 2015). We established technical parameters and developed an analysis pipeline enabling accurate concurrent haplotyping and copy-number profiling of single cells with the use of a HapMap cell line pedigree (7 single cells). A clinical validation of the methodology with a total of 14 single blastomeres and 3 trophectoderm samples biopsies from human preimplantation embryos for 6 PGT-M families were processed with scGBS and were previously haploptyped via SNP array. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  49 
 
  
    EGAD00001007796 
   
  
    
    The dataset for Detection and characterization of lung cancer using cell-free DNA fragmentomes includes 872 bam files from whole genome next-generation sequencing on the Illumina HiSeq2500.  The samples analyzed include plasma samples from healthy individuals and patients with cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  872 
 
  
    EGAD00001007799 
   
  
    
    Analysis of RAD51C promoter methylation using targeted bisulfite sequencing (amplicon sequencing) in ovarian cancer pre-clinical models and patient samples. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  20 
 
  
    EGAD00001007800 
   
  
    
    To better understand variation in metastatic prostate cancer behaviour, we assembled and analyzed longitudinal clinical and autopsy records in 33 men. The dataset is contained in a self-explanatory Excel Workbook, with each patient identified as A1, A2, etc. as listed in the "Combined longitudinal clinical and autopsy phenomic assessment in lethal metastatic prostate cancer: recommendations for advancing precision medicine" publication in European Urology Open Science. Please see Jasu J, Tolonen T, Antonarakis ES, Beltran H, Halabi S, Eisenberger MA, Carducci MA, Loriot Y, Van der Eecken K, Lolkema M, Ryan CJ, Taavitsainen S, Gillessen S, Högnäs G, Talvitie T, Taylor RJ, Koskenalho A, Ost P, Murtola TJ, Rinta-Kiikka I, Tammela T, Auvinen A, Kujala P, Smith TJ, Kellokumpu-Lehtinen PL, Isaacs WB, Nykter M, Kesseli J, Bova GS. Combined Longitudinal Clinical and Autopsy Phenomic Assessment in Lethal Metastatic Prostate Cancer: Recommendations for Advancing Precision Medicine. Eur Urol Open Sci. 2021 Jul 2;30:47-62. doi: 10.1016/j.euros.2021.05.011. PMID: 34337548; PMCID: PMC8317817. for more details. 
    
   
  
    
   
  33 
 
  
    EGAD00001007801 
   
  
    
    Fastq files generated during target sequencing of 10MB genomic region surrounding top hits in GWAS in a subset of 86 individuals in case families. Paired end sequencing performed on Illumina NextSeq. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  86 
 
  
    EGAD00001007803 
   
  
    
    Whole-exome sequencing of IMFT tumor samples from 24 participants in the clinical phase II trial EORTC 90101 “CREATE” (CREATE IMFT cohort) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001007804 
   
  
    
    Whole-genome sequencing of IMFT tumor samples from 24 participants in the clinical phase II trial EORTC 90101 “CREATE” (CREATE IMFT cohort) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001007805 
   
  
    
    Mutational landscape of high-grade B-cell lymphoma with MYC-, BCL2 and/or BCL6 rearrangements characterized by whole-exome sequencing and panel sequencing. 
    
   
  
    
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  73 
 
  
    EGAD00001007806 
   
  
    
    26 Tumor/Control pairs of WGS data  of PCNSL tumors, sequenced on either Illumina HiSeq2000/2500 instruments or HiSeq X Ten. The controls are blood or buffy coat samples in most cases. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001007807 
   
  
    
    Paired-end WGS data of 27 neuroblastoma patient samples (10 obtained at diagnosis, 6 at relapse and 11 matched blood samples as controls) used for detection of complex "seismic" amplification. Mean coverage is 24-55x per sample. The remaining patient samples of the dataset can be found under accession number EGAS00001001308. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD00001007808 
   
  
    
    Data supporting "Interplay of processes shapes structural variations undergoing selection in oesophageal adenocarcinoma" Ng, Contino et al.
WGS (BAM files)
383 oesophageal adenocarcinoma samples
383 normal samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001007809 
   
  
    
    Data supporting: "Interplay of processes shapes structural variations undergoing selection in oesophageal adenocarcinoma" Ng, Contino et al.
RNAseq (BAM files)
214 oesophageal adenocarcinoma samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001007810 
   
  
    
    Paired WGS samples, 24 tumor/control pairs of primary CNS lymphoma, sequenced on HiSeq X Ten using Illumina TruSeq Nano DNA for library preparation. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  24 
 
  
    EGAD00001007811 
   
  
    
    Primary lymphomas of the central nervous system (PCNSL) are diffuse large B-cell lymphomas (DLBCLs) which are confined to the central nervous system (CNS). Paired RNA-Seq sequencing was done on Illumina HiSeq2000 machines using Illumina TruSeq RNA library preparation kit. About 36 tumor samples were sequenced. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001007812 
   
  
    
    We analyzed 34 AGCTs (19 primary and 15 recurrent) and the KGN cell line by RNA-Seq. Our cohort comprised of 3 AGCTs WT for FOXL2, 28 heterozygous and 3 homo/hemizygous for the pathogenic variant. Fresh-frozen AGCTs were selected from OVCARE’s Gynecological Tissue Bank in Vancouver, Canada for bulk RNA-seq. RNA was extracted from frozen tissue and sections adjacent to the scrolls submitted for RNA-seq were stained with hematoxylin and eosin (H&E) to evaluate tumour cell purity. Cases with >80% tumour cell purity were selected for sequencing with the majority of cases (29 of 34 patients) containing >90% tumour cells. Ribodepleted RNA libraries were constructed and paired-end sequencing (125 base pair reads) was performed. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  35 
 
  
    EGAD00001007813 
   
  
    
    RNAseq on 20 samples of multiple myeloma patients and 3 normal plasma cells. RNAseq was performed using 200 ng of total RNA by GATC Biotech. Directional libraries were performed after mRNA selection by polyA selection using UTP method. RNA-seq libraries were sequenced on HiSeq2500 Illumina machine using 100bp paired-end reads. Reads alignment was performed using the STAR aligner (version 2.4.0f1) and human genome hg19 as reference. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001007814 
   
  
    
    This dataset contains ATAC sequencing of plasma cells from multiple myeloma (MM) patients. The data was used to investigate genotype-specific chromatin accessibility quantitative trait locus (caQTL) using caQTLseg (https://github.com/abhisheknrl/caQTLseg). This dataset contains 161 bam files. 
    
   
  
    
      
      unspecified 
      
    
   
  161 
 
  
    EGAD00001007815 
   
  
    
    Genotyped data for 28,022 British individuals with South Asian ancestry from the Genes and Health cohort (Feb2020), which were imputed with the GenomeAsia pilot reference panel. 
    
   
  
    
   
  28022 
 
  
    EGAD00001007816 
   
  
    
    This dataset contains whole exome sequencing (WES) data (various enrichment methods) from tumor DNA samples of various pediatric cancer entities. Files are provided in fastq format. Samples were sequenced on a Novaseq6000 or Hiseq2500 (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001007817 
   
  
    
    This dataset contains 538 Tumor and Control WGS and WES files for samples already submitted and published in study EGAS00001004276 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  400 
 
  
    EGAD00001007818 
   
  
    
    Some data was previously submitted data under study number EGAS00001004276. In this new dataset we provide additional WGS and Avenio Surveillance Panel data. We utilized 43 ALK+ NSCLC patients receiving targeted ALK therapy to evaluate ctDNA levels based on matched panel-based targeted next generation sequencing (tNGS) and untargeted shallow whole genome sequencing (sWGS).  For the Avenio panel the sequencing was done on Illumina NextSeq 550 paired end 150 bp, for WGS the sequencing was done on Illumina HiSeq 4000, partly with KAPA_Hyper_Prep_Kit. In this dataset there are 132 WGS tumor samples and 134 panel  sequencing data of plasma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 550 
      
    
   
  266 
 
  
    EGAD00001007819 
   
  
    
    Capture lncRNA and totalRNA sequencing of various sample types (including plasma, FFPE and high quality RNA). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001007820 
   
  
    
    RNAseq of liver organoids with a dG genotype (B20, nt115, U15) and with a TT genotype (nt5, U16, U19) was performed, which was used to study the impact of IFNλ4 on the cellular response to Sendai viral infection. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  18 
 
  
    EGAD00001007821 
   
  
    
    Whole genome bisulfite sequencing on 10 multiple myeloma cases. Data quality control and adaptor-trimmed were performed with the Trimomatic tool. Paired-reads were mapped to the hg19 human reference with methylCtools aligner. 
    
   
  
    
      
      Illumina Genome Analyzer 
      
    
   
  1 
 
  
    EGAD00001007822 
   
  
    
    Enhanced reduced representation bisulfite sequencing (eRRBS) on 45 multiple myeloma samples and 3 normal plasma cell. Enhanced reduced representation bisulfite sequencing (eRRBS) on 45 multiple myeloma samples and 3 normal plasma cell. Libraries were sequenced on a HiSeqTM4000 Illumina machine using 75bp paired-end reads 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001007824 
   
  
    
    Whole exome sequencing data (bam files) of 55 samples of myxofibrosarcoma and 44 matched pairs. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  99 
 
  
    EGAD00001007825 
   
  
    
    Myxofibrosarcoma (MFS) is a rare subtype of sarcomas in the elderly, whose genetic basis is poorly understood. To elucidate it, the whole genome sequence was performed. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001007826 
   
  
    
    Myxofibrosarcoma (MFS) is a rare subtype of sarcomas in the elderly, whose genetic basis is poorly understood. To elucidate it, the Targeted-capture sequencing was performed. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  108 
 
  
    EGAD00001007827 
   
  
    
    Cryopreserved PBMCs from 10 individuals before and after vaccination were used to perform single cell RNA sequencing. Equal number of cells per individual were pooled together (5 individuals per pool) and single-cell RNA sequencing was performed in paired-end mode on NovaSeq 6000 (Illumina) with a depth of 50,000 reads per cell. DNA was isolated from PBMCs and then used for genotyping by Illumina GSA Beadchip. This dataset contains the fastq sequence files, genotypes of the donors used for demultiplexing the pools and files indicating the linkages between individuals, pools and fastq files.
The number of samples listed by EGA does not match the actual number of samples due to limitations on the upload scheme used. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001007828 
   
  
    
    661 bam files generated from high-throughput RNAseq of tumour biopsies from colorectal cancer patients 
    
   
  
    
      
      NextSeq 500 
      
    
   
  661 
 
  
    EGAD00001007829 
   
  
    
    This data set contains BAM files of the RNAseq analysis for three SCCOHT patient tumors. Total mRNA was isolated from fresh frozen tumor samples. RNA sequencing was performed using Illumina HiSeq 4000, paired end 150 bp. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001007830 
   
  
    
    Total collection of Samples. Exome sequencing and RNAseq from Mongolia and Western HCC samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  550 
 
  
    EGAD00001007831 
   
  
    
    Samples are from patients enrolled in an international multicentric study aimed to define the genetic determinants of recurrence of membranous nephropathy in the kidney graft. They include 248 samples from patients with MN including 105 patients who received  a graft, their 105 graft donors, and 192 controls all of Caucasian origin. Files from targeted-capture of HLA and PLA2R loci are available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  545 
 
  
    EGAD00001007832 
   
  
    
    Basic phenotypes for BRACOVID cohort. 
    
   
  
    
   
  348 
 
  
    EGAD00001007833 
   
  
    
    Lab values for BRACOVID cohort. 
    
   
  
    
   
  234 
 
  
    EGAD00001007834 
   
  
    
    Basic phenotypes for BelCovid2 cohort. 
    
   
  
    
   
  392 
 
  
    EGAD00001007835 
   
  
    
    Lab values for BelCovid2 cohort. 
    
   
  
    
   
  262 
 
  
    EGAD00001007836 
   
  
    
    Basic phenotypes for GEN_COVID cohort. 
    
   
  
    
   
  1141 
 
  
    EGAD00001007837 
   
  
    
    Lab values for GEN_COVID cohort. 
    
   
  
    
   
  739 
 
  
    EGAD00001007838 
   
  
    
    Basic phenotypes for Hostage1 cohort. 
    
   
  
    
   
  847 
 
  
    EGAD00001007839 
   
  
    
    Lab values for Hostage1 cohort. 
    
   
  
    
   
  847 
 
  
    EGAD00001007840 
   
  
    
    Basic phenotypes for Hostage2 cohort. 
    
   
  
    
   
  306 
 
  
    EGAD00001007841 
   
  
    
    Lab values for Hostage2 cohort. 
    
   
  
    
   
  306 
 
  
    EGAD00001007842 
   
  
    
    Basic phenotypes for Hostage3 cohort. 
    
   
  
    
   
  71 
 
  
    EGAD00001007843 
   
  
    
    Lab values for Hostage3 cohort. 
    
   
  
    
   
  71 
 
  
    EGAD00001007844 
   
  
    
    Basic phenotypes for Hostage4 cohort. 
    
   
  
    
   
  121 
 
  
    EGAD00001007845 
   
  
    
    Lab values for Hostage4 cohort. 
    
   
  
    
   
  121 
 
  
    EGAD00001007846 
   
  
    
    Basic phenotypes for INMUNGEN_CoV2 cohort. 
    
   
  
    
   
  367 
 
  
    EGAD00001007847 
   
  
    
    Lab values for INMUNGEN_CoV2 cohort. 
    
   
  
    
   
  37 
 
  
    EGAD00001007848 
   
  
    
    Basic phenotypes for SPGRX cohort. 
    
   
  
    
   
  364 
 
  
    EGAD00001007851 
   
  
    
    Age-related loss of function in the human haematopoietic system is well documented, manifesting as reduced regenerative capacity, age-related cytopenias and immune dysfunction. However, the cellular and population level changes that underpin both this functional decline and the increased risk of clonal haematopoiesis and blood cancer in the elderly remain elusive. Here we performed whole genome sequencing on >3350 single haematopoietic stem cell / multipotent progenitors (HSC/MPP) derived colonies across 10 haematologically normal subjects aged 0 to 81. We found that HSC/MPPs accumulated 17 single nucleotide variants per year post birth and had a reduction in telomere length of 50bp per year throughout young adult life. We reconstructed phylogenies of the sampled HSC/MPPs to interrogate changes in clonal dynamics through life. Haematopoiesis in adults aged less than 65 was predominantly polyclonal, with few known driver mutations. In contrast, individuals aged over 75 displayed a profound change in clonal structure, with frequent clonal expansions, many unexplained by known driver mutations. The ratio of non-synonymous to synonymous mutations revealed widespread positive selection, estimating around 1000 driver mutations in the dataset (10-fold more than the number of known drivers). We identified novel genes ZNF318 and HIST2H3D as being under positive selection, despite not being enriched in myeloid malignancies. Our data show that HSC clonal dynamics is more complex than previously thought. One implication is that by old age, the majority of HSCs carry at least one of a number of largely undescribed driver mutations, which may underlie aspects of their functional decline. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  3601 
 
  
    EGAD00001007853 
   
  
    
    We have performed single cell RNA-sequencing for infant and childhood B-cell acute lymphoblastic leukemias as well as infant acute myeloid leukemias at diagnosis. The sequencing was performed with 10X Chromium single cell 3’ and 5’ chemistry. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001007854 
   
  
    
    We have performed single cell RNA-sequencing for infant and childhood B-cell acute lymphoblastic leukemias as well as infant acute myeloid leukemias at diagnosis. The sequencing was performed with 10X Chromium single cell 3’ and 5’ chemistry. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001007856 
   
  
    
    The dataset consists of 
- 126 whole exome sequencings (SAMD9/9Lmut: 64; GATA2mut 24, MDS wildtype 38/471) performed using SureSelect Human All Exon V6 enrichment (Agilent, cat# 5190-8863). The generated libraries were sequenced on the Illumina Hiseq 2500 with 150bp paired-end reads. FASTQ files were processed using SeqNext platform (JSI medical system, Germany), with gene-based alignment to a virtual panel of 300 genes (including 28 MDS-associated genes, SAMD9, and SAMD9L), consisting of genes relevant to bone marrow failure, MDS predisposition, and hematological cancers as per the Pan-Cancer studies with cohorts of >10,000 cancers. The respective BAM files are provided. 
- Custom panel targeting SAMD9, SAMD9L, and 22 single nucleotide polymorphisms (SNP) on chromosome 7q (allele frequency >35% in all ethnic sub-populations in gnomAD) (Ampliseq #IAD104171) were performed in 666/669 cases. And Custom panel targeting 28 MDS-associated genes (GATA2, RUNX1, HOXA9, CEBPA, GATA1, KRAS, NRAS, CBL, PTPN11, ASXL1, EZH2, SETBP1, FLT3, KIT, JAK2, JAK3, CSF3R, MPL, SH2, BCOR, BCORL1; RAD21, STAG2, CTCF, TP53, PTEN, CALR, VPS45) was performed in 544 cases (Ampliseq #IAD51150). Both custom panel libraries were prepared using NEBNext Ultra II DNA library prep kit (New England BioLabs, cat#E7645S/L) per manufacturer’s instruction and samples were sequenced on an Illumina Miseq 2000 with 2 x 150 bp reads. The respective BAM files are provided
- 4 SAMD9/9L patients were subjected to MissionBio custom single-cell panel (CO-112) targeting 250 heterozygous gnomAD population polymorphisms on 7q arm and 69 amplicons in SAMD9/9L and other cancer genes. All libraries were sequenced on an Illumina NovaSeq6000 with 150 base-paired ending multiplexed runs. Fastq files were processed using the Tapestri Pipeline V2 and python-based Mosaic package (multi-omics analysis, data visualization). The derived BAM and loom files are provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  437 
 
  
    EGAD00001007858 
   
  
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  234 
 
  
    EGAD00001007859 
   
  
    
    This dataset contained raw sequencing fastq data of the article "A body map of somatic mutagenesis in morphologically normal human tissues". We sampled morphologically normal tissue biopsies from 5 donors. We performed low-depth WGS on 1,764 samples, high-depth WGS on 48 samples and WES on 1,772 samples. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  1792 
 
  
    EGAD00001007860 
   
  
    
    Collection of mostly matching primary and recurrent glioblastoma RNA-seq sample pairs, also matching with an earlier DNA sequencing study 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  346 
 
  
    EGAD00001007861 
   
  
    
    This dataset contains 318 Tumor and Control WGS files submitted in another EGA box for samples for Gerhauser et al.,Cancer Cell, 2018, 34:996-1011. WGS and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  320 
 
  
    EGAD00001007862 
   
  
    
    The dataset is composed by the raw and processed sequencing data generated from 185 Patients affected by azoospermia or severe oligozoospermia recruited from the Netherlands and the UK. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  555 
 
  
    EGAD00001007863 
   
  
    
    BCL11B PacBio data set, 4 samples 
    
   
  
    
      
      Sequel 
      
    
   
  4 
 
  
    EGAD00001007864 
   
  
    
    Part of the project: The INFORM Precision Medicine Study for High-Risk Pediatric Malignancies resulted in the publication of this study: Radiation-induced gliomas represent H3-/IDH-wild type pediatric gliomas with recurrent PDGFRA amplification and loss of CDKN2A/B. This dataset contains the subset of 17 patient exome sequencing data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001007865 
   
  
    
    Single-cell multi-omic profiling of COVID19 patients recruited from University College London. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors. Samples are pooled, with 4 donors per pool. Germ-line genotypes derived from previous single-cell RNA-sequencing are provided (VCF) to aid demultiplexing of single-cell and assignment to specific patient donor samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001007866 
   
  
    
    Single-cell multi-omic profiling of healthy controls, asymptomatic and hosptial-admitted COVID19 patients recruited from Newcastle University hospitals. These data also include healthy control volunteers treated with IV-LPS as inflammatory controls. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  73 
 
  
    EGAD00001007867 
   
  
    
    Single-cell multi-omic profiling of healthy controls, asymptomatic and hosptial-admitted COVID19 patients recruited from Addenbrookes and Royal Papworth hospitals, in collaboration with the NIHR Cambridge Bioresource. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors. Samples are pooled, with 4 donors per pool. Germ-line genotypes are provided (VCF) to aid demultiplexing of single-cell and assignment to specific patient donor samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  96 
 
  
    EGAD00001007868 
   
  
    
    Whole genome sequencing of sick children in neonatal and paediatric intensive care units, aligned to reference assembly GRCh38. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  449 
 
  
    EGAD00001007870 
   
  
    
    This dataset contains the 22 bam files coresponding to the scRNAseq done in PDX models and cell lines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001007872 
   
  
    
    To QC the TraCe-seq strategy, single-cell RNA-seq libraries were generated from a variety of human cancer cell lines transduced with the TraCe-seq library to validate the TraCe-seq strategy. Specifically, 5 different cell lines (PC9, MCF-10A, MDA-MB-231, NCI-H358, and NCI-H1373) were each transduced with a unique TraCe-seq barcode. The transduced cells were selected with puromycin only, dissociated to single cell suspensions, and then mixed together. The complex mixture of the 5 cell lines was profiled by 10X scRNA-seq. Furthermore, transduced NCI-H1373 cells were sorted by FACS to enrich for the top 50% of eGFP positive cells, and sorted cells were cultured briefly and used to construct scRNA-seq libraries and profiled by 10x scRNA-seq.
To carry out the full TraCe-seq experiment, ~600 PC9 cells carrying unique TraCe-seq barcodes were expanded over 12 doublings to establish the barcoded population. A subset of the barcoded PC9 population was used to generate scRNA-seq libraries and profiled by 10x scRNA-seq prior to treatment to establish a baseline transcription profile for each barcoded clone. The rest of the cells were then treated for four days with 1 µM erlotinib, 1 µM GNE-069, or 1 µM GNE-104 respectively. scRNA-seq libraries were then generated form the treated cells and profiled by 10x scRNA-seq. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001007873 
   
  
    
    This dataset contains 26 whole-genome sequencing (13 paired tumor and normal), 106 whole-exome sequencing (53 paired tumor and normal), and 43 targeted sequencing data. Sequencing was performed using an Illumina platform. The data are BAM files aligned to the hg19 reference genome. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  175 
 
  
    EGAD00001007874 
   
  
    
    RNA-seq data from paired tumour and germline samples from mesothelioma patients for study EGAS00001005196 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  42 
 
  
    EGAD00001007875 
   
  
    
    Islet-derived_MSC06 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007876 
   
  
    
    Islet-derived_MSC08 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007877 
   
  
    
    Islet-derived_iPSC04 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007878 
   
  
    
    Islet-derived_MSC06 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007879 
   
  
    
    Islet-derived_MSC08 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007880 
   
  
    
    Islet-derived_iPSC04 mRNA-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007881 
   
  
    
    Islet-derived_MSC06 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007882 
   
  
    
    Islet-derived_MSC08 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007883 
   
  
    
    Islet-derived_iPSC04 miRNA-Seq single end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007884 
   
  
    
    Pancreas-Islet06 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007885 
   
  
    
    Short read whole genome sequencing (WGS) VCF files for the NIHR BioResource Rare Diseases WGS project – Participants from the Hypertrophic Cardiomyopathy (HCM) Rare Disease domain 
    
   
  
    
   
  - 
 
  
    EGAD00001007886 
   
  
    
    Short Description: This study contains 7 RRBS samples, including 3 ex vivo CD4+ Trm (2 x spleen and 1 x bone marrow) and 4 blood tetanus (TT) and measles (Me) antigen-reactive memory CD4+ cells before and one day post DTaP (diphtheria-tetanus-pertussis) and MMR (measles-mumps-rubella) vaccination, respectively (1 x tetanus D0, 1 x tetanus D1, 1 x measles D0, 1 x measles D1). 
Technology:  Illumina HiSeq 2500
Filetype: fastq format 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001007887 
   
  
    
    Pancreas-Islet08 WGBS paired end data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001007888 
   
  
    
    Islet-derived_iPSC04 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007889 
   
  
    
    Islet-derived_iPSC04 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007890 
   
  
    
    Islet-derived_iPSC04 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007891 
   
  
    
    Islet-derived_iPSC04 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007892 
   
  
    
    Islet-derived_iPSC04 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007893 
   
  
    
    Islet-derived_iPSC04 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007894 
   
  
    
    Islet-derived_iPSC04 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007895 
   
  
    
    Islet-derived_MSC06 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007896 
   
  
    
    Islet-derived_MSC06 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007897 
   
  
    
    Islet-derived_MSC06 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007898 
   
  
    
    Islet-derived_MSC06 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007899 
   
  
    
    Islet-derived_MSC06 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007900 
   
  
    
    Islet-derived_MSC06 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007901 
   
  
    
    Islet-derived_MSC06 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007902 
   
  
    
    Islet-derived_MSC08 h3k27ac ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007903 
   
  
    
    Islet-derived_MSC08 h3k27me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007904 
   
  
    
    Islet-derived_MSC08 h3k36me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007905 
   
  
    
    Islet-derived_MSC08 h3k4me1 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007906 
   
  
    
    Islet-derived_MSC08 h3k4me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007907 
   
  
    
    Islet-derived_MSC08 h3k9me3 ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007908 
   
  
    
    Islet-derived_MSC08 input ChIP-Seq paired end data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001007909 
   
  
    
    HCA Endometrium_LM The endometrium regenerates monthly and its transformation is executed through dynamic changes in states and interactions of multiple cell types. Using transcriptomics methods we seek to profile changes of the endometrium across the menstrual cycle. Our map will have implications in women's health and cancer, by enabling the interpretation of GWAS analyses or the studying functional consequences of somatic mutations.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD00001007910 
   
  
    
    Open chromatin regions in the MYC super-enhancer region were investigated by ATAC-seq in t(3;8) AML. ATAC-seq was performed as described (Buenrostro et al, 2013) with a modification in the lysis buffer (0.30 M sucrose, 10 mM Tris pH 7.5, 60 mM KCl, 15 mM NaCl, 5 mM MgCl2, 0.1 mM EGTA, 0.1% NP40, 0.15 mM Spermine, 0.5 mM Spermidine, 2 mM 6AA) to reduce mitochondrial DNA contamination. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001007911 
   
  
    
    DNA (exome) sequencing of uveal melanoma metastases. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  107 
 
  
    EGAD00001007912 
   
  
    
    RNA sequencing of uveal melanoma metastases. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD00001007913 
   
  
    
    Single-cell RNA and TCR sequencing of PBMC from patients with uveal melanoma. 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001007914 
   
  
    
    Hi-C (n=72) data from a variety of pediatric brain tumors including ependymoma (PFA, PFB, Ste, spinal), medulloblastoma (G3, G4, SHH), high grade glioma (H3K27 and H3-WT), pilocytic astrocytoma, and more. Raw data provided as FASTQ. Data generated on Illumina HiSeq2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  70 
 
  
    EGAD00001007915 
   
  
    
    RNA-seq (n=52) data from a variety of pediatric brain tumors including ependymoma (PFA, PFB, Ste, spinal), medulloblastoma (G3, G4, SHH), high grade glioma (H3K27 and H3-WT), pilocytic astrocytoma, and more. Raw data provided as FASTQ. Data generated on Illumina HiSeq2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  52 
 
  
    EGAD00001007916 
   
  
    
    Novaseq whole exome raw sequence files (FastQ) for breast cancer tumor core biopsies and blood normal control. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001007917 
   
  
    
    This data contains the TCR-beta sequences of 10 head and neck squamous carcinomas and 19 nasopharyngeal carcinomas. The library preparation method is a customised targeted amplification of the VDJ regions and is sequenced on the Illumina Miseq. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  29 
 
  
    EGAD00001007918 
   
  
    
    Targeted sequencing of non-small cell lung cancer samples. BAM files of paired end reads aligned to hg19 using BWA MEM v0.7.1573. This targeted panel covers 370 genes of clinical relevance in non-small cell lung cancer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  140 
 
  
    EGAD00001007919 
   
  
    
    February 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      unspecified 
      
    
   
  5 
 
  
    EGAD00001007920 
   
  
    
    This paper describes the work by Akbari V,. et al. on detection of allele specific methylation using oxford nanopore sequencing data. They have developed set of tools, SNVoter and NanoMethPhase, and workflow which enable the detection of allele specific methylation even in samples with sparse coverage of nanopore sequencing data. 
    
   
  
    
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001007921 
   
  
    
    Two sections of cryopreserved prostate cancer tissue from one untreated prostate cancer patient were profiled for spatial transcriptomics using the Visium Spatial library preparation protocol from 10x Genomics. The GRCh38 aligned sequencing reads from the two prostate cancer tissue sections are provided as BAM files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001007922 
   
  
    
    Raw FASTQ files for 77 RS + DLBCL + CLL samples. RNA-sequencing with single-end 50 nt reads. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  77 
 
  
    EGAD00001007923 
   
  
    
    PromethION-based whole genome sequencing of endothelial cells differentiated from patient derived induced pluripotent stem cells (iPSCs) of a hemophilic donor, transiently treated with a Cre recombinase, a RecF8 recombinase, and untreated cells. The dataset contains fastq files with all sequencing reads passing the standard quality filtering. 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001007930 
   
  
    
    Mutation analysis of 77 frequently mutated genes in NSCLC in plasma DNA and corresponding PBMCs of NSCLC patients under ICI using the AVENIO Expanded Kit. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 550 
      
    
   
  516 
 
  
    EGAD00001007931 
   
  
    
    Anonymised patient metadata and associated data dictionary. For further information regarding this dataset, please contact Alexander Mentzer at contact@combat.ox.ac.uk. 
    
   
  
    
   
  611 
 
  
    EGAD00001007932 
   
  
    
    SmartSeq2 RNAseq data from 16 samples. For further information regarding this dataset, please contact Julian Knight and Alexander Mentzer at contact@combat.ox.ac.uk. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001007933 
   
  
    
    mRNA capture seq of uterotubal lavage samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  74 
 
  
    EGAD00001007934 
   
  
    
    Whole exome and RNASeq raw sequencing data for a cohort 24 patients with non-small cell lung cancer, 15 adenocarcinoma (8 female, 7 male) and 9 squamous cell carcinoma(5 female, 4 male). Median age at diagnosis was 69. Tumour tissue and PBMCs were used for whole exome sequencing and RNA sequencing. 
This data was generated as part of a study funded by a Cancer Research UK Centres Network Accelerator Award Grant (A21998). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  83 
 
  
    EGAD00001007935 
   
  
    
    Whole-exome sequencing (~250X coverage) of primary GBM tumours and matched patient-derived organoids and normal blood. Samples from two spatially distinct regions of seven tumours from five patients (five primary, two recurrent). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  29 
 
  
    EGAD00001007936 
   
  
    
    Single-cell RNA-seq of primary GBM tumours and matched patient-derived organoids and gliomasphere lines. Obtained using the 10X Genomics single-cell 3' expression solution (v2 chemistry). Primary samples and PDOs from 12 tumours from 10 patients (10 primary, two recurrent), and gliomasphere lines from a subset of five tumours. Samples were obtained from two spatially distinct regions of each tumour. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  99 
 
  
    EGAD00001007937 
   
  
    
    Single-cell whole-genome sequencing of primary GBM tumours and matched patient-derived organoids. Obtained using the 10X Genomics single-cell CNV solution. Samples from two spatially distinct regions of five tumours from three patients (three primary, two recurrent). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  16 
 
  
    EGAD00001007938 
   
  
    
    CM214 - Biomarker Analysis From the Phase 3 CheckMate 214 Trial of Nivolumab Plus Ipilimumab (N+I) or Sunitinib (S) in Advanced Renal Cell Carcinoma (aRCC) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  213 
 
  
    EGAD00001007939 
   
  
    
    This dataset contains samples from 9 patients with embryonal rhabdomyosarcoma. 9 samples have whole exome tumor data (one has multiple). 7 samples have tumor RNAseq data. 1 sample has matched normal dna sequence data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001007940 
   
  
    
    Whole exome sequencing (WES) data of paired (germline and leukemic) samples of 60 adult patients affected by acute myeloid leukemia. 
    
   
  
    
   
  120 
 
  
    EGAD00001007941 
   
  
    
    Whole exome sequencing (WES) data of paired (germline and leukemic) samples of 100 adult patients affected by acute myeloid leukemia.
Targeted sequencing data of myeloid-related genes of 21 leukemia (not paired) samples from adult patients affected by acute myeloid leukemia. 
    
   
  
    
   
  221 
 
  
    EGAD00001007942 
   
  
    
    This is raw sequencing data, analysis of which is presented in the paper "Sensitivity to Immune Checkpoint Blockade and Progression-Free Survival is associated with baseline CD8+ T cell clone size and cytotoxicity", DOI: https://doi.org/10.1101/2020.11.15.383786 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  134 
 
  
    EGAD00001007943 
   
  
    
    Intellance-2: rRNA-minus RNA-seq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  224 
 
  
    EGAD00001007944 
   
  
    
    Intellance-2: TruSight Tumor 170 panel based RNA-seq 
    
   
  
    
      
      NextSeq 500 
      
    
   
  222 
 
  
    EGAD00001007945 
   
  
    
    Intellance-2: TruSight Tumor 170 panel based DNA-seq 
    
   
  
    
      
      NextSeq 500 
      
    
   
  216 
 
  
    EGAD00001007946 
   
  
    
    57 Bone marrow specimens for 5 healthy bone marrow and 24 CML samples profiled with 10X scRNA-seq 5' upon separation using MACS for CD34. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 3000 
      
      Illumina HiSeq 4000 
      
    
   
  57 
 
  
    EGAD00001007947 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001007948 
   
  
    
    Transcriptome sequencing of rhabdoid tumor tissue, organoids and SMARCB1-reconstituted organoids 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001007949 
   
  
    
    Cancer RNA-seq consisting of FASTQ single-end reads from 1 colon-cancer individual
RNA-seq was performed on illumina
This dataset contains reads from a single region. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  1 
 
  
    EGAD00001007950 
   
  
    
    Cancer and germline exomes consisting of FASTQ reads from 6 individuals (4 melanoma, 1 lung and 1 colon cancer).
Exome sequencing was performed on illumina with a depth of 100x to 200x.
2 Melanoma datasets contain reads from 2 different tumor regions
2 Melanoma datasets contain reads from 1 tumor region and from a tumor derived cell line
1 Melanoma dataset contains reads from 2 healthy tissues
Colon and lung datasets contain both 1 matched germline-tumor pair 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001007951 
   
  
    
    Cancer RNA-seq consisting of FASTQ paired-end reads from 6 individuals (4 melanoma, 1 lung cancer).
RNASEQ was performed on illumina, Truseq capture kit, 40M-80M clusters.
2 Melanoma datasets contain reads from 1 tumor region and from a tumor derived cell line
2 Melanoma, 1 Colon and 1 lung datasets contain each reads from a single region. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001007952 
   
  
    
    This dataset consists of RNA-seq data from human monocyte-derived macrophages that were subjected to siRNA treatment targeting RAD21 and either left untreated, or stimulated with LPS. In total, it includes 24 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  24 
 
  
    EGAD00001007953 
   
  
    
    This dataset consists of ATAC-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, it includes 39 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  39 
 
  
    EGAD00001007954 
   
  
    
    This dataset consists of ChIP-seq data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. ChIP-sequencing was done for H3K27, RAD21 and CTCF. In total, the data set includes 120 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  120 
 
  
    EGAD00001007955 
   
  
    
    This dataset consists of in situ HiC-seq data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, the data set includes 42 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  42 
 
  
    EGAD00001007956 
   
  
    
    This dataset consists of RNA-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, it includes 63 samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  63 
 
  
    EGAD00001007957 
   
  
    
    Bulk RNAseq data from whole blood. For further information regarding this dataset, please contact Katie Burnham and Andew Kwok at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  144 
 
  
    EGAD00001007958 
   
  
    
    Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  210 
 
  
    EGAD00001007959 
   
  
    
    gVCF file per patient obtained from the bulk/mini-bulk RNAseq data. For further information regarding this dataset, please contact Stephen Sansom and Alexander Mentzer at contact@combat.ox.ac.uk. 
    
   
  
    
   
  228 
 
  
    EGAD00001007960 
   
  
    
    fastq and filtered fasta files for B-cell receptor sequencing. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  96 
 
  
    EGAD00001007961 
   
  
    
    fastq and filtered fasta files for T-cell receptor sequencing. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  91 
 
  
    EGAD00001007962 
   
  
    
    Raw Illumina sequencing data and CellRanger BAM output files. For further information regarding this dataset, please contact Stephen Sansom at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001007963 
   
  
    
    Raw Illumina sequencing data from single-cell ATACSeq experiments. For further information regarding this dataset, please contact Julian Knight and Tatjana Sauka-Spengler at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001007964 
   
  
    
    Raw Illumina sequencing data. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001007965 
   
  
    
    Raw Illumina sequencing data. For further information regarding this dataset, please contact Benjamin Fairfax and Rachael Bashford-Rogers at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001007966 
   
  
    
    WGS data set used in the study, 2 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001007967 
   
  
    
    RNAseq data set used in the study, 10 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001007968 
   
  
    
    WGBS data set used in the study, 96 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  96 
 
  
    EGAD00001007969 
   
  
    
    We investigated 10 female and 14 male SARS-CoV-2 positive children (age range: 0.8 to 18 years). Based on the WHO guidelines, 15 patients were classified as having mild COVID-19, while 7 children were classified as moderate COVID-19 cases. Two children were asymptomatic. 8 female and 10 male SARS-CoV-2 negative children were included as controls (age range: 4 to 16). 
12 SARS-CoV-2 positive female and 9 male adults were included ( age range: 27 - 76) together with 13 female and 10 male SARS-CoV-2 negative adult controls (age: 24 - 77). 10 adult COVID-19 patients had mild disease, while 12 had moderate COVID-19. We performed single-cell RNA sequencing experiments. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  86 
 
  
    EGAD00001007970 
   
  
    
    The dataset contains transcriptomic information of 36 oral potentially malignant disorders (OPMD), 14 fibroepithelial polyps (FEP), and 6 early stage oral squamous cell carcinoma (OSCC) from the Asian population. Total RNA was extracted from formalin-fixed paraffin embedded (FFPE) tissue sections. RNA libraries were prepared using the NEB NextUltra RNA kit with Illumina Ribo-Zero rRNA removal as per manufacturer’s instructions. RNA sequencing was performed on the HiSeq2500 platform to generate paired-end 150 nucleotides reads and with a coverage of 50 million reads per sample. Uploaded bam files have been mapped to the GRCh38 human genome using TopHat2. Clinical and demographic data for these patients are available from the associated publications or by request. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001007971 
   
  
    
    The study will use RNA sequencing to aid in benchmarking different culture conditions in a set of genetically annotated human organoid lines. The data will be used to assess whether there are any clonal differences introduced when culturing these lines in different conditions. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001007972 
   
  
    
    We analyzed the cell free DNA methylomes using 30 plasma samples from patients with localized prostate cancer in the CPC-GENE project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  30 
 
  
    EGAD00001007973 
   
  
    
    Exome sequencing on HiSeq platform of 36 brain metastases with matched normal samples, 32 having matched RNA seq. Published Saunus et al J Path, (2015);  https://doi.org/10.1002/path.4583 
    
   
  
    
   
  106 
 
  
    EGAD00001007975 
   
  
    
    The data contains paired-end fastq files of 1440 single cells transcriptome sequencing data from 4 Celiac disease patients. CD4+ T cells were sorted by HLA-DQ-gluten tetramers carrying four immunodominant gluten epitopes. All single cell libraries were constructed following SmartSeq2 and sequenced on Illumina NextSeq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1440 
 
  
    EGAD00001007976 
   
  
    
    DNA methylation sequencing profiles of 1538 breast tumors and 244 normal breast tissues. Libraries were prepared using a custom Reduced Representation Bisulfite Sequencing pipeline. Sequencing was performed on the Illumina HiSeq 2500 (v4 chemistry), with single-end reads of 125 bp length. Multiplexing was conducted at the level of 8 samples per lane. FASTQ files are provided for 1538 breast tumors and 244 normal breast tissues. 
Reference: Batra et al. (2021). DNA methylation landscapes of 1538 breast cancers reveal a replication-linked clock, epigenomic instability and cis-regulation. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1782 
 
  
    EGAD00001007978 
   
  
    
    Neurofibromatosis type 1 (NF1) is caused by loss-of-function variants in the NF1 gene. Approximately 10% of these variants affect RNA splicing and are either missed by conventional DNA diagnostics or are misinterpreted by in silico splicing predictions. Therefore, a targeted RNAseq-based approach was designed to detect pathogenic RNA splicing and associated pathogenic DNA variants. an in-house developed tool (QURNAS) was used to calculate the enrichment score (ERS) for each splicing event. RNA enrichment of NF1 and SPRED1 was done using SPET (NUGEN - NF1 only) and using SureSelect (Agilent - NF1 and SPRED1). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  47 
 
  
    EGAD00001007979 
   
  
    
    Biomarkers to identify patients without benefit from adding everolimus to endocrine treatment in metastatic breast cancer (MBC) are needed. 
We report the results of the Pearl trial conducted in five Belgian centers assessing 18F-FDG-PET/CT non-response (n=45) and ctDNA detection (n=46) after 14 days of exemestane-everolimus (EXE-EVE) to identify MBC patients who will not benefit. 
Metabolic non-response rate was 66.6%. Median PFS in non-responding patients (using as cut-off 25% for SUVmax decrease) was 3.1 months compared to 6.0 months in those showing response (HR: 0.77, 95% CI: 0.40-1.50, p=0.44). Difference was significant when using a “post-hoc” cut-off of 15% (PFS 2.2 months vs 6.4 months). ctDNA detection at D14 was associated with PFS: 2.1 months vs 5.0 months (HR-2.5, 95% CI: 1.3-5.0, p=0.012). 
Detection of ctDNA and/or the absence of 18F-FDG-PET/CT response after 14 days of EXE-EVE identifies patients with a low probability of benefiting from treatment. Independent validation is needed. 
    
   
  
    
      
      Ion Torrent S5 XL 
      
    
   
  126 
 
  
    EGAD00001007980 
   
  
    
    The dataset includes 144 BAM files of WGS, WES, and RNA-seq data from primary and PDOX samples analyzed in Smith et al, Acta Neuropathologica, 2020 (PMID: 32519082). 
    
   
  
    
      
      unspecified 
      
    
   
  144 
 
  
    EGAD00001007981 
   
  
    
    Patients with idiopathic, heritable, or drug-induced pulmonary arterial hypertension (referred to throughout as PAH) were recruited from expert centers across the UK as part of the PAH Cohort study (www.ipahcohort.com). In each case, diagnosis was confirmed by right heart catheterization following established international guidelines, which remained unchanged for the duration of this study. Healthy volunteers were recruited at the same centers and samples processed using the same standard operating procedure at all sites. All individuals gave written, informed consent with local ethical committee approval. Whole blood (3 ml) was collected in Tempus Blood RNA Tubes, and RNAseq was performed using established Illumina methodologies (see online supplement for further details). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  359 
 
  
    EGAD00001007982 
   
  
    
    Single-cell analysis of the transcriptome, T cell immune receptors, and surface proteome (CITE-seq) from peripheral blood mononuclear cells (PBMCs) of COVID-19 patients with pre-existing autoimmune diseases (rheumatoid arthritis n = 5, psoriasis n = 4, or multiple sclerosis n = 3), as well as COVID-19 patients without pre-existing autoimmunity as controls (n = 10) to investigate altered immune responses. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD00001007983 
   
  
    
    Whole genome sequence of Philippine Ayta Magbukon. A total of 5 individuals. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001007989 
   
  
    
    WXS sequence data from 112 samples, RNA-seq sequence data from 117 samples, all sequence data are raw sequence data in fastq format, sequenced by Illumina platform. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  121 
 
  
    EGAD00001007990 
   
  
    
    The TIGER samples dataset contains PISA cohort samples which consist of paired RNA-seq and genotyping array data.
It contains 127 RNA-seq pair-end samples in fastq format and 127 individuals genotypes in PLINK format. 
    
   
  
    
      
      unspecified 
      
    
   
  127 
 
  
    EGAD00001007991 
   
  
    
    BAM files containing paired-end mtDNA sequencing data from morphologically normal human liver. Clonal CCO-deficient patches of hepatocytes were identified in human liver samples, and samples were taken along a line spanning approximately from the portal triad to the central hepatic vein. Individual BAM files are named according to their patch, line and cut, where cut 1 is nearest to the portal triad, and cuts 2, 3 etc. lying further from the portal triad. Other file types include "Bulk" samples, contain sequencing data of the remaining CCO-deficient cells that were not sampled as part of the line of cuts, and "Stroma" control samples (used for identifying germ-line variants). Sequenced on NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  319 
 
  
    EGAD00001007992 
   
  
    
    We studied 44 rectal cancer patients enrolled onto a prospective population-based biomarker study, who were planned for curative-intent radiation therapy before definitive surgery, yet at high risk of metastatic progression beyond the pelvic cavity. The patients had full-length mtDNA sequencing of whole blood (WB) and peripheral blood mononuclear cells (PBMC), sampled at the time of diagnosis. Metastatic events were recorded up to 60 months of follow-up after completion of the multimodal treatment. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  66 
 
  
    EGAD00001007993 
   
  
    
    This dataset contains 10 fastq files from 10 cell lines (4 cell lines from 3 patients and 6 cell lines from 4 controls) that have undergone 50bp single end sequencing with PolyA enrichment strategy (BGI project number HUMpcsN).
*please note one of the samples (Patient_4) was named in error and should be corrected to Patient_3 during analysis 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001007995 
   
  
    
    COVID-19 scRNA-seq, TCR-seq and BCR-seq for 291 samples collected from 109 patients. Among 291 samples, 249 of them have two libraries (sequencing runs) for each assay, while 42 have only one library. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  37 
 
  
    EGAD00001007996 
   
  
    
    scRNAseq data of scrambled and siRNA-mediated knock-down (96h) of  the minor spliceosome snRNA U6atac in androgen-sensitive LNCaP cells and in patient derived neuroendocrine organoids  (PM154). Three replicates for each cell line. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001007997 
   
  
    
    Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  31 
 
  
    EGAD00001007998 
   
  
    
    ATACseq FASTq files from RT4 cells treated with KDM5i C70 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001007999 
   
  
    
    RNAseq FASTq files from RT4 cells treated with FGFRi Erdafitinib 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008000 
   
  
    
    RNAseq FASTq files from RT4 cells treated with KDM5i C70 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008001 
   
  
    
    single-cell RNAseq FASTq files for three muscle-invasive bladder tumors 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001008002 
   
  
    
    RNAseq FASTq files from 31 post-treatment tumors from PURE01 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  31 
 
  
    EGAD00001008003 
   
  
    
    RNAseq FASTq files from 82 pre-treatment tumors from PURE01 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  82 
 
  
    EGAD00001008004 
   
  
    
    Retinoblastoma is a rare childhood cancer of the retina. We studied retinoblastoma by Targeted Sequencing. 
    
   
  
    
   
  51 
 
  
    EGAD00001008005 
   
  
    
    Human skin samples were obtained from HS patients after informed consent (Ethical vote, University of Würzburg; No. 306/12). Lesional and perilesional were taken and epidermis and dermis separated. Isolated epidermal keratinocytes were further processed for RNA isolation. mRNA was extracted from five pairwise-matched lesional and perilesional epidermal HS pellets and RNA sequencing was performed. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  10 
 
  
    EGAD00001008006 
   
  
    
    Study metadata, containing the clinical information on samples and patients 
    
   
  
    
   
  125 
 
  
    EGAD00001008007 
   
  
    
    Raw Illumina sequencing data and CellRanger BAM output files. For further information regarding this dataset, please contact Stephen Sansom at contact@combat.ox.ac.uk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001008008 
   
  
    
    Linker file for COMBAT CITEseq sequencing data. Links COMBAT sample IDs with sequencing pools and their associated raw sequence data. Sequence data can be found in the following datasets:
ADT data: EGAD00001007962
GEX data: EGAD00001008007
VDJ (B-cell): EGAD00001007964
VDJ (T-cell): EGAD00001007965 
    
   
  
    
   
  140 
 
  
    EGAD00001008009 
   
  
    
    Other raw and processed phenotype data generated by the COMBAT consortium. 
    
   
  
    
   
  611 
 
  
    EGAD00001008010 
   
  
    
    the dataset contains Exome and RNA fastq files of Renal Cell Carcinoma patients, which belongs to "Integrated genomic analysis of tumor thrombus"/ 
    
   
  
    
   
  600 
 
  
    EGAD00001008011 
   
  
    
    This is a set of 20 10X Genomics Chromium WGS 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008012 
   
  
    
    Genome and transcriptome sequence data from a poorly differentiated chordoma of C1-C2 spine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001008013 
   
  
    
    Genome and transcriptome sequence data from an unspecified tissue chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study 
    
   
  
    
   
  2 
 
  
    EGAD00001008014 
   
  
    
    RNA-sequencing dataset of post-mortem human brain tissue of FTD patients with mutations in GRN, MAPT and C9orf72 and healthy controls. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  47 
 
  
    EGAD00001008015 
   
  
    
    We performed bulk RNA-sequencing on peripheral blood collected from 4,732 blood donors recruited as part of the INTERVAL study. Using these data, we mapped gene expression and splicing quantitative trait loci (QTLs). Then, we integrated these data with protein, metabolite and lipid QTLs in the same individuals. The study aimed to identify the shared genetic etiology across transcriptional phenotypes, molecular traits and health outcomes in humans. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008016 
   
  
    
    Dataset comprises of 84 bam files from exome sequencing data, including 40 tumor-normal pairs and 4 normal files. Each sample is numbered by the patient case ID such as 135, 156 and so on. The filenames are suffixed with "_tumor"  and "_normal" to indicate  tumor and normal bam files respectively. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  84 
 
  
    EGAD00001008020 
   
  
    
    To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region whole-exome sequencing on a total of 51 spatially separated tumor samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 1 relapsed neuroblastoma. We also assessed the impact of chemotherapy on the clonal expansion by sequencing tumour regions from one medium risk and one  high-risk tumour for which we had matched samples obtained at diagnosis and after chemotherapy. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  61 
 
  
    EGAD00001008021 
   
  
    
    Dataset contains whole mitochondrial DNA sequencing data in fastq format (Illumina MiSeq paired-end) of 62 samples, in total. Those samples include sequencing data of the endothelial cell populations of 10 different donors and of 26 early-passage iPSC clones derived thereof. Moreover, the dataset contains the data of 7 of those iPSC clones sequenced additionally in passage 30 and 50, each. Lastly, 4 iPSC clones were sequenced during directed cardiomyocyte differentiation, each at day 0, 5, and 15 of differentiation. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  62 
 
  
    EGAD00001008022 
   
  
    
    This dataset was conceived to characterize the genomic differences among different types of follicular-like thyroid lesions. To do so, we performed whole exome sequencing experiments on human biopsies corresponding to nodular hyperplasias, follicular thyroid adenomas, follicular thyroid carcinomas and Follicular Variant Thyroid Gland Papillary Carcinomas. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  54 
 
  
    EGAD00001008024 
   
  
    
    This dataset contains the raw sequencing data, in FASTQ format, for Hi-C assays from 17 primary prostate tissue samples. The sequencing data is paired-end, 150 bp sequencing data from an Illumina NovaSeq 6000 machine, and contains 5 benign tissue samples and 12 primary tumour samples from the Canadian Prostate Cancer Genome Network (CPC-GENE) project. Tumour samples have IDs starting with the "CPCG" prefix, and benign tissue samples have IDs starting with the "BP" suffix. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008026 
   
  
    
    Applying a refined m6A RNA immunoprecipitation method, we profiled the m6A epitranscriptome on 10 non-neoplastic lung (NL) tissues and 53 lung adenocarcinoma tumors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  126 
 
  
    EGAD00001008027 
   
  
    
    41 breast cancer patients with known functional homologous recombination status (matched normal and tumor genomes, n=82) 
    
   
  
    
   
  1 
 
  
    EGAD00001008029 
   
  
    
    The dataset comprises whole exome sequences from laser capture micro-dissected biopsies of 10 patients diagnosed with clear cell renal cell carcinoma. In total over 100 regions are sampled to allow 'focally exhaustive' sequencing and explore the limits of intra-tumoural heterogeneity. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  117 
 
  
    EGAD00001008030 
   
  
    
    The dataset comprises of 5' single cell RNA sequencing with TCR enrichment with 10x Genomics' Chromium technology of multiregional biopsies of human renal cell carcinomas. Biopsies from different tumour regions, the tumour-normal interface, normal kidney, normal adrenal, metastatic regions, peri-nephric fat, and peripheral blood were sequenced from 12 patients with kidney tumours. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001008031 
   
  
    
    Spatial transcriptome sequence data from HER2-positive human breast tumors obtained from the first generation of Spatial Transcriptomics arrays. 
The dataset contains 8 different tumors with 3 or 6 sections taken from each with paired-end sequencing. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  36 
 
  
    EGAD00001008032 
   
  
    
    The rates and patterns of somatic mutation in normal tissues are largely unknown outside of humans. Comparative analyses can shed light on the diversity of mutagenesis across species and on long-standing hypotheses regarding the evolution of somatic mutation rates and their role in cancer and ageing. Here, we used whole-genome sequencing of 208 intestinal crypts from 56 individuals to study the landscape of somatic mutation across 16 mammalian species. We found somatic mutagenesis to be dominated by seemingly endogenous mutational processes in all species, including 5-methylcytosine deamination and oxidative damage. With some differences, mutational signatures in other species resembled those described in humans, although the relative contribution of each signature varied across species. Remarkably, the somatic mutation rate per year varied greatly across species and exhibited a strong inverse relationship with species lifespan, with no other life-history trait studied displaying a comparable association. Despite widely different life histories among the species surveyed, including ~30-fold variation in lifespan and ~40,000-fold variation in body mass, the somatic mutation burden at the end of lifespan varied only by a factor of ~3. These data unveil common mutational processes across mammals and suggest that somatic mutation rates are evolutionarily constrained and may be a determinant of lifespan. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  36 
 
  
    EGAD00001008033 
   
  
    
    Whole exome sequencing from 51 patients with brain metastases from prostate cancer 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  235 
 
  
    EGAD00001008034 
   
  
    
    Bulk RNAseq data of scrambled and siRNA-mediated knock-down of the minor spliceosome snRNA U6atac in androgen-sensitive LNCaP cells (L),  androgen-insensitive C4-2 (C) and 22Rv1 (R) cells and in patient derived neuroendocrine organoids  PM154 (P). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001008035 
   
  
    
    RNA-sequencing on neuroblastoma PDX model COG-N-519 treated with control miR-1283 and test miR-99b-5p mimics. Three samples from each of the treatment condition were analysed. Next-Seq platform was used for sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008036 
   
  
    
    This dataset contains raw data from polyA RNAseq, hybrid capture target TCR panel data, and bam files from whole exome sequencing on 39 tumors with matched germline blood. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  79 
 
  
    EGAD00001008037 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008038 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008039 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008040 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008041 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008042 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008043 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008044 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008045 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008046 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008047 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008048 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008049 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008050 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008051 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008052 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008053 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008054 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008055 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008056 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008057 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008058 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008059 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008060 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008061 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008062 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008063 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008064 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008065 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008066 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008067 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008068 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008069 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008070 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008071 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008072 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008073 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008074 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008075 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008076 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008077 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008078 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008079 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008080 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008081 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008082 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008083 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008084 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008085 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008086 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008087 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008088 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008089 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008090 
   
  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008091 
   
  
    
    We applied this signature to a 567-patient GC cohort to establish genomic-based molecular subtypes and then used a support vector machine to build a molecular subtype-based risk-scoring model. Both source code and supplementary datasets for risk score prediction are available at https://github.com/hwanglab/Yonsei_gastric_cancer_32genes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001008092 
   
  
    
    Lynch Syndrome (LS) is an autosomal dominant disease conferring a high risk of colorectal cancer due to germline heterozygous mutations in a DNA mismatch repair (MMR) gene. Although cancers in LS patients show elevated somatic mutation burdens, information on mutation rates in normal tissues and understanding of the trajectory from normal to cancer cell is limited. Here we whole-genome sequenced 152 crypts from normal and neoplastic epithelial tissues from LS patients. In normal tissues the repertoire of mutational processes and mutation rates were similar to those found in wild type individuals. A morphologically normal colonic crypt with an increased mutation burden and mutational signatures consistent with MMR deficiency was identified, which may represent a very early stage of LS pathogenesis. Phylogenetic tress of tumour crypts indicated that the most recent ancestor cell of each tumour was already MMR deficient and had experienced multiple clonal evolution cycles. This study demonstrates the genomic stability of epithelial cells with heterozygous germline MMR gene mutations and highlights important differences in the pathogenesis of LS from other colorectal cancer predisposition syndromes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  161 
 
  
    EGAD00001008094 
   
  
    
    Single-end bulk RNA sequencing results of cell lines derived from patients described with NGLY1 deficiency as well as parent and CRISPR edited controls. The cell lines represent 4 different cell types: fibroblasts, lymphoblastoid cells, induced pluripotent stem cells (iPSCs) and neural progenitor cells (NPCs.). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  136 
 
  
    EGAD00001008095 
   
  
    
    This dataset contains whole genome sequencing data, based in BAM files of three trio members. These BAM files contain information of chromsomes 21, X, Y and mitochondrial. 
    
   
  
    
   
  3 
 
  
    EGAD00001008096 
   
  
    
    This dataset contains whole genome sequencing data, based in paired end Fastq files of three trio members. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001008097 
   
  
    
    This dataset contains whole genome sequencing data, based in VCF of three trio members. 
    
   
  
    
   
  3 
 
  
    EGAD00001008098 
   
  
    
    The dataset contains rearranged TCR‐α and TCR‐β genes of Ttet+/Tpat+, Ttet-/Tpat+ and Ttet-/Tpat- CD4+ cells from gut biopsies (exvivo) or that of T cell clones generated  from gut biopsies (invitro) from 12 CeD patients. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  36 
 
  
    EGAD00001008099 
   
  
    
    This dataset consists of 116 tumor and normal samples analyzed with whole exome sequencing on the HiSeq2500 instruments with 100bp paired-end reads as well as 760 tumor and normal samples analyzed with the PGDx elio tissue complete assay.  The PGDx elio tissue complete assay is a hybrid capture approach targeting 500+ genes with sequencing on the NextSeq instruments with 150bp paired-end reads.  The bam files provided have been adapter masked and contain duplicate reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  876 
 
  
    EGAD00001008100 
   
  
    
    May 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      unspecified 
      
    
   
  17 
 
  
    EGAD00001008101 
   
  
    
    August 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada  as part of the International Human Epigenome Consortium. 
    
   
  
    
      
      unspecified 
      
    
   
  13 
 
  
    EGAD00001008105 
   
  
    
    Exome sequencing samples from the Acute Care Flagship, Illumina sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  85 
 
  
    EGAD00001008106 
   
  
    
    Patients in IA cohort with PPIL4 mutations (Please see Supplementary Table 3 for clinical characteristics of the patients) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001008107 
   
  
    
    A lymphocyte suffers many threats to its genome, including programmed mutation during differentiation, antigen-driven proliferation and residency in diverse microenvironments. After developing protocols for single-cell lymphocyte expansions, we sequenced whole genomes from 717 normal naive and memory B and T lymphocytes and hematopoietic stem cells. All lymphocyte subsets carried more point mutations and structural variants than haematopoietic stem cells – the extra mutations were mostly acquired during differentiation, with burdens higher in memory than naive lymphocytes, although T cells also had a higher rate of mutation accumulation throughout life. Off-target effects of immunological diversification accounted for most of the additional differentiation-associated mutations in lymphocytes. Memory B cells acquired, on average, 18 off-target mutations genome-wide for every one on-target IGV mutation during the germinal centre reaction. Structural variation was 16-fold higher in lymphocytes than stem cells, with ~15% of deletions being attributable to off-target RAG activity. Mutational processes associated with ultraviolet light exposure and other sporadic mutational processes generated hundreds to thousands of mutations in some memory lymphocytes. The mutation burden and signatures of normal B lymphocytes were broadly comparable to those seen in many B-cell cancers, suggesting that malignant transformation of lymphocytes arises from the same mutational processes active across normal ontogeny. The mutational landscape of normal lymphocytes chronicles the off-target effects of programmed genome engineering during immunological diversification and the consequences of differentiation, proliferation and residency in diverse microenvironments. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  717 
 
  
    EGAD00001008108 
   
  
    
    This dataset contains single-cell RNA sequencing data from patients with thyroid cancer (n=7), multinodal Goiter (n=3) and healthy individuals (n=5). Mononuclear cells were taken from both the peripheral blood and the bone marrow compartments. We used a pooled single-cell design where multiple individuals were pooled in a single sample for sequencing (NextSeq 500-V2) and later demultiplexed using their genotype data. Associated metadata contains information on the phenotypes per individual, the pooling design and the linkage between the supplied files and sequenced pools.
Due to limitations from EGA in uploading single-cell data, the raw fastq files were processed as follows: (i) I1/I2/R1/R2 fast files were concatenated over the different lanes. (ii) Concatenated I1 and I2 files were interleaved, as were the concatenated R1 and R2 files to generate two fastq files per pool containing all the information. To interleave the fastq files, the BBmap tool bbmap/reformat.sh was used, which can also be used to de-interleave the files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001008109 
   
  
    
    Full information about the T cell receptor (CR)  variable regions found in the sequences of the vdj region. 
Columns:
barcode	is_cell	contig_id	high_confidence	length	chain	v_gene	d_gene	j_gene	c_gene	full_length	productive	cdr3	cdr3_nt	reads	umis	raw_clonotype_id	raw_consensus_id 
    
   
  
    
   
  2 
 
  
    EGAD00001008113 
   
  
    
    Pancreatic cancer biopsies and matching normal controls from 10 patients were exome sequenced. The same biopsies and PDX models derived from these were also subject to RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  57 
 
  
    EGAD00001008114 
   
  
    
    This dataset contains CLL2 data used in FLTseq paper. The dataset contains the data of CITEseq, FLTseq, RaCHseq, Exonseq and bulk RNAseq. 
    
   
  
    
      
      NextSeq 500 
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001008115 
   
  
    
    the source data in VCF format of 46 patients primary malignant glioma cohort in Chinese population 
    
   
  
    
   
  45 
 
  
    EGAD00001008117 
   
  
    
    This dataset contains samples from 13 patients with osteosarcoma. 13 samples have whole exome tumor data. 12 samples have tumor RNAseq data. 3 samples have matched normal dna sequence data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001008118 
   
  
    
    This dataset contains 60 .bam files of shallow WGS data (~0.1X) from ovarian cancer cell lines. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  60 
 
  
    EGAD00001008119 
   
  
    
    This dataset contains 148 .bam files of shallow WGS data (~0.1X) from OV04 PDX samples. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  148 
 
  
    EGAD00001008120 
   
  
    
    This dataset contains tumor and normal whole exome DNA sequence data for a patient with neuroblastoma 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008121 
   
  
    
    This dataset contains 142 .bam files of shallow WGS data (~0.1X) from OV04 patient samples. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  142 
 
  
    EGAD00001008122 
   
  
    
    contain the raw data from scRNA, scATAC,  genotyping. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001008123 
   
  
    
    Paired tumor and normal WGS of primary neuroblastomas. This is an update of the „Berlin Neuroblastoma Dataset” (EGAS00001004022). This data was used for the analysis of circular RNA expression and regulation in neuroblastoma. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  25 
 
  
    EGAD00001008124 
   
  
    
    Tumor Total RNA Seq data of primary neuroblastomas. This is an update of the „Berlin Neuroblastoma Dataset” (EGAS00001004022). This data was used for the analysis of circular RNA expression and regulation in neuroblastoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  105 
 
  
    EGAD00001008125 
   
  
    
    Hi-C sequencing data includes 5 samples collected from 4 B-ALL patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001008126 
   
  
    
    Bulk RNAseq of human skeletal muscle
RNAseq of FACS sorted human skeletal muscle cells
scRNAseq of human skeletal muscle 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  41 
 
  
    EGAD00001008127 
   
  
    
    RNA sequencing of 32 primary head and neck squamous cell carcinoma (HNSCC) samples prior to treatment with neoadjuvant anti-PD-1 (n=6) or anti-PD-1 + anti-CTLA-4 (n=26) immunotherapy, and 30 paired on-treatment HNSCC samples (i.e. after neoadjuvant immunotherapy). RNA quantity used: 10ng. Library Preparation Kit: SMART Stranded Total RNA Seq Kit (Takara). Sequencing parameters: NovaSeq 6000, 2x 100 bp. File type: fastQ 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  62 
 
  
    EGAD00001008128 
   
  
    
    RNAseq FASTq files of 181 bulk pre-treatment and 14 post-treatment tumors from GO30140 Ph1b group A and F and 177 bulk pre-treatment tumors of IMbrave150 PhIII 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  372 
 
  
    EGAD00001008129 
   
  
    
    WES FASTq files of 76 bulk pre-treatment tumors and 76 matched peripheral blood mononuclear cells from GO30140 group A 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  152 
 
  
    EGAD00001008130 
   
  
    
    Clinical data from GO30140 group A and group F and IMBrave150 biomarker populations including gender, confirmed RECIST response by independent review forum (IRF), overall survival (OS), progression survival by IRF, treatment group and treatment 
    
   
  
    
   
  1 
 
  
    EGAD00001008131 
   
  
    
    Standard RNA-Seq datasets. Check the associated paper for more details. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  584 
 
  
    EGAD00001008132 
   
  
    
    NuGen 99-Gene-Panel Targeted Sequencing of 574 DLBCL Cases of Non-China Cohort from Phoenix Clinical Trial. Check the Associated Publication for More Experimental Details. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  574 
 
  
    EGAD00001008133 
   
  
    
    To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region RNA sequencing on a total of 51 spatially separated tumor samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 1 relapsed neuroblastoma. We also assessed the impact of chemotherapy on the clonal expansion by sequencing tumour regions from one medium risk and one  high-risk tumour for which we had matched samples obtained at diagnosis and after chemotherapy. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  50 
 
  
    EGAD00001008134 
   
  
    
    RNAseq data set, panALL study, 16 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001008135 
   
  
    
    Oxidative bisulfite sequencing (oxBS-Seq) for APL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008136 
   
  
    
    Whole genome bisulfite sequencing (WGBS) for APL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008137 
   
  
    
    This dataset includes mutation profiling by Whole-exome sequencing of 3 upper  urinary  tract urothelial  tumours  (UTUC). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001008138 
   
  
    
    This dataset includes transcription profiling by RNA-seq of 3 upper  urinary  tract urothelial  tumours  (UTUC). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001008139 
   
  
    
    Whole-exome sequencing of 32 primary head and neck squamous cell carcinoma samples prior to treatment with neoadjuvant anti-PD-1 (n=6) or anti-PD-1 + anti-CTLA-4 (n=26) immunotherapy. DNA quantity used: 50ng. Library Preparation Kit: Twist Human Core Exome Plus (Twist Bioscience). Sequencing parameters: NovaSeq 6000, 2x 100 bp. File type: fastQ. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001008140 
   
  
    
    ATAC-seq profiling bam files from colorectal carcinoma and adenoma. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1207 
 
  
    EGAD00001008141 
   
  
    
    Transcriptomic data for five patients with breast cancer undergoing neoadjuvant chemotherapy and hyperpolarised 13C-MRI for early response assessment 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001008142 
   
  
    
    Metagenomics data for "Combined Metabolic Activators Reduces Liver Fat in Nonalcoholic Fatty Liver Disease Patients". Samples were sequenced on NovaSeq6000(NovaSeq Control Software 1.7.0/RTA v3.4.4) with a 151nt (Read1)-10nt(Index1)-10nt(Index2)-151nt(Read2) setup using ‘NovaSeqXp’ workflow in ‘S4’ mode flow cell. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  189 
 
  
    EGAD00001008143 
   
  
    
    SNP data for Ovarian cancer PRS (cases) 
    
   
  
    
   
  217 
 
  
    EGAD00001008144 
   
  
    
    SNP data for 313 loci required for calculation of the Breast cancer PRS 
    
   
  
    
   
  - 
 
  
    EGAD00001008145 
   
  
    
    SNP data of 28 sites required for the Ovarian cancer PRS (controls) 
    
   
  
    
   
  - 
 
  
    EGAD00001008146 
   
  
    
    APL nanopore sequencing data are deposited into 2 data formats:
1. CRAM files
2. h5 files 
    
   
  
    
      
      GridION 
      
    
   
  1 
 
  
    EGAD00001008147 
   
  
    
    This dataset contains BAM files for 9 samples from individuals involved in a retrospective IVF trial. The BAM files were derived from whole genome sequencing. The 9 individuals consist of three trios containing mother, father and child samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001008149 
   
  
    
    This dataset comprises complete exome data from from the study PMID27216186 (Harbst & Lauss et al, Cancer Research 2016). These data are from 49 samples (tumor and matched normal) from 8 patients representing multi-region sequencing of human melanoma. Files are in the BAM format and contain aligned and processed data used for e.g. somatic variant calling. The sequencing libraries were constructed using SureSelect target enrichment with Clinical Research Exome Panel (Agilent) and sequenced on a HiSeq2500 (Illumina). 
    
   
  
    
   
  1 
 
  
    EGAD00001008150 
   
  
    
    Four PAIRED WGS samples, tumor and control, were sequenced on a HiSeq X Ten and the library preparation kit used was Illumina TruSeq Nano DNA. The tumor was multiple myeloma from bone marrow. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001008151 
   
  
    
    Raw fast5 file of Oxford Nanopore sequencing for an APL patient sample 
    
   
  
    
      
      GridION 
      
    
   
  1 
 
  
    EGAD00001008152 
   
  
    
    RNA-Seq data for systematic gene fusion detection in Pediatric Cancer 
    
   
  
    
   
  - 
 
  
    EGAD00001008153 
   
  
    
    smallRNA sequencing from healthy individuals and MCI patients, along with phenotypic information. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  145 
 
  
    EGAD00001008155 
   
  
    
    Intratumoral heterogeneity is a critical frontier in understanding how the tumor microenvironment (TME) propels malignant progression. Here, we deconvolute the human pancreatic TME through large-scale integration of histology-guided regional multiOMICs with clinical data and patient-derived preclinical models. We discover subTMEs, histologically definable tissue states anchored in fibroblast plasticity, with regional relationships to tumor immunity, subtypes, differentiation, and treatment response. Reactive subTMEs rich in complex but functionally coordinated fibroblast communities were immune-hot and inhabited by aggressive tumor cell phenotypes. The matrix-rich deserted subTMEs harbored less activated fibroblasts and tumor- suppressive features yet were markedly chemoprotective and enriched upon chemotherapy. SubTMEs originated in fibroblast differentiation trajectories and transitory states were notable both in single cell transcriptomics and in situ. The intratumoral co-occurrence of subTMEs produced patient-specific phenotypic and computationally predictable heterogeneity tightly linked to malignant biology. Therefore, heterogeneity within the plentiful, notorious pancreatic TME is not random but marks fundamental tissue organizational units. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      unspecified 
      
    
   
  29 
 
  
    EGAD00001008156 
   
  
    
    To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region targeted re-sequencing on a total of 140 samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 2 relapsed neuroblastoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  141 
 
  
    EGAD00001008157 
   
  
    
    This dataset contains targeted sequencing data of 204 surgical samples from resected NSCLC. Genomic profiling identifies five predictive biomarkers, which is then integrated into the Multiple-gene INdex to Evaluate the Relative benefit of Various Adjuvant therapies (MINERVA) score. The MINERVA score categorizes patients into three subgroups with relative disease-free survival and overall survival benefits from either adjuvant gefitinib or chemotherapy. This study demonstrates that predictive genomic signatures could potentially stratify resected EGFR-mutant NSCLC patients and provide precise guidance towards future personalized adjuvant therapy. 
    
   
  
    
   
  204 
 
  
    EGAD00001008158 
   
  
    
    Illumina-based RNA-Seq analysis of 93 liver samples. Biopsies of tumors and non-tumor tissues are included. Samples are stratified by response and non-response to TACE treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  93 
 
  
    EGAD00001008159 
   
  
    
    Single cell RNA sequence generated from human primary nasal epithelium differentiated at air-liquid interface. This dataset comprises tissue from two donors, with cultures either unexposed or exposed to SARS-CoV2. Libraries were prepared using the 10X Genomics pipeline as per manufacturer's instructions. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  24 
 
  
    EGAD00001008160 
   
  
    
    RNA sequences of a total of 24 samples, including 13 unrelated ASD patients (13 males) and 11 adult individuals of Spanish origin as controls (4 males, 7 females). The RNAseq study was conducted on a HiSeq 4000 (Illumina) and paired-end sequences were obtained (fastq files). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001008161 
   
  
    
    We performed single-cell RNA-sequencing of cells in the bronchoalveolar lavage (BAL) fluid of late severe COVID-19. This study provides detailed insights into the alveolar macrophage response to SARS-CoV-2 infection and reveals a profibrotic macrophage response in severe COVID-19 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008162 
   
  
    
    Whole exome sequencing data from 90 diagnostic lymphoma samples run and published as part of the Leukemia manuscript. 133 total exomes were sequenced including tumour and normal controls. Copy number array data was also generated for 95 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  91 
 
  
    EGAD00001008163 
   
  
    
    ADAPTeR study RNAseq from multi-region samples taken pre and post nivolumab. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  52 
 
  
    EGAD00001008164 
   
  
    
    ADAPTeR study WES from multi-region samples taken pre and post nivolumab 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001008165 
   
  
    
    ADAPTeR study multi-region TCRseq of pre and post nivolumab tumour and PBMC samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  234 
 
  
    EGAD00001008166 
   
  
    
    ADAPTeR study scRNA and scTCR data from TILs from two ccRCC patients treated with nivolumab 
    
   
  
    
      
      NextSeq 550 
      
    
   
  64 
 
  
    EGAD00001008182 
   
  
    
    All 122 HCC biopsies and 115 non-tumoral tissues from 114 patients were subjected to whole-exome sequencing. Whole-exome capture was performed using the SureSelectXT Clinical Research Exome (Agilent Technologies) or SureSelect Human All Exon V6+COSMIC (Agilent Technologies) platforms according to the manufacturer’s guidelines. Sequencing was performed on an Illumina HiSeq 2500 at the Genomics Facility Basel according to the manufacturer’s guidelines. Paired-end 101-bp reads were generated. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  237 
 
  
    EGAD00001008183 
   
  
    
    This dataset includes TruSeq paired-end, total RNA sequencing data from primary B-precursor acute lymphoblastic leukaemia (B-ALL) xenografts. It comprises 43 pairs of matched bone marrow (BM) and central nervous system (CNS) human leukaemia cells from individual immunodeficient mice. Xenografts were generated from 6 patients with B-ALL and include samples taken at diagnosis and relapse from 3 of 6 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  86 
 
  
    EGAD00001008184 
   
  
    
    Targeted sequencing using BD Rhapsody with 462 mRNA of healthy young adult bone marrow mononuclear cells from iliac crest aspirations (BM3/Young3). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001008185 
   
  
    
    Shallow targeted sequencing with 462 mRNA and 97 antibodies of AML patient’s bone marrow mononuclear cells from iliac crest aspirations from. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is https://doi.org/10.6084/m9.figshare.14780127.v1
AMLQ4_SMK1	AML314	male
AMLQ1_SMK2	AML116	female
AMLQ3_SMK3	AML127	female
AMLQ6_SMK4	AML183	male
AMLQ2_SMK5	AML327	female
AMLQ5_SMK6	AML334	male
APLQ5_SMK7	APL124	male
APLQ3_SMK8	APL142	male
APLQ6_SMK9	APL218	female
APLQ4_SMK10	APL147	male
APLQ2_SMK11	APL223	female
APLQ1_SMK12	APL224	female 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008186 
   
  
    
    Whole transcriptome sequencing using BD Rhapsody with 97 antibodies of a healthy young adult bone marrow (Young3/BM3) mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access links is https://figshare.com/s/901bcddb9ee18e226031. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001008187 
   
  
    
    SmartSeq2 read out of index cultured cell sorted with the classification and erythroid-myeloid panel developed in main Figure 6. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001008188 
   
  
    
    Targeted sequencing using BD Rhapsody with 462 mRNA and 97 antibodies of healthy young and aged adult as well as AML bone marrow mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is  https://figshare.com/s/0fda29b169c719223ee3. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001008189 
   
  
    
    Targeted sequencing using BD Rhapsody with 462 mRNA and 197 antibodies of healthy young adult bone marrow mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is  https://figshare.com/s/313b5739ff469dac8c01 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001008190 
   
  
    
    Targeted sequencing with 462 mRNA and 97 antibodies of fresh, frozen and stored on ice (6h) healthy adult blood cells. Multiplexed sample fresh thawed ice SMK1-Frozen, SMK2-thawed, SMK3-fresh and Targeted sequencing with 462 mRNA and 197 antibodies of CD34+Immature cells Multiplexed sample CD34+ immature Abseq SMK4-CD38+CD45RA-, SMK5-CD38+CD45RA+, SMK6-CD38-CD45RA+/- 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008191 
   
  
    
    Dataset consists of Oncomine Comprehensive Cancer Panel v3 sequencing of 16 tumor-normal mucosa pairs. Tumors include 8 sessile serrated lesions (SSL), 3 sessile serrated lesions with dysplasia (SSL/D), 2 traditional serrated adenomas (TSA) and 3 tubular adenoma s(TA). 
    
   
  
    
      
      Ion Torrent S5 
      
    
   
  32 
 
  
    EGAD00001008192 
   
  
    
    Samples prepared using TruSeq Stranded Total RNA library kit. Samples sequenced on a HiSeq 2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001008193 
   
  
    
    Exome sequencing of panALL exome data set, total of 1948 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  598 
 
  
    EGAD00001008194 
   
  
    
    This dataset contains paired-end RNA sequencing data of blood CD34+ cells from random blood donors (155 paired-end FastQ files).
The data were used to perform expression quantitative trait locus (eQTL) analysis. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  155 
 
  
    EGAD00001008195 
   
  
    
    Variation and transmission of the human gut microbiota across generations - 16S sequencing data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  102 
 
  
    EGAD00001008196 
   
  
    
    In this study, we sequenced multiple stages of the B-lineage in elderly individuals and patients with lymphoplasmacytic lymphoma to study whether mutations are accumulated in normal-cell counterparts prior to lymphoma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  73 
 
  
    EGAD00001008197 
   
  
    
    We isolated naive and memory CD4+ T cells from 119 healthy individuals and stimulated the cells using anti-CD3/anti-CD28 coated beads. We profiled gene expression using single cell RNA-seq (10X-Genomics 3’ v2 kit) at resting state and three time points of activation (16h, 40h and 5 days post stimulation) and mapped expression quantitative trait loci. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  167 
 
  
    EGAD00001008198 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008199 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008200 
   
  
    
    This dataset contains the exome sequencing data (BAMs, VCFs and CNVs) from 5 schwannoma tumors from the same patient. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008201 
   
  
    
    This dataset include FASTQ files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  808 
 
  
    EGAD00001008202 
   
  
    
    This dataset include BAM files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index. 
    
   
  
    
   
  808 
 
  
    EGAD00001008203 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001008204 
   
  
    
    RNA-seq dataset (BAM files) of 28 HCCs and 19 non-tumor livers derived from 8 patients undergoing sorafenib treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  47 
 
  
    EGAD00001008205 
   
  
    
    RNA-sequencing of 122 hepatocellular carcinoma biopsies and 15 normal liver biopsies. RNA-seq library prep was performed with 200 ng total RNA using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Gold (Illumina) according to manufacturer’s specifications. Single-end 126-bp sequencing was performed on an Illumina HiSeq 2500 using v4 SBS chemistry at the Genomics Facility Basel according to the manufacturer’s guidelines. Primary data analysis was performed with the Illumina RTA version 1.18.66.3. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  137 
 
  
    EGAD00001008207 
   
  
    
    Variation and transmission of the human gut microbiota across generations - shotgun sequencing data 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  102 
 
  
    EGAD00001008208 
   
  
    
    The dataset is composed by processed whole genome sequencing data generated from 53 children and their respective parents, forming 49 trios (mother, father and child) and 2 quartets (mother, father and two siblings). A total of 18 children were born after spontaneous conception (n = 18); 17 children were born after in vitro fertilization (IVF) and another 18 children were born after intracytoplasmic sperm injection combined with testicular sperm extraction (ICSI-TESE) 
    
   
  
    
      
      unspecified 
      
    
   
  155 
 
  
    EGAD00001008210 
   
  
    
    This dataset contains raw genotypes ( SNVs, Indels and SVs), from 785 samples,without applying any filter, from the 808 WGS GCAT cohort. 
    
   
  
    
   
  785 
 
  
    EGAD00001008211 
   
  
    
    Whole exome sequencing of Diffuse intrinsic pontine gliomas, primary patient derived DIPG cell cultures and isogenic trametinib resistant clones 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  56 
 
  
    EGAD00001008212 
   
  
    
    RNA sequencing of Diffuse intrinsic pontine gliomas, primary patient derived DIPG cell cultures and isogenic trametinib resistant clones 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001008213 
   
  
    
    RNA sequencing of 79 libraries were prepared, from sample of Osteosarcoma tumors biopsied at diagnosis, with TruSeq Stranded mRNA kit following recommendations: the key steps consist of PolyA mRNA capture with oligo dT beads using 1µg total RNA, fragmentation to approximately 400pb, DNA double strand synthesis, and ligation of Illumina adaptors amplification of the library by PCR for sequencing. Libraries sequencing was performed using Illumina sequencers (NextSeq 500 or Hiseq 2000/2500/4000) in 75 bp paired-end mode. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  79 
 
  
    EGAD00001008214 
   
  
    
    This is the RNAseq data from mucosal biopsies. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  440 
 
  
    EGAD00001008215 
   
  
    
    This is the dataset of 16S data from mucosal biopsies. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  833 
 
  
    EGAD00001008216 
   
  
    
    Dataset consists of 25 HCCs and 9 non-tumor livers from 8 patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD00001008218 
   
  
    
    Whole genome sequencing for single cells for library A96228B 1125 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  6 
 
  
    EGAD00001008219 
   
  
    
    Whole genome sequencing for single cells for library A90679 1034 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001008220 
   
  
    
    Whole genome sequencing for single cells for library A95618A 876 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  9 
 
  
    EGAD00001008221 
   
  
    
    Whole genome sequencing for single cells for library A95628B 1335 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008222 
   
  
    
    Whole genome sequencing for single cells for library A95632D 644 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001008223 
   
  
    
    Whole genome sequencing for single cells for library A95635B 593 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001008224 
   
  
    
    Whole genome sequencing for single cells for library A95635D 367 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001008225 
   
  
    
    Whole genome sequencing for single cells for library A95654A 901 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008226 
   
  
    
    Whole genome sequencing for single cells for library A95662A 637 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001008227 
   
  
    
    Whole genome sequencing for single cells for library A95664B 630 samples; filetype=bam 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001008228 
   
  
    
    Whole genome sequencing for single cells for library A95707A 1359 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  9 
 
  
    EGAD00001008229 
   
  
    
    Whole genome sequencing for single cells for library A95724A 503 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001008230 
   
  
    
    Whole genome sequencing for single cells for library A95724B 581 samples; filetype=bam 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001008231 
   
  
    
    Whole genome sequencing for single cells for library A95732B 655 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008232 
   
  
    
    Whole genome sequencing for single cells for library A96145A 642 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  9 
 
  
    EGAD00001008233 
   
  
    
    Whole genome sequencing for single cells for library A96146A 1195 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008234 
   
  
    
    Whole genome sequencing for single cells for library A96149A 751 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008235 
   
  
    
    Whole genome sequencing for single cells for library A96149B 1792 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008236 
   
  
    
    Whole genome sequencing for single cells for library A96155B 1028 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008237 
   
  
    
    Whole genome sequencing for single cells for library A96157C 938 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008238 
   
  
    
    Whole genome sequencing for single cells for library A96162B 1316 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008239 
   
  
    
    Whole genome sequencing for single cells for library A96165A 779 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008240 
   
  
    
    Whole genome sequencing for single cells for library A96172A 1476 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008241 
   
  
    
    Whole genome sequencing for single cells for library A96172B 1694 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008242 
   
  
    
    Whole genome sequencing for single cells for library A96174B 1447 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008243 
   
  
    
    Whole genome sequencing for single cells for library A96175C 1473 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008244 
   
  
    
    Whole genome sequencing for single cells for library A96177B 683 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  9 
 
  
    EGAD00001008245 
   
  
    
    Whole genome sequencing for single cells for library A96179B 1396 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008246 
   
  
    
    Whole genome sequencing for single cells for library A96180B 1189 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008247 
   
  
    
    Whole genome sequencing for single cells for library A96183C 1036 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008248 
   
  
    
    Whole genome sequencing for single cells for library A96184A 1733 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008249 
   
  
    
    Whole genome sequencing for single cells for library A96186A 850 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008250 
   
  
    
    Whole genome sequencing for single cells for library A96186C 525 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008251 
   
  
    
    Whole genome sequencing for single cells for library A96199B 1170 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008252 
   
  
    
    Whole genome sequencing for single cells for library A96201A 536 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008253 
   
  
    
    Whole genome sequencing for single cells for library A96205A 1907 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  7 
 
  
    EGAD00001008254 
   
  
    
    Whole genome sequencing for single cells for library A96210C 1191 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008255 
   
  
    
    Whole genome sequencing for single cells for library A96211C 1397 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008256 
   
  
    
    Whole genome sequencing for single cells for library A96215A 1312 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008257 
   
  
    
    Whole genome sequencing for single cells for library A96216A 1001 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  9 
 
  
    EGAD00001008258 
   
  
    
    Whole genome sequencing for single cells for library A96220B 1267 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008259 
   
  
    
    Whole genome sequencing for single cells for library A96244A 1374 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008260 
   
  
    
    Whole genome sequencing for single cells for library A98176A 1003 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008261 
   
  
    
    Whole genome sequencing for single cells for library A98176B 1389 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008262 
   
  
    
    Whole genome sequencing for single cells for library A98284A 1265 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  4 
 
  
    EGAD00001008263 
   
  
    
    Whole genome sequencing for single cells for library A98284B 1582 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  2 
 
  
    EGAD00001008264 
   
  
    
    Whole genome sequencing for single cells for library A98289B 1032 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008265 
   
  
    
    Whole genome sequencing for single cells for library A98293B 703 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008266 
   
  
    
    Whole genome sequencing for single cells for library A98294A 994 samples; filetype=bam 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  5 
 
  
    EGAD00001008267 
   
  
    
    Whole exome sequencing of a Chinese girl with congenital cataract. 
The dataset contains one sample with two fastq files.
The novel PAX6 mutation (c.221G>A) is associated with congenital cataract, and the WFS1 mutation (c.2070_2079del) interactively aggravates this process. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008268 
   
  
    
    panALL exome sequencing, data set2, 700 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  700 
 
  
    EGAD00001008269 
   
  
    
    Whole exome, shallow whole genome, and RNA-sequencing data from a cohort of 168 women with breast cancer receiving neoadjuvant chemotherapy. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  336 
 
  
    EGAD00001008270 
   
  
    
    This dataset contains raw count data (10X CellRanger) for 28 Hodgkin lymphoma samples and 5 reactive lymph nodes, and merged data from all samples (RData object) including cell cluster assignments. 
    
   
  
    
   
  34 
 
  
    EGAD00001008271 
   
  
    
    Targeted sequencing of candidate regions on chromosome 22q predisposing to multiple schwannomas 
    
   
  
    
      
      NextSeq 550 
      
    
   
  51 
 
  
    EGAD00001008272 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001008273 
   
  
    
    Hypertensive disorders in pregnancy, of which the multisystem syndrome pre-eclampsia is the most severe, leading to preterm delivery, maternal mortality, and life-long complications. To elucidate early disease dynamics, we present the first spatio-temporal study comparing single-nuclei transcriptomes of human preterm pre-eclamptic placentae and healthy controls, contextualizing this in a comprehensive study including early and late gestational placentae. This study includes early placentae samples from the fetal part (villi; n=10), maternal part (Decidua; n=3), late placentae samples from healthy pregnancies, villi (n=6), decidua (n=4), and late placentae samples from early-onset preeclamptic pregnancies, villi (n=5) and decidua (n=5). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  22 
 
  
    EGAD00001008274 
   
  
    
    Whole exome sequencing of localized prostate cancer patients in this study contained pair tumor-normal samples and validated tumor content. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008275 
   
  
    
    Fastq files for the 16S rDNA amplicon library of 714 fecal samples of 20 time series (as described in Vandeputte et al. 2021, Nature Communications) 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  714 
 
  
    EGAD00001008276 
   
  
    
    This data set contains whole exome sequencing (WXS) and RNA-Seq on germline BRCA- mutant tumors from 18 patients.  BAM files are provided for WXS on tumor and germline samples.  FASTQ files are provided for the RNA-Seq samples.  Sequencing was done on an Illumina Hi-Seq 2500. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  38 
 
  
    EGAD00001008277 
   
  
    
    Whole genome sequencing of cell free DNA from CSF across timepoints from medulloblastoma clinical trial patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  534 
 
  
    EGAD00001008278 
   
  
    
    We performed RNA-Seq in DIPG and hemispheric HGG. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  40 
 
  
    EGAD00001008279 
   
  
    
    We performed whole exome sequencing in DIPG and hemispheric HGG. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Ion Torrent Proton 
      
    
   
  30 
 
  
    EGAD00001008280 
   
  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  287 
 
  
    EGAD00001008281 
   
  
    
    Activating mutations in PIK3CA generate large clones in the aging human esophagus. Here we
investigate the underlying cellular mechanisms regulating their expansion by lineage tracing.
Expression of an activating heterozygous Pik3caH1047R mutation in single progenitor cells of the
mouse esophagus tilts cell fate towards proliferation, generating mutant clones that outcompete their
wild type neighbours. The mutation leads to increased aerobic glycolysis through the activation of
Hif1α transcriptional targets. In vitro and in vivo interventions that level out differences in activation
of the PI3K/HIF1α/aerobic glycolysis axis between wild type and Pik3caH1047R cells attenuate the
competitive advantage of the mutants. In contrast, metabolic conditions that alter Insulin/PI3K
signalling, such as type-1 diabetes or diet-induced insulin resistance, further increase Pik3caH1047R
mutant competitiveness in mice. Consistently, the density of activating PIK3CA mutations in human
esophagus is increased in overweight individuals. We conclude that the metabolic environment
influences the mutational landscape of normal epithelia. Clinically feasible interventions that even
out signalling imbalances between wild type and mutant cells may therefore limit the expansion of
oncogenic mutants in normal tissues. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  157 
 
  
    EGAD00001008282 
   
  
    
    This dataset contains samples from 5 patients with ewings sarcoma. 5 samples have whole exome tumor data. 1 sample has tumor RNAseq data. 4 samples have matched normal dna sequence data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001008283 
   
  
    
    This dataset contains samples from 5 patients with wilm's tumor. 5 samples have whole exome tumor data. 4 samples have tumor RNAseq data. 2 samples have matched normal dna sequence data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  7 
 
  
    EGAD00001008284 
   
  
    
    16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for APP using custom designed primers. 
    
   
  
    
      
      Sequel 
      
    
   
  16 
 
  
    EGAD00001008285 
   
  
    
    16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for SPP1 using custom designed primers. 
    
   
  
    
      
      Sequel 
      
    
   
  16 
 
  
    EGAD00001008286 
   
  
    
    16 DS and Control brains samples were prepared using the 10X Genomics Single Cell 3' v3 kit. Pre-fragmented cDNA libraries were loaded onto a Pacific Biosciences Sequel II to sequence single-nucleus isoforms. 
    
   
  
    
      
      Sequel 
      
    
   
  16 
 
  
    EGAD00001008287 
   
  
    
    29 DS and Control brain samples were prepared using the 10X Genomics Single Cell 3' v3 kit. The cDNA libraries were sequenced on an Illumina Novaseq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001008288 
   
  
    
    16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for BIN1 using custom designed primers. 
    
   
  
    
      
      Sequel 
      
    
   
  16 
 
  
    EGAD00001008289 
   
  
    
    Whole-genome sequence (WGS) data of tumor-normal pairs from 139 ATL patients and RNA sequence (RNA-seq) data of tumors from 28 ATL patients. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  139 
 
  
    EGAD00001008290 
   
  
    
    panALL exome data set3, 650 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  650 
 
  
    EGAD00001008291 
   
  
    
    The is dataset includes the whole exome sequencing of the tumor from a sinonasal glomangiopericytoma case together with the matching blood. The whole exome sequencing revealed somatic PIK3CA and CTNNB1 mutations. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001008297 
   
  
    
    Oligodendroglioma (WHO gr. 2) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008298 
   
  
    
    Oligodendroglioma, Anaplastic (WHO gr. 3 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008299 
   
  
    
    Oligodendroglioma (WHO gr. 2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008300 
   
  
    
    Oligodendroglioma (WHO gr. 2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008301 
   
  
    
    Oligodendroglioma, IDH-mutant, 1p19q codeleted 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008302 
   
  
    
    Oligodendroglioma, Anaplastic (WHO gr. 3 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008303 
   
  
    
    Unknown 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008304 
   
  
    
    Anaplastic Astrocytoma, IDH-mutant 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008305 
   
  
    
    Astrocytoma (WHO gr. 2) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008306 
   
  
    
    Oligodendroglioma (WHO gr. 2) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008307 
   
  
    
    Astrocytoma, Anaplastic (WHO gr. 3 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008308 
   
  
    
    Oligodendroglioma (WHO gr. 2) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008309 
   
  
    
    Sequencing data (BAM/CRAM) of diagnosis-relapse pairs of 12 children who relapsed very early, followed by a deep-sequencing validation of all identified mutations. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
      unspecified 
      
    
   
  88 
 
  
    EGAD00001008310 
   
  
    
    BAM files containing paired-end mtDNA sequencing data from human esophageal samples of individuals that had progressed to dysplasia or developed Barrett's esophagus (BE) post-esophagectomy. BE biopsies and the background mucosa were analysed. Each patient (JE*) has associated mtDNA sequencing data from biopsies of stroma, BE and squamous and cardia tissue. Two technical replicates, denoted "A" and "B", were analysed for each biopsy. Libraries were sequenced via the Illumina MiSeq platform v2 (Illumina, San Diego, CA, USA) 300 cycles (150 nt paired-end). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  80 
 
  
    EGAD00001008311 
   
  
    
    Single-cell count data generated by the Cellranger (10X Genomics). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  57 
 
  
    EGAD00001008314 
   
  
    
    This study is multi-omics study of a Asian longitudinal metastatic breast cancer (MBC) cohort treated with palbociclib plus endocrine therapy. It contains NGS of baseline (BL) and progressive disease (PD) from 70 patients, consisting of 79 tumor/normal matched whole exome sequencing (WES) from 62 patients and 90 tumor whole transcriptome sequecing samples (WTS) from 70 patients. There were 56 BL biopsies profiled by WES and 64 by WTS; 23 PD biopsies were profiled by WES and 26 by WTS. Twenty and 23 patients had paired BL and PD biopsies profiled by WES and WTS, respectively. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  228 
 
  
    EGAD00001008315 
   
  
    
    This dataset contains FASTQ files generated from MT amplicon sequencing of 159 Sudanese individuals. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  159 
 
  
    EGAD00001008316 
   
  
    
    The dataset contains 295 plasma cfDNA samples from various stages of resectable esophageal adenocarcinoma from the PERFECT cohort and the nCRT cohort. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  295 
 
  
    EGAD00001008317 
   
  
    
    Bisulfite sequencing of a 3kb region within the CpG island of the NR3C1 exone 1 was performed with Illumina Miseq. 24 samples from major hepatic or pancreatic surgery with complications (cases) and without complications (controls).  The files are in FASTQ format. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  24 
 
  
    EGAD00001008318 
   
  
    
    Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of a BRC0004PR PDX and SK-N-BE(2C) CDX mouse model, and total RNA sequencing profiles of the matching PDX tumors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  60 
 
  
    EGAD00001008319 
   
  
    
    The dataset contains sequencing data generated for the publication 'In utero origin of myelofibrosis presenting in adult monozygotic twins after a prolonged disease latency. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008320 
   
  
    
    This dataset includes Illumina RNA Sequencing Data for 59 chronic lymphocytic leukemia patient samples. 57 samples are single end, 2 samples are paired end sequencing. 
    
   
  
    
   
  59 
 
  
    EGAD00001008321 
   
  
    
    The dataset contains 106 lung cancer, 12 healthy control and 11 non-cancerous lesion plasma cfDNA sample. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  129 
 
  
    EGAD00001008322 
   
  
    
    The dataset contains 6 lung cancer and 60 healthy control plasma cfDNA samples collected in EDTA, PAXGene and Norgen blood collection tubes at various locations. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  66 
 
  
    EGAD00001008323 
   
  
    
    Illumina sequencing data (fastq files) representing single-nucleus (sn) ATAC-seq, snRNA-seq, bulk ATAC-seq, and snATACseq+snRNAseq multiomics data from human and rat skeletal muscle samples (19 libraries total). Includes a README file that describes the relationship between libraries, samples, and files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008325 
   
  
    
    In this study, we profiled single-cell transcriptome (10X genomics) of Patient-derived xenografts (PDX) T-ALL replase samples from P1 patient. Primary human T-ALL cells were recovered from cryopreserved bone marrow aspirates of patients enrolled in the ALL-BFM 2009 study. Patient-derived xenografts (PDX) were generated as previously described by intrafemoral injection of 1 Million viable primary ALL cells in NSG mice110 PDX-derived (P1)28 cells were frozen until processing. For scRNA-seq library preparation, cryopreserved cells were thawed rapidly at 37 ℃ and resuspended in 10 ml warm Roswell Park Memorial Institute (RPMI) medium with 100 μg/ml Dnase I. Cells were centrifuged for 5 mins at 300 g, and resuspended in ice-cold phosphate buffered saline (PBS) with 2% foetal bovine serum (FBS) and 5mM EDTA. Cells were stained on ice with anti-murine-CD45-PE (mCD45)(clone 30-F11; BioLegend; 1:20) in the dark for 30 mins. 1:100 DAPI was added and incubated in the dark for 5 mins before sorting. Triple negative cells (DAPI-mCD45-GFP-) were sorted (Fig. S27) using a BD FACSAria™ Fusion Cell Sorter into ice cold 0.03% bovine serum albumin (BSA) in PBS. All isolated cells were immediately used for scRNA-seq libraries, which were generated as per the standard 10x Genomics Chromium 3′ (v.3.1 Chemistry) protocol. Completed libraries were sequenced on a NextSeq5000 sequencer (HIGH-mode, 75 bp paired-end). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001008326 
   
  
    
    Whole genome and Whole exome sequencing of patient-derived xenograft models of endometrial cancer 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  50 
 
  
    EGAD00001008327 
   
  
    
    This dataset contains .fastq files generated by targeted DNA sequencing of 542 cancer-associated and cadidate genes (52 individuals), and targeted duplex sequencing of PIK3CA and TP53 genes (4 individuals). 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 550 
      
    
   
  156 
 
  
    EGAD00001008329 
   
  
    
    Exome sequencing and amplicon-based single-cell sequencing dataset on the patients and family members that were analyzed in this study. 
    
   
  
    
   
  11 
 
  
    EGAD00001008330 
   
  
    
    Set of 8 bam files from patients affected with Lupus. BAM alignments for exonic variants present in P2RY8 gene. VCF file describing the variants. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001008331 
   
  
    
    To model recovery dynamics, using severe COVID-19 as the example, we align heterogeneous recovery trajectories via a novel computational scheme applied to longitudinally sampled blood transcriptomes. We thus generate pseudotime trajectories, which we then link to cellular and molecular mechanisms based on cell deconvolution analysis and molecular pathway prediction, thus presenting a unique framework for studying recovery processes over time. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  258 
 
  
    EGAD00001008332 
   
  
    
    Tumor-blood paired whole-exome sequencing of 58 pairs of non-muscle-invasive bladder cancer samples (stageT1). Targeted sequencing of 112 non-muscle-invasive bladder cancer samples (34 stage T1; 78 stage Ta)
Please note the following files have been removed: EGAR00003025153, EGAR00003025435, EGAR00003025294, EGAR00003025262, EGAR00003025224. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  339 
 
  
    EGAD00001008333 
   
  
    
    Small variants in mtDNA of several Canary Islanders sequenced with Illumina WGS and WES and Oxford Nanopore Technologies WGS. 
    
   
  
    
   
  36 
 
  
    EGAD00001008334 
   
  
    
    Genomic data from a cohort of 19 MMR deficient colorectal cancers and 1 MMR proficient colorectal cancer. All cases were target gene DNA sequenced using multiple primary and where available metastatic tumour regions from surgical resection samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  91 
 
  
    EGAD00001008335 
   
  
    
    The dataset contains raw RNA-seq data of human adipocytes from 13 individuals. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  13 
 
  
    EGAD00001008336 
   
  
    
    The dataset include sequencing data from 23 patients diagnosed with metastatic melanoma. The 23 metastatic melanoma subtypes consisted of cutaneous melanoma (CM, n=10); head and neck melanoma (HNM, n=7); uveal melanoma (UM, n=4); acral lentiginous (AM, n=1) and mucosal melanoma (MM, n=1). 
    
   
  
    
      
      unspecified 
      
    
   
  23 
 
  
    EGAD00001008337 
   
  
    
    This project includes RNA-sequencing data from human FSHD and control skeletal muscle biopsies. This project includes data from 28 FSHD patients (total 37 samples, including vastus lateralis and tibialis anterior muscles) and 12 control individuals (total 24 samples, including vastus lateralis and tibialis anterior muscles). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  65 
 
  
    EGAD00001008338 
   
  
    
    38 samples with DCIS and matched recurrences sequenced with a targeted mutation panel on IonTorrent. 
    
   
  
    
      
      Ion Torrent PGM 
      
    
   
  76 
 
  
    EGAD00001008339 
   
  
    
    Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – filtered vcf files 
    
   
  
    
   
  551 
 
  
    EGAD00001008340 
   
  
    
    Single-cell RNAseq dataset of paired normal and tumor human prostate biopsies from n=10 participants. Fastq files corresponding to R1, R2 and I1 are uploaded and were generated from cellranger mkfastq. Data was sequenced on Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001008341 
   
  
    
    Whole genome sequencing from paired tumour and germline malignant pleural mesothelioma samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  74 
 
  
    EGAD00001008342 
   
  
    
    Acne meta-analysis 
    
   
  
    
   
  1 
 
  
    EGAD00001008343 
   
  
    
    Patient neuroblastoma hybrid capture sequencing panel. 5 samples from 2 donors (BAM files). For each donor, we obtained neuroblastoma tumor samples and neuroblastoma ALKi resistant samples. This dataset was used to study ALKi resistance in neuroblastoma. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  5 
 
  
    EGAD00001008344 
   
  
    
    Enriched tumor epithelium, tumor-associated stroma, and whole tissue were collected by laser microdissection from thin sections across spatially separated levels of ten high-grade serous ovarian carcinomas (HGSOCs) and analyzed by mass spectrometry, reverse phase protein arrays, and RNA sequencing. Unsupervised analyses of protein abundance data revealed independent clustering of an enriched stroma and enriched tumor epithelium, with whole tumor tissue clustering driven by overall tumor “purity.” Comparing these data to previously defined prognostic HGSOC molecular subtypes revealed protein and transcript expression from tumor epithelium correlated with the differentiated subtype, whereas stromal proteins (and transcripts) correlated with the mesenchymal subtype. Protein and transcript abundance in the tumor epithelium and stroma exhibited decreased correlation in samples collected just hundreds of microns apart. These data reveal substantial tumor microenvironment protein heterogeneity that directly bears on prognostic signatures, biomarker discovery, and cancer pathophysiology and underscore the need to enrich cellular subpopulations for expression profiling. 
    
   
  
    
      
      Ion Torrent S5 XL 
      
    
   
  49 
 
  
    EGAD00001008345 
   
  
    
    Using the chromium 3' expression assay, we generated an atlas of neuroblastoma and the human fetal adrenal gland.  These data were complemented with whole genome sequencing of normal and tumour DNA from the neuroblastoma samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001008346 
   
  
    
    Raw RNAseq paired end fastq files of MCL control samples (3 samples) and MCL samples transduced with a retrovirus expressing mutated NOTCH1 (3 samples) or NOTCH2 (3 samples). Instrument used: Illumina NovaSeq 6000 
    
   
  
    
   
  1 
 
  
    EGAD00001008347 
   
  
    
    We profiled 4 high-grade gliomas patient brain tumor samples by single-cell ATAC-seq using the 10X Chromium 3' technology. The raw fastq and index files are provided. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001008348 
   
  
    
    We profiled 9 high-grade gliomas patient tumor samples by bulk RNA-seq. The raw fastqs are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  9 
 
  
    EGAD00001008349 
   
  
    
    We profile 10 high-grade gliomas patient brain tumor samples by single-cell multiome ATAC + gene expression, using the 10X Chromium technology.
3 sets of fastq are provided for each samples: R1 and R2 for gene expression, R1 and R2 for ATAC-seq as well as index1 and index2 for ATAC-seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001008350 
   
  
    
    We profiled 15 patient brain tumor samples by ChIP-seq. Inputs are provided for 16 samples, H3K27ac is provided for 15 samples, H3K27me3 is provided for 10 samples and H3K27me3 is provided for 5 samples.
The raw bam files are provided. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  52 
 
  
    EGAD00001008351 
   
  
    
    We profiled 7 high-grade gliomas patient brain tumor samples by single-cell RNA-seq and 18 single-nuclei RNA-seq using 10X Chromium 3' techonology. The raw fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD00001008352 
   
  
    
    WGS data of buffy coat from CRC patients 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001008353 
   
  
    
    The data contained in this dataset is ChipSeq BAM files aligned to reference genome hg38. The ChipSeq was based on a combination of six histone modifications as follows: H3K4me1, H3K4me3, H3K9me3, H3K27me3, H3K27Ac and H3K36me3. The samples are patient-derived xenografts generated by passaging primary patient CD138+ selected cells through the SCID-rab myeloma mouse model. 
    
   
  
    
      
      unspecified 
      
    
   
  42 
 
  
    EGAD00001008354 
   
  
    
    50 Whole genome sequences from 50 Mexican individuals with a high proportion of Native American ancestry. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  50 
 
  
    EGAD00001008356 
   
  
    
    RNA sequencing data of in vitro differentiated megakaryocyte cells transduced with E527K and WT SRC. CD34+ hematopoietic stem cells (HSC) were isolated from healthy
controls before transduction with WT-SRC and E527K-SRC lentiviral vectors in triplicate and differentiation to MK. Three replicates each of two pools were generated for both WT and E527K SRC transduced cells, resulting in 3 WT pool 1 samples, 3 WT pool 2 samples, 3 E527K pool 1 samples and 3 E527K pool 2 samples for a total of 12 samples. RNA was extracted and sequenced with following parameters: Platform: Illumina HiSeq4000, Library Prep Kit: TruSeq stranded mRNA, Sequencing Kit: Illumina HiSeq4000 100 cycles (76-8-8-7), Fragments: single end / fr-firststrand. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001008357 
   
  
    
    Briefly, twenty paired tumor and germline DNAs were extracted from patients’ BM and from buccal mucosa, respectively. Samples were subjected to massively parallel sequencing using the HiSeq 2000, HiSeq2500, HiSeq X Ten, and/or NovaSeq 6000 according to the manufacturer’s instructions. Sequencing reads were aligned to NCBI Human Reference Genome Build 37 (hg19) by Burrows−Wheeler Aligner, version 0.7.10, with default parameters (http://bio-bwa.sourceforge.net/). PCR duplicates were eliminated using Picard tools version 1.39 (GATK). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  40 
 
  
    EGAD00001008358 
   
  
    
    In vitro and in vivo drug screens of tumor cells identify novel therapies for high-risk child cancer 
    
   
  
    
      
      HiSeq X Ten 
      
      NextSeq 500 
      
    
   
  94 
 
  
    EGAD00001008359 
   
  
    
    WGS and WGBS data from monocyte-derived macrophages that were infected with Influenza A virus strain PR8WT, or a matching non-infected control. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  70 
 
  
    EGAD00001008360 
   
  
    
    Mutational burden and profiles to be studied in approx. 500 human primary melanomas with matched normal samples, part of the Leeds melanoma cohort. New custom design targeted capture panel covering melanoma-specific copy number alterations, promoter mutations, gene fusions, coding genes, HLA regions and IFNg/JAK/STAT pathway genes. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001008361 
   
  
    
    RNA-seq transcriptomics of whole blood samples from longitudinal follow-up of a cohort of visceral leishmaniasis (VL) patients with and without HIV coinfection, from active disease through apparent cure and potential relapse. Analysis will identify potential correlates of relapse to identify immune mechanisms underlying the high rate of relapse in HIV/VL coinfection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  249 
 
  
    EGAD00001008362 
   
  
    
    27 Fresh frozen tissue specimens were crushed by mortar and pestle, homogenized using the QIAShredder kit (Qiagen), and genomic DNA and total RNA were extracted using the AllPrep DNA/RNA Mini kit (Qiagen), according to the manufacturer’s instructions. RNA libraries were synthesized using 200 ng of total RNA using the Ilumina TruSeq Stranded RNA LT Sample Prep Kit (Illumina), and subsequently sequenced on the NextSeq550 platform to a read depth of 80 million clusters and 160 million paired end reads (75 bp X 75 bp) using V2 chemistry. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  27 
 
  
    EGAD00001008363 
   
  
    
    two tables containing RNASeq expression values to patients with RNA-Seq data in the study "Comprehensive genomic characterization of refractory multiple myeloma (HIPO_067)". From the bam files gene expression was calculated with the annotation of Gencode.v19. Raw Counts and TPM values are given in one table, the other contains filtered TMM normalized CPM values (genes < 1CPM omitted). 
    
   
  
    
   
  - 
 
  
    EGAD00001008364 
   
  
    
    Genomic profiling of effusion-based fluid samples from 8 HHV8-negative effusion-based lymphoma patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001008365 
   
  
    
    Captured single-cell long-read data of a cohort of CLL patients receiving VEN treatment for resistance study 
    
   
  
    
      
      PromethION 
      
    
   
  25 
 
  
    EGAD00001008366 
   
  
    
    CITEseq data of CLL patients receiving VEN treatment for resistance study 
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001008367 
   
  
    
    Single-cell Long read data of a cohort of CLL patients receiving Venetoclax treatment for VEN resistance study. 
    
   
  
    
      
      PromethION 
      
    
   
  25 
 
  
    EGAD00001008368 
   
  
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  56 
 
  
    EGAD00001008370 
   
  
    
    ATAC-seq dataset on a patient (P) presenting with defects of immunity and two (C5, C6) healthy donors. This dataset contains raw and processed files from ATAC-seq chromatin accessibility analysis. 
There are 3 single-read (50 bp) fastq files (1 per patient/ donor). Processed files consist of narrowPeak files (1 per patient/ donor) and one file that contains read counts in consensus peaks. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001008371 
   
  
    
    RNA-seq dataset on a patient (P) presenting with defects of immunity and three healthy donors (C1, C5, C6). This dataset contains raw and processed files from RNA-seq transcriptome analysis performed according to the Smart-seq2. 
There are 24 single-read (50 bp) fastq files, 6 per patient/donor consisting of 2 cell types and 3 replicates per cell type. There is one count matrix file generated using featureCounts against Ensembl v98 gene models. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001008372 
   
  
    
    scRNA-seq dataset on a patient (P) presenting with defects of immunity and four healthy donors (C1, C2, C3, C4). This dataset contains raw and processed files from scRNA-seq performed on samples using the 10x Genomics Chromium Controller with the Chromium Single Cell 3′ Reagent Kit (v3 chemistry). 
There are 15 paired-end fastq files (3 per patient/donor - I1, R1, R2) and 15 processed files generated with 10x Genomics Cell Ranger v3.0.2 software against GRCh38 human reference transcriptome (3 per patient/donor - barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001008373 
   
  
    
    To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed single-cell RNA-seq (SORT-seq) to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are Bone marrow aspirates. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  30 
 
  
    EGAD00001008374 
   
  
    
    To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed RNA-seq to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are bone marrow aspirates. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001008375 
   
  
    
    To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed whole exome sequencing to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are bone marrow aspirates. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  18 
 
  
    EGAD00001008376 
   
  
    
    105 Normal, DCIS and recurrences samples target-sequenced 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  105 
 
  
    EGAD00001008377 
   
  
    
    The raw fastq files for 30 whole exome and 30 whole genome sequencing for normal endometrial glands. The paired-end sequencing data sets (R1 and R2) are deposited. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD00001008379 
   
  
    
    KCL SNP array samples for copy number analysis 
    
   
  
    
      
      unspecified 
      
    
   
  96 
 
  
    EGAD00001008380 
   
  
    
    KCL lpWGS samples for copy number analysis 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  33 
 
  
    EGAD00001008381 
   
  
    
    We used single-cell transcriptomics to study cells from the developing human cerebellum, and show that different molecular subgroups of medulloblastoma resemble distinct glutamatergic progenitors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001008382 
   
  
    
    Multi-region WES from 4 NSCLC patients, totaling 12 tumor samples and 4 matched control samples. The files were submitted as bam files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  16 
 
  
    EGAD00001008383 
   
  
    
    This dataset consists of 60 mRNA sequencing runs from full blood of 31 myotonic dystrophy type 1 patients, of which for 27 patients reliable data is available before and after 10 months of cognitive behavioural therapy. 
>30 million 150 bp paired end reads were obtained with UMI-labeled adapters to facilitate filtering of PCR duplicates.
Via UMI-analysis we found samples with the aliases sample_01 and sample_02 to contain a very high number of PCR duplicates and recommend the use of these samples only with highest caution or not at all. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD00001008384 
   
  
    
    RNA Sequencing upon shRNA mediated depletion of RAF kinases or treatment with Cobimetinib (GDC-0973, 250nM, 6hrs) or with pan RAFi (AZ-628, 10uM, 6hrs) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD00001008385 
   
  
    
    Stage I and stage III/IV Follicular lymphoma samples, shallow whole genome sequencing for copy number analysis and targeted capture sequencing for mutation and translocation analysis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  269 
 
  
    EGAD00001008386 
   
  
    
    Shallow whole genome sequencing and targeted sequencing of DLBCL patients treated in the PETAL trial 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  216 
 
  
    EGAD00001008387 
   
  
    
    Shallow whole genome sequencing for copy number analysis and targeted capture sequencing data for translocation and mutation anslysis of paired primary and relapse PCNSL and PTL samples 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  335 
 
  
    EGAD00001008389 
   
  
    
    Shallow whole genome sequencing and targeted sequencing of DLBCL patients treated in the HOVON84 trial 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  220 
 
  
    EGAD00001008390 
   
  
    
    This dataset contains log2(TPM + 1) for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753). 
    
   
  
    
   
  - 
 
  
    EGAD00001008391 
   
  
    
    This dataset contains log2(TPM + 1) for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915). 
    
   
  
    
   
  - 
 
  
    EGAD00001008392 
   
  
    
    The purpose of this project is to provide public human datasets for the study of rare diseases. The use of public human genomic background combined with the in-silico insertion of real disease-causing variants enable to have a representative dataset for testing purposes without facing ethical and legal issues associated with the use of human sensitive data. This project aims to help development of technical implementations for rare disease data integration, analysis, discovery,  and federated access. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001008393 
   
  
    
    Raw FASTQ files obtained by RNA sequencing of tumor samples from patients (age 12-29) with newly diagnosed, recurrent intermediate or high-grade sarcoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001008394 
   
  
    
    Raw FASTQ files obtained from whole exome sequencing (WES) of tumor samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  51 
 
  
    EGAD00001008396 
   
  
    
    Targeted next-generation sequencing (NGS) of 93 frequently mutated genes in breast cancer using the QIAseq Human Breast Cancer Targeted Panel (QIAGEN), which uses digital sequencing by incorporating unique molecular barcodes (UMI). 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 550 
      
    
   
  187 
 
  
    EGAD00001008397 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data for the identification of genomewide somatic copy number alterations (SCNA) and the estimation of tumor fractions. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  185 
 
  
    EGAD00001008398 
   
  
    
    Exome sequencing data of 24 Brugada syndrome individuals 
    
   
  
    
      
      NextSeq 500 
      
    
   
  23 
 
  
    EGAD00001008399 
   
  
    
    42 NGS libraries of a 13y/o FFPE sample, a tissue-and-patient-matched FF sample, and a GIAB sample (NA12878). In technical replicates (untreated DNA, treated DNA, two different library types, at least library duplicates for each case). Illumina NextSeq, HiSeq and NovaSeq paired-end sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  42 
 
  
    EGAD00001008400 
   
  
    
    116 Whole Genome Sequencing (WGS) samples from the TB-DAR study, based on a cohort of adult pulmonary tuberculosis patients recruited in Dar es Salaam, Tanzania. 
WGS was performed at the Health2030 Genome Center in Geneva on the Illumina NovaSeq 6000 instrument (Illumina Inc, San Diego CA, USA), starting from1μg of whole blood genomic DNA and using Illumina TruSeq DNA PCR-Free reagents for library preparation and the 150nt paired-end sequencing configuration. Average coverage was above 30X for 75 samples, between 10X and 30Xfor 40 samples, and approximately 8X for a single sample. Sequencing reads were aligned to the GRCh38 (GCA_000001405.15) reference genome using bwa (Version 0.7.17). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  116 
 
  
    EGAD00001008401 
   
  
    
    128 samples with DCIS and matched recurrences sequenced with lpWGS 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Ion Torrent PGM 
      
    
   
  52 
 
  
    EGAD00001008402 
   
  
    
    small RNA next generation sequencing in head and neck cancer 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  51 
 
  
    EGAD00001008403 
   
  
    
    This dataset contains raw sequencing reads in FASTQ format from single-nuclei (30 samples) and bulk tissue (40 samples) transcriptome sequencing of pheochromocytoma and paraganglioma tissue specimens.  Additionally, data from single-nuclei sequencing of two normal adrenal medulla specimens is included. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  70 
 
  
    EGAD00001008405 
   
  
    
    Raw FASTQ files obtained from whole exome sequencing (WES) of normal samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001008407 
   
  
    
    RNAseq files for Klco RPAML study titled "Genomics of pediatric myeloid neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  173 
 
  
    EGAD00001008408 
   
  
    
    RNA sequencing was performed on 15 T-LGLL patients and five control samples. The raw data is provided as fastq files. 
    
   
  
    
      
      unspecified 
      
    
   
  20 
 
  
    EGAD00001008409 
   
  
    
    Single-cell RNA sequencing was performed on viably frozen cells from 11 T-LGLL samples from 9 T-LGLL patients and 6 age-matched healthy samples. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  92 
 
  
    EGAD00001008410 
   
  
    
    Nascent transcriptome (GRO-seq) data representing bone marrow mononuclear cells of two diagnostic T-ALL samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001008411 
   
  
    
    Organoid cultures derived from normal colon and/or colorectal adenomas and/or colorectal carcinomas. RNA and DNA  was isolated from these cultures for genome wide profiling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001008412 
   
  
    
    In this study, we identified miR-130a as a regulator of HSC self-renewal and differentiation. To characterize gene expression changes following enforced expression of miR-130a OE, we performed RNA-seq in CD34+ cord blood (CB) cells transduced with control and miR-130a OE lentiviruses. To capture miRNA targets in an unbiased, transcriptome-wide manner, we perfomed enhanced CLIPseq procol in 2 replicates of CD34+ CB cells and Kasumi-1 cell line, which represent a model system for t(8;21) AML. We chose this cell line, as we found miR-130a to be highly expressed in this AML subtype where it is critical for maintaining the oncogenic molecular program mediated by AML1-ETO. Chimeric Ago2 eCLIPseq in CD34+ CB cells combined with Mass Spectrometry data analysis identified TBL1XR1 as a principal target of miR-130a. To elucidate gene expression changes associated with TBL1XR1 loss of function, we performed RNA-seq in CD34+CD38- CB cells transduced with control and shRNA targeted against TBL1XR1. To determine the functional significance of high miR-130a expression levels in Kasumi-1 cells on the molecular network controlled by AML1-ETO, we performed CUT&RUN assay and RNA-seq in Kasumi-1 cells following miR-130a knock-down (KD). Collectively, our findings reveal a unique role of miR-130a in regulating normal hematopoietic stem cell self-renewal and how elevated levels of miR-130a in t(8;21) AML contribute to the leukemogenesis of this AML subtype. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  36 
 
  
    EGAD00001008413 
   
  
    
    WGS files for Klco RPAML study titled "Genomics of pediatric myeloid neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  158 
 
  
    EGAD00001008416 
   
  
    
    WGS (tumor and germline samples) was performed to identify structural variants in the UBTF/CDX2 subgroup.
RNA-seq was performed to detect gene fusion in the UBTF/CDX2 subgroup.
HiChIP was performed to investigate 3D chromatin architecture and enhancer landscapes of representative patient samples and cell lines harboring Type I and II FLT3-PAN3 deletions and amplifications. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008417 
   
  
    
    Transcriptomic profiling of skin biopsies from psoriasis patients following treatment with deucravacitinib 
    
   
  
    
   
  120 
 
  
    EGAD00001008418 
   
  
    
    To understand the impact of enzymatic treatments on gene expression and epitope preservation on major immune cell populations, skin dissociation (SkinD) and solid soft tumor dissociation (TumorD) were tested on three healthy PBMC samples in triplicate (D1, D2, D3), against an untreated control.
CITE-seq performance was assessed on a solid biopsy cohort of 11 samples (5 healthy skin samples, 3 primary melanoma samples, 3 melanoma metastasis samples) as well as on a  liquid biopsy PBMC cohort consisting of three healthy donors and three immunotherapy-treated melanoma patients.
This dataset contains the GEX data for each sample.
Data is provided in the form of pooled BAM files. Linkage between samples, BAM files and hashtags is provided in a separate linkage file. 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001008419 
   
  
    
    To understand the impact of enzymatic treatments on gene expression and epitope preservation on major immune cell populations, skin dissociation (SkinD) and solid soft tumor dissociation (TumorD) were tested on three healthy PBMC samples in triplicate (D1, D2, D3), against an untreated control.
CITE-seq performance was assessed on a solid biopsy cohort of 11 samples (5 healthy skin samples, 3 primary melanoma samples, 3 melanoma metastasis samples) as well as on a  liquid biopsy PBMC cohort consisting of three healthy donors and three immunotherapy-treated melanoma patients.
This dataset contains the ADT/SPEX data for each sample.
Data is provided in the form of pooled BAM files. Linkage between samples, BAM files and hashtags is provided in a separate linkage file. 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001008420 
   
  
    
    Exome sequencing study on 4 individuals from a pedigree with CHH and cerebellar hypoplasia. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001008421 
   
  
    
    This dataset contains RNAseq data of 20 paired pre-post neoadjuvant chemotherapy breast cancer samples. In total the set contains n=20 biopsies, n=20 surgery specimens. Each sample has 2 fastq files, so n=80 fastq files are uploaded in total. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  40 
 
  
    EGAD00001008422 
   
  
    
    RNA-seq, ATAC-seq and ChIPmentation data from monocyte-derived macrophages that were infected with Influenza A virus strain PR8WT, or a matching non-infected control. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001008423 
   
  
    
    3 control iPSC lines differentiated into iPSC-derived motor neurons transduced with either EGFP or NOVA1 lentivirus. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008424 
   
  
    
    iPSC-derived motor neurons form sporadic ALS and Controls. 4 sALS iPSC lines and 4 Ctrl iPSC lines. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001008425 
   
  
    
    iPSC-derived motor neurons form familial ALS and Controls. 2 fALS iPSC lines and 3 Ctrl iPSC lines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008426 
   
  
    
    eCLIP of TDP-43 from iPSC-derived motor neurons in 2 control lines. Per line input and IP samples and analysis including bigWig files and peak files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001008427 
   
  
    
    iPSC-derived motor neurons from 5 NOVA1 knock out and 5 NOVA1 wt lines in the CVB background. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001008428 
   
  
    
    eCLIP of NOVA1, NOVA2 and RBFOX2 from iPSC-derived motor neurons in 2 control lines. Per line and RNA-binding protein input and IP samples and analysis including bigWig files and peak files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001008429 
   
  
    
    Set of FASTQ sequences generated from Urine Liquid Biopsy in 12 Bladder Cancer Patients using the IDT PanCancer Panel, Illumina Nextera Flex for Enrichment libraries (aka DNA Prep libraries) and Illumina NovaSeq 2x150bp sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001008430 
   
  
    
    WES data of a HCC with neuroendocrine differentiation (HCC-NED), normal and organoid from a 74-year-old man. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008431 
   
  
    
    Single-cell RNA-seq of first-, second-, and third-generation patient-derived organoids. Obtained using the 10X Genomics single-cell 3' expression solution (v3 chemistry). First- and second-generation PDOs from one patient and first-, second-, and third-generation PDOs from three additional patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001008432 
   
  
    
    Targeted panel sequencing data from PanNEN samples. Sample ID is annotated in the following manner: each patient is given a number and "P" is appended to the patient number if it is a primary tumor, "M” if it is metastasis and "N" if it is normal (healthy tissue) sample. All NETG1 and NETG2 samples underwent panel sequencing using a custom panel (in-house PanNEN panel). All NEC and NETG3  samples (except PNET2, PNET77P and PNET77M) underwent panel sequencing using a commercial CCP panel. 
    
   
  
    
   
  103 
 
  
    EGAD00001008433 
   
  
    
    This dataset contains RNAseq data of n=87 pre-treatment biopsies of triple negative and luminal- type breast cancer patients, all scheduled to receive neoadjuvant chemotherapy. Gene expression data is linked with treatment response and survival. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008434 
   
  
    
    This dataset includes 23 specimens from osteosarcoma patients (primary, relapsed, metastatic). It contains bam files from RNA sequencing using a library in which coding regions of cDNA are captured and short-read, paired-end sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001008435 
   
  
    
    This dataset contains 86 osteosarcoma samples and their matched normals that underwent RNA sequencing using size fractionation, NuGEN Ovation Ultralow Library System V2 preparation, and paired-end sequencing on Illumina HiSeq 2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008436 
   
  
    
    This dataset contains 86 osteosarcoma samples and their matched normals that underwent RNA sequencing using size fractionation, NuGEN Ovation Ultralow Library System V2 preparation, and paired-end sequencing on Illumina HiSeq 2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  86 
 
  
    EGAD00001008437 
   
  
    
    Aggregated VCF file from cancer genes panel seq for the initial (n=500) cohort of solid tumors screened for the Basket of Baskets study 
    
   
  
    
   
  1 
 
  
    EGAD00001008438 
   
  
    
    This dataset contains fastq files from four tumours that underwent targeted sequencing on panel for suspected VHL disease. The samples contained within the dataset and their corresponding sample ID are: ccRCC - M19-12422, Pheochromocytoma - M19-13800, Expelled lung tissue- M19-13801, and Liver biopsy- M19-13802. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001008441 
   
  
    
    This dataset contains multi-region sequencing of 16 RCC patients with venous tumor thrombus (VTT), 11 of which were either metastatic on diagnosis or recurred with metastasis.  Whole exome sequencing is available for 94 samples across all 16 patients, including 1 matched normal sample per patient, 2-3 primary tumor samples per patient, 1-2 VTT samples per patient, and 0-3 metastasis samples per patient (metastatic lesions were only sampled for 8 of the 11 metastatic patients).  RNAseq, generated by exome capture, is available for 67 samples across 12 patients, including 0-1 matched normal samples per patient, 3 primary tumor samples per patient, 0-1 VTT samples per patient, and 0-3 metastasis samples per patient (RNAseq was only available for 4 of the 8 patients from whom metastatic lesions were sampled). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  94 
 
  
    EGAD00001008442 
   
  
    
    This dataset contains whole exome sequencing data (WES) of 20 paired pre- and post neoadjuvant chemotherapy breast cancer samples. From every patient a pre-treatment biopsy (B) and a post-treatment surgery (S) specimen has been sequenced. From most patients a paired normal blood sample (N) has been sequenced as a reference control. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008443 
   
  
    
    scRNA-seq dataset on a patient (P_IKZF2-het) presenting with immune dysregulation. This dataset contains raw and processed files from scRNA-seq performed on samples using the 10x Genomics Chromium Controller with the Chromium Single Cell 3′ Reagent Kit (v3 chemistry). 
There are three paired-end (75 bp) fastq files (I1, R1, R2) and three processed files generated with 10x Genomics Cell Ranger v3.0.2 software against GRCh38 human reference transcriptome (scrnaseq_P_IKZF2-het_barcodes.tsv.gz, scrnaseq_P_IKZF2-het_features.tsv.gz, scrnaseq_P_IKZF2-het_matrix.mtx.gz). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001008444 
   
  
    
    Long-range sequencing with low error rate has been challenging. Sequence assembly and phasing usually require a high-quality reference genome for mapping, so working on highly-variable genomic regions or regions with no reference genome information would be difficult. In this study, we describe novel bench protocols and algorithms to obtain ultra-low-error-rate haplotype-phased sequence assemblies of regions 10 KB in length using a short-read sequencing platform that simultaneously solves the above two problems. We accomplish this by imprinting each template strand from a target region with a dense and unique mutation pattern. The mutation process randomly and independently converts ~50% of cytosines to uracils. Short-read sequencing libraries are made from both mutated and unmutated templates. A conservative de Bruijn graph approach seeds an assembly of the mutated templates, which we then extend by mapping paired-end reads. We next partition the template assemblies into two or more haplotypes after using the unmutated sequence library to recover almost all of the mutated bases. The final haplotype is assembled and corrected for residual template mutations and PCR errors. We obtain per-base-error rates below 10 9. We apply this method to a human family, correctly assembling and phasing three genomic intervals, including the highly polymorphic HLA-B gene. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  4 
 
  
    EGAD00001008445 
   
  
    
    Functional screening on patient-derived organoids identifies a therapeutic bispecific antibody that triggers EGFR degradation in LGR5+ tumor cells 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  131 
 
  
    EGAD00001008446 
   
  
    
    Remaining WGS files for study titled "Genomics of pediatric myeloid neoplasms" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  10 
 
  
    EGAD00001008447 
   
  
    
    Whole genome sequence from paired tumour and germline samples from mesothelioma patients 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  42 
 
  
    EGAD00001008448 
   
  
    
    Stool samples were collected from 2,509 Estonian Biobank participants. The shotgun metagenomic paired-end sequencing was performed by Novogene Bioinformatics Technology Co., Ltd. using the Illumina NovaSeq6000 platform, resulting in 4.62 ± 0.44 Gb of data per sample (insert size, 350 bp; read length, 2 × 250 bp). A total of 2,513 samples belonging to 2,509 individuals were sequenced, including 4 biological replicates from one individual. First, the reads were trimmed for quality and adapter sequences. The host reads that aligned to the human genome were removed using SOAP2.21 (parameters: -s 135 -l 30 -v 7 -m 200 -x 400). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2513 
 
  
    EGAD00001008449 
   
  
    
    RNA-seq 10 subsets; 5 donors. ATAC-seq 9 subsets, 4 donors; Histone modification profiling 10 subsets, 2 donors all using human NK cell and T cell subsets. TF ChIP-seq Bcl11b, Bach2, Runx2, Gata3, PLZF.
Illuminia sequencing platform, ATAC-seq is Paired-end, RNA-seq/ChIP-seq is Single-end 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 3000 
      
    
   
  1 
 
  
    EGAD00001008450 
   
  
    
    This study contains whole genome sequencing data and whole exon sequencing data of IMPC tumor and normal tissue sample. 
    
   
  
    
      
      unspecified 
      
    
   
  460 
 
  
    EGAD00001008452 
   
  
    
    We extracted DNA from whole blood or lymphoblast-derived cell lines and assessed the DNA quality with PicoGreenTM and gel electrophoresis. Whole genome sequencing was performed (Illumina HiSeq2000 and Illumina HiSeq X). WGS reads were mapped to the human reference genome assembly hg19 (GRCh37) using Burrows-Wheeler Aligner v.0.7.12 (TCAG) or Isaac v.2.0.13 (Macrogen). For each genome, we performed local realignment and quality recalibration and detected SNVs and small indels using GATK Haplotype Caller v.3.4.6 without genotype refinement. We detected CNVs using ERDS (estimation by read depth with single nucleotide variants) and CNVnator. We detected structural variants using Manta v.0.29.6. When available by the variant caller (i.e. GATK and Manta), trio-based joint variant calling was conducted for each family. 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
    
   
  112 
 
  
    EGAD00001008453 
   
  
    
    Raw, unfiltered fastq files obtained through RNA-seq of endometrial organoids from MRKH patients and controls. The dataset divides into three parts, depending on the growth conditions of the organoids, ie expansion medium or treated with hormones. Each sample consists of two paired-end fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  33 
 
  
    EGAD00001008454 
   
  
    
    We also collected samples from 8 NSCLC patients and 4 ovarian cancer patients and. For all 8 NSCLC patients, a tumor biopsy sample, a WBC sample, and three plasma samples were collected. For all 4 ovarian cancer patients, a WBC sample and two serum samples were collected. We collected tumor tissue sample from one ovarian cancer patient (OV4). The cfDNA was extracted from their plasma samples using the QIAamp circulating nucleic acid kit from QIAGEN (Germantown, MD). For serum cfDNA, ampure XP beads size selection was further performed to eliminate gDNA contamination.  In brief, 0.5 volume of beads were first added to the cfDNA samples. After incubation, the supernatant was transferred to a new tube and an additional 2.0 volume of beads were added.  After 80% ethanol wash, cfDNA was eluted from the beads. FA assays (Agilent Technologies) were performed to rule out the contamination of gDNA in the size selected samples. The cfDNA WES library of all patients and the genomic DNA WES library of the 4 ovarian cancer patients were constructed with the SureSelect XT HS kit from Agilent Technologies (Santa Clara, CA) according to the manufacturer’s protocol. In brief, 10ng of cfDNA was used as input material.  After end repair/dA-tailing of cfDNA, the adaptor was ligated. The ligation product was purified with Ampure XP beads (Beckman-Coulter, Atlanta, GA) and the adaptor-ligated library was amplified with index primer in 10-cycle PCR. The amplified library was purified again with Ampure XP beads, and the amount of amplified DNA was measured using the Qubit 1xdsDNA HS assay kit (ThermoFisher, Waltham, MA). 700-1000 ng of DNA sample was hybridized to the capture library and pulled down by streptavidin-coated beads. After washing the beads, the DNA library captured on the beads was re-amplified with 10-cycle PCR. The final libraries were purified by Ampure XP beads. The library concentration was measured by Qubit, and the quality was further examined with Agilent Bioanalyzer before the final step of 2x150bp paired-end sequencing at an average coverage of 200. Whole-exome capture libraries of genomic DNA from the 8 NSCLC patients were constructed via Roche SeqCap EZ Exome V6 (Roche). Enriched exome libraries were sequenced on the Illumina HiSeq 3000 platform (Illumina) to generate 2x100bp paired-end reads at an average coverage of 200. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 3000 
      
    
   
  53 
 
  
    EGAD00001008455 
   
  
    
    54 samples consisting of COAD, ESCC, GA and OSCC 
    
   
  
    
   
  107 
 
  
    EGAD00001008456 
   
  
    
    Computationally reconstructed B-cell receptor sequences (using BraCeR) from scRNA-seq data for all cells passing quality control. 
    
   
  
    
   
  1 
 
  
    EGAD00001008457 
   
  
    
    Single cell multiomics from 2 donor controls, expression and chromatin accessibility. Samples belong to gray matter tissue from the brain. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001008458 
   
  
    
    We used single-cell transcriptomics to study cells from the developing human cerebellum, and show that different molecular subgroups of medulloblastoma resemble distinct glutamatergic progenitors. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  391 
 
  
    EGAD00001008460 
   
  
    
    Circulating tumor DNA (ctDNA) in blood plasma is an emerging tool for clinical cancer genotyping and longitudinal disease monitoring. We performed deep whole-genome sequencing of serial plasma and synchronous metastases in patients with aggressive prostate cancer. We comprehensively assess all classes of genomic alterations and demonstrate that ctDNA harbors multiple dominant populations whose evolutionary histories frequently indicate whole-genome doubling and shifts in mutational processes. Although tissue and ctDNA showed concordant clonally-expanded cancer driver alterations, each individual metastasis contributed only a minor share of total ctDNA. By comparing serial ctDNA before and after clinical progression on potent androgen receptor (AR) pathway inhibitors, we reveal population restructuring converging solely on AR augmentation as the dominant genomic driver of acquired treatment-resistance. Finally we leverage nucleosome footprints in ctDNA to infer mRNA abundance in synchronously biopsied metastases, including treatment-induced changes in AR pathway transcriptional activity. 
    
   
  
    
      
      unspecified 
      
    
   
  117 
 
  
    EGAD00001008462 
   
  
    
    This dataset consists of genome-wide 5hmC methylomes at various stages of prostate cancer, including not only 93 metastases from castration-resistant prostate cancer (mCRPC) patients, but also  5hmC patterns in cell-free DNA (cfDNA).  There are 2000 runs in total as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  596 
 
  
    EGAD00001008463 
   
  
    
    Exome (*_{N,T}{1,2})
RNAseq (polyA - *_PolyA, and RiboZero - *_RibZ) 
Methylation (SeqCapEpi - MAPD*). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008464 
   
  
    
    We recruited 98 hospitalised patients displaying severe COVID-19 symptoms from the first wave of infection. A stringent exclusion criteria based on non-genetic factors such as age, blood oxygen, radiologic findings and other typical COVID-19 signs was performed. 
Gingival or peripheral blood samples were taken for 98 individuals and whole exome sequencing performed using ExomeCapture-Seq capture KAPA HyperExome on Illumina machines. 
    
   
  
    
      
      unspecified 
      
    
   
  100 
 
  
    EGAD00001008465 
   
  
    
    The raw fastq files target sequencing of 112 genes for 1,298 endometrial glands and matched blood samples. The paired-end sequencing data sets (R1 and R2) are deposited. ABCC1, ACRC, ANK3, ARHGAP35, ARID1A, ARID5B, ATCAY, ATM, ATR, BARD1, BCOR, BRCA1, BRCA2, BRD4, BRIP1, CAMTA1, CDC23, CDYL, CFAP54, CHD4, CHEK1, CHEK2, CTCF, CTNNB1, CUX1, DGKA, DISP2, DYNC2H1, EMSY, FAAP24, FAM135B, FAM175A, FAM65C, FANCA, FANCB, FANCC, FANCD2, FANCE, FANCF, FANCG, FANCI, FANCL, FANCM, FAT1, FAT3, FBN2, FBXW7, FGFR2, FRG1, GPR50, HEATR1, HIST1H4B, HNRNPCL1, HOOK3, KIAA1109, KIF26A, KMT2B, KMT2C, KRAS, LAMA2, LRP1B, MLH1, MON2, MRE11A, MSH2, MSH6, MTOR, NBN, PALB2, PHEX, PIK3CA, PIK3R1, PLXNB2, PLXND1, PMS2, POLE, POLR3B, PPP2R1A, PTEN, PTPN13, RAD50, RAD51, RAD51B, RAD51C, RAD51D, RAD52, RAD54B, RAD54L, RICTOR, SACS, SIGLEC9, SLC19A1, SLX4, SPEG, STT3A, TAF1, TAF2, TAS2R31, TFAP2C, TNC, TONSL, TP53, TTC6, UBA7, VNN1, WT1, XIRP2, ZBED6, ZC3H13, ZFHX3, ZFHX4, ZMYM4. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1334 
 
  
    EGAD00001008466 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001008467 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001008468 
   
  
    
    The dataset includes 6 FASTQ files with single cell transcriptome sequencing data of normal breast myoepithelial cells from ducts and TDLUs derived from reduction mammoplasties from three patients. Chromium Single Cell 3’ Reagent Kit v2 or v3 (10x Genomics) were used for processing of cells, whereafter sequencing was performed using the Illumina® NextSeq500/550 High Output Kit v2. Cell Ranger was used for generating FASTQ files and files from different lanes were concatenated prior to uploading the data to EGA. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001008469 
   
  
    
    SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001008470 
   
  
    
    SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001008473 
   
  
    
    This dataset includes RNAseq data of the fetal ISCs and iPSCs derived of the fetal ISCs to confirm successful reprogramming 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001008474 
   
  
    
    this data set includes deep targeted re-sequencing of fetal bulk tissues of the 4 foetuses (T21=2, D21=2). The tissues include: fetal skin and intestinal organoid cultures passage 0 of all 4 fetuses, and spleen of fetus N01 (T21) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001008475 
   
  
    
    This data set includes WGS data of the in vivo acquired mutations in fetal ISCs and HSPCs of 4 foetuses ( T21=2, D21= 2). In addition, this data set includes sub-clonal fetal ISCs to determine the culture-associated mutations of fetal ISCs. Also, it concludes clone +subclone WGS data of iPSCs derived of the fetal ISCs 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD00001008476 
   
  
    
    Data about copy number aberrations was obtained from primary CRC (n=90). DNA was collected from second primary CRC (in HL survivors, n = 39), and primary SBA (n=14). For second primary SBA (in HL and TC survivors), DNA was isolated for molecular analyses (n=7). Copy number aberrations were evaluated after low-coverage whole genome sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  60 
 
  
    EGAD00001008477 
   
  
    
    Whole genome sequencing of 209 pediatric probands with primary cardiomyopathy and their family members. All samples were sequenced using Illumina short read platform. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  114 
 
  
    EGAD00001008478 
   
  
    
    We analyzed the T-cell receptor (TCR) repertoires from ten kidney transplant recipients. Five out of the ten kidney transplant recipients received ATLG while the other five recipients received basiliximab as induction therapy.  TCR repertoires of CD4+ and CD8+ positive T-cells were assessed prior to transplantation and within the first month after transplantation as well as at three- and 12-months post-transplant. In addition, the pre-formed alloreactive TCR repertoire for each kidney transplant recipient was identified using mixed lymphocyte reaction and donor reactive T-cells were subjected to TCR beta sequencing. This dataset comprises a total of 106 samples. NGS TCR beta libraries of all samples were sequenced on an Illumina NextSeq 500 and raw sequencing data (in the form of fastq files) as well assembled clonotypes and their counts (in the form of clonotype tables) are provided. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  84 
 
  
    EGAD00001008479 
   
  
    
    Organoid cultures derived from normal colon and/or  colorectal adenomas and/or colorectal carcinomas. RNA and DNA  was isolated from these cultures for genome wide profiling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
    
   
  37 
 
  
    EGAD00001008480 
   
  
    
    Organoid cultures derived from colorectal adenomas were transduced with a miR-17-92 expressing vector. RNA from miR17-92-overexpressing organoids and respective non-transduced organoids (controls) was isolated for expression analysis. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001008481 
   
  
    
    scRNA-seq data of B-lineage cells from the cerebrospinal fluid of 21 patients with multiple sclerosis. The data was generated with the Smart-seq2 protocol and sequenced on Illumina NextSeq500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001008482 
   
  
    
    Multi-region RNAseq from 4 NSCLC patients, totaling 12 tumor samples and 7 matched control samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001008484 
   
  
    
    Whole transcriptome RNA-sequencing of purified bone marrow blasts of 136 de novo, treatment naive AML patients. For further details, we refer to the manuscript "The Proteogenomic Landscape of AML" by Jayavelu, Wolf, Buettner et al. 
mRNA extraction and whole transcriptome sequencing
For transcriptome analysis the TruSeq Total Stranded RNA kit was used, starting with 250ng of total RNA, to generate RNA libraries following the manufacturer’s recommendations (Illumina, San Diego, CA, USA). 100bp paired-end reads were sequenced on the NovaSeq 6000 (Illumina) with a median of 57 mio. reads per sample.
RNA Data Analysis
Data quality control was performed with FastQC v0.11.9. Reads were aligned to the human reference genome (Ensembl GRCh38 release 82) using STAR v2.6.1. Gene count tables were generated while mapping, using Gencode v31 annotations. All downstream analyses were carried out using R v4.0 and BioConductor v3.12 (Huber et al., 2015; R Core Team, 2020). Size-factor based normalization was performed using DESeq2 v1.28.1(Love et al., 2014). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  177 
 
  
    EGAD00001008485 
   
  
    
    Sequencing data from a targeted myeloid DNA-Panelsequnencing at the MLL Dx, Munich lab. 
Targeted sequencing was performed using the Nextera DNA Flex library preparation kit, starting with 100ng of genomic DNA (Illumina, San Diego, CA, USA). The target regions were enriched by a custom xGen Lockdown panel using a hybridization capture workflow (IDT Integrated DNA Technologies, Coralville, IA, USA). All libraries were sequenced with 100bp paired-end reads on a NovaSeq6000 (Illumina) with a mean coverage of 3206x. Somatic variant calling was performed with Pisces and a sensitivity cut off of 2%. Large deletions and medium-sized insertions, as they are for example found in CALR and FLT3, were called with Pindel. Variant annotation considered the publicly available data bases Cosmic (v91), ClinVar (2020-03), gnomAd (non-cancer, v2.1.1), dbNSFP (v3.5) and UMD TP53 (2017_R2). Variants that are described as somatic, protein truncating or affecting splice sites were considered as mutations while variants with no or discrepant data base information were considered as variant of uncertain significance. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008487 
   
  
    
    224 pairs of FASTQ files from metastatic Castration-Resistant Prostate Cancer (mCRPC) sequenced on HiSeq 4000 instruments. Patients were enrolled in the West Coast Dream Team study. Biopsies include various tissue sites including bone, soft tissue, and lymph node. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  224 
 
  
    EGAD00001008488 
   
  
    
    To infer the proteomic Mito signature in the LSC subcompartiment, myeloid blasts for 10 patients from the discovery cohort were FACS-sorted into 
CD34-GPR56+NKG2DLigands- (CD34-), alias 61dc5fb798e2520001702c03
CD34+GPR56+NKG2DLigands- (CD34+), alias 61dc5fb798e2520001702c03
Detailed gating strategy will be described in Donato, Correia, Andresen and Trumpp et al., (manuscript in preparation) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001008489 
   
  
    
    Whole exome and RNASeq raw sequencing data for a cohort of 7 male patients with oesophageal adenocarcinoma. Median age at diagnosis was 68. Tumour tissue and PBMCs were used for whole exome sequencing and RNA sequencing.
This data was generated as part of a study funded by a Cancer Research UK Centres Network Accelerator Award Grant (A21998). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001008491 
   
  
    
    This dataset includes linked-read whole-genome sequencing data from the normal ileal of the patient. The normal sample was sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001008492 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HF3FKCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001008493 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HF3J5CCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001008494 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HF3NYCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001008495 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HFFWLCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001008496 
   
  
    
    This dataset includes linked-read whole-genome sequencing data (subfolder HFG3FCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  176 
 
  
    EGAD00001008497 
   
  
    
    CRAM files and VCF for DDD_1 and their parents. Also de novo mutations file for hypermutated DDD_1 child as described in the manuscript ‘Genetic and chemotherapeutic influences of germline hypermutation’ by Kaplanis et al. which will be published in Nature shortly. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001008498 
   
  
    
    Exome sequencing data from transformed Follicular Lymphoma samples that express a PMBL-like gene expression signature 
    
   
  
    
   
  1 
 
  
    EGAD00001008499 
   
  
    
    The dataset includes 12 paired FASTQ files (6 samples) with single cell transcriptome sequencing data of normal breast luminal cells from ducts and TDLUs derived from reduction mammoplasties from three patients. Chromium Single Cell 3’ Reagent Kit v2 or v3 (10x Genomics) were used for processing of cells, whereafter sequencing was performed using the Illumina® NextSeq500/550 High Output Kit v2. Cell Ranger was used for generating FASTQ files and files from different lanes were concatenated prior to uploading the data to EGA. "R1" files include the feature barcodes and UMIs, while "R2" files include the reads. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001008501 
   
  
    
    Targeted myeloid DNA-Panelsequencing from purified bone marrow blasts of 104 treatment naive AML patients from the discovery cohort. For more details, we refere to Jayavelu, Wolf, Buettner et al. 
Libraries were prepared from 40 ng DNA using the QIASeq Human Myeloid Neoplasms Panel (Qiagen) according to the manufacturer’s protocol. Samples were tagged with the QIAseq 96-Unique Dual Index Set A for Illumina platforms (Qiagen) to yield unique combinations of i5 and i7 barcodes for each sample. Sample fragment size distribution and concentration was estimated using the Agilent High Sensitivity DNA kit on a 2100 Bioanalyzer (Agilent). Samples were pooled in an equimolar fashion, denatured, and diluted to 1.5 pM according to Illumina’s recommendations. The diluted library was sequenced on a NextSeq 500 benchtop sequencer (Illumina) using NextSeq High Output cartridges. Demultiplexing was performed using the BaseSpace cloud platform (Illumina). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  19 
 
  
    EGAD00001008504 
   
  
    
    Paired tumor-normal exome data from 47 Microsatellite stable Early-onset sporadic rectal cancer exome from the Indian population 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  94 
 
  
    EGAD00001008505 
   
  
    
    Dataset consists of fastq files from bulk RNA-seq done on peripheral blood acquired from two well characterised hospitalised cohorts, a cohort of patients infected with influenza and a cohort of patients infected with SARS-CoV-2 during the first wave of the pandemic and prior to availability of COVID-19 treatments and vaccines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  160 
 
  
    EGAD00001008506 
   
  
    
    Profiling of co-mutations was done by targeted resequencing using the TruSight Myeloid assay (Illumina, Chesterford, UK) covering 54 genes recurrently mutated in AML: BCOR, BCORL1, CDKN2A, CEBPA, CUX1, DNMT3A, ETV6, EZH2, IKZF1, KDM6A, PHF6, RAD21, RUNX1, STAG2, ZRSR2, ABL1, ASXL1, ATRX, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CSF3R, FBXW7, FLT3, GATA1, GATA2, GNAS, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MLL, MPL, MYD88, NOTCH1, NPM1, NRAS, PDGFRA, PTEN, PTPN11, SETBP1, SF3B1, SMC1A, SMC3, SRSF2, TET2, TP53, U2AF1 and WT1. For each reaction, 50 ng of genomic DNA was used. Library preparation was done as recommended by the manufacturer (TruSight Myeloid Sequencing Panel Reference Guide 15054779 v02, Illumina). Samples were sequenced paired-end (150 bp PE) on NextSeq- (Illumina) or (300 bp PE) MiSeq-NGS platforms, with a median coverage of 3076 reads (range 824–30565). Sequence data alignment of demultiplexed FastQ files, variant calling and filtering was done using the Sequence Pilot software package (JSI medical systems GmbH, Ettenheim, Germany) with default settings and a 5% variant allele frequency (VAF) mutation calling cut-off. Human genome build HG19 was used as reference genome for mapping algorithms. 
    
   
  
    
   
  - 
 
  
    EGAD00001008507 
   
  
    
    RPPA analysis from FAIRLANE Trial of Neoadjuvant Ipatasertib Plus Paclitaxel for Triple-Negative Breast Cancer. 
    
   
  
    
   
  1 
 
  
    EGAD00001008508 
   
  
    
    Whole transcriptome and 850k mehylome profiling of human intraoperative or snap frozen and FFPE MBM. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001008510 
   
  
    
    Mice with medulloblastoma (Group 3) were treated with sham (isofluorane and imaging xray) or CSI as described by Abbas et al (2022). Total RNA was isolated with RNeasy Plus Mini Kit (Qiagen), library preparation (SureSelect, Agilent), rRNA depletion (Ribo-Zero Plus, Illumina) and sequencing were carried out by GenomicsWA or Australian Genome Research Facility. Libraries were sequenced on NovaSeq 6000 S1 flow cells as paired-end 150bp reads (Illumina). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001008511 
   
  
    
    Mice with medulloblastoma (Group 3) were treated with saline, cyclophosphamide, or gemcitabine as described by Abbas et al (2020). Total RNA was isolated with RNeasy Plus Mini Kit (Qiagen), library preparation (SureSelect, Agilent), rRNA depletion (Ribo-Zero Plus, Illumina) and sequencing were carried out by GenomicsWA or Australian Genome Research Facility. Libraries were sequenced on NovaSeq 6000 S1 flow cells as paired-end 150bp reads (Illumina). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001008512 
   
  
    
    RNA-seq libraries were prepared using the KAPA Stranded RNA-Seq Kit with RiboErase (Kapa Biosystems, Wilmington, MA) and sequenced to a target depth of 200-M reads on the Illumina HiSeq platform (Illumina, San Diego, CA). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  162 
 
  
    EGAD00001008514 
   
  
    
    We performed single cell RNA sequencing (scRNA-seq) from bone marrow on 11 pediatric (0-14 years-old) and adolescent and young adult (AYA) (15-39 years-old) de novo AML samples (Dx) (4 inv(16), 3 t(8;21) and 4 rMLL). In addition, for some patients also relapse sample was sequenced (2 inv(16), 2 t(8;21) and 3 rMLL). Cells were sorted into CD34+/CD38- and CD34-/CD38+ and sequenced separately. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001008515 
   
  
    
    This deposit consists of DNA and RNA sequencing data from 32 EPS patients. 28 samples had tumor DNA sequencing data. 2 had matched normal sequencing data. 27 samples had tumor RNA sequencing data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  32 
 
  
    EGAD00001008516 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-BM8772 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008517 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-CM3220 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001008518 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-DM9089 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001008519 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GE7528 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008520 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-GI2070 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008521 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-JR9883 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008522 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-LC6372 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008523 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-ML9537 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008524 
   
  
    
    WGS data from a GBM patient PT-MS8478 
    
   
  
    
   
  - 
 
  
    EGAD00001008525 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-PR5617 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001008526 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-PV2594 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00001008527 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-RV2286 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008528 
   
  
    
    WGS data from a GBM patient PT-SB3465 
    
   
  
    
   
  - 
 
  
    EGAD00001008529 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-SJ5453 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001008530 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-SS3647 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001008531 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-WR7927 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008532 
   
  
    
    WGS and RNA-Seq data from a GBM patient PT-WT4796 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008533 
   
  
    
    This data comes into 2 pairs of experiments:
- RNA-seq Control versus Formate treated colorectal cancer T18 cells
- Humix device, RNA-seq of control versus co-culture colorectal cancer T18 cells with Fusobacterium nucleatum 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001008534 
   
  
    
    Set of 2 bam files from patients affected with Lupus. Fastq alignments for exonic variants present in TLR7 gene. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001008535 
   
  
    
    WGS data relative to 36 triple negative breast cancer PDX models. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001008537 
   
  
    
    RNA-seq dataset of high-grade serous ovarian cancer (HGSC) tumours from long-term survivors performed as part of the Multidisciplinary Ovarian Cancer Outcomes Group (MOCOG) study. 
The dataset includes fastq files from 56 HGSC tumours (53 primary tumours and 3 recurrent tumours) from 53 long-term survivor patients. 
Libraries were generated using the Illumina Stranded mRNA Prep and 150 bp paired-end sequencing was performed to a minimum of 100 million reads on Illumina NovaSeq 6000 instruments. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  56 
 
  
    EGAD00001008538 
   
  
    
    We monitored patient's anti SARS-CoV-2 immune responses using an in vitro cross presentation assay. The goal of this study was to identify immune correlates of clinical protection against SARS-CoV-2 infection. Briefly, peripheral blood mononuclear cells of patient were divided into a monocyte and lymphocyte. Monocyte were differentiated into monocyte derived dendritic (mo-DC)cells using GM-CSF and Interferon alpha. Mo-DC were then loaded with SARS-CoV-2 culture lysates , or VeroE86 lysates. SARS-CoV-2 loaded mo-DC were then used to stimulates their autologous lymphocytes and T cell cytokine secretion was monitored in the supernatant. We discriminated patients producing IL-2 and patients producing IL-5. RNA sequencing was performed for 18 patients, to identify gene profile associated with IL-2 or IL-5 production. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001008541 
   
  
    
    In other analysis in the current manuscript, we find a similar gene signature (to dissociation based artifacts in mouse and human tissue) is present in post-mortem microglia and astrocytes, across all snRNA-seq datasets analyzed, although it is highly variable between subjects.  
Using acutely-resected neurosurgical tissue, we performed single-nucleus RNA-seq and reveal that a similar signature can be detected in microglia following prolonged exposure to room temperature.  Tissue handling and methods details, as well as sequencing and analysis details) can be found in the methods section of related manuscript (Marsh et al., 2022).
Together, these results suggest that the presence of this signature in post-mortem brain samples may be the result of a combination of acute pre-mortem (agonal state, cause of death, comorbidities, etc.) and post-mortem (post-mortem interval (PMI), storage time, RNA quality, etc.) variables and may not represent normally present cell state. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001008542 
   
  
    
    RNAseq data relative to 41 triple negative breast cancer patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  41 
 
  
    EGAD00001008543 
   
  
    
    DNA-seq libraries were captured to exome regions using xGen Exome Research Panel v1.0 (IDT), and libraries were prepared using the KAPA Hyper prep kit. DNA libraries were sequenced to a target depth of ×200 for tumor sample, ×100 for normal samples on the Illumina HiSeq platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  256 
 
  
    EGAD00001008544 
   
  
    
    Intrahepatic cholangiocarcinomas (iCCs) are characterized by their rarity, difficulty in diagnosis, and overall poor prognosis. We performed comprehensive transcriptomic characterization of treatment-naive iCC. Whole transcriptome analyses identified two prognostic subtypes, concordant with previous reports.The findings could assist in patient stratification with iCCs and in developing rational therapeutic strategies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  91 
 
  
    EGAD00001008545 
   
  
    
    The compressed file contains plink format file for the Illumina MEGA SNP array data of 255 individuals generated and analyzed in Liu et al study of genom-wide variation of the Massim region. 
    
   
  
    
   
  1 
 
  
    EGAD00001008546 
   
  
    
    This is a prospective study with 100 participants. The enzymatic digestion profiles after conventional PCR allowed the identification of different haplotypes of  hemoglobin in Abidjan. 
    
   
  
    
   
  - 
 
  
    EGAD00001008547 
   
  
    
    RNAseq data relative to 56 primary and treatment-naive ovarian carcinomas, from independent donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  56 
 
  
    EGAD00001008548 
   
  
    
    Relevant clinical data for POPLAR including treatment arm, histology, overall survival, progression-free survival, TLS-LA calls, and best confirmed overall response. 
    
   
  
    
   
  891 
 
  
    EGAD00001008549 
   
  
    
    Relevant clinical data for OAK including treatment arm, histology, overall survival, progression-free survival, and best confirmed overall response. 
    
   
  
    
   
  - 
 
  
    EGAD00001008550 
   
  
    
    Additional relevant biomarker data for OAK including PD-L1 tumor cell IHC by the 22C3 assay, tumor mutational burden status, and STK11, KEAP1, tissue type, and EGFR mutation status. 
    
   
  
    
   
  699 
 
  
    EGAD00001008551 
   
  
    
    Clinical data from IMblaze370: Clinical data include disease, treatment arm, MSI status, KRAS oncogenic mutation status, sex, and overall survival (1=dead, 0=alive) 
    
   
  
    
   
  - 
 
  
    EGAD00001008552 
   
  
    
    RNA-seq count matrix for 296 bulk pre-treatment tumors from IMblaze370 
    
   
  
    
   
  - 
 
  
    EGAD00001008553 
   
  
    
    RNA-seq FASTQ files from 296 bulk pre-treatment tumors from IMblaze370 
    
   
  
    
      
      unspecified 
      
    
   
  296 
 
  
    EGAD00001008554 
   
  
    
    WGS and WES data for manuscript titled: ctDNA as a biomarker of progression in oesophageal adenocarcinoma 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001008555 
   
  
    
    Raw sequencing reads were processed as single end sequencing, aligned to the human reference genome GRCh38 and processed using CellRanger 3.1. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  89 
 
  
    EGAD00001008556 
   
  
    
    Whole exome sequecing data of 224 Chinese Clear Cell Renal Cell Carcinoma patients. 
    
   
  
    
   
  1 
 
  
    EGAD00001008557 
   
  
    
    Intrahepatic cholangiocarcinomas (iCCs) are characterized by their rarity, difficulty in diagnosis, and overall poor prognosis. We performed comprehensive genomic characterization of treatment-naive iCC. This study reports a large-scale genomic analysis of iCC. The findings could assist in patient stratification with iCCs and in developing rational therapeutic strategies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001008558 
   
  
    
    For the cohort of 59 samples, we performed TruSeq DNA PCR-Free whole-genome sequencing library preparation according to manufacturer’s instructions (llumina, ILMN, San Diego, CA) on the automated NGS Star liquid handling platform (Hamilton, Bonaduz, Switzerland) followed by 2x150 bp paired-end sequencing on the HiSeqX or NovaSeq6000 (ILMN). An average coverage of >100x was achieved. 
For whole transcriptome analysis, the TruSeq Total Stranded RNA kit was used, starting with 250 ng of total RNA, to generate RNA libraries following the manufacturer’s recommendations (ILMN). 2x100bp paired-end reads were sequenced on the NovaSeq 6000 with a median of 50 mio. reads per sample (ILMN). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  59 
 
  
    EGAD00001008559 
   
  
    
    Libraries were prepared from RNA-extracted cell lines using Illumina RNA library prep kit. Samples were sequenced on Illumina HiSeq 4000 or HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  103 
 
  
    EGAD00001008560 
   
  
    
    The sample AD_Library_1,  AD_Library_2 and Control_Library  were run on a Chromium Chip B with the Chromium Single Cell 3′ Library & Gel Bead Kit v3 kit (10x Genomics, CA, USA) . The 3’ gene expression libraries were sequenced at an approximate depth of 50,000 reads per cell using the NovaSeq 6000 S1 (Illumina, San Diego, CA, USA) flow cells. Cell Ranger v.3.0.2 was used to analyze the raw base call files.  FASTQ files and raw gene-barcode matrices were generated and aligned human genome GRCh37 (hg19).  The samples were integrated in R v.4.0.3 and  generated Seurat objects, two related to AD samples and one to control samples, were analyzed using the Seurat package v.4.0.3 to perform downstream analysis, clustering of the cells and differential expression. 
    
   
  
    
   
  1 
 
  
    EGAD00001008562 
   
  
    
    ChIP-seq for AR, FOXA1 and H3K27ac in primary prostate tumors before and after 3 months of neoadjuvant enzalutamide treatment.
RNA-seq expression data of primary prostate tumors before and after 3 months of neoadjuvant enzalutamide treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  245 
 
  
    EGAD00001008564 
   
  
    
    Targeted DNA sequencing data of paired primary and relapse tumor material taken from a pediatric patient with neuroblastoma. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  10 
 
  
    EGAD00001008566 
   
  
    
    Whole Genome sequencing of colorectal cancer patients (SG-BULK-1) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  69 
 
  
    EGAD00001008567 
   
  
    
    We have assessed the molecular profile of a cohort of 70 patients with MDS by next-generation sequencing (NGS) using cfDNA and compared the results to paired bone marrow (BM) DNA. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  140 
 
  
    EGAD00001008568 
   
  
    
    Whole-exome sequencing data (Agilent SureSelectXT Human All Exon V7). Retrospective study of matched pairs of initial and post-therapeutic GBM cases treated with temozolomide+radiotherapy with a recurrence period greater than one year. Matched normal, initial and post-therapeutic samples for 27 patients and 1 patient (GBM046) with a matched normal and two post-therapeutic samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  84 
 
  
    EGAD00001008569 
   
  
    
    Whole genome sequencing data (bam) of tetralogy of Fallot study, including data derived from iPSCs of two control and four patients with tetralogy of Fallot (two with DiGeorge syndrome (DG), two without DiGeorge syndrome (ND). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008571 
   
  
    
    RNA-seq was performed from 3 separate GINS3 patient fibroblast cultures and 1 replicate of fibroblasts derived from each of the two parents. RNA-seq libraries were generated with NEBNext Ultra II Directional RNA library prep for Illumina with NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs) and sequenced on Illumina NextSeq500 with paired-end 150 bp read length. SIRV Set 3 (Lexogen) spike-ins were added. Two fastq files are provided for each RNAseq sample. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001008572 
   
  
    
    The PYDP dataset includes 26 bam files of Y chromosome sequences for Papua New Guinean individuals from different locations, extracted from whole genome sequences. DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  24 
 
  
    EGAD00001008573 
   
  
    
    The IYDP dataset includes BAM files of 126 Y chromosomes extracted from whole genome sequences. These are from individuals from a broad range of Indonesian islands - communities close to mainland Asia through to New Guinea. The original whole genome sequencing libraries were prepared using TruSeq DNA PCR-Free and TruSeq Nano DNA HT kits depending on DNA quantity. 150 bp paired-end sequencing was performed on the Illumina HiSeq X sequencer. Individuals were sequenced to expected mean depth of 30x, with an achieved median depth of raw reads across samples of 43x. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  126 
 
  
    EGAD00001008574 
   
  
    
    Whole Genome sequencing of colorectal cancer patients (SG-BULK-2) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001008575 
   
  
    
    Common variable immunodeficiency (CVID) is the most prevalent primary immunodeficiency. Here the authors perform single cell omics analyses in CVID discordant monozygotic twins and show epigenetic and transcriptional alterations associated with activation in memory B cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  2838 
 
  
    EGAD00001008576 
   
  
    
    Contains 14 control samples and 26 case samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  26 
 
  
    EGAD00001008577 
   
  
    
    Joint called VCF for whole genome sequence data from 410 samples described in the paper: PMID:33116287.  It includes 314 high coverage (average 30X) samples sequenced on the Illumina X-Ten, also available as individual datasets under the H3Africa Chip study (EGAS00001002976) and 112 medium coverage (average 10X) samples from the TrypanGen study (EGAS00001002602) sequenced on Illumina HiSeq 2500. Supplementary table 3 of the paper describes the geographic breakdown of the samples. 16 samples from the Southern African Human Genome project have been removed from this VCF. 
    
   
  
    
   
  410 
 
  
    EGAD00001008578 
   
  
    
    Dataset includes fastq files for RNA-Seq experiments for tumor samples of PPGL patients. Single end reads fastq files are available for 102 different samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  102 
 
  
    EGAD00001008579 
   
  
    
    Dataset includes fastq files for WES experiments for tumor samples of PPGL patients. Paired end reads fastq files are available for 87 different samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  87 
 
  
    EGAD00001008581 
   
  
    
    WGS data relative to 63 primary and treatment-naive ovarian carcinomas, from independent donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  63 
 
  
    EGAD00001008583 
   
  
    
    Sanger sequencing and RT-qPCR data for validation used in Primary lymphomas of the central nervous system (PCNSL). 
    
   
  
    
   
  1 
 
  
    EGAD00001008584 
   
  
    
    All libraries were sequenced on Illumina HiSeq4000 until sufficient saturation was reached. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  9 
 
  
    EGAD00001008585 
   
  
    
    All libraries were sequenced on Illumina NextSeq or NovaSeq6000 until sufficient saturation was reached. 
    
   
  
    
      
      unspecified 
      
    
   
  26 
 
  
    EGAD00001008586 
   
  
    
    BAM files from RNAseq study from regions of insitu and invasive human mammary ductal disease 
    
   
  
    
   
  1 
 
  
    EGAD00001008587 
   
  
    
    This database contains 46 samples for early stage ovarian high grade serous carcinoma project. Amplicon sequencing on 37 tumour samples from early stage ovarian high grade serous carcinoma as well as 5 adjacent normal tissue samples and 4 whole blood samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  46 
 
  
    EGAD00001008588 
   
  
    
    shallow whole genome sequencing dataset contains 44 samples. all the samples are early stage high ovarian high grade serous carcinoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  44 
 
  
    EGAD00001008589 
   
  
    
    This study compared different assays for the detection of circulating tumour DNA (ctDNA) in serial plasma from stage IA-IV breast cancer patients, targeting structural variants (SVs), single nucleotide variants (SNVs) and/or somatic copy-number aberrations (SCNAs). SV-multiplex PCR, SNV-/SV-hybrid capture, and different depths of whole-genome sequencing (WGS) were used to evaluate ctDNA levels, demonstrating concordant results. SNV-hybrid capture targeting 1,347-7,491 mutations was the most sensitive assay, detecting 67% (36/54) of samples down to an allele fraction (AF) of 0.00024%. SV-multiplex PCR, targeting 21-47 mutations, detected 63% (34/54) of samples down to 0.00047% AF and has potential as a clinical assay. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  1284 
 
  
    EGAD00001008590 
   
  
    
    This dataset contains 10x Genomics Single Cell 3’ Solution (version 2) scRNA-seq data from peripheral blood leukocytes of a single healthy donor. Data from 20939 cells were collected over 8 lanes and 2 sequencing runs. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  32 
 
  
    EGAD00001008592 
   
  
    
    Whole Genome sequencing of colorectal cancer patients (SG-BULK-3) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  64 
 
  
    EGAD00001008593 
   
  
    
    Whole genome sequencing from two resectable patients with pancreatic cancer for both normal and tumour tissue samples; whole exome sequencing from the two resectable patients and five unresectable patients of peripheral blood mononuclear cells and blood plasma (1-5 time points per patient), and whole exome sequencing of plasma samples from three chronic pancreatitis patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  34 
 
  
    EGAD00001008594 
   
  
    
    consists 17 cases, 7 control cases and 10 cancer cases. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001008595 
   
  
    
    Fastq files for the single cell RNAseq data of Follicular lymphoma study. This dataset includes the paired single cell RNA sequencing data for 23 samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  23 
 
  
    EGAD00001008598 
   
  
    
    This dataset contains the FASTQ files for a portion of the samples in Tang F. et al. “Chromatin accessibility profiles of castration-resistant prostate cancers reveal novel subtypes and therapeutic vulnerabilities” published in Science.
It contains 51 samples sequenced with Illumina HiSeq 2500 or HiSeq 4000. The remaining samples can be found at dbGaP: phs000909.v1.p1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  51 
 
  
    EGAD00001008600 
   
  
    
    WGS files for Genomic Landscape ALL paper titled "The genomic landscape of pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  278 
 
  
    EGAD00001008601 
   
  
    
    Non-small cell lung cancer (NSCLC) is the leading cause of cancer deaths worldwide. Only a fraction of NSCLC harbour actionable driver mutations and there is an urgent need for patient-derived model systems that will enable the development of new targeted therapies. We generated  NSCLC patient-derived xenografts (PDXs) that recapitulate the histology and molecular features of primary NSCLC. Here, we completed whole exome sequencing on 122 NSCLC PDXs. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  122 
 
  
    EGAD00001008608 
   
  
    
    The Genomics of MPNST (GeM) Consortium dataset includes de-identified whole genome sequencing data (.bam) for germline samples (DNA primarily derived from blood) sequenced at standard (30x) coverage (n=88) and for tumor samples (DNA derived from fresh frozen tissue) sequenced at 90x coverage (n=105). This dataset also includes transcriptome profiling data (.fastq) for paired normal nerve samples (n=7) and for tumor samples (n=132). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  332 
 
  
    EGAD00001008609 
   
  
    
    For scRNA-Seq, single live cells were suspended in 0.4% BSA in DPBS buffer (1000 cells/µL) and subjected for GEM generation and barcoding. Library preparation was performed according to the recommended procedures of the manufacture Chromium Single Cell 3’ reagent Kit V3.1 chemistry. 10,000 cells were targeted for capturing and 9 cycles were used for cDNA amplification, while 12 cycles were performed for library formation, and sequencing was performed on an Illumina NovaSeq 6000 sequencer. For bulk RNA-Seq, RNA was purified using the miRNeasy™ RNA MiniPrep (Qiagene) and RNA-seq libraries were generated either using the Illumina TruSeq RNA Library Preparation Kits and sequenced on the Illumina HiSeq 2500 sequencer as 76 bp paired-end reads, or using the NEBnext UltraDirectional RNA Library Preparation Kits after rRNA depletion using the NEBNext rRNA depletion kit and sequenced on an Illumina HiSeq 2500 sequencer using 50 cycles of single-end sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001008610 
   
  
    
    Genomic DNA of 81 cases from Japanese gastric cancer was extracted from tumor and matched normal tissues, and libraries with an insert size of 350–550 bp were prepared.
The libraries were sequenced on a HiSeq 2500 instrument (Illumina) with paired-end reads of 101 bp. The read data are stored as FASTQ formatted files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  161 
 
  
    EGAD00001008611 
   
  
    
    This deposit consists of DNA and RNA sequencing data from 67 CCS samples. 55 samples had tumor DNA sequencing data. 6 had matched normal sequencing data. 34 samples had tumor RNA sequencing data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  67 
 
  
    EGAD00001008616 
   
  
    
    Targeted sequencing was applied to an unselected population-based diffuse large B-cell lymphoma cohort  (n=928) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres).
DNA extracted from tumour samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500.  All data are provided in the CRAM format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  928 
 
  
    EGAD00001008617 
   
  
    
    WGS data relative to 42 primary and treatment-naive triple negative breast cancers, from independent donors. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  42 
 
  
    EGAD00001008618 
   
  
    
    Whole genome paired sequencing of Multiple Myeloma CD138positive bone marrow plasma cells and saliva control samples (6 tripletts, tumor1, tumor2, saliva control) from 6 patients. WGS was done on HiSeq X-Ten or Novaseq 6000 with Illumina TruSeq Nano DNA. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001008619 
   
  
    
    RNA-Seq data on multiple myeloma CD138positive bone marrow plasma cells, 11 samples, sequenced on HiSeq 4000 and HiSeq X-Ten, using mostly the TruSeq stranded mRNA Kit. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD00001008621 
   
  
    
    The dataset consists of whole exome sequencing data (fastq format) of 100 non-syndromic autism spectrum disorder patients from India. Whole exome sequencing data is generated using Agilent SureSelect v6 capture kit and Illumina HiSeq sequencing platform. Paired end fastq files are available. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  100 
 
  
    EGAD00001008624 
   
  
    
    This dataset includes 10X Genomics 3' single cell-RNA-seq profiles from 24 human rashes and 7 healthy controls. BAM files are provided for each sample. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00001008625 
   
  
    
    Whole Genome sequencing of colorectal cancer patients (SG-BULK-4) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  64 
 
  
    EGAD00001008626 
   
  
    
    PacBio HiFi sequencing was performed on 48 barcoded patients' genomic DNA after a telobait-capture protocol to enrich for telomeric regions. The sequencing reads of each patient were de-multiplexed and presented as patient-specific PacBio CCS BAM files. 
    
   
  
    
      
      Sequel 
      
    
   
  48 
 
  
    EGAD00001008627 
   
  
    
    The dataset consists of shallow whole genome sequencing from plasma DNA of 1002 individuals, including 1048 samples. Raw fastq files from Illumina HiSeq series are available. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1048 
 
  
    EGAD00001008628 
   
  
    
    This dataset contains counts for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915). 
    
   
  
    
   
  - 
 
  
    EGAD00001008629 
   
  
    
    This dataset contains counts per million for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915). 
    
   
  
    
   
  - 
 
  
    EGAD00001008630 
   
  
    
    This dataset contains counts for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753). 
    
   
  
    
   
  - 
 
  
    EGAD00001008631 
   
  
    
    This dataset contains counts per million for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753). 
    
   
  
    
   
  - 
 
  
    EGAD00001008632 
   
  
    
    This dataset contains 8 samples, each of which has paired-end WXS fastq files for Tumour and Normal samples, as well as RNA-Seq fastq file. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001008633 
   
  
    
    Bone marrow or peripheral blood samples were collected of adult patients at first diagnosis of B-precursor acute lymphoblastic leukemia. RNA was isolated from mononuclear cells and subjected to mRNA library prep using Poly-A selection and sequencing on a NovaSeq 6000 system. Obtained gene expression profiles and gene fusion calls were used to allocate samples to molecular disease subtypes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  560 
 
  
    EGAD00001008634 
   
  
    
    We performed whole genome sequencing (WGS) in an ASD cohort of 68 individuals from 22 families enriched for recent shared ancestry. Samples were sequenced using Illumina HiSeq X platform, and Variants (single nucleotide variants (SNVs) and insertions or deletions (indels)) were detected using GATK with HaplotypeCaller. Quality control checks for (i) duplicate samples, (ii) samples per platform, (iii) genome call rate, (iv) missingness rate, (v) singleton rate, (vi) heterozygosity rate, (vii) homozygosity rate, (viii) Ti/Tv ratio, (ix) inbreeding coefficient, and (x) sex inference were performed as previously described. Variant call format (VCF) files for SNVs and indels were annotated with ANNOVAR using allele frequencies from the 1000 Genomes project (2015; 1000G), the Genome Aggregation Database (gnomAD), and the Greater Middle East Variome Project (GME). 
    
   
  
    
   
  1 
 
  
    EGAD00001008635 
   
  
    
    17 scRNA-seq and 16 scATAC-seq datasets on bone marrows derived from 16 patients. The sequencing dataset consists of 5 samples without bone marrow infiltration, defined as controls (C1, C2, C3, C4, and C5) and 11 neuroblastoma infiltrated bone marrow samples from patients with MYCN amplification (M1, M2, M3, M4), ATRX mutations (A1, A2), and cases lacking either alteration (S1, S2, S3, S4, S5). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001008636 
   
  
    
    Spatial transcriptome sequencing data from prostate cancer needle biopsies.
Dataset contains biopsies from before and after androgen deprivation therapy of 3 patients with 8 biopsies per patient in total.
2 sections (replicates) are taken from each biopsy. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  24 
 
  
    EGAD00001008637 
   
  
    
    Whole Genome sequencing of colorectal cancer patients (SG-BULK-5) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  66 
 
  
    EGAD00001008638 
   
  
    
    FPKM expression values of the CUP/reference/validation cohort used for tissue-of-origin prediction based on transcriptomic data 
    
   
  
    
   
  - 
 
  
    EGAD00001008639 
   
  
    
    The dataset comprises total RNA sequencing data obtained from two samples of testicular tissue from the individual M1911, who was diagnosed with meiotic arrest. 
    
   
  
    
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001008640 
   
  
    
    Whole-genome sequencing BAM files of a census-based elderly cohort of Brazilians (n=1171) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001008641 
   
  
    
    This dataset includes DNA methylation profiles before and after GH treatment (with a duration of ~18 months in average) on 47 healthy children using customized methylC-seq capture sequencing. It includes 360 fastq files (i.e. 180 paired-end fastq files) where 40 fastq files were generated with HiSeq and 320 fastq files were generated with NovaSeq. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  94 
 
  
    EGAD00001008642 
   
  
    
    WGS profiling bam files from colorectal carcinoma and adenoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  527 
 
  
    EGAD00001008643 
   
  
    
    9 tumor biopsies & 84 blood samples 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  93 
 
  
    EGAD00001008644 
   
  
    
    Spatial transcriptome sequence data from two tumour containing prostates. Entire cross section of organ divided into cubes to fit spatial transcriptomics arrays.
The dataset contains paired-end sequences from 21 sections of 1k array sections and 9 sections of 10x Visium sections for patient 1 as well as 28 sections of 10x Visium sections for patient 2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  58 
 
  
    EGAD00001008645 
   
  
    
    Results of scRNA-seq analysis of a PBMC collected from a male with a mosaic 45,X/48,XYYY karyotype 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001008646 
   
  
    
    WGS data for manuscript titled: Multi-omic features of oesophageal adenocarcinoma in patients treated with preoperative neoadjuvant therapy 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  178 
 
  
    EGAD00001008647 
   
  
    
    This dataset includes WGS data of samples from our paper titled "Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in Hepatocellular Carcinoma: The PLANET study." (National Science Review, nwab192. https://doi.org/10.1093/nsr/nwab192) 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001008648 
   
  
    
    This dataset includes RNA-seq data of samples from our paper titled "Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in Hepatocellular Carcinoma: The PLANET study." (National Science Review, nwab192. https://doi.org/10.1093/nsr/nwab192) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001008649 
   
  
    
    We performed single-cell RNA-sequencing (scRNA-seq) of cells in the bronchoalveolar lavage (BAL) fluid at 3-month follow-up of a multiple myeloma patient experiencing sarcoidosis-like pulmonary reactions after anti-BCMA CAR T-cell therapy (Sample alias: A8_3, A9_3; technical replicates). In addition we performed scRNA-seq of a extramedullary relapse lesion at 6-month follow-up (Sample alias: B10_3). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008650 
   
  
    
    This dataset contains 8 .bam files of shallow WGS (~0.1×) from fresh frozen tumour tissues of four matched patients and first generation PDX models. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001008651 
   
  
    
    This dataset contains paired-end fastq sequencing files (n=212) from shallow WGS of 106 dried blood spot (DBS) samples, containing 91 DBS collected from OV04 ovarian cancer PDX mice, 10 DBS collected from healthy non-tumour bearing NSG mice, and 5 DBS generated from whole blood samples from 4 OV04 ovarian cancer patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  106 
 
  
    EGAD00001008652 
   
  
    
    To study global transcriptional dynamics during human spermatogenesis we sequenced total RNA of human testicular biopsies with 5 specific histological phenotypes: Sertoli cell-only (SCO, n=3), spermatogenic arrests at the spermatogonial (SPG, n=4), spermatocyte (SPC, n=3), and spermatid (SPD, n=3) level, as well as normal spermatogenesis (Normal, n=3). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD00001008653 
   
  
    
    Single cell RNA-seq from 6 and single nuclei ATAC-seq from 3 human fetal tissue samples. Samples from 8 to 11 weeks. Includes a 8.5 weeks samples with matching both ATAC-seq and RNA-seq runs. Data was sequenced using 10X Genomics chromium technology, for scRNA-seq samples belong to v2 and v3. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001008656 
   
  
    
    RNA-Seq data for manuscript titled: Multi-omic features of oesophageal adenocarcinoma in patients treated with preoperative neoadjuvant therapy 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      NextSeq 2000 
      
    
   
  79 
 
  
    EGAD00001008657 
   
  
    
    We generated a large transcriptome atlas of human skeletal muscles by collecting biopsies from 6 different muscles to determine molecular signatures that may be distinct between leg muscles. The biopsies were collected from gracilis (GR),  semitendinosus (ST),  vastus lateralis (VL), vastus medialis (VM), rectus femoris (RF), and gastrocnemius lateralis (GL) muscles. We also investigated molecular differences within the muscle by including two biopsies from the middle and distal sides of the semitendinosus muscle (STM and STD, respectively). In total, 128 samples from 20 individuals (aged 25 ± 3.6 yr) were analyzed. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  128 
 
  
    EGAD00001008658 
   
  
    
    Dataset includes whole genome and transcriptomic sequencing data from five T-cell acute lymphoblastic leukemia (T-ALL) patients. Whole genome sequencing has performed from both diagnostic (T-ALL sample) and control (remission sample) samples. RNA-sequencing has performed from diagnostic samples. Samples has been taken from the bone marrow or peripheral blood. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001008659 
   
  
    
    Whole-genome sequencing data from 38 leukemia patients and 12 leukemia cell lines; Containing 100 fastq files; Two files for each sample. 
    
   
  
    
      
      unspecified 
      
    
   
  11 
 
  
    EGAD00001008660 
   
  
    
    Iron accumulation in microglia has been observed in Alzheimer’s disease and other neurodegenerative disorders and is thought to contribute to disease progression through various mechanisms including neuroinflammation. To study the interaction between iron accumulation and inflammation, we treated human induced pluripotent stem cell-derived microglia (iPSC-MG) with an increasing concentration of iron, in combination with inflammatory stimuli such as interferon gamma and amyloid β, and performed RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001008661 
   
  
    
    Whole-genome sequencing (WGS) genotype data generated as part of the Interval project. The data are reported, separately per chromosome, in variant call format (VCF).  The genotypes are denoted in diploid format (for chrY the genotype 1 denoted as 1/1 and 0 denoted as 0/0). Note that multi-allelic variants are present in the data, but encoded to appear on separate, consecutive lines. The data are reported in following versions - unphased, phased, phased with imputation, sites only.  Note: the unphased version has additional genotype information, while the phased versions only contain the genotypes. 
    
   
  
    
   
  1 
 
  
    EGAD00001008662 
   
  
    
    Despite extensive studies on the chromatin landscape of exhausted T cells, the transcriptional wiring underlying functional and dysfunctional states of human tumor infiltrating lymphocytes (TILs) is incompletely understood. Here, we identify tissue-specific and general gene-regulatory landscapes in the wide breadth of CD8+ TIL functional states covering four cancer entities using single-cell chromatin profiling. We map enhancer-promoter interactions in human TILs by integrating single-cell chromatin accessibility with single-cell RNA-seq data from tumor entity-matching samples and prioritize key elements by super enhancer analysis. Our analysis reveals a human core chromatin trajectory to TIL dysfunction and identifies key enhancers, transcriptional regulators, and deregulated target genes involved in this process. Finally, we validate enhancer regulation at loci encoding PD1, TCF1, and TIM3 by targeting non-coding regulatory elements with potent CRISPR activators and repressors. In summary, our study advances the understanding of molecular regulation of TIL (dys-)function and provides a framework for modulating immunotherapeutic relevant TIL genes via their enhancers. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  49 
 
  
    EGAD00001008663 
   
  
    
    To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated global transcriptomic data from long-term endurance (8 men) and strength (8 men) trained individuals and healthy age-matched untrained controls (8 men). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and after 1h and 3hrs following acute exercise. All subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks. All 192 samples were multiplexed in 4 lanes and sequenced (2x250bp paired end) on the Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  192 
 
  
    EGAD00001008664 
   
  
    
    Whole exon sequencing data of CLL patients 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001008665 
   
  
    
    ChIPseq data for CLL patients 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001008666 
   
  
    
    The dataset contains 90 lung cancer and 5 non-cancerous lung lesion plasma cfDNA samples collected in EDTA blood collection tubes. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  95 
 
  
    EGAD00001008667 
   
  
    
    Groups of cells belonging to different ploidy populations (based on PI staining) were collected from an undifferentiated soft tissue sarcoma. The different ploidy populations underwent RRBS, after which copy number signatures for the ploidy-sorted populations and the bulk population were extracted. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001008668 
   
  
    
    We investigated the impact of ploidy heterogeneity on copy number inference at a single cell level using fluorescence-activated cell sorted (FACS) nuclei from an undifferentiated soft tissue sarcoma. FACS revealed the presence of three aberrant subpopulations, including a haploid, a near diploid and a whole genome doubled population. Once sorted, single cell nuclei underwent whole genome sequencing using the chromium CNV single cell DNA library kit (10X Genomics). We sequenced single normal nuclei (2n) and single aberrant / tumour nuclei (1n, 2n and 2n+). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  4 
 
  
    EGAD00001008669 
   
  
    
    Beta values of methylation data of the CUP/reference/validation cohort (H021) used for the validation cohort described in the publication 
    
   
  
    
   
  1 
 
  
    EGAD00001008671 
   
  
    
    This study used whole exome sequencing on 21 patients with cholesteatoma from 10 families in order to identify variants that may attribute to cholesteatoma aetiology. Exomes were enriched for using hybridisation selection and subject to DNA-sequencing. This datasets is formed of two batches as they were sequencing at different times. Batch-1 exomes were selected for using Nimblegen capture (4-plex) and sequenced on Illumina Hiseq 4000 and batch-2 was exome selected using Agilent SureSelect Human All Exon v6 and sequenced on the Illumina NovaSeq 6000. All samples within the same family were processed within the same batch. This dataset is comprised on BAM files mapped using cgpMAP v3.2.0 (bwa-mem) using the GRCh38. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD00001008672 
   
  
    
    The genomic VCF data of the Integrative proteogenomic characterization of early esophageal cancer project ,this dataset contains 90 VCF files. 
    
   
  
    
   
  90 
 
  
    EGAD00001008673 
   
  
    
    This study contains methyl-binding domain sequencing and shallow whole genome sequencing from circulating free DNA (cfDNA) for 79 patients with small cell lung cancer (SCLC) and 78 non-cancer controls. We also sequence genomic DNA (both methyl-binding domain sequencing and shallow whole genome sequencing) from 30 circulating-tumour-cell derived explant models (CDXs, from 23 unique patients with SCLC), 20 patient derived explant models (PDXs, from 10 unique patients with SCLC) and 13 lung tissue samples. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001008674 
   
  
    
    To investigate the mode of action and potential side-effects we analyzed differential gene expression in
Postmitotic C9orf72 iPSC-Neurons by RNAseq. The cells were treated in a 2 dose regime at 10 µm in 0.1 % DMSO for 9 days. Compound treated iPSC-Neurons were washed with PBS, frozen on dry ice and stored at -80°C until RNA
isolation. The RNA was isolated using miRNA Mini Kit (Qiagen) using 700 μl of Qiazol. A total of 250 ng of
RNA per sample was processed for mRNA library preparation as per the manufacturer’s instructions for
Illumina® Stranded mRNA Prep Ligation to be used with the IDT® for Illumina® RNA UD Indexes Set B and sequenced using NextSeq 500/550 High Output Kit v2.5 (Illumina) on NextSeq 550 (Illumina). The data
was processed and analyzed using CLC genomics workbench (Version 21.0.3, Qiagen) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  27 
 
  
    EGAD00001008675 
   
  
    
    Whole-exome sequencing data in fastq format of matched tumour and germline DNA from 8 patients with metastatic basal cell carcinoma. 
Samples are labeled as Primary, Local or Metastasis:
Primary: Sample was obtained from primary tumour.
Local: Sample was obtained from local recurrence of primary tumour.
Metastasis: Sample was obtained from a metastatic site.
Germline: Sample was obtained from normal adjacent tissue.
DNA was isolated from FFPE tissue sections of the tumor biopsies using the AllPrep DNA/RNA FFPE Kit (Qiagen) and quality controls conducted using the Qubit fluorometer (Thermo Fisher Scientific). Library preparation was performed using the Agilent SureSelect Human All Exon v7 XTHS2 probes and sequenced on a NovaSeq 6000 S2 2x100bp 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD00001008676 
   
  
    
    We detected a uniparental paternal isodisomy event of chromosome 4 in a child. DNA was extracted from the blood. HiSeq X generated the sequence data. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001008677 
   
  
    
    We generated whole genome sequence data from a family of monozygotic twins. DNA was obtained from blood, buccal epithelial cells, placenta, and umbilical cords of monozygotic twins. DNA from the parents were also obtained. Libraries were generated using Truseq-PCR free, Truseq nano, and NEBnext Ultra II depending on the availability of DNA. Data was generated on Illumina NovaSeq platform. Raw sequence data was aligned to the human reference genome GRCh38 using bwa mem aligner. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008683 
   
  
    
    This dataset contains 278 miRNA and 20 mRNA transcriptomes generated as part of the study "miR-374a-5p regulates inflammatory genes and monocyte function in inflammatory bowel disease." 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  298 
 
  
    EGAD00001008685 
   
  
    
    The dataset contains whole exome sequencing data (libraries prepared using the Agilent SureSelect Human All Exon V6 kit, and paired-end sequenced on Illumina HiSeq4000 (2 x 150bp)) of 6 samples taken from peripheral blood mononuclear cells (PBMCs) (Samples 1-5) and bone marrow (BM). Data are provided as fastq files. Sample 1 was taken prior to initial venetoclax treatment. Sample 2 was taken as disease progression on venetoclax. Sample 3 was taken during response to BTK and PI3K inhibitor therapy. Sample 4 and the BM sample were taken at disease progression/prior to venetoclax re-treatment. And Sample 5 was taken during venetoclax re-treatment response. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001008686 
   
  
    
    This dataset contain RNA-seq, ChIP-seq, WGBS and ATAC-seq of 1 human muscle stem cell sample. 
NCAM+ITGB1+ CD31−CD45−CD34− were used as the sorting strategy for the sample.  
H3K27ac, H3K27me3, H3K4me1, H3K4me3, H3K36me3  and H3K9me3 are the targets of ChIP-seq. 
    
   
  
    
      
      NextSeq 500 
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001008687 
   
  
    
    The mutagenicity of bacteria was assessed by serially exposing human small intestinal organoids to various bacterial species or isolated toxins.
We have used the following abbreviations: 
EWT: Organoids exposed to E. coli described in PMID: 32106218
EKO: Organoids exposed to isogenic E. coli as EWT, with knockout of the deltaClbQ gene, rendering them unable to produce colibactin 
DYE: Organoids exposed to FastGreen injection control dye
NIS: Organoids exposed to E. coli Nissle
ETBF: Organoids exposed to the protease toxin BTF produced by ETBF-bacteria. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008688 
   
  
    
    ChIP-seq and matched input data of AR, FOXA1 and H3K27ac for mCRPC patient samples taken prior to and after treatment with AR targeted therapy 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  68 
 
  
    EGAD00001008689 
   
  
    
    372 samples consisting of 185 patient paired CD138+ tumor and non-involved DNA pairs, plus 5 Horizon Diagnostic known mutation standards (HD). Samples were processed using the KAPA HyperCap protocol and hybridized onto a targeted panel for multiple myeloma and associated diseases.  Reference Sudha et al Clinical Cancer Research, 2022. 
    
   
  
    
   
  372 
 
  
    EGAD00001008690 
   
  
    
    Long-read genome sequencing performed on the Oxford Nanopore Technologies' PromethION to resolve variants underlying breast cancer susceptibility in sixteen individuals with pathogenic germline SVs in BRCA1, BRCA2, CHEK2 or PALB2. 
    
   
  
    
      
      PromethION 
      
    
   
  16 
 
  
    EGAD00001008691 
   
  
    
    TCRab sequencing was performed on viably frozen cells from 11 T-LGLL samples from 9 T-LGLL patients and 6 age-matched healthy samples. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  68 
 
  
    EGAD00001008692 
   
  
    
    This data set contains whole exome and transcriptome data from 47 case with BPDCN. For exome data, bam files are provided (mapped against GRCh38), for transcriptome raw fastq-files (paired-end data). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  97 
 
  
    EGAD00001008693 
   
  
    
    In this study we employed Laser Capture Microdissection (LCM) for the multimodal profiling of lung macrophages cell populations as a function of location within the healthy tissue. In detail, macrophage mini-bulks (~100 cells) were collected from 4 healthy human donors in 5 different locations of the airways (a total of 20 biopsies), including parenchyma (L1 – lower left lobe (LLL); L6 – 80% distance from LLL tip), trachea (L2), bronchi (L3 – 1st/2nd generation; L5 – 3rd/4th generation), and processed for ATAC-seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  39 
 
  
    EGAD00001008694 
   
  
    
    In the reported study, we employed Laser Capture Microdissection (LCM) for the transcriptome profiling of lung macrophages cells populations as a function of location within the healthy tissue. In detail, macrophage mini-bulks (100 cells each) were collected by LCM from 4 healthy human donors in 5 different locations of the airways (a total of 20 biopsies), including parenchyma (L1 – lower left lobe (LLL); L6 – 80% distance from LLL tip), trachea (L2), bronchi (L3 – 1st/2nd generation; L5 – 3rd/4th generation) and processed for RNA-seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD00001008695 
   
  
    
    This dataset contains bam files mapped to hg19 (exome and panel) or hg38 (RNA) that either were primary bone marrow cells or sorted human cells after long term engraftment in NSG mice treated with LOXL inhibitor 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  147 
 
  
    EGAD00001008696 
   
  
    
    Circulating cell-free methyl-DNA (mcfDNA) contains promising cancer markers but its low abundance and possibly diverse origin pose challenges toward the accurate diagnosis of early stage cancers. By whole-genome bisulfite sequencing (WGBS) of cell-free DNA (cfDNA) from about 0.5 mL plasma of mice xenografted with human tumors, we obtained and aligned the reads to the human genome, filtered out the mouse and carrier bacterial sequences, and confirmed the tumor origin of methyl-cfDNA (mctDNA) by methylation-sensitive restriction enzyme digestion prior to species-specific PCR. We estimated that human tumor-specific reads (ctDNA) or mctDNA comprised about 0.29 or 0.01%, respectively of the xenograft mouse cfDNA, and about 0.029 or 0.001% of the cfDNA of human early stage cancer patients. Similar WGBS of early stage (0-II, node- and metastasis-free) breast, lung or colorectal cancer samples identified hundreds of specific DMRs (differentially methylated regions) compared to healthy controls. Their association with tumourigenesis was supported by stage-dependent methylation, tumor suppressor or oncogene clusters, and genes also identified in the xenograft samples. Using 20 three-cancer-common and 17 colorectal cancer-specific DMRs in combination (top 0.0018% of the WGBS methylation clusters) was sufficient to distinguish the stage I colorectal cancers from breast and lung cancers and healthy controls. Our data thus confirmed the tumor origin of mctDNA by sequence specificity, and provide a selection threshold for authentic tumor mctDNA markers toward precise diagnosis of early stage cancers solely by top DMRs in combination. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  24 
 
  
    EGAD00001008697 
   
  
    
    This dataset includes genome-wide autosomal array data and whole mtDNA sequences for 12 Resande and 17 Swedish individuals. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  27 
 
  
    EGAD00001008699 
   
  
    
    Transcriptome sequencing of nine patients diagnosed with chondrosarcoma. cDNA was generated using the NuGEN Ovation RNA-Seq FFPE system. Total RNA was randomly primed and thermally sheared and the resulting cDNA fragment was amplified. The cDNA was then mechanically sheared using sonication to generate ~250 bp fragments. Sequencing libraries were generated using the NuGEN Ovation Ultralow System V2 library prep kit and sequenced on HiSeq2500 TruSeq v3. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001008700 
   
  
    
    Dataset comprises one vcf file containing variants from a list of genes (DNA repair and metabolism associated genes) subset from WES of an adult AML cohort. The cohort contains 145 patient samples. WES was performed using Illumina platform. 
    
   
  
    
   
  1 
 
  
    EGAD00001008701 
   
  
    
    Advamced Visium Spatial Gene Expression assay for FFPE tissues with human and SARS-CoV-2 whole transcriptome (WT) information at a 55 µm resolution. The dataset consists of 13 tissue sections from 5 patient lung tissue samples, 3 from COVID-19 patients and 2 from control patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001008702 
   
  
    
    Dataset contains 5 exome BAM files from a child with neurofibromatosis and relapsed refractory acute lymphoblastic leukaemia. The samples are CD19 positive and CD19 negative bone marrow mononuclear cells at both diagnosis and relapse as well as mesenchymal stem cells as the germline control. Libraries were prepared using the SureSelect Clinical Research Exome v2 kit (Agilent Technologies, Santa Clara, CA, USA) and run on the Illumina NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001008705 
   
  
    
    The dataset is composed by the raw RNA sequencing (n=6), targeted DNA sequencing (n=18) and whole exome sequencing (n=17) from 19 patients with IG-MYC-rearrangements. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001008706 
   
  
    
    Primary central nervous system lymphoma (PCNSL) is a distinct extranodal lymphoma presenting with limited stage disease but variable response rates to treatment despite homogenous pathological presentation. The likely underlying molecular heterogeneity and its clinical impact is poorly understood. The present dataset contents paired-ended whole-exome sequencing information (n=115; HyperExome Kapa hyperprep), paired-ended RNA-seq information (n=123; KAPA mRNA HyperPrep Kits), and paired-ended bisulfite sequencing (n=64; TruSeq Methyl Capture EPIC) from fresh-frozen tumor tissue immunocompetent, treatment naïve PCNSL patients. Additionally, the dataset includes single-ended RNA-seq (n=93; QuantSeq 3’ mRNA-Seq Library Prep Kit) from formalin-fixed, paraffin-embedded tissue of  immunocompetent, treatment naïve PCNSL patients. All samples were sequenced in an Illumina NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008707 
   
  
    
    Total RNA sequencing of cultured OM cells derived from patients with Alzheimer's disease (AD), individuals with mild cognitive impairment (MCI) and cognitively healthy controls. 
    
   
  
    
   
  1 
 
  
    EGAD00001008708 
   
  
    
    Human (n=34) and mice (n=68) melanoma tumor WXS dataset. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  102 
 
  
    EGAD00001008709 
   
  
    
   
  
    
   
  - 
 
  
    EGAD00001008710 
   
  
    
    Wes for 15 multiple meningioma samples from 6 individual 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008711 
   
  
    
    We analyzed the cell free DNA methylomes using 67 plasma samples from patients with mCRPC prostate cancer in the VPC project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  62 
 
  
    EGAD00001008712 
   
  
    
    We analyzed the cell free DNA methylomes using 14 plasma samples from patients with mCRPC in the Barrier cohort. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  14 
 
  
    EGAD00001008713 
   
  
    
    We analyzed the cell free DNA methylomes using 22 plasma samples from patients with mCRPC prostate cancer in the WCDT project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  22 
 
  
    EGAD00001008714 
   
  
    
    Heatrich-BS was performed on 14 patients monitored across 5-8 timepoints each. The predicted tumor fraction trend was compared with CEA values and tumor measurements from CT scans. 
    
   
  
    
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  79 
 
  
    EGAD00001008715 
   
  
    
    Heatrich-BS was performed on 5 healthy volunteers and 15 CRC patient cell-free DNA. The Heatrich-BS predicted tumor fractions were compared with tumor burden values obtained by genomic methods such as targeted amplicon sequencing and low pass sequencing. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina MiSeq 
      
    
   
  20 
 
  
    EGAD00001008716 
   
  
    
    This dataset consists of shallow whole genome sequencing data and amplicon sequencing data for 26 ovarian cancer patients (21 high-grade serous ovarian cancer, 4 low-grade serous ovarian cancer and 1 clear cell ovarian cancer). The data are provided as single end FASTQ files for the shallow whole genome sequencing data (31 libraries) and paired end FASTQ files for the amplicon sequencing data (98 libraries). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001008717 
   
  
    
    Dataset including 13 sequenced mtDNA genome samples. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  13 
 
  
    EGAD00001008718 
   
  
    
    this dataset contains the raw data generated for CD14 monocytes WGBSof 7 covid19 hospitalized patients sampled at various time points (Admission, Day 5 and Day 15) in total 15 biospecimen are available. WGBS libraries have been sequenced on Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001008721 
   
  
    
    Bam files consists PET cases and healthy cases 
    
   
  
    
      
      Sequel 
      
    
   
  20 
 
  
    EGAD00001008722 
   
  
    
    Engineered Human Primary T Cell transcriptome study 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001008723 
   
  
    
    CLL PBMC cells were isolated using ficoll gradient. They have been treated with IBET762 or DMSO as solvent control and ATAC Seq has been performed on them. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001008724 
   
  
    
    Mixture of 4 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008725 
   
  
    
    Deconvoluted files of the 5-9 individuals of in silico datasets (combination of biological mixture sequencing and publicly available data). The dataset includes the phenotypes used for clustering. 
    
   
  
    
   
  1 
 
  
    EGAD00001008726 
   
  
    
    Mixture of 2 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008727 
   
  
    
    Mixture of 2 unrelated individuals (of close mtDNA haplogroup) sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008728 
   
  
    
    reference whole exome sequence serving as a reference of individuals. Includes the fastq files of each individual (labelled S1-S5) and the called variants in vcf format merge for all individuals. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008729 
   
  
    
    Mixture of 3 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008730 
   
  
    
    Sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection. This cohort comprises a subset of patients enrolled in the Genomic Advances in Sepsis (GAinS) study, an established biobank of adult sepsis patients. Patients with sepsis due to community acquired pneumonia or faecal peritonitis were recruited from 34 hospitals across the UK from 2005-2018, with samples for functional genomics and detailed clinical information collected on the first, third and/or fifth day following ICU admission. RNA was extracted from leukocytes isolated at the bedside using LeukoLOCK kits.
We have previously identified sepsis response signatures (SRSs), transcriptomic endotypes that are associated with differential early mortality (Davenport et al, Lancet Respir Med, 2016; Burnham et al, AJRCCM, 2017) and response to treatment in a clinical trial (Antcliffe et al, AJRCCM, 2018). We generated RNA sequencing data on 903 samples, including 134 samples repeated from our previously released microarray data. Libraries were prepared using NEB Ultra II Library Prep kits (Illumina) and sequenced on a NovaSeq 6000. Reads were aligned to the reference genome (GRCh38) using STAR and gene counts quantified using featureCounts (annotation Ensembl v99). Counts were TMM-normalised and log-transformed. Following QC, processed data were available on 864 samples from 667 unique patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  909 
 
  
    EGAD00001008731 
   
  
    
    Although cross-species transcriptional analysis has been generated for DCs, transcriptomic conservation between mouse and human FRCs at single-cell resolution has been unclear. To test whether GREM1+ FRCs might also play a role in DC homeostasis in humans, we performed scRNA-seq of CD45–PDPN+ stromal cells, as well as CD45+CD11c+ immune cells from healthy human LNs of three human donors. Data was generated using the 10x platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001008732 
   
  
    
    consists of  14 bam files 
    
   
  
    
      
      Sequel 
      
    
   
  14 
 
  
    EGAD00001008733 
   
  
    
    Whole exome sequencing from pre-treatment samples and matched blood normals from 22 patients. Of these individuals, on-treatment samples are available for a subset of 18 patients. RNA sequencing from pre-treatment samples from 21 patients. Of these individuals, on-treatment samples are available for a subset of 15 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  98 
 
  
    EGAD00001008734 
   
  
    
    Chromium 10x scRNA of 6 metastatic colorectal cancer organoids 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001008735 
   
  
    
    This dataset comprises the BAM files from targeted genome sequencing of CD138+ selected myeloma tumour samples from 21 individuals. In 5 cases there is only one tumour sample, but in the other 16 cases there are sequential samples, spanning treatment relapses. There are denoted Tumor A, B C etc. Therefore there are a total of 48 myeloma tumour samples.
For each individual there is also a germline control samples, obtained either from peripheral blood or from CD138-selected bone marrow cells. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  69 
 
  
    EGAD00001008736 
   
  
    
    This data set contains whole exome sequences of individuals from 8086 (mostly British Pakistani/British Bangladeshi, mostly self-reported parentally related) individuals from the following studies:
1. 5236 British Pakistani/British Bangladeshi adults from East London Genes & Health, now known as Genes & Health
2. 2624 British South Asian mothers from Born in Bradford (mostly Pakistani)
3. 1061 British South Asian adults from Birmingham (mostly Pakistani)
This dataset contains all the exome sequence data available for this study on 2022-04-26 
    
   
  
    
   
  1 
 
  
    EGAD00001008737 
   
  
    
    We analyzed the cell free DNA methylomes using 72 plasma samples from patients with mCRPC prostate cancer in the VPC project for validation. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. Files from multiple lanes exists per sample. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  72 
 
  
    EGAD00001008739 
   
  
    
    Single nuclei RNA sequencing (snRNA-seq) on tissue samples from 11 patients (SDHB and RET). Files are fastq files of 10x-5'scRNAseq libraries. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001008740 
   
  
    
    - RNA-sequencing data: 5 normal pleurae and 40 malignant pleural mesotheliomas
- Targeted DNA-sequencing of the 165 genes included in the “Solid and Haematological tumors” panel (BRIGHTCore, Brussels, Belgium): 6 MPM samples.
2 FASTQ files for each sample (paired). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001008741 
   
  
    
    RNA-sequencing data for 12 MPM cell lines treated with 0.1% DMSO or 1 µM palbociclib for 9-10 days.
Experiment was performed in duplicates for sensitive cells (MPM08, MPM21, MPM38, MPM57, MPM59, Meso11, Meso13, Meso34 and Meso56)  and only once for resistant cells (MPM31, MPM34 and MPM36) except for Meso11 which was done in triplicates. For Meso11, Meso13, Meso34 and Meso56, a drug washout of 48 hours was also performed. 4 MPM cell lines (MPM31, MPM34, MPM59 and MPM66)
were also analyzed in untreated condition. 
2 FASTQ files for each sample (n=56) (paired) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  56 
 
  
    EGAD00001008742 
   
  
    
    Paired RNA-seq of bisulfite treated VDH01 samples control or depleted for NSUN3 (4 replicates each). The samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq.
Paired RNA-seq of fCAB treated vdh01 samples control or depleted for NSUN3(4 replicates each). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq.
Paired RNA-seq of fCAB treated VDH01 samples (3). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq.
Paired RNA-seq of bisulfite treated VDH01 samples (3). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  22 
 
  
    EGAD00001008743 
   
  
    
    Single RNA-seq of fCAB treated tRNAs from vdh01 samples (5 replicates). tRNAs were extracted using Mirvana° Invitrogen kit. The samples were prepped with Il TruSeq Small RNA and sequenced on Illumina NextSeq550. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  5 
 
  
    EGAD00001008744 
   
  
    
    This dataset contains raw .fastq files of a whole exome sequencing experiment on primary mediastinal large B-cell lymphoma and contains 8 tumor-normal pairs and 14 unpaired tumor samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001008745 
   
  
    
    The cellular origin and differentiation status of glioblastoma by scRNA-seq 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001008746 
   
  
    
    We generated a single-cell RNA-seq atlas capturing over 100,000 cells spanning all stages of the mouse cerebral development. By examining data from over 100 cerebral tumour samples, our study reveals that, despite the phenotypic/genotypic differences between the tumour types, they are all comprised of developmental sublineages that map most closely to embryonic or juvenile stages of development. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001008747 
   
  
    
    This dataset consists of 1 sample 
    
   
  
    
      
      Sequel 
      
    
   
  1 
 
  
    EGAD00001008748 
   
  
    
    this dataset consists of 18 samples 
    
   
  
    
      
      Sequel 
      
    
   
  18 
 
  
    EGAD00001008752 
   
  
    
    Osteochondral explants were obtained from knee joints (n=17 explants) from the RAAK study. Paired-end 2x150 bp RNA sequencing (Illumina TruSeq mRNA Library Prep Kit, Illumina HiSeq X) was performed. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  17 
 
  
    EGAD00001008753 
   
  
    
    HGSOC patient-derived organoids and their tissue of origin 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001008755 
   
  
    
    WES files (fastQ)  from 19 germline DNA, 22 tumor DNA, 5 patient-derived xenograft (PDX) DNA, and 9 plasma cfDNA) samples from 28 CRC-BRAF-mutated patients collected at baseline to anti-BRAF/EGFR therapies. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  55 
 
  
    EGAD00001008756 
   
  
    
    41 WGS DNA sequences from: Phase I trial of CX-5461, a first-in-class G-quadruplex stabilizer in patients with advanced solid tumors enriched for DNA-repair deficiencies (CCTG IND.231) 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  56 
 
  
    EGAD00001008758 
   
  
    
    This dataset consists of functional genomic data from 20 Ankylosing Spondylitis patients and 35 Healthy Controls taken from CD4+ T cells, CD8+ T cells and CD14 Monocytes. It contains 364 paired end fastq files consisting of 104 total RNA-seq samples and 116 ATAC samples, for ChIP there are 46 H3K4me3 samples, 46 H3K27ac samples and 3 H3K4me1 samples, along with 49 paired input samples. The samples were sequenced on Illumina HiSeq4000, Illumina NextSeq500 and Illumina NovaSeq 6000 platforms. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  364 
 
  
    EGAD00001008759 
   
  
    
    DNA was extracted from archival tissue of 119 patients with various SGC subtypes and sequenced using a targeted NGS panel encompassing 523 cancer related genes (TruSight Oncology 500, TSO500). This dataset belongs to the publication entitled: 'Identification of fusion genes and targets for genetically matched therapies in a large cohort of salivary gland cancer patients'. 
    
   
  
    
   
  119 
 
  
    EGAD00001008760 
   
  
    
    Bulk tumor RNAseq FASTQ files from 124 samples from patients with hormone sensitive or castration resistant prostate cancer. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  124 
 
  
    EGAD00001008761 
   
  
    
    Clinical data from this cohort of patients, including hormone sensitive or castration resistant prostate cancer, overall survival, NMF subtypes, tumor TMB, prior treatment status, PD-L1 IC and TC scores from SP142 and SP263 as well as percentage of CD8 IHC at tumor center. 
    
   
  
    
   
  1 
 
  
    EGAD00001008762 
   
  
    
    Raw count matrix of the 124 bulk tumor RNAseq samples from patients with hormone sensitive or castration resistant prostate cancer. 
    
   
  
    
   
  1 
 
  
    EGAD00001008763 
   
  
    
    This dataset includes bam files (aligned to hg38) from the germline of pediatric cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001008764 
   
  
    
    The single base substitution mutational signatures SBS2 and SBS13, likely caused by APOBEC cytosine deaminases, are common in many human cancer types. However, the stimulus activating APOBEC mutagenesis is unknown and understanding of when it occurs in the progression from normal to cancer cell is limited. Here, as part of a wider survey of human tissues, we whole genome sequenced 342 microdissected normal epithelial crypts from the small intestines of 39 individuals. SBS2/13 mutations were present in 17% normal small intestine crypts and were likely due to APOBEC3A activity. Localised clusters of SBS2/13 mutations (kataegis) were also commonly found. APOBEC mutation burdens were variable between individuals and between crypts from the same individual. Crypts with SBS2/13 often had immediate crypt neighbours without SBS2/13, suggesting that the underlying cause of SBS2/13 is cell-intrinsic rather than a widely distributed microenvironmental exposure, or needs to be permitted by cell-intrinsic conditions. APOBEC mutagenesis occurred throughout the human lifespan, including in young children, and was episodic with a small number of episodes occurring during the life history of a single cell. The results indicate that APOBEC mutagenesis is more common in the small intestine epithelium than in many other cell types, and is an episodic process in vivo initiated or permitted by cell intrinsic factors. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  408 
 
  
    EGAD00001008765 
   
  
    
    The dataset contains fecal WGS samples of 196 participants to the HELIUS study, as well as VLP filtrate sequencing for a subset of 48 participants. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  244 
 
  
    EGAD00001008766 
   
  
    
    This dataset contains 56 fastq files of paired-end RNA sequencing of a Illumina® TrueSeq stranded mRNA library of 28 renal cell carcinoma PDX samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001008767 
   
  
    
    This study included nasal gene expression data collected from nasal brushes of adolescents in PIAMA birth cohort, which is used in the project Nasal DNA methylation at three CpG sites predicts childhood allergic disease. 186 samples were included in this analysis, and phenotypes (age and sex) were provided together with a gene expression count table. Gene expression was measured by the Illumina HiSeq2500 sequencer. 
    
   
  
    
   
  1 
 
  
    EGAD00001008768 
   
  
    
    Dataset contains paired-end Whole Exome sequencing data from 5 tumor samples and 1 single normal blood sample from a single primary GBM patient. 
    
   
  
    
   
  6 
 
  
    EGAD00001008769 
   
  
    
    WGS data from healthy reference iPSC lines. The median coverage is 41-50x. >85% have a coverage of >30x. 97% of the variants are known. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008770 
   
  
    
    We assessed the transcriptome and chromatin states of patient and control samples at both bulk and single-cell resolutions with RNA-seq and ATAC-seq. Maternal-fetal interface samples were collected from 7 patients infected with SARS-CoV-2 during late pregnancy, and from 7 gestational age-matched control donors. Raw and processed files are provided in this dataset. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD00001008771 
   
  
    
    The data set includes MBL2 genotypes and clinical phenotypes of a cohort of patients with critical Covid-19. The files included in the data set include a vcf file with single nucleotide variants, and a file with clinical phenotypes. 
    
   
  
    
   
  331 
 
  
    EGAD00001008772 
   
  
    
    This dataset contains 6 Fastq files from 3 samples (pre-culture (n=1), post culture in standard (n=1) or niche-llike (n=1) conditions) from 1 AML patient. They correspond to single-cell RNA sequencing on Illumina plateform of 3 libraries prepared with 10X Genomics gene expression V3.1 chemistry. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008773 
   
  
    
    This dataset contains 18 Fastq files from 9 samples (pre-culture (n=3), post culture in standard (n=4) or niche-llike (n=2) conditions)  from 4 patients.They correspond to bulk RNA sequencing on Illumina instrument of libraries prepared using SMARTer Universal Low Input RNA Kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001008774 
   
  
    
    1075 members of the LBC1936 were sequenced using the Illumina HiSeq X platform. This dataset contains the paired fastq files. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  1075 
 
  
    EGAD00001008775 
   
  
    
    297 members of the LBC1921 were sequenced using the Illumina HiSeq X platform. This dataset contains the fastq files. 
    
   
  
    
      
      Illumina HiSeq X Ten 
      
    
   
  297 
 
  
    EGAD00001008777 
   
  
    
    This submission includes raw FASTQ files (for bulk RNA-seq and 10X joint snATAC+snRNA multiome profiling experiments), sample phenotype files, and genotypes for the data included in the manuscript. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001008778 
   
  
    
    Whole-genome sequencing (WGS) data. 
    
   
  
    
      
      unspecified 
      
    
   
  5 
 
  
    EGAD00001008779 
   
  
    
    Single-cell DNA sequencing (scDNA-seq) data. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001008780 
   
  
    
    Excess sugar consumption is common among youth and can have adverse health effects. However, the relationship between saliva microbiota and sugar consumption remains sparsely studied. We aimed to explore diversity, composition and functional capacities of saliva microbiota in 11–13-year-old Finnish children with low and high sweet treat consumption. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  453 
 
  
    EGAD00001008781 
   
  
    
    The dataset comprises of transcriptomes of tissue sections derived from either the tumour normal interface or tumour core from clear cell renal cell carcinomas. 16 sections are sampled in total using 10x Genomics' Visium technology. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001008782 
   
  
    
    Column 1 “rsid”: SNP identifier
Column 2 “chromosome”: name of chromosome on which the SNP is located
Column 3: “position”: base pair position on the chromosome
Column 4 “minor_test_allele”: the base that constitutes the minor allele
Column 5 “major_allele”: the base that constitutes the major allele
Column 6 “maf”: the frequency of the minor allele, indicated as a fraction of 1
Column 7 “allele_freq_cases”: the minor allele frequency in cases
Column 8 “allele_freq_controls”: the minor allele frequency in controls
Column 9 “regression_pvalue”: the p-value for the difference in allele frequency between cases and controls
Column 10 “odds_ratio”: the odds ratio, as calculated using logistic regression under an additive model with adjustment for the first ten principal components of ancestry 
    
   
  
    
   
  1 
 
  
    EGAD00001008783 
   
  
    
    This dataset contains one vcf file with variants from whole exome sequencing of 24 paediatric AML samples at diagnosis. 
    
   
  
    
   
  1 
 
  
    EGAD00001008785 
   
  
    
    Clinical data for KATHERINE: Clinical data include Treatment Arm, Invasive Disease Free Survival (IDFS), Clinical Stage at Presentation, Hormone Receptor Status, Preoperative HER2 Directed Therapy, Pathological Node. 
    
   
  
    
   
  1059 
 
  
    EGAD00001008786 
   
  
    
    Biomarker data for KATHERINE: Biomarker data include RNA-seq time point, Percent of tumor content, PAM50 subtypes, normalized gene expression of ERBB2, CD8 and CD274, normalized immune signature expression. 
    
   
  
    
   
  1059 
 
  
    EGAD00001008788 
   
  
    
    Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001008789 
   
  
    
    Results of comprehensive immune deconvolution analysis through the TIMER2 web portal with algorithms specified in the publication. 
    
   
  
    
   
  1 
 
  
    EGAD00001008790 
   
  
    
    Recurrently altered genes based on FoundationOne sequencing. 
    
   
  
    
   
  - 
 
  
    EGAD00001008791 
   
  
    
    TCR-beta sequences, frequencies, and VDJ usage. 
    
   
  
    
   
  - 
 
  
    EGAD00001008792 
   
  
    
    "Master" file of patient clinical characteristics and outcomes, samples, and the results of certain analyses, including immunohistochemistry. 
    
   
  
    
   
  1 
 
  
    EGAD00001008793 
   
  
    
    Log2 gene expression count data from RNA sequencing. 
    
   
  
    
   
  1 
 
  
    EGAD00001008794 
   
  
    
    TCR-beta specificity motifs based on GLIPH2. 
    
   
  
    
   
  - 
 
  
    EGAD00001008796 
   
  
    
    Individual FASTQ files from RNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  66 
 
  
    EGAD00001008797 
   
  
    
    Individual FASTQ files from TCR sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001008798 
   
  
    
    Illumina platform whole genome sequencing data for matched tumour-normal DNA samples from 570 melanoma patients 
    
   
  
    
   
  1139 
 
  
    EGAD00001008799 
   
  
    
    This dataset includes fastq files for total RNAseq of 104 patient biopsies with metastatic castration resistant prostate cancer. The RNAseq libraries were rRNA depleted and sequenced at 150bp paired-end on Illumina Novaseq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  104 
 
  
    EGAD00001008800 
   
  
    
    We copy number profiled 688 tumor regions from 300 patients presenting with advanced prostate cancer and prospectively followed-up (median, 7 years) in the control group of the STAMPEDE trial. Patients were categorised into four metastatic states, namely high-risk non-metastatic (with or without local lymph node involvement) or metastatic (low or high volume). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  603 
 
  
    EGAD00001008801 
   
  
    
    This dataset contains chromosomal conformation capture data from fourteen samples (eleven tumor samples and three tumor derived cell lines). Libraries were prepared using the Illumina TruSeq LT sequencing adaptors. Sequencing was performed on the HiSeq X or NovaSeq platforms resulting in 28 FASTQ files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001008802 
   
  
    
    whole genome sequencing of six commonly used breast cancer cell lines and six patient derived xenograft models 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  14 
 
  
    EGAD00001008805 
   
  
    
    This dataset contains Whole Genome Bisulfite sequencing data from seven samples (six tumor samples and on tumor derived cell line).  Sequencing was performed Illumina HiSeq 2000 machine resulting in 14 FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008806 
   
  
    
    This dataset contains CTCF ChIP-sequencing data from seven samples (six tumor samples and one tumor derived cell line). Following library amplification, DNA fragments were sequenced using Illumina HiSeq 2000 paired-end sequencing resulting in 14 FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008807 
   
  
    
    n=4 Ctrl and n=4 HD fibroblasts lines were treated with DMSO or 10nM Branaplam for 72h and RNA-seq was performed. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001008808 
   
  
    
    n=3 Ctrl and n=3 HD iPSC lines differentiated into cortical neurons were treated with DMSO or 10nM Branaplam for 72h and RNA-seq was performed. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001008809 
   
  
    
    5 human plasma cell-free DNA cases (BS-seq) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001008810 
   
  
    
    36 mouse plasma cell-free DNA cases 
    
   
  
    
      
      NextSeq 500 
      
    
   
  36 
 
  
    EGAD00001008811 
   
  
    
    We profiled 87 primary-recurrentpatient-matched paired GBM specimens with single-nucleus RNA and bulk-DNA sequencing and single-cell open-chromatin and spatial transcriptomics/proteomics assays. We found that recurrent GBMs are characterized by a shift to a mesenchymal phenotype in response to therapy 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  71 
 
  
    EGAD00001008815 
   
  
    
    Meta-data/patient information for the bulk RNAseq data 
    
   
  
    
   
  1 
 
  
    EGAD00001008816 
   
  
    
    Bulk RNAseq of sigmoid colon biopsies from healthy volunteers and ulcerative colitis patients. Subjects were treated with Placebo or IL-22Fc at different doses, and biopsies were collected at day 0 and day 30 post treatment and prepared for RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  83 
 
  
    EGAD00001008817 
   
  
    
    Fecal WMS data from NCT02749630 ulcerative colitis patients. Stool samples were collected at screening as well as on day 64 and prepared for whole metagenomic sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  93 
 
  
    EGAD00001008818 
   
  
    
    Fecal 16S-V4 rRNA gene sequence data from NCT02749630 ulcerative colitis patients. Stool samples were collected at screening as well as on days 29, 43, 64, 85, and 134 processed for 16SV4 rRNA gene sequencing 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  192 
 
  
    EGAD00001008819 
   
  
    
    Metadata for fecal WMS data from NCT02749630 healthy volunteers. 
    
   
  
    
   
  1 
 
  
    EGAD00001008820 
   
  
    
    Metadata for fecal WMS data from NCT02749630 ulcerative colitis patients. 
    
   
  
    
   
  1 
 
  
    EGAD00001008821 
   
  
    
    Metadata for fecal 16S-V4 rRNA gene sequence data from NCT02749630 healthy volunteers. 
    
   
  
    
   
  1 
 
  
    EGAD00001008822 
   
  
    
    Metadata for fecal 16S-V4 rRNA gene sequence data from NCT02749630 ulcerative colitis patients. 
    
   
  
    
   
  1 
 
  
    EGAD00001008823 
   
  
    
    Metadata for 16S-V4 rRNA gene sequence data for intestinal biopsies from NCT02749630 trial participants. 
    
   
  
    
   
  1 
 
  
    EGAD00001008824 
   
  
    
    RNASeq files for Roussel-MPBRG paper titled "Combination of ribociclib and gemcitabine for the treatment of medulloblastoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  98 
 
  
    EGAD00001008825 
   
  
    
    Exome sequencing data from seven phenotypically abnormal human fetal samples. Anaysis perfomed using Illumina NovaSeq 6000, Twist Bioscience  - Human Comprehensive Exome. Paired end fastq files were aligned to hg38 reference genome using BWA-MEM v0.7.15, followed by sorting using SAMtools sort v1.3.1, and duplicate reads marked using Picard Tools MarkDuplicates v2.18.2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001008826 
   
  
    
    Mesothelioma of the peritoneum (n=21) and Pseudomyxoma peritonei/mucinous adenocarcinoma of the appendix (n=11) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  32 
 
  
    EGAD00001008827 
   
  
    
    Libraries of liCHi-C for different input cell numbers (50k, 100k, 250k, 500k and 1M cells) with 2 biological replicates each. Fastq file format 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001008828 
   
  
    
    Libraries of liCHi-C for 9 blood cell types (HSC, CMP, CLP, Ery, Mon, MK, nB, nCD4 and nCD8) with 2 biological replicates each. Fastq file format. 
    
   
  
    
      
      unspecified 
      
    
   
  18 
 
  
    EGAD00001008829 
   
  
    
    Libraries of liCHi-C for 2 B-ALL (B Acute Lymphocytic Leukaemia) from human patients. Fastq file format. 
    
   
  
    
      
      unspecified 
      
    
   
  4 
 
  
    EGAD00001008830 
   
  
    
    RNAseq data generated from paired tumor frozen tissues in which the tumor organoids were established for cell-cell or cell-matrix adhesion dependency assay. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  52 
 
  
    EGAD00001008831 
   
  
    
    This dataset includes whole genome sequences of 75 synchronous primary tumors, 15 metastases, and corresponding normal samples from 13 patients with multifocal ileal neuroendocrine tumors. The whole genomes were sequenced on Illumina HiSeq X Ten to generate 151-bp paired-end reads, which were aligned to GRCh38/hg38 reference assembly using BWA–MEM and duplicate-marked with Picard tools. GATK was utilized for base score recalibration and local indel re-alignments. The whole genomes are provided as CRAM files. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  108 
 
  
    EGAD00001008832 
   
  
    
    This dataset contains RNA-seq data (Fastq files of paired-end data) of 18 patient tumors used for identification of neotranscripts in 18 different types of fusion-driven sarcomas and other cancers as described in Vibert et al., Mol Cell 2022 (PMID: 35550257) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD00001008834 
   
  
    
    Single cell RNA-seq from D0,D11,D16,D21,D28 of dopamingeric differentiation from hESCs cell lines H9 and HS980 using current protocols. Different time points along the differentiation for each cell line were multiplexed using BD™ Single-Cell Multiplexing Kit for use with the 10x Chromium™ Single Cell 3’ Reagent
Kit v2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001008835 
   
  
    
    Single-cell RNA-seq cases (Tumor and adjacent tissue) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001008836 
   
  
    
    Plasma RNA sequencing (consists of 70 cases) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  70 
 
  
    EGAD00001008837 
   
  
    
    Illumina RNASeq sequencing of tumour samples from 230 cases of melanoma 
    
   
  
    
   
  230 
 
  
    EGAD00001008838 
   
  
    
    Consists of  76 mouse plasma cell-free DNA, 30 mouse Liver DNA, 10 human plasma cell-free DNA 
    
   
  
    
      
      NextSeq 500 
      
    
   
  116 
 
  
    EGAD00001008839 
   
  
    
    Homologous recombination deficiency (HRD) score in a large cohort of 55 triple-negative breast cancer PDX 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  55 
 
  
    EGAD00001008840 
   
  
    
    Fecal 16S-V4 rRNA gene sequence data from NCT02749630 healthy volunteers. Stool samples were collected at screening as well as on days 29, 43, 64, 85, and 134 processed for 16SV4 rRNA gene sequencing 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  206 
 
  
    EGAD00001008841 
   
  
    
    Fecal WMS data from NCT02749630 healthy volunteers. Stool samples were collected at screening as well as on days 29, 43, and 64 and prepared for whole metagenomic sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  53 
 
  
    EGAD00001008842 
   
  
    
    RNASeq files for Roussel paper titled "Combination of CDK4/6 with BET-bromodomain and PI3K/mTOR inhibitors in medulloblastoma in vitro and in vivo" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  39 
 
  
    EGAD00001008843 
   
  
    
    16S-V4 rRNA gene sequence data for intestinal biopsies from NCT02749630 trial participants. Biopsies from patients were collected at screening, day 30, and day 85 and prepared for 16SV4 rRNA gene sequencing. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  132 
 
  
    EGAD00001008844 
   
  
    
    Rare cancer sequencing data of 23 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD00001008845 
   
  
    
    Rare cancer sequencing data of 28 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  22 
 
  
    EGAD00001008846 
   
  
    
    Rare cancer sequencing data of 45 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001008847 
   
  
    
    Rare cancer sequencing data of 95 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  64 
 
  
    EGAD00001008848 
   
  
    
    Rare cancer sequencing data of 55 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  44 
 
  
    EGAD00001008849 
   
  
    
    Rare cancer sequencing data of 87 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  58 
 
  
    EGAD00001008850 
   
  
    
    Rare cancer sequencing data of 59 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  38 
 
  
    EGAD00001008851 
   
  
    
    Rare cancer sequencing data of 75 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  49 
 
  
    EGAD00001008852 
   
  
    
    Rare cancer sequencing data of 50 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001008853 
   
  
    
    Rare cancer sequencing data of 44 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001008854 
   
  
    
    Rare cancer sequencing data of 40 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001008855 
   
  
    
    Rare cancer sequencing data of 97 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  61 
 
  
    EGAD00001008856 
   
  
    
    Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  33 
 
  
    EGAD00001008857 
   
  
    
    Rare cancer sequencing data of 58 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  40 
 
  
    EGAD00001008858 
   
  
    
    Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  35 
 
  
    EGAD00001008859 
   
  
    
    Rare cancer sequencing data of 243 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  164 
 
  
    EGAD00001008860 
   
  
    
    Rare cancer sequencing data of 47 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  41 
 
  
    EGAD00001008861 
   
  
    
    Rare cancer sequencing data of 92 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  62 
 
  
    EGAD00001008862 
   
  
    
    Rare cancer sequencing data of 145 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  104 
 
  
    EGAD00001008863 
   
  
    
    Rare cancer sequencing data of 119 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  76 
 
  
    EGAD00001008864 
   
  
    
    Thirty six samples were sequenced and analysed. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  36 
 
  
    EGAD00001008865 
   
  
    
    Rare cancer sequencing data of 87 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  87 
 
  
    EGAD00001008866 
   
  
    
    Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  48 
 
  
    EGAD00001008867 
   
  
    
    Rare cancer sequencing data of 12 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001008868 
   
  
    
    Rare cancer sequencing data of 94 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  91 
 
  
    EGAD00001008869 
   
  
    
    Rare cancer sequencing data of 46 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  46 
 
  
    EGAD00001008870 
   
  
    
    Rare cancer sequencing data of 62 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  59 
 
  
    EGAD00001008871 
   
  
    
    Rare cancer sequencing data of 119 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  117 
 
  
    EGAD00001008872 
   
  
    
    Rare cancer sequencing data of 54 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  52 
 
  
    EGAD00001008873 
   
  
    
    Rare cancer sequencing data of 30 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001008874 
   
  
    
    Rare cancer sequencing data of 18 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD00001008875 
   
  
    
    Rare cancer sequencing data of 64 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  61 
 
  
    EGAD00001008876 
   
  
    
    Rare cancer sequencing data of 18 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001008877 
   
  
    
    Rare cancer sequencing data of 55 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  55 
 
  
    EGAD00001008878 
   
  
    
    Rare cancer sequencing data of 38 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  31 
 
  
    EGAD00001008879 
   
  
    
    Rare cancer sequencing data of 56 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  54 
 
  
    EGAD00001008880 
   
  
    
    Rare cancer sequencing data of 86 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  83 
 
  
    EGAD00001008881 
   
  
    
    Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  44 
 
  
    EGAD00001008882 
   
  
    
    Rare cancer sequencing data of 91 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  78 
 
  
    EGAD00001008883 
   
  
    
    Rare cancer sequencing data of 162 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  149 
 
  
    EGAD00001008884 
   
  
    
    Rare cancer sequencing data of 83 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  71 
 
  
    EGAD00001008885 
   
  
    
    Rare cancer sequencing data of 96 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  89 
 
  
    EGAD00001008886 
   
  
    
    Rare cancer sequencing data of 29 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  28 
 
  
    EGAD00001008887 
   
  
    
    Rare cancer sequencing data of 66 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  58 
 
  
    EGAD00001008888 
   
  
    
    Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  41 
 
  
    EGAD00001008889 
   
  
    
    Rare cancer sequencing data of 22 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  17 
 
  
    EGAD00001008890 
   
  
    
    Rare cancer sequencing data of 76 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001008891 
   
  
    
    Rare cancer sequencing data of 164 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  159 
 
  
    EGAD00001008892 
   
  
    
    Rare cancer sequencing data of 42 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  42 
 
  
    EGAD00001008893 
   
  
    
    Rare cancer sequencing data of 112 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  100 
 
  
    EGAD00001008894 
   
  
    
    Rare cancer sequencing data of 34 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  28 
 
  
    EGAD00001008895 
   
  
    
    Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  43 
 
  
    EGAD00001008896 
   
  
    
    Rare cancer sequencing data of 137 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  138 
 
  
    EGAD00001008897 
   
  
    
    Rare cancer sequencing data of 246 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  250 
 
  
    EGAD00001008898 
   
  
    
    Rare cancer sequencing data of 34 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001008899 
   
  
    
    Rare cancer sequencing data of 6 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001008900 
   
  
    
    Rare cancer sequencing data of 142 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  134 
 
  
    EGAD00001008901 
   
  
    
    Rare cancer sequencing data of 28 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  24 
 
  
    EGAD00001008902 
   
  
    
    Rare cancer sequencing data of 85 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  77 
 
  
    EGAD00001008903 
   
  
    
    Rare cancer sequencing data of 36 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  34 
 
  
    EGAD00001008904 
   
  
    
    Rare cancer sequencing data of 112 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  106 
 
  
    EGAD00001008905 
   
  
    
    RNA-Seq, WES and WGS data of 5 rare tumor/control pairs which were submitted to other HIPO projects, not MASTER. The sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001008906 
   
  
    
    Part of the published data from EGAS00001004662 resulted in the publication of this study EGAS00001004813 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001008907 
   
  
    
    The TransplantLines Gut Microbiome study includes raw data generated by shotgun metagenomic sequencing of fecal samples of solid organ transplant recipients and basic phenotypes (age and sex, BMI). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1177 
 
  
    EGAD00001008908 
   
  
    
    Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001008949 
   
  
    
    Organoid cultures derived from colorectal adenomas. RNA and DNA  was isolated from these cultures for genome wide profiling. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001008950 
   
  
    
    WES sequencing of multiple regions per tumor from 8 lung cancer patients (LUSC, LCNEC and LUAD) and adjacent healthy lung tissue for each patient. 
    
   
  
    
      
      unspecified 
      
    
   
  111 
 
  
    EGAD00001008951 
   
  
    
    RNA-seq Revision 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001008952 
   
  
    
    We applied an integrative single-cell genomics strategy with single nucleus RNA sequencing (snRNA-seq) and single nucleus Assay for Transposase-Accessible Chromatin sequencing (snATAC-seq) together with spatial transcriptomics from the same tissue mapping human cardiac cells in homeostasis and after myocardial infarction (MI) at unprecedented spatial and molecular resolution. We profiled in total 31 samples from 23 patients including four non-transplanted donor hearts as controls and samples from tissues with necrotic tissue areas (ischemic zone, IZ), border zone (BZ), and the non-affected left ventricular myocardium (remote zone, RZ) of patients with acute MI. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD00001008953 
   
  
    
    Whole-exome sequencing 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001008954 
   
  
    
    Whole-genome sequencing data 
    
   
  
    
      
      unspecified 
      
    
   
  67 
 
  
    EGAD00001008955 
   
  
    
    Fastq files of single nucleus RNA Sequencing data from 26 patients including 26 lung adenocarcioma and 12 matched healthy tissue samples for 8 young female never smokers, 8 young female smokers, 7 elderly female never smokers and 3 male never smokers. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  38 
 
  
    EGAD00001008956 
   
  
    
    Aligned BAM files with removed duplicate reads of targeted sequencing data (exomes of a panel of 153 genes) from 12 skin and 5 oral epithelial bulk samples from 2 donors. Sequences generated by the BGI DNB-SEQ platform. 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001008957 
   
  
    
    GWAS genotype data of 2,393 Japanese COVID-19 cases. 
    
   
  
    
   
  2393 
 
  
    EGAD00001008958 
   
  
    
    Consists of 88 cases 
    
   
  
    
      
      unspecified 
      
    
   
  88 
 
  
    EGAD00001008959 
   
  
    
    RNA-seq data 
    
   
  
    
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001008960 
   
  
    
    Single cell DNA-seq data (CLL gene panel - Tapestri single-cell DNA CLL panel, Mission Bio) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001008961 
   
  
    
    Single cell RNA-sequencing of treatment naïve PDAC patient samples. We have 10 samples, sequenced using the 10X genomics chromium platform with 3 prime chemistry. We are submitting FASTQ files representing the index files (I1), Read1 (R1) and Read2 (R2). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001008962 
   
  
    
    ATAC-seq data. Dataset includes FASTQ files, BAM files, and analysis files with the ATAC-seq peaks determined using MACS2. 
    
   
  
    
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001008963 
   
  
    
    A method for multiplexed full-length single-molecule sequencing of the human mitochondrial genome - cell line data 
    
   
  
    
      
      GridION 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008964 
   
  
    
    ChIP-seq peaks of H3K27ac. The dataset includes FASTQ files, BAM files, and analyses of the ChIP-seq peaks of H3K27ac determined using MACS2 
    
   
  
    
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001008965 
   
  
    
    RNAseq dataset containing 20 control and 2 SLFN14 K219N patient samples, derived from platelets. Sequencing libraries were constructed using an Illumina TruSeq stranded Ribo Zero Gold kit and paired end sequenced at a read depth of 30 million reads on an Illumina NovaSeq 6000 platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001008967 
   
  
    
    RNAseq of circulating monocytes of familial hypercholesterolaemia (FH) patients before and after treatment, and healthy controls.
Please cite original paper: Monocyte and macrophage lipid accumulation results in downregulated type-I interferon responses. Willemsen et al. Frontiers in Cardiovascular Medicine (2022)
Familial hypercholesterolemic patients (n=10) and healthy subjects (n=9): the study population, design, and further processing of these human study subjects and their samples have been extensively described (Stiekema et al., 2021). Briefly, untreated FH patients who indicated to start lipid-lowering therapy (statin, PCSK9 antibody, and/or ezetimibe) according to their treating physician were included. The healthy controls were age, sex, and body mass index (BMI) matched with the FH patients. After inclusion, FH patients fasted for at least 9 hours before blood samples were drawn for lipid measurements and monocyte isolation. This was repeated after 12 weeks of lipid-lowering therapy. RNA-seq was performed on circulating monocytes. V1 = visit 1. V2 = visit 2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001008968 
   
  
    
    Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). 
Here we are submitting two datasets that contain NGS files from gastrointestinal rare cancers (n=119):
- Dataset 2 (87 patients):  Intra-hepatic cholangiocarcinoma (n=47), Extra-hepatic, cholangiocarcinoma (n=16), Not specified Cholangiocarcinoma (n=9), Small bowel adenocarcinoma (n=6) and other rare GI cancer (n=9) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  87 
 
  
    EGAD00001008969 
   
  
    
    Reads were processed with the RNA-seq workflow 1.3.0 developed by the DKFZ Omics IT and Data Management Core Facility (https://github.com/DKFZ-ODCF/RNAseqWorkflow). First, FASTQ reads were aligned via two-pass alignment using STAR 2.5.3a. The STAR index was generated from the 1000 Genomes assembly and GENCODE Version 19 gene models with a sjdbOverhang of 200. Duplicate marking of the resultant main alignment file was done with sambamba 0.6.5. Gene-specific read counting was performed using featureCounts (from Subread 1.5.1) over exon features based on GENCODE Version 19 gene models. Both reads of a paired fragment were used for counting, and the quality threshold was set to 255, indicating that STAR found a unique alignment. Strand-specific counting was also used. For RPKM and TPM calculations, all genes on chromosomes X and Y, the mitochondrial genome, as well as rRNA and tRNA genes were omitted as they are likely to introduce library size estimation biases. 
    
   
  
    
   
  10 
 
  
    EGAD00001008970 
   
  
    
    Nanopore RNA Sequencing was done for 10 tumor samples. Direct cDNA sequencing was performed using the SQK-DCS109 kit (Oxford Nanopore Technologies). For analysis of a single sample on a MinION flow cell (version R9.4.1), 5 μg RNA was used as input. For multiplexing on a MinION flow cell, 2.5 μg RNA per sample was used as input, and the native barcoding expansion kit EXP-NBD104 was employed in conjunction with SQK-DCS109. After reverse transcription with Maxima H Minus Reverse Transcriptase (Thermo Scientific), second-strand synthesis was performed using the 2x LongAmp Taq Master Mix (New England Biolabs). The resulting double-stranded cDNA was subjected to end-repair and dA-tailing using the NEBNext Ultra End Repair/dA-Tailing Module (New England Biolabs). For multiplexed libraries, this step was followed by barcode ligation and library pooling. Next, libraries were quantified with a Qubit Fluorometer 3.0 (Life Technologies). Finally, sequencing adapters were added to the library preparations and ligated with Blunt/TA Ligase Master Mix (New England Biolabs), followed by further quality control using a Qubit. Samples ACC1 and ACC2 were analyzed on individual MinION flow cells, while the remaining eight samples were sequenced as multiplexed libraries on two MinION flow cells by pooling four samples for each run. Five ACC samples were also analyzed individually on Flongle flow cells 
    
   
  
    
      
      MinION 
      
    
   
  10 
 
  
    EGAD00001008971 
   
  
    
    Fastq files of paired RNA-Seq of 10 different tumor samples, for which Nanopore and Illumina sequencing was compared. Illumina sequencing was carried out with HiSeq4000 or HiSeq X-Ten using the Illumina TruSeq stranded mRNA Kit. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001008972 
   
  
    
    Ovarian cancer EV RNA-seq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001008973 
   
  
    
    WGS 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001008974 
   
  
    
    This dataset contains DNA and RNA sequencing information for AML, Gliomas, brain tumors (medulloblastoma and ependymoma), DIPG, rhabdoid tumors and soft tissue sarcomas. In total 39 samples are present (14 matched normal, 25 tumor samples). Not all samples have matched normals and not all samples have RNA sequencing data.. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  37 
 
  
    EGAD00001008975 
   
  
    
    234 BAM files containing capture data of MCL tumours and constitutive DNA 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  235 
 
  
    EGAD00001008976 
   
  
    
    Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827)
File name "eGAdeposit_233CaboVerde_SampleInfo_FINAL_01062022.txt" 
Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file
Column 2 corresponds to individual's biological sex as per genetic inference
Column 3 corresponds to individual's self-reported age in years
Column 4 corresponds to individual's self-reported cumulated number of years spent in academic or professional education 
    
   
  
    
   
  1 
 
  
    EGAD00001008977 
   
  
    
    Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827)
File name "eGAdeposit_233CaboVerde_GEOcoordFULL_FINAL_01062022.txt"
Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file
Column 2-3 corresponds to X-Y GPS coordinates of individual's interview location in Cabo Verde
Column 4-5 corresponds to X-Y GPS coordinates of individual's self-reported residence location at the time of the interview
Column 6-7 corresponds to X-Y GPS coordinates of individual's self-reported birth-place location
Column 8-9 corresponds to X-Y GPS coordinates of individual's self-reported paternal birth-place location
Column 10-11 corresponds to X-Y GPS coordinates of individual's self-reported maternal birth-place location 
    
   
  
    
   
  1 
 
  
    EGAD00001008978 
   
  
    
    Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827)
File name "eGAdeposit_225CaboVerde_FreeSpeech_Utterance_counts_FINAL_01062022.txt"
Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file. Note that only 225 unrelated Cabo Verdean-born individuals are considered here, out of the 233 individuals in the genotype file. See Material and Methods in Romain Laurent et al. 2022 - doi pending
Each 4831 other column correspond to the respective individual's utterance count in the free speech transcribed in ALUPEC and provided as column header.
See See Material and Methods in Romain Laurent et al. 2022 - doi pendingColumn 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file. Note that only 225 unrelated Cabo Verdean-born individuals are considered here, out of the 233 individuals in the genotype file. See Material and Methods in Romain Laurent et al. 2022 - doi pending
Each 4831 other column correspond to the respective individual's utterance count in the free speech transcribed in ALUPEC and provided as column header.
See Material and Methods in Laurent R et al. eLife 2023 
    
   
  
    
   
  1 
 
  
    EGAD00001008979 
   
  
    
    Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827)
As per Materials and Methods herein, the genotype data corresponds to 2,118,722 autosomal SNPs genotyped from the IlluminaOmni 2.5 Million BeadChip for 233 Cabo Verdean volunteer participants, family unrelated at the 2nd degree based on population genetics analyses (see Material and Methods).
SNP rsID, Chromosome position and genetic position in (bp) are in Build GRCh38.
Cabo Verdean individuals are designated with an alphanumeric unique code 
    
   
  
    
   
  1 
 
  
    EGAD00001008980 
   
  
    
    31 pregnant women at different trimesters, 6 hepatitis B carriers, and 8 patients with  hepatocellular carcinoma 
    
   
  
    
      
      PromethION 
      
    
   
  46 
 
  
    EGAD00001008981 
   
  
    
    Artificial mixtures of sonicated human and mouse DNA at different sizes were sequenced 
    
   
  
    
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001008982 
   
  
    
    Artificial mixtures of sonicated human and mouse DNA at different sizes were sequenced 
    
   
  
    
      
      Sequel 
      
    
   
  2 
 
  
    EGAD00001008983 
   
  
    
    Juntendo Muscle Study (JMS) dataset comprises 23 samples of paired-end RNA-Seq sequences in fastq format. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  23 
 
  
    EGAD00001008984 
   
  
    
    Muscle SATellite cell study (MSAT) dataset comprises 39 samples of paired-end RNA-Seq sequences in fastq format. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  39 
 
  
    EGAD00001008985 
   
  
    
    We profiled CD34+ enriched cells from GCSF mobilized bone marrow samples (n = 4) using single-cell RNA sequencing (10X) with targeted genotyping, and single-cell DNA methylation (RRBS) with single-cell RNA sequencing (Smart-Seq2) and targeted genotyping.
A 5th CH bone marrow aspirate sample was obtained to validate observed results. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001008987 
   
  
    
    The genomic VCF data of the Integrative proteogenomic characterization of early-stage DC project ,this dataset contains 76 VCF files. 
    
   
  
    
   
  76 
 
  
    EGAD00001008988 
   
  
    
    This dataset was collected from viable bone marrow cells obtained at diagnosis from nine patients with high hyperdiploid ALL and one normal bone marrow sample. All samples were subjected to low pass single cell whole genome sequencing with the median sequencing coverage of 0.02x. Single nuclei in G0/G1 phase were isolated using a fluorescence-activated cell sorting (FACS) cytometer. DNA libraries were constructed and associated next-generation sequencing was carried out by European Research Institute for the Biology of Ageing (ERIBA), University of Groningen, University Medical Center Groningen, Groningen, The Netherlands. Further details regarding the DNA libraries construction are available by Bos et. al., 2019 (https://link.springer.com/protocol/10.1007/978-1-4939-8931-7_15). 
    
   
  
    
      
      NextSeq 550 
      
    
   
  2842 
 
  
    EGAD00001008989 
   
  
    
    79 out of 336 GC samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001008991 
   
  
    
    Patients with progressive, metastatic castration-resistant prostate cancer (mCRPC) underwent metastatic tumor biopsy. 118 total samples including 14 paired samples (baseline and later progression). Various organ sites including soft tissue & bone were present. Published in DOI: 10.1200/JCO.2017.77.6880 Journal of Clinical Oncology 36, no. 24 (August 20, 2018) 2492-2503. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  118 
 
  
    EGAD00001008992 
   
  
    
    Raw, unfiltered, paired-end fastq files obtained through whole-genome and RNA-sequencing, respectively.
RNA-seq of affected individuals in three twin pairs.
WGS of blood in five twin pairs as well as uterine rudiment tissue of selected affected individuals. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001008993 
   
  
    
    In this study, the DNA of 44 subjects with severe COVID-19 have been sequenced  in order to explore rare genetic variants. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001008994 
   
  
    
    WES 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001008995 
   
  
    
    RNA-seq 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD00001008996 
   
  
    
    single cell RNA-seq 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001008997 
   
  
    
    single cell DNA sequencing 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001008998 
   
  
    
    This dataset contains data from 11 uveal melanoma patients. Plasma samples were collected at baseline, 2-weeks, 3-, 6-, and 12-months post treatment (surgery/radiation). Samples underwent targeted panel, shallow whole genome, and cfMeDIP sequencing. A total of 46 plasma samples, 7 tumours, 11 buffy coats, and 10 healthy controls are included. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  74 
 
  
    EGAD00001008999 
   
  
    
    This dataset contains plasma WGS data from patients with stage IV colorectal cancer (CRC, n = 16) and healthy individuals (n = 21) used in the Pointy manuscript. Patients with CRC provided written consent and samples were collected as performed as described previously (Clinical-Trials.gov number NCT01876511; Georgiadis et al., 2019, Le et al., 2017). Plasma samples from 21 healthy control individuals were procured through BioIVT. Cell-free DNA (cfDNA) was extracted from plasma using the QIAamp Circulating Nucleic Acid Kit. Libraries were prepared with 5 to 250 ng of cfDNA using the NEBNext DNA Library Prep Kit. Libraries were sequenced on HiSeq2000/2500. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  37 
 
  
    EGAD00001009000 
   
  
    
    Genomics of drug sensitivity in acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD00001009001 
   
  
    
    RRBS data for solid tumors and adjacent normal tissues 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  328 
 
  
    EGAD00001009002 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001009003 
   
  
    
    cfMethyl-Seq libraries were generated for 479 cfDNA samples and were sequenced with 150 bp paired-end reads. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  479 
 
  
    EGAD00001009004 
   
  
    
    whole genome sequencing on lymph node metastases and blood DNA from 25 cSCC patients with regional metastases of the head and neck. We designed a multifaceted computational analysis at the whole genome level to provide a more comprehensive perspective of the genomic landscape of metastatic cSCC. This study contains the majority of 15 samples which are previously submitted in  EGAC00001001100. 
    
   
  
    
   
  25 
 
  
    EGAD00001009005 
   
  
    
    Single cell transcriptomes, generated using chromium 10X 3' sequencing, for two tumour types (AT/RT, and Ewing's sarcoma).  For each individual, tumour and normal whole genome sequencing was also obtained using Illumina short read sequencing to an average depth of 30X.  These data were used to validate the accuracy of a method for identifying cancer cell transcriptomes based on the allelic shift produced by copy number changes. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001009006 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009007 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009008 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009009 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009010 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009011 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009012 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009013 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009014 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009015 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009016 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009017 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009018 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009019 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009020 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009021 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009022 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009023 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009024 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009025 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009026 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009027 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009028 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009029 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009030 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009031 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009032 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009033 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009034 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009035 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009036 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009037 
   
  
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009041 
   
  
    
    This dataset includes RNA-seq, DNA and Chip-Seq data of samples from our paper. 
    
   
  
    
   
  - 
 
  
    EGAD00001009042 
   
  
    
    This dataset includes RNA-seq data of samples from our paper. 
    
   
  
    
   
  18 
 
  
    EGAD00001009043 
   
  
    
    RNA-Seq data for 9 JPA samples. 
    
   
  
    
      
      unspecified 
      
    
   
  11 
 
  
    EGAD00001009044 
   
  
    
    5 Hi-C datasets (4 JPA and 1 LGG). Hi-C data for the remaining 5 JPAs used in our paper as well as all controls have been uploaded to EGA under EGAS00001005476. 
    
   
  
    
      
      unspecified 
      
    
   
  5 
 
  
    EGAD00001009045 
   
  
    
    ChIP data from PFA (n = 10). Raw data provided as FASTQ. Data generated on Illumina NovaSeq 6000 PE50. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001009046 
   
  
    
    PDX aCGH (txt), WES (fastq) and RNASeq (fastq) samples from mice treated with cisplatin. Primary samples and matched pdx samples from multiple passages. Models from the Marie Curie Instute (HBCx1, HBCx4B, HBCx8, HBCx10, HBCx12B, HBCx14, HBCx15, HBCx16, HBCx17, HBCx23, HBCx24, HBCx27, HBCx28, HBCx30, HBCx31, HBCx33, HBCx39, HBCx40, HBCx43, HBCx51, HBCx63, HBCx66, HBCx92, HBCx106) and the NKI (T127, T162, T241, T250, T283, T302, T336). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  71 
 
  
    EGAD00001009047 
   
  
    
    Total RNA was isolated from 50 formalin-fixed paraffin-embedded nasopharyngeal cancer (NPC) specimens using the RecoverAll Total Nucleic Acid Isolation kit (Ambion). Tumor RNA libraries were prepared with 200ng RNA using the Illumina TruSeq Stranded Total RNA kit with Ribo-Zero Gold, and sequenced with >80 million 100 bp paired-end reads. 
    
   
  
    
      
      unspecified 
      
    
   
  50 
 
  
    EGAD00001009048 
   
  
    
    Somatic RNA for 40 samples matched to the WGS was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500.
Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and
2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  - 
 
  
    EGAD00001009049 
   
  
    
    FASTQ reads for 81 matched tumour-normal WGS pairs for high grade serous ovarian cancer patients.
Scottish HGSOC samples were collected via local Bioresource facilities at Edinburgh, Glasgow, Dundee and Aberdeen and stored in liquid Nitrogen until required. HGSOC patients were determined from pathology records and were included in the study where there was matched tumour and whole blood samples. Tumour samples were divided into two for DNA and RNA extraction and slivers of tissue were taken, fixed in formalin and embedded in paraffin wax (FFPE). Samples were only included if they were confirmed as HGSOC and there was greater than 40% tumour cellularity throughout the tumour, determined using H&E staining of the FFPE sections and pathology review. Somatic DNA was extracted from the tumour and germline DNA was extracted from whole blood.
Somatic DNA was extracted using the Qiagen DNeasy Blood and tissue kit (cat no 69504). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including RNase digestion step). Germline DNA was
extracted from 1-3ml whole blood using the Qiagen FlexiGene kit (cat no 51206) following the manufacturers recommended protocol. The resulting DNA underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as surrogate measures of DNA purity. A260/280 had to be 1.8 or greater and A260/230 had to be 2.0 or greater. Then, DNA was quantified using LifeTechnologies Qubit dsDNA BR kit (cat no Q32850) and we required a minimum of 50ul at 25ng/ul for WGS. Thirdly, DNA was diluted to 25ng/ul and a representative sample was loaded onto a 0.8% TAE gel, ran at 100v for 60mins and then imaged using a BioRad ChemiDoc imaging system to visualise the DNA quality. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001009050 
   
  
    
    This dataset contains 2 human tumor-derived cell lines and 1 human tumor Hi-C samples. The raw fastq files and the processed .hic file is provided for each sample. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001009051 
   
  
    
    K562 cell line has been treated with two different HSP90 inhibtors. After resistance clones emerged, they have been genetically characterized using WES in comparison to the parental K562 cell line. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  3 
 
  
    EGAD00001009052 
   
  
    
    Single cell technologies allow the interrogation of tumor heterogeneity, providing insights into tumor evolution and treatment resistance. To better understand whether circulating tumor cells (CTCs) could complement metastatic biopsies for tumor genomic profiling, we characterized 11 single CTCs and 10 pooled CTC samples at the mutational and copy number aberration (CNA) levels, and compared these results with matched synchronous tumor biopsies from 3 metastatic breast cancer patients with triple-negative (TNBC), HER2-positive and estrogen receptor-positive (ER+) tumors.
Similar CNA profiles and the same patient-specific driver mutations were found in bulk tissue and CTCs for the HER2-positive and TNBC tumors, whereas different CNA profiles and driver mutations were identified for the ER+ tumor, which presented two distinct clones in CTCs defined by mutations in ESR1 Y537N and TP53, respectively. Furthermore, de novo mutational signatures derived from CTCs described patient-specific biological processes.
These data suggest that tumor tissue and CTCs provide complementary clinically relevant information to map tumor heterogeneity and tumor evolution. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  30 
 
  
    EGAD00001009054 
   
  
    
    RNA sequencing data of 141 samples from 141 patients with HER2+ breast cancer treated withletrozole or tamoxifen (SOLTI-1114 PAMELA trial) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  142 
 
  
    EGAD00001009056 
   
  
    
    RNA-seq was performed to compare gene expression profiles between 11 patient adherent  ALL samples and nonadherent  ALL samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001009057 
   
  
    
    Medulloblastoma intra-tumoural genetic heterogeneity and clonal evolution, and their role in disease pathogenesis and clinical behaviour, are poorly understood. We used single-cell whole-genome sequencing (sc-WGS) to reconstruct the natural history and temporal evolution of 14 medulloblastomas, representing its major clinico-molecular sub-classes. We identified wholly-clonal tumours which displayed single-clone expansion (i.e. linear evolution); all were observed in favourable-outcome sub-classes (i.e. MBWNT and infant MBSHH). In contrast, remaining tumours harboured sub-clonal structures which displayed punctuated or gradual trajectories; highest-risk sub-classes, typically characterised by MYC-amplification (MBGroup3) or TP53-mutation (MBSHH), and linked to genomic instability and LCA pathology, were most clonally-diverse. Clinically-adopted biomarkers were typically early-clonal/initiating events, representing exploitable targets for early-disease detection; in analyses of spatially-distinct tumour regions, a single biopsy was sufficient to assess their status.  sc-WGS revealed events not previously appreciated in bulk tumour analysis, which arose later and/or sub-clonally and more commonly displayed spatial diversity; their clinical significance and role in disease evolution post-diagnosis now require establishment. In summary, our findings reveal diverse modes of tumour initiation and clonal evolution in the major medulloblastoma sub-classes, highlighting their pathogenic relevance and clinical potential. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  430 
 
  
    EGAD00001009058 
   
  
    
    We carried out WGS and RNAseq on a cohort of 48 children and young adults with induction failure in T-cell Acute Lymphoblastic Leukemia (T-ALL) to identify genomic drivers of treatment resistance. The study includes WGS for 33 tumour/normal pairs and 15 tumour-only samples. In addition, there is RNAseq data for 37 cases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  118 
 
  
    EGAD00001009059 
   
  
    
    RNA-seq data for melanoma biopsies at baseline and after treatment 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009060 
   
  
    
    RRBS data for melanoma biopsies at baseline and after treatment 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009061 
   
  
    
    Clonal tracking of stem cells and their progeny by whole genome sequencing permits exploration of evolutionary genetics in human disease. In this study, we performed phylogenetic reconstruction of haematopoiesis using somatically acquired mutations in 323 single haematopoietic stem and progenitor cell-derived colonies from 10 individuals with an inherited disorder of ribosome assembly, Shwachman-Diamond syndrome. We observed numerous clonal expansions, with recurrent acquisition of mutually exclusive mutations (EIF6, TP53, RPL5, RPL22, PRPF8, chromosomes 7 and 15) in multiple different clones in utero or early childhood converging on the p53-dependent nucleolar surveillance pathway that monitors ribosome integrity. In contrast to clones carrying biallelic TP53 mutations, genomes derived from colonies carrying mono-allelic TP53 mutations displayed no increase in mutation burden or specific mutational signatures. Our study highlights striking loss of clonal diversity with convergent somatic evolution on the p53-dependent nucleolar surveillance pathway from early life to offset the deleterious effects of a germline mutation in a Mendelian haematopoietic disorder. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  323 
 
  
    EGAD00001009062 
   
  
    
    This Dataset contains RNA-Seq, H3K27Ac ChIP-Seq, and ATAC-Seq data for 13 cystic fibrosis (CF) patients and 8 healthy volunteers (HV). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  222 
 
  
    EGAD00001009063 
   
  
    
    This dataset contains the sputum metagenome from 99 COPD patients and 36 healthy individuals in China. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  135 
 
  
    EGAD00001009064 
   
  
    
    Profiling of 12 megabases of human non-coding DNA (including enhancers, promoters, and boundaries of topologically associating domains) in a longitudinal cohort of patients treated with endocrine therapies. For each patient, DNA from the primary and relapsed (metastatic) tumour, along with normal matched DNA, were profiled. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  300 
 
  
    EGAD00001009065 
   
  
    
    164 pairs of FASTQ files from metastatic Castration-Resistant Prostate Cancer (mCRPC) sequenced on HiSeq 4000 instruments. Patients were enrolled in the West Coast Dream Team study. Biopsies include various tissue sites including bone, soft tissue, and lymph node. 42 pairs prior to enzalutamide treatment and at progression from 21 patients are included. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 4000 
      
    
   
  164 
 
  
    EGAD00001009066 
   
  
    
    The cohort (n=53) consists of prostate cancer patients from Australia. For each patient, a pair of blood and tumour samples were collected. The sequencing data was mapped to hg38 reference. Blood BAMs are named with “-B” as suffix and tumour BAMs are named with “-T” as suffix. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  106 
 
  
    EGAD00001009067 
   
  
    
    The cohort (n=130) consists of prostate cancer patients from South Africa (n=123) and Brazil (n=7). For each patient, a pair of blood and tumour samples were collected. The sequencing data was mapped to hg38 reference. Blood BAMs are named with “-B” as suffix and tumour BAMs are named with “-T” as suffix. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  260 
 
  
    EGAD00001009068 
   
  
    
    RNAseq sequencing of 10 breast cancer bone metastasis PDX. Samples were obtained from 5 PDX that acquired palbociclib resistance (palboR) and 5 from the parentale PDX (palboS). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001009069 
   
  
    
    This dataset contains RNA-sequencing of blood samples from Healthy Controls (n=7),  PSA patients (n=27) and RA patients (n=9). It also contains RNA-sequencing data of skin fibroblasts from healthy controls (n=3) and PSA patients (n=3). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  69 
 
  
    EGAD00001009070 
   
  
    
    GWAS genotype data of the Japanese population (N=2,380). 
    
   
  
    
   
  1 
 
  
    EGAD00001009071 
   
  
    
    This dataset contains genome-wide array data from Tunisian and Moroccan individuals. Tunisian individuals were sampled in the city of Tunis (n = 64), and Moroccan individuals in different urban areas in the country (n = 45). 
    
   
  
    
   
  1 
 
  
    EGAD00001009072 
   
  
    
    Fastq files from RNAseq of breast cancer bone metastases PDX after treatment with IACS-010759. Five PDX are resistants to treatment and 6 are responders. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001009074 
   
  
    
    This dataset contains RNA-seq data of giant cell tumour of bone (GCTB) cell lines (n=3). Cell lines consist of neoplastic "stromal" cells harboring a heterozygous H3F3A p.G34W mutation. RNA-seq was performed on the BGISEQ-500 platform (PE100) and uploaded data contains fastq files, vcf files, and gene expression values (TPM). Cell lines, data generation, and data analysis are described in the following publication: Venneker et al., Histone deacetylase inhibitors as a therapeutic strategy to eliminate neoplastic “stromal” cells from giant cell tumors of bone, 2022. 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001009075 
   
  
    
    Time-dependent characterization of CNS response in COVID-19 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001009076 
   
  
    
    This dataset includes 3432 paired single cell sequencing fastq files derived from synovial B cells of 5 early Rheumatoid Arthritis patients. Libraries were prepared using the SmartSeq2 protocol. All wells contained ERCC spike-ins. Libraries were sequenced on an Illumina NovaSeq instrument with 2x100bp paired-end reads yielding a median of 2.5M reads/well. Sample aliases ending on 368 and 384 represent empty control wells. Sample aliases starting with P11417_4 and P13157_1 belong to ACPA- patient A7, sample aliases starting with P11417_5 and P11417_6 belong to ACPA+ patient A1, sample aliases starting with P13157_2 belong to ACPA- patient A3, sample aliases starting with P13157_3 and P13157_6 belong to ACPA+ patient A2, sample aliases starting with P13157_4 and P13157_5 belong to ACPA- patient A4. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3432 
 
  
    EGAD00001009077 
   
  
    
    Sequencing of Huntington's disease patient samples. Whole exome sequencing (n = 463) and MiSeq HTT amplicon sequencing (n = 584) BAM/BAI files. Two analyses (one linear and one logistic) with relevant files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  363 
 
  
    EGAD00001009078 
   
  
    
    Bam files of whole-genome sequencing of 14 paired PMBCL samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001009079 
   
  
    
    We purified peripheral blood mononuclear cells from individuals living in India (N=10) and the Netherlands (N=10) at baseline and 10-12 weeks after BCG vaccination. We compared chromatin accessibility between the two populations at baseline, as well as gene transcription profiles and cytokine production capacities upon viral stimulation with influenza and SARS-CoV-2 
    
   
  
    
      
      unspecified 
      
    
   
  157 
 
  
    EGAD00001009081 
   
  
    
    The dataset is based on 37 FFPE samples obtained from 12 patients diagnosed with breast or larynx cancer,
For each patient 3 sample types were obtained P - primary tumor, L - malignant lymph node and C - benign lymph node (control).
For patient G46 two malignant lymph nodes were used.
DNA isolated from all samples was subject to exome selection using Agilent SureSelect Human All Exon V7. The obtained material was sequenced using NovaSeq 6000 platform with 2x150 reads. The sequencing was conducted by Novogene company. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  37 
 
  
    EGAD00001009082 
   
  
    
    We used chromatin-immunoprecipitation followed by sequencing (ChIP-Seq) with an antibody for the H3K27ac (a bona fide histone mark for regulatory element activation) in sorted CLL cells from 15 CLL, including cases from stereotyped subsets #1, #2, #4, and #8. The samples were sequenced by Illumina HiSeq 2500. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001009083 
   
  
    
    Samples: primary cutaneous melanoma (CM) non-associated or distal nevus (A); adjacent or CM-associated nevus (B); Primary-CM (C); and Lymph-Node Metastasis (LN-mts) (D). Whole-exome sequencing (WES) was performed in DNA extracted from the different samples (A-D) paired with the germline reference (G), processed with the Agilent SureSelect All Exon Human V5 Library in an Illumina Hiseq 4000 PE101 platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001009085 
   
  
    
    TCRab sequencing of viably frozen cells from 12 samples from four chronic-phase chronic myeloid leukemia patients. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00001009086 
   
  
    
    Single-cell RNA sequencing of viably frozen cells from 12 samples from four chronic-phase chronic myeloid leukemia patients. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001009087 
   
  
    
    RNA-sequencing (RNA-seq) efforts in acute lymphoblastic leukaemia (ALL) have identified numerous prognostically significant genomic alterations which can guide diagnostic risk stratification and treatment choices when detected early. However, a full RNA-seq Bioinformatics workflow is time-consuming and costly in a clinical setting where rapid detection and accurate reporting of clinically relevant alterations are essential. To accelerate the identification of ALL-specific alterations (including gene fusions, single nucleotide variants and focal gene deletions), we developed the rapid screening tool RaScALL, capable of identifying more than 100 prognostically significant lesions directly from raw sequencing reads. RaScALL uses the k-mer based targeted detection tool km and known ALL variant information to achieve a high degree of accuracy for reporting subtype defining genomic alterations compared to standard alignment-based pipelines. Gene fusions, including difficult to detect fusions involving EPOR and DUX4, were accurately identified in 98% (164 samples) of reported cases in a 180-patient Australian study cohort and 95% (n=63) of samples in a North American validation cohort. Pathogenic sequence variants were correctly identified in 75% of tested samples, including all cases involving subtype defining variants PAX5 p.P80R (n=12) and IKZF1 p.N159Y (n=4). Accurate detection of intragenic IKZF1 deletions resulting in aberrant transcript isoforms was also detectable with 98% accuracy. Importantly, the median analysis time for detection of all targeted alterations averaged 22 minutes per sample, significantly shorter than standard alignment-based approaches, ensuring accelerated risk-stratification and therapeutic triage. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  180 
 
  
    EGAD00001009088 
   
  
    
    An increased incidence of endometrial cancer has been described for patients that have received tamoxifen to treat breast cancer. Using samples from endometrial tumors, isolated from surgivcal specimens of patients who previously received tamoxifen treatment for breast cancer, we aimed to identify whether there are specific somatic mutations enriched in this population, relative to endometrial tumors from the general population. For this, WES was performed on matched endometrial tumors and healthy tissue (n=21). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  42 
 
  
    EGAD00001009089 
   
  
    
    31 samples transcriptomics to simulate the knock-out of all targets of a drug on an objective function such as growth or energy balance. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  31 
 
  
    EGAD00001009090 
   
  
    
    16 CRC patient WGS data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  64 
 
  
    EGAD00001009091 
   
  
    
    16 CRC patients transcriptome data 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  96 
 
  
    EGAD00001009092 
   
  
    
    BARIA 100 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina Genome Analyzer IIx 
      
    
   
  420 
 
  
    EGAD00001009099 
   
  
    
    Gynecologic carcinosarcomas (CS), including more generally uterine (endometrial) and less frequently ovarian localization, are histologically defined as biphasic neoplasms composed of carcinomatous (C) epithelial and sarcomatous (S) malignant components. We report a comprehensive analysis of 20 patients of macro-dissected samples of C and S components through RNA sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  40 
 
  
    EGAD00001009100 
   
  
    
    Brain-Derived Neurotrophic Factor (BDNF) is crucial for neuronal survival, differentiation, synaptic plasticity, memory formation, and neurocognitive health. Molecular mechanisms of BDNF promoting cellular survival and synaptic plasticity have been intensely studied, yet its role in genome regulation is obscure. Using human induced pluripotent stem cell (hiPSC)-derived neurons via lentiviral delivery of the neuronal transcription factor Ngn2, we performed a temporal profiling (1h, 6h and 10h) of chromatin accessibility upon BDNF treatment or depolarization (KCl) to identify BDNF-specific chromatin-to-gene expression programs. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001009101 
   
  
    
    The data includes exome sequencing FASTQ files of 335 patients receiving immune checkpoint blockade therapy. The data only provides for WXS of tumor tissue. 
    
   
  
    
   
  335 
 
  
    EGAD00001009102 
   
  
    
    Here we provide access to newly generated RNA-seq data for 101 human islet samples used to map genetic effects on gene expression and alternative splicing (eQTLs and sQTLs) in a total of 399 human islets. We also make publicly available genotyping array data for 128 human islets, including the fraction of 101 human islet samples with existing RNA-seq data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  101 
 
  
    EGAD00001009103 
   
  
    
    Dataset for 16S rRNA gene sequencing data for sputum samples 61 COPD patients, generated using PacBio sequencing technology. 
    
   
  
    
      
      Sequel 
      
    
   
  40 
 
  
    EGAD00001009104 
   
  
    
    MTM-HD - fibroblast RNAseq. 57 samples from controls, pre-HD and early-HD patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  57 
 
  
    EGAD00001009105 
   
  
    
    MTM-HD - adipose tissue RNAseq. 60 samples from controls, pre-HD and early-HD patients 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  60 
 
  
    EGAD00001009106 
   
  
    
    MTM-HD - skeletal muscle RNAseq. 57 samples from controls, pre-HD and early-HD patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001009108 
   
  
    
    RNA sequencing dataset for primary and recurrent ovarian granulosa cell tumors consists of 24 .bam files with aligned reads, including 8 primary and 16 recurrent tumors.
Total RNA was extracted from cryopreserved tissue of adult-type granulosa cell tumor samples. Libraries were prepared from cDNA using the NuGEN Ovation Ultralow Library System V2 (San Carlos, CA). Paired-end bulk RNA sequencing was performed on the Illumina HiSeq 2000 platform. RNA sequencing reads were aligned to the hg19/GRCh37 reference human genome using the STAR software (version 2.6.0b) with default parameters.
Samples description:
GCT001	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT002	Adult-type Granulosa Cell Tumor: primary tumor
GCT003	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT004	Adult-type Granulosa Cell Tumor: primary tumor
GCT005	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT006	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT007	Adult-type Granulosa Cell Tumor: primary tumor
GCT008	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT009	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT010	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT011	Adult-type Granulosa Cell Tumor: primary tumor
GCT012	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT013	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT014	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT015	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT016	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT017	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT018	Adult-type Granulosa Cell Tumor: primary tumor
GCT019	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT020	Adult-type Granulosa Cell Tumor: primary tumor
GCT021	Adult-type Granulosa Cell Tumor: primary tumor
GCT022	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT023	Adult-type Granulosa Cell Tumor: recurrent tumor
GCT024	Adult-type Granulosa Cell Tumor: primary tumor 
    
   
  
    
   
  24 
 
  
    EGAD00001009109 
   
  
    
    6 trios and 1 proband  were whole genome sequenced with PacBio Sequel  II to a depth of 30X, using the HiFi chemistry. For each trio the proband was affected with severe ID, and the parents were unaffected. Samples are grouped by trio. 
    
   
  
    
      
      Sequel 
      
    
   
  19 
 
  
    EGAD00001009110 
   
  
    
    Extracted regions from WGS of Ewing sarcoma spanning fusion breakpoints +/- 100kb for ctDNA tracking in plasma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009111 
   
  
    
    10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 112.109 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009112 
   
  
    
    10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 116.126 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009113 
   
  
    
    10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 83.86 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009114 
   
  
    
    10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTert L9 116.66 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009115 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 4 
    
   
  
    
      
      BGISEQ-500 
      
    
   
  1 
 
  
    EGAD00001009116 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009117 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009118 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 8 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009119 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 7 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009120 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA609 passage 6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009121 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1049 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009122 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009123 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009124 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1051 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009125 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009126 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009127 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009128 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009129 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1093 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009130 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1091 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009131 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009132 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009133 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1051 passage 1 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009134 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009135 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1181 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009136 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009137 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009138 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1162 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009139 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient SA1096 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009140 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1049 passage 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009141 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009142 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA605 passage 3 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009143 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line OV2295 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009144 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line OV2295(R2) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009145 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient SA1184 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009146 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1047 passage 1 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009147 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 4 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009148 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 8 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009149 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 6 
    
   
  
    
      
      BGISEQ-500 
      
    
   
  1 
 
  
    EGAD00001009150 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009151 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 7 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009152 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 4 
    
   
  
    
      
      BGISEQ-500 
      
    
   
  1 
 
  
    EGAD00001009153 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 5 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009154 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009155 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 5 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009156 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 5 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009157 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 9 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009158 
   
  
    
    10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line TOV2295(R) 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009159 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 9 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009160 
   
  
    
    10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA610 passage 3 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009161 
   
  
    
    Children with ALL treated with anti-CD19 therapy occasionally develop a phenotypically distinct AML. However, the precise clonal origin of such class switch leukemias remains unresolved. Here, we reconstructed the evolution of leukemia in a child with primary ALL, two ALL relapses and AML after treatment with anti-CD19 CAR-T and blinatumomab through whole-genome sequencing. The phylogeny revealed that the AML was a monoclonal outgrowth descending from the initial ALL and harbored biallelic loss of CDKN2A, PAX5 and TP53. However, none of the ALL or AML relapses directly descended from one another, suggesting the presence of a reservoir of persistent clones. Our findings suggest anti-CD19 treatment selects pre-existing clones, with many key genetic alterations underpinning the lineage switch detectable prior to treatment. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009162 
   
  
    
    We conducted whole exome sequencing (using the SureSelect Human All Exon V5 + UTRs target enrichment kit) of 90 individuals from AP (23 from Saudi Arabia, 24 from Yemen, 24 from Oman and 19 from UAE). 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  90 
 
  
    EGAD00001009163 
   
  
    
    WGS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001009164 
   
  
    
    WXS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001009165 
   
  
    
    This dataset contains the methylation sequencing data of 60 nonCancer and 70 colorectal cancer cfDNA samples. The methylation library is constructed by using NEBNext Enzymatic-seq Kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  130 
 
  
    EGAD00001009166 
   
  
    
    The Roche Alzheimer’s disease dataset (Roche_AD) consists of 80 samples from 40 unique individuals (one sample from the temporal cortex and one from deep white matter for each individual, 12 cases, 25 controls, 3 dementia). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus. 
    
   
  
    
   
  80 
 
  
    EGAD00001009167 
   
  
    
    This dataset comprises genetic variation data (as somatic indels and snvs VCFs) of 38 OPSCC tumors. WES was done using NextSeq 500 System running in 150 cycles (2x 75bp paired-end) mode. Sequence information was converted to FASTQ format using bcl2fastq v2.20.0.422. VCFs were generated using the Strelka package. 
    
   
  
    
   
  1 
 
  
    EGAD00001009168 
   
  
    
    The Columbia Alzheimer’s dataset (white matter) consists of 24 white matter individuals (12 controls, 12 cases). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus. 
    
   
  
    
   
  24 
 
  
    EGAD00001009169 
   
  
    
    The Roche multiple sclerosis dataset (Roche_MS) consists of 166 cortical grey matter (GM) and white matter (WM) samples from 83 unique individuals (29 controls and 54 cases). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  166 
 
  
    EGAD00001009170 
   
  
    
    The whole exome was sequenced in two cancer-affected members  (II:1 and III:1) of the family. The family subject of this study showed an autosomal dominant mode of CRC inheritance, fulfilling the Amsterdam I clinical criteria with three CRCs in two consecutive generations. 
The exome capture was performed using SureSelectXT Human All Exon V3 (51Mb, Agilent Technologies), and the library was sequenced on an Illumina HiSeq 2000 platform with paired-end reads of 101bp and a 50x average coverage depth. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001009171 
   
  
    
    Bank of metastatic colorectal cancer (mCRC) of Patient Derived Xenografts (PDXs) 
    
   
  
    
      
      unspecified 
      
    
   
  480 
 
  
    EGAD00001009172 
   
  
    
    PDX model of T-ALL under treatment of CB-103 and Vehicle was analyzed by single-cell transcriptomics using 10X Genomics technology. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001009173 
   
  
    
    This dataset contains 10x Genomics v3 3’ single nuclei RNA sequencing (24 human schizophrenia and control samples) and 10x Genomics Visium spatial transcriptomics (14 human schizophrenia and control samples) datasets. 
Files are in .bam format, output of the cellranger v3.1.0 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/3.1/using/count) for snRNA-seq and spaceranger v1.1.0 (https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/1.1/using/count) for Visium samples.
Reads were mapped against Release 97 of human genome from Ensembl (http://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
http://ftp.ensembl.org/pub/release-97/gtf/homo_sapiens/Homo_sapiens.GRCh38.97.gtf.gz). 
More details on sample processing are available in the biorXiv pre-print (https://doi.org/10.1101/2020.11.17.386458) and upcoming publication in Science Advances.
BAM files can be converted to fastq files using bamtofastq tool (https://support.10xgenomics.com/docs/bamtofastq) with downstream remapping using tools and genomes of choice. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  38 
 
  
    EGAD00001009174 
   
  
    
    To explore intratumor-heterogeneity of CLL_24 using single-cell multi-omics approach, we generated single-cell CITE-seq data for CLL_24, coupling scRNA-seq and protein surface marker measurements with oligo-tagged antibodies. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009175 
   
  
    
    Low Pass Whole Genome Sequencing of Cell Free DNA from Patients Receiving CD19 CAR T-Cell Therapy for Large B-Cell Lymphoma consisting of 123 samples with FASTQs with hg19 aligned BAM/BAI files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  94 
 
  
    EGAD00001009176 
   
  
    
    We sorted CD45-CD44+CD90+ stromal cells from multiple tumor types and performed bulk RNA-sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  171 
 
  
    EGAD00001009177 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA218 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009178 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA219 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009179 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA272 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009180 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA274 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009181 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA275 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009182 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA276 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009183 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA279 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009184 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA283 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009185 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA287 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009186 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA394 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009187 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA395 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009188 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA398 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009189 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA402 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009190 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA404 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009191 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA530 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009192 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA585 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009193 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA586 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009194 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA588 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009195 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA589 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009196 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA590 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009197 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA591 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009198 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA592 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009199 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA593 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009200 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA595 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009201 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA596 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009202 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA598 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009203 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA599 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009204 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA600 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009205 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA601 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009206 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA654 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009207 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA655 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009208 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA665 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009209 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA666 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009210 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA667 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009211 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA668 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009212 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA669 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009213 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA671 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009214 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA672 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009215 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA535 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009216 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1035 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009217 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA604 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009218 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA605 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009219 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA609 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009220 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA218 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009221 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA219 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009222 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA272 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009223 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA274 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009224 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA275 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009225 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA276 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009226 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA279 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009227 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA283 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009228 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA287 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009229 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA394 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009230 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA395 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009231 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA398 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009232 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA402 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009233 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA404 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009234 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA530 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009235 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA535 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009236 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA585 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009237 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA586 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009238 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA588 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009239 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA589 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009240 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA590 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009241 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA591 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009242 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA592 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009243 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA593 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009244 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA595 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009245 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA596 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009246 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA598 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009247 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA599 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009248 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA600 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009249 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA601 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009250 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA654 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009251 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA655 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009252 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA665 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009253 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA667 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009254 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA668 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009255 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA669 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009256 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA671 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009257 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA672 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009258 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1035 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009259 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA994 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009260 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA604 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009261 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA605 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009262 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA609 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009264 
   
  
    
    Profiling of paired nuclear and cytoplasmic fractions of anterior prefrontal cortex, cerebellar cortex and putamen samples by bulk-tissue RNA-sequencing. Samples were derived from 4 post-mortem neuropathologically-confirmed control individuals ( anterior prefrontal cortex – 4 individuals, cerebellar cortex – 4 individuals, putamen- 3 individuals). Paired-end FASTQ files for each of the human samples are provided. Fastp (v 0.20.0), a fast all-in-one FASTQ pre-processor, was used for adapter trimming, read filtering and base correction. Fastp default settings were used for quality filtering and base correction. Further details on parameters used are available here: https://github.com/RHReynolds/RNAseqProcessing . 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  22 
 
  
    EGAD00001009265 
   
  
    
    Dataset comprising raw paired RNA-seq data in fastq.gz format for 7 samples of rosette forming brain tumors 
    
   
  
    
      
      NextSeq 500 
      
    
   
  7 
 
  
    EGAD00001009266 
   
  
    
    This dataset comprise results of mutect2 variant calling in vcf format on 9 samples of rosette forming brain tumors (5 with paired normal tissue and 4 without). Only variants specific to the tumor where kept to comply with patients consent. 
    
   
  
    
   
  14 
 
  
    EGAD00001009267 
   
  
    
    59 samples are sequenced, 18 are HCC and beta thalassemia cases while the remaining are control case. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  59 
 
  
    EGAD00001009268 
   
  
    
    levels of 92 circulating proteins measured by Olink platform, CVDIII panel 
    
   
  
    
   
  1 
 
  
    EGAD00001009269 
   
  
    
    Fastq or bam files are deposited for 28 patient H3-K27M diffuse midline gliomas. UMPEDD65 was profiled by targeted exome-sequencing using the TSO500 Illumina assay, while all other samples were sequenced by whole-exome sequencing. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001009270 
   
  
    
    10X Genomics scRNA- and TCR-sequencing (Chromium Next GEM Single Cell 5’
Reagent Kit v1.1) was performed on the plasma cell depleted mononuclear
fraction of bone marrow aspirates from 6 patients with newly diagnosed
multiple myeloma. Generated gene expression libraries were paired-end sequenced on the NovaSeq
6000 S2. Generated V(D)J libraries were paired-end sequenced on the NextSeq
550. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  - 
 
  
    EGAD00001009271 
   
  
    
    ATAC-Seq on OCIAML-22 CD34+, CD34-, and Bulk Fractions
RNA-Seq on OCIAML-22 CD34+/CD38-, CD34+/CD38+, CD34-/CD38+, CD34-/CD38- Fractions
WGS on Donor Bulk, OCIAML-22 Bulk, and CD34+ and CD34- Fractions out of OCIAML-22 Xenografts 
    
   
  
    
   
  36 
 
  
    EGAD00001009272 
   
  
    
    WES/WGS sequencing data of 234 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  233 
 
  
    EGAD00001009273 
   
  
    
    WES/WGS sequencing data of 86 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  86 
 
  
    EGAD00001009274 
   
  
    
    WES/WGS sequencing data of 337 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  319 
 
  
    EGAD00001009275 
   
  
    
    WES/WGS sequencing data of 56 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  56 
 
  
    EGAD00001009276 
   
  
    
    WES/WGS sequencing data of 75 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  74 
 
  
    EGAD00001009277 
   
  
    
    WES/WGS sequencing data of 44 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  40 
 
  
    EGAD00001009278 
   
  
    
    WES/WGS sequencing data of 242 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  242 
 
  
    EGAD00001009279 
   
  
    
    WES/WGS sequencing data of 239 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  218 
 
  
    EGAD00001009280 
   
  
    
    Whole-genome sequence (WGS) analysis of tumors from 22 TP53 mutation carriers. We observed somatic mutations affecting Wnt, PI3K/AKT signaling, epigenetic modifiers and homologous recombination genes as well as mutational signatures associated with prior chemotherapy. We identified near-ubiquitous early loss of heterozygosity of TP53, with gain of the mutant allele. This occurred earlier in these tumors compared to tumors with somatic TP53 mutations, suggesting the timing of this mark may distinguish germline from somatic TP53 mutations. Phylogenetic trees of tumor evolution, reconstructed from bulk and multi-region WGS, revealed that LFS tumors exhibit comparatively limited heterogeneity. Overall, our study delineates early copy number gains of mutant TP53 as a characteristic mutational process in LFS tumorigenesis, likely arising very early in life or in utero.years prior to tumor diagnosis. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  65 
 
  
    EGAD00001009281 
   
  
    
    This dataset includes WXS sequencing for 1 tumor FFPE sample and adjacent normal tissue from the individual from one family member. 
    
   
  
    
   
  2 
 
  
    EGAD00001009282 
   
  
    
    The study includes WGS data for DNA extracted from blood, fibroblasts or buccal swabs from sixteen family members, who represent four sub-families, each including two parents and one to three  children, comprising a total of eight offspring. In two sub-families POLD1 L474P was carried by the father; in one sub-family, by the mother; and in the other sub-family, both parents had wild-type POLD1. 
    
   
  
    
   
  16 
 
  
    EGAD00001009283 
   
  
    
    This dataset contains WGS for fibroblasts colonies obtained from carriers and non-carriers of germline POLD1  L474P. Data for single-cell colonies obtained from immortalized fibroblasts are present for eight out of 16 family members (six carriers and two non-carriers). Sequences obtained for colonies after approximately 40 passages also present for six out of these eight colonies (four carriers and 2 non-carriers). Samples marked with "_F2" and "_F3" represent sequences of single cell-derived colonies and colonies after ~40 passages correspondingly. 
    
   
  
    
   
  14 
 
  
    EGAD00001009284 
   
  
    
    Bam files of 17 samples from 11 different patients. The scRNAseq data were obtained using the 10X 3' Gene Expression kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001009285 
   
  
    
    Data used to validate RNAmp tool. 
    
   
  
    
   
  27 
 
  
    EGAD00001009286 
   
  
    
    Dataset contains paired-end clinical cancer panel sequencing (UCSF500) data from 2 samples of an initial tumor and one sample of a recurrence from one GBM patient and one sample from a second GBM patient. 
    
   
  
    
   
  4 
 
  
    EGAD00001009287 
   
  
    
    The dataset consists of 258 bam files by whole exome sequencing. 122 from IgAN-tGBM patients, 64 from IgAN patients and 72 fromTBMN patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  258 
 
  
    EGAD00001009288 
   
  
    
    Sample information: The 56 samples produced in this project come from the human iPSC line GM17602 (Coriell) where a tyrosine hydroxylase-T2A-mCherry reporter was inserted. A combination of epigenetic analysis (ATAC/ChIP) along transcriptomics. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 500 
      
    
   
  68 
 
  
    EGAD00001009289 
   
  
    
    Paired-end WGS data of 10 neuroblastoma patient samples (5 obtained at diagnosis and 5 matched blood samples as controls) used for analysis of telomeric content and sequence composition. Mean coverage is 11-65x per sample. The remaining patient samples of the dataset can be found under accession numbers EGAS00001001308 and EGAS00001005424 and mappings of the patients IDs in the supplementary material of the publication. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001009291 
   
  
    
    Tumour biopsies were collected from twenty-three patients (46 samples) (of which 20 were matched) with locally advanced or metastatic melanoma (stage IIIB – stage IV). Library construction was done either with the Chromium Single Cell 3ʹ GEM, Library & Gel Bead Kit v3 (n = 16; 10x genomics, Cat#1000092) or the Chromium Single Cell A Chip Kit and 5’ Library & Gel Bead Kit (10x genomics, Cat#1000014). All libraries were sequenced on Illumina NextSeq, HiSeq4000 or NovaSeq6000 until sufficient saturation was reached (60% on average). The reference genome used in this study was GRCh38. 
    
   
  
    
      
      unspecified 
      
    
   
  46 
 
  
    EGAD00001009292 
   
  
    
    Using single-nucleus RNA sequencing, we characterized the transcriptome of 880,000 nuclei from 18 control and 61 failing, nonischemic human hearts with pathogenic variants in DCM and ACM genes or idiopathic disease. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  196 
 
  
    EGAD00001009293 
   
  
    
    Using a novel sorting strategy, we performed ultra low input RNAseq from FACS-sorted  populations from diagnostic DNMT3Amut and NPM1mut AML patients. Primary samples were retrospectively collected based on their mutational profile. Samples were thawed, stained and FACS sorted using combination of lineage markers, CD34, GPR56 and NKG2DLigands. RNA was extracted and library prepared from 13 samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      NextSeq 550 
      
    
   
  7 
 
  
    EGAD00001009294 
   
  
    
    Archival de-identified formalin-fixed paraffin-embedded RCC tumor tissue blocks from nephrectomy or tumor biopsy were processed as per below and the same sections were used for both DNA and RNA extractions.
For WES (ACE version 3; Illumina NovaSeq), samples were profiled using Personalis ACE Cancer Exome  (Personalis, Inc, Menlo Park, CA)
Whole-transcriptome profiles were generated by RNA-seq (Accuracy and Content Enhanced (ACE) version 3; Illumina NovaSeq)  using Personalis ACE Cancer Transcriptome (Personalis, Inc, Menlo Park, CA )
Of the 615 patients in the intent-to-treat population in S-TRAC trial, 193 individual specimens were available for molecular profiling, of which 171 (27.8%) (sunitinib, n = 91; placebo, n = 80) returned results for the WES analysis, and 133 (21.6%)  (sunitinib, n = 72; placebo, n = 61) returned results for the GES analysis.  Of the 138 WTS samples with data,  replicates for two patients were summarized by median expression, and three samples were excluded from the final analysis due to low counts. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  309 
 
  
    EGAD00001009296 
   
  
    
    Pulmonary atypical carcinoid. DNA WES and RNA-Seq on 42 tumour samples collected at autopsy, 10x Chromium linked read whole genome sequencing on four tumours plus one normal sample, and targeted DNA sequencing on two clinical biopsies and one blood plasma sample. Note – RNA-Seq dataset features complex batch effect attributable to tissue processing artefacts, as described in "Complex patterns of genomic heterogeneity identified in 42 tumor samples and ctDNA of a pulmonary atypical carcinoid patient" - Robb et al., 2022 and detailed in Supplementary Table S3.
Note – RNA-Seq dataset features complex batch effect attributable to tissue processing artefacts, as described in "Complex patterns of genomic heterogeneity identified in 42 tumor samples and ctDNA of a pulmonary atypical carcinoid patient" - Robb et al., 2022 and detailed in Supplementary Table S3. 
    
   
  
    
      
      HiSeq X Ten 
      
      Ion Torrent S5 
      
      NextSeq 500 
      
      unspecified 
      
    
   
  47 
 
  
    EGAD00001009297 
   
  
    
    Whole genome sequencing of paired tumor-normal samples of pediatric Wilms tumors 
    
   
  
    
   
  - 
 
  
    EGAD00001009298 
   
  
    
    Bulk RNA-seq data of pediatric Wilms tumors 
    
   
  
    
   
  - 
 
  
    EGAD00001009299 
   
  
    
    This dataset consists of paired-end DNA sequencing (whole exome and targeted-capture) of tumours for the BEACCON study. There are 240 unique samples consisting of 92 matched tumour-germline pairs and 56 unmatched tumours totalling 148 patients. There are 33 paired and 6 unpaired tumours sequenced using the Agilent SureSelect All Human Exon v6 libraries, 1 paired and 8 unpaired tumours sequenced using Agilent SureSelect All Human Exon v7 libraries and 48 paired and 3 unpaired tumours sequenced using Twist Bioscience Comprehensive Human Exome v1 libraries totalling 176 whole exome samples. There are 3 paired and 58 unpaired tumours sequenced using a custom Agilent SureSelectXT library totalling 64 targeted capture samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  230 
 
  
    EGAD00001009300 
   
  
    
    extended cohort of single cell RNAseq data of lung adenocarcinoma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  107 
 
  
    EGAD00001009301 
   
  
    
    RNA sequencing data of a collection of 6 pediatric ependymoma cases 
    
   
  
    
   
  6 
 
  
    EGAD00001009302 
   
  
    
    RNASeq files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  8 
 
  
    EGAD00001009303 
   
  
    
    Data generated through single nuclei RNA sequencing on 5 regions of the brain (frontal cortex, ganglionic eminence, hippocampus, thalamus and cerebellum) from 3 fetuses (two of 14 and one of 15 post-conception weeks, all female). Tissue was acquired from the MRC-Wellcome Trust Human Developmental Biology Resource (HDBR) with ethical approval.  
snRNA-seq libraries were prepared from ∼10,000 nuclei from each sample using Chromium Single Cell 3ʹ (v3) reagents (10X Genomics). Quality control of libraries was performed using the Agilent 5200 Fragment Analyzer before sequencing on an Illumina NovaSeq 6000 to a depth of at least 865 million (median = 1.01 billion) read pairs per library. Raw sequencing data were converted into FASTQ files.
For a full description of data generation, please see Cameron et al, Biological Psychiatry 2022, https://doi.org/10.1016/j.biopsych.2022.06.033. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001009304 
   
  
    
    Genomic profiling at diagnosis of B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) in adults is used to guide disease classification, risk stratification and treatment decisions. Patients for which diagnostic screening fails to identify disease defining or risk stratifying lesions are classified as B-other ALL. We screened a cohort of 652 BCP-ALL cases enrolled in UKALL14 to identify and perform whole genome sequencing (WGS) on paired tumor-normal samples. For 52 B-other patients we compared WGS findings to data from clinical and research cytogenetics. WGS identifies a cancer associated event in 51/52 cases, this includes an established subtype defining genetic alteration in 5/52 that were previously missed by standard-of-care genetics. Of the 47 true B-other ALL we identified a recurrent driver in 87% (41).  Complex karyotype by cytogenetics emerges as a heterogeneous group, underlied by distinct genetic alterations associated with either favorable (DUX4-r) or poor outcomes (MEF2D-r, IGK::BCL2). For a subset of 31 cases, we integrate findings from RNA-sequencing (RNA-seq) analysis to include fusion gene detection, and classification by gene expression. Compared to RNA-seq, WGS was sufficient to detect and resolve recurrent genetic subtypes, however RNA-seq can provide orthogonal validation of findings. In conclusion, we demonstrate that WGS can identify clinically relevant genetic abnormalities missed by standard-of-care testing and identify leukemia driver events in virtually all cases of B-other ALL. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  115 
 
  
    EGAD00001009305 
   
  
    
    Genomic profiling at diagnosis of B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) in adults is used to guide disease classification, risk stratification and treatment decisions. Patients for which diagnostic screening fails to identify disease defining or risk stratifying lesions are classified as B-other ALL. We screened a cohort of 652 BCP-ALL cases enrolled in UKALL14 to identify and perform whole genome sequencing (WGS) on paired tumor-normal samples. For 52 B-other patients we compared WGS findings to data from clinical and research cytogenetics. WGS identifies a cancer associated event in 51/52 cases, this includes an established subtype defining genetic alteration in 5/52 that were previously missed by standard-of-care genetics. Of the 47 true B-other ALL we identified a recurrent driver in 87% (41).  Complex karyotype by cytogenetics emerges as a heterogeneous group, underlied by distinct genetic alterations associated with either favorable (DUX4-r) or poor outcomes (MEF2D-r, IGK::BCL2). For a subset of 31 cases, we integrate findings from RNA-sequencing (RNA-seq) analysis to include fusion gene detection, and classification by gene expression. Compared to RNA-seq, WGS was sufficient to detect and resolve recurrent genetic subtypes, however RNA-seq can provide orthogonal validation of findings. In conclusion, we demonstrate that WGS can identify clinically relevant genetic abnormalities missed by standard-of-care testing and identify leukemia driver events in virtually all cases of B-other ALL. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  33 
 
  
    EGAD00001009306 
   
  
    
    Fresh nephrectomy samples were collected from a total of 5 untreated ccRCC patients. Out of these 5 patients, 7 samples were obtained. Two samples consisted of fresh versus frozen single cells from the primary tumor site of one patient. Two other samples consisted of matched primary and distant thrombus sites (the vena cava) of a second patient. The three remaining samples came from the primary tumor sites of three distinct ccRCC patients. Single-cells were captured into 10x barcoded gel beads and RNA-sequencing library preparation was done using Chromium Single Cell 3' v2 chemistry. Sequencing was performed on a Illumina HiSeq 4000 sequencer. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  7 
 
  
    EGAD00001009307 
   
  
    
    In this study, we aimed to identify somatic structural variation of Skin fibroblast at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of skin fibroblast sample from male donor. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  95 
 
  
    EGAD00001009308 
   
  
    
    This dataset contains data used in the paper titled "Significant and pervasive effects of RNA degradation on Nanopore direct RNA sequencing.  The data consists of one post mortem sample that was sequenced with direct RNA sequencing form Oxford Nanopore Technologies on a promethION flow cell. 
    
   
  
    
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001009309 
   
  
    
    This dataset contains:
1.) Whole-genome sequencing (WGS) data (~6x) of 259 cfDNA samples obtained from 50 colorectal cancer (CRC) patients and 61 healthy controls. Paired-end sequencing was performed with 2x101 bp reads on the NovaSeq 6000 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38).
2.) WGS data (~1x) of 50 tumor biopsy and 45 saliva samples from CRC patients. Paired-end sequencing was performed with 2x101 bp reads on the NovaSeq 6000 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  354 
 
  
    EGAD00001009311 
   
  
    
    The dataset contains a genomics characterization of 35 triple-negative Asian breast tumours from the Malaysian Breast Cancer cohort. This includes whole-exome sequencing of tumour tissue at 80X, whole-exome sequencing of matched normal (blood) tissue at 40X, and RNA-seq of tumour tissue at 40X coverage (>15 million reads). Whole-exome libraries were prepared using the Nextera Rapid Capture Exome Kit; exome capture was performed in pools of 3 and subjected to paired end 75 sequencing on a NovaSeq 6000 platform. RNA libraries were prepared  using the TruSeq Stranded Total RNA HT kit with Ribo-Zero Gold as per manufacturer’s instructions and also subjected to paired end 75 sequencing on a NovaSeq 6000 platform. Uploaded bam files have been mapped to the hs37d5 human genome and processed using the standard GATK pipelines. Paired clinical, demographic, genotyping, and overall survival data for these patients are available from the associated publications or by request. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  105 
 
  
    EGAD00001009316 
   
  
    
    Single Cell Genome Sequence for Triple negative breast cancer patient-derived xenograft SA609 passage 3 on DLP+ library A95618B 
    
   
  
    
      
      NextSeq 550 
      
    
   
  15 
 
  
    EGAD00001009317 
   
  
    
    Spinocerebellar ataxia type 3 (SCA3) is the most common autosomal dominant inherited ataxia worldwide, caused by a CAG repeat expansion in the Ataxin-3 gene resulting in a polyglutamine (polyQ)-expansion in the corresponding protein. Here we have RNA-sequencing data from the cerebellum of individuals with SCA3 and matched controls. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  12 
 
  
    EGAD00001009318 
   
  
    
    Small variants in HAE of several Canary Islanders sequenced with Illumina WES. 
    
   
  
    
   
  1 
 
  
    EGAD00001009319 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient SA1162SB on DLP+ library A95628A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009320 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051D passage 1 on DLP+ library A95629B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009321 
   
  
    
    Single Cell Genome Sequence for immortalized breast epithelium - BRCA1-/- Tp53-/- cell line 184-hTERT-22 L9 83.86 on DLP+ library A95632A 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD00001009322 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA609 passage 6 on DLP+ library A95632C 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001009323 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1091A passage 1 on DLP+ library A95634A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009324 
   
  
    
    Single Cell Genome Sequence for immortalized breast epithelium - BRCA2-/-; Tp53-/- cell line 184-hTERT-22 L9 116.126 on DLP+ library A95635A 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001009325 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052J passage 1 on DLP+ library A95650A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009326 
   
  
    
    Single Cell Genome Sequence for immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 on DLP+ library A95652A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001009327 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage 1 on DLP+ library A95652B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009328 
   
  
    
    This ADPKD project has 3 different experiments, 7 different single cell RNA-seq data, 5 different single nuclei RNA-seq data and 6 different bulk ATAC-seq data objects. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  15 
 
  
    EGAD00001009330 
   
  
    
    Sequencing data for three HGSC patients with patient derived cell lines. WES data for two patient derived cell line samples and matched blood control samples. Fresh frozen tumor samples of two patients with WES or WGS sequencing data. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  9 
 
  
    EGAD00001009331 
   
  
    
    Single-cell RNA-seq, single-cell ATAC-seq, and genotypes used in the analysis for the study "Altered and allele-specific open chromatin landscape reveal epigenetic and genetic regulators of innate immunity in COVID-19". The RNA-seq and ATAC-seq are raw data in FASTQ format while the genotypes are in the VCF format which was filtered and imputed (more details are available in the main text of the study). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001009333 
   
  
    
    The phenotypic data for 6431 samples of the KDRN Study from Ghana and Nigeria. 
    
   
  
    
   
  6431 
 
  
    EGAD00001009335 
   
  
    
    The dataset of Integrative modeling of tumor genomes and epigenomes for enhanced cancer diagnosis by cell-free DNA includes 3784 whole genome sequencing bam files on the MGI and Illumina platform. The analyzed samples include plasma samples from normal individuals and patients with cancer. 
    
   
  
    
   
  3784 
 
  
    EGAD00001009336 
   
  
    
    Create a living biobank of patient-derived ductal carcinoma in situ (DCIS) Mouse-INtraDuctal (MIND) xenografts to find factors explaining invasive growth. Samples exist of both primary and pdx samples. Invasive growth was scored in the pdx. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  227 
 
  
    EGAD00001009337 
   
  
    
    Low-pass whole-genome sequencing of pretherapy lymphoma cfDNA and targeted sequencing of cfDNA, tumor tissue and whole-blood samples of NLG-LBC-05 patient samples and cfDNA of nine subjects with no known cancer. Hybrid capture target enrichment; panel target and sequencing information provided in PMID:34932792. FASTQ files provided for targeted sequencing, separate sequencing runs per sample noted with prefix "run" if applicable. Sequences from DTX1 and KLHL6 targets are advised to be excluded from analyses due to PCR/plasmid contaminants of in-house origin. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  391 
 
  
    EGAD00001009339 
   
  
    
    High-resolution lung adenocarcinoma expression subtypes identify tumors with dependencies on MET, CDK4, CDK6, and PD-L1 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  164 
 
  
    EGAD00001009340 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA673 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009341 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA674 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009342 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA675 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009343 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA676 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009344 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA677 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009345 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA678 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009346 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA679 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009347 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA680 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009348 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA681 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009349 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA682 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009350 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA683 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009351 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA221 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009352 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA238 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009353 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA239 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009354 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA300 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009355 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA423 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009356 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA425 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009357 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA495 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009358 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA286 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009359 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA289 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009360 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA291 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009361 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA280 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009362 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA221 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009363 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA238 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009364 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA239 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009365 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA280 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009366 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA286 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009367 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA289 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009368 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA291 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009369 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA300 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009370 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA423 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009371 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA425 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009372 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA495 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009373 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA673 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009374 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA674 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009375 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA675 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009376 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA676 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009377 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA677 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009378 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA678 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009379 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA679 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009380 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA680 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009381 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA681 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009382 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA682 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009383 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA683 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009384 
   
  
    
    whole-genome sequencing data of 177 samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  177 
 
  
    EGAD00001009385 
   
  
    
    Single-cell mRNA-sequencing to generate a transcriptomic atlas of soft tissue sarcoma tumors 
    
   
  
    
      
      NextSeq 500 
      
    
   
  13 
 
  
    EGAD00001009386 
   
  
    
    Comprehensive map of first- and second-trimester
      gonadal development in humans using a combination of single-cell
      and spatial transcriptomics, chromatin accessibility assays, and
      imaging. ArrayExpress Accession: E-MTAB-10551 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001009387 
   
  
    
    Comprehensive map of first- and second-trimester gonadal development in humans using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays, and imaging. ArrayExpress Accession: E-MTAB-10570 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009388 
   
  
    
    Comprehensive map of first- and second-trimester gonadal development in humans using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays, and imaging. ArrayExpress Accession: E-MTAB-11708 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001009389 
   
  
    
    This experiment consists of RNAseq of liver harvested from CDAHFD mice treated for 8 weeks with either the MGAT2 inhibitor compound BMS-963272 (N = 10) or with vehicle (N = 10). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001009390 
   
  
    
    This experiment consists of RNAseq of jejunum (small intestine) harvested from CDAHFD mice treated for 8 weeks with either the MGAT2 inhibitor compound BMS-963272 (N = 14) or with vehicle (N = 14). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD00001009391 
   
  
    
    A subset of meningiomas progress in histopathological grade and drivers of progression are poorly understood. We aimed to identify somatic mutations and copy number alterations (CNAs) associated with grade progression in a unique matched tumour dataset.
This dataset consists of DNA sequencing from 10 individuals with meningiomas, where the meningiomas have underdone grade progression.
50 meningiomas were sequenced from the 10 individuals using the hybrid capture-based TruSight Oncology 500 (TSO500) Library Preparation Kit, and 13 of those meningiomas were also sequenced using Agilent SureSelect Clinical Research Exome V2.
BAM files for the sequencing data are included in the dataset. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  63 
 
  
    EGAD00001009392 
   
  
    
    The dataset contains 12 lung cancer plasma cfDNA samples, 8 bladder cancer and 2 healthy control urine cfDNA samples collected in EDTA collection tubes. Shallow WGS was performed using both Oxford Nanopore Technologies' MinION platform with an R9.4.1 flow cell and the SQK-PBK004 kit (22 files) and Illumina Novaseq platform with an S4 flow cell in PE150bp configuration (2x22 files). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
    
   
  53 
 
  
    EGAD00001009393 
   
  
    
    Tumor infiltrated Macrophages and Monocytes were sorted on Aria II (Becton Dickinson) into TRIzol LS and flash frozen. RNA was extracted with chloroform. Isopropanol and linear acrylamide were added, and the RNA was precipitated with 75% ethanol. Total RNA (0.649–1 ng) with RNA integrity numbers 6.8 to 10 underwent amplification using the SMART-Seq v4. Ultra Low Input RNA Kit (Clontech; cat. #63488). Amplified cDNA (15 ng) was used to prepare libraries with the KAPA Hyper Prep Kit (Kapa Biosystems, KK8504) using 8 cycles of PCR. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001009394 
   
  
    
    Additional RNASeq files for Roussel paper titled "Combination of CDK4/6 with BET-bromodomain and PI3K/mTOR inhibitors in medulloblastoma in vitro and in vivo" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  19 
 
  
    EGAD00001009395 
   
  
    
    Total RNA sequencing of cultured ONS cells derived from patients with Alzheimer's disease (AD), individuals with mild cognitive impairment (MCI) and cognitively healthy controls. 
    
   
  
    
   
  1 
 
  
    EGAD00001009396 
   
  
    
    We performed deep targeted DNA sequencing  with a panel of 74 selected cancer-related genes previously identified to be recurrently mutated in EBV associated DLBCL. Sequencing was performed on a HiSeq platform (Illumina) with 250 bp paired-end reads.  There are 68 FFPE samples in this targetedDNAseq-dataset with 46 unpaired tumors and 22 normals used as a panel of normals. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  68 
 
  
    EGAD00001009397 
   
  
    
    PacBio HiFi sequencing was performed on 68 barcoded patients' genomic DNA after a telobait-capture protocol to enrich for telomeric regions. The sequencing reads of each patient were de-multiplexed and presented as patient-specific PacBio CCS BAM files. There are 56 new samples and 12 repeated samples from run 1. 
    
   
  
    
      
      Sequel 
      
    
   
  68 
 
  
    EGAD00001009398 
   
  
    
    Whole-genome sequencing of high-grade serous ovarian cancer (HGSC) tumours and matched normals from long-term survivors performed as part of the Multidisciplinary Ovarian Cancer Outcomes Group (MOCOG) study. The dataset includes fastq files from 58 HGSC tumours (53 primary tumours and 5 recurrent tumours) and 53 matched normals from 53 long-term survivor patients. Sequence libraries were generated from tumour and matched normal genomic DNA using the KAPA HyperPrep PCR-free library preparation kit (Roche) according to manufacturer’s instructions. Sequencing was carried out by the Kinghorn Centre for Clinical Genomics Sequencing Laboratory (Sydney, Australia) on the HiSeq X Ten System (Illumina) to a minimum base coverage of 30-fold for normal DNA and 60-fold for tumour DNA samples. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  111 
 
  
    EGAD00001009399 
   
  
    
    This dataset contains bulk transcriptomes from the inoperable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19). Transcriptomes were prepared from pre-treatment oesophageal tumour biopsies using a ribodepletion approach in order to assess both previously reported (e.g. PD-L1 expression) and novel predictive expression-based biomarkers for immunochemotherapy treatment in this setting. On-treatment and post-treatment biopsies were generated as well to characterize response to therapy, and samples were also prepared from paired normal GI tissues for a subset of patients. scRNA-seq based deconvolution was also applied to bulk transcriptomes in this study to estimate the cell composition of tumour biopsies and assess the link between the presence of specific cell types with immunochemotherapy outcomes. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  144 
 
  
    EGAD00001009400 
   
  
    
    This dataset contains whole genome sequencing (WGS) generated from the inoperable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19). WGS data were generated with the aim to assess previously reported (e.g. tumour mutational burden) and novel predictive genomic markers for immunochemotherapy treatment in this setting. Using these data, mutation and copy number profiles were generated for the LUD2015-005 study, which were assessed for correlation with patient outcomes from this trial. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  102 
 
  
    EGAD00001009401 
   
  
    
    This dataset contains single-cell RNA-seq generated from the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19) and additional donors with Barrett's oesophagus using the 5' Single Cell Gene Expression assay from 10x Genomics. Samples were generated from oesophageal tumours, Barrett's oesophagus, and normal oesophagus and gastric tissues, with the aim of generating a reference atlas for cell types found in normal and disease-associated tissue states in the upper GI tract. This reference atlas was used as the basis for a deconvolution workflow to estimate the cell composition of bulk transcriptomes from these tissues, and to assess cell type-specific expression patterns of potential predictive biomarkers for immunochemotherapy regimens in this setting. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  59 
 
  
    EGAD00001009402 
   
  
    
    Somatic mosaicism (SM), referring to the presence of somatic mutations in sub-populations of cells within healthy individuals, is associated with an increased risk of a variety of diseases, including cancer. Blood is at particularly high risk of SM, given its rapid turnover and functionally- heterogeneous cell-type composition. While the roles of point mutations and large-scale rearrangements in blood SM have been scrutinised in recent years, the functional impact of mosaic structural variants (mSVs) remains poorly understood.
Using haplotype-resolved single-cell multi-omics based on Strand-seq technology, we explored the mSV landscape of human hematopoietic stem and progenitor cells (HSPCs). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1133 
 
  
    EGAD00001009403 
   
  
    
    RNA-Seq was performed on 249 DS-ALL samples. Library preparation was carried out using TruSeq Stranded Total RNA Library Prep Kit. The libraries were sequenced on a NovaSeq platform with read length of 2×101. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  249 
 
  
    EGAD00001009404 
   
  
    
    Cetuximab treatment in organoids 
    
   
  
    
      
      unspecified 
      
    
   
  62 
 
  
    EGAD00001009405 
   
  
    
    Primary lung fibroblast were isolated from well-matched control donors (no COPD, n=3) and patients with COPD (GOLD stage I-IV, n=8). Total RNA of cultured human lung fibroblast were isolated at passage 3 and rRNA was depleted. 75 bp single-end reads were generated from RNA libraries on Illumina NextSeq 500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001009406 
   
  
    
    Primary lung fibroblast were isolated from well-matched control donors (no COPD, n=3) and patients with COPD (GOLD stage I-IV, n=8). Genomic DNA of cultured human lung fibroblast was isolated at passage 3, fragmented by tagmentation, and subjected to bisulfite treatment. 100 bp paired-end reads were generated from DNA libraries on Illumina HiSeq2500 platform. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001009407 
   
  
    
    29 paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001009408 
   
  
    
    Paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD00001009409 
   
  
    
    BAM files from RNA-seq of PDAC samples used in the COMPASS hENT1 study 
    
   
  
    
   
  - 
 
  
    EGAD00001009410 
   
  
    
    Germline variants calls were defined using the sequenced reads derived from 230 patients with hepatocellular carcinoma.
This dataset is comprised of one aggregated vcf file. 
    
   
  
    
   
  230 
 
  
    EGAD00001009411 
   
  
    
    ONT (PromethION) sequencing of chromothriptic medulloblastoma. Three samples: blood, primary tumor, and relapse tumor. Includes a fourth low-coverage run that multiplexes blood and primary tumor. 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001009412 
   
  
    
    Longitudinal plasma samples (n = 79) of 21 ALK-positive NSCLC patients and 13 healthy donors were collected alongside 15 ALK-positive tumor tissue and 10 healthy lung tissue specimens. All plasma and tissue samples were analyzed by cell-free DNA methylation immunoprecipitation sequencing to generate genome-wide 5-mC profiles. Paired cfMeDIP on NextSeq 550 using KAPA Hyper Prep Kit	was done. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  104 
 
  
    EGAD00001009413 
   
  
    
    Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of the validation PDX/CDX cohort 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD00001009414 
   
  
    
    Clinical data including the treatment arm, HER2 status Pre- and Post-NAT. 
    
   
  
    
   
  - 
 
  
    EGAD00001009415 
   
  
    
    Biomarker data including the time point of sample collection, tumor content and ERBB2 gene expression. 
    
   
  
    
   
  1 
 
  
    EGAD00001009416 
   
  
    
    Amplicon sequencing data for 90  patients hospitalized for COVID-19. to general ward. Patients had a median age of 60.5 (52.0 – 69.3) years and were overweighted (Body mass index: 28.4 (24.4 – 32.6) kg/m2). 35.6% of the cohort were female.
The following genes were sequenced on a NovaSeq600 instrument with an Enrichment based library preparation (IDT-xGEN) with a median coverage of 2000x:
ABL1, ASXL1, ATRX, BCOR, BCORL1, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CEBPA, CSF3R, CUX1, DNMT3A, ETV6, EZH2, FBXW7, FLT3, FLT3-ITD, GATA1, GATA2, GNAS, GNB1, HRAS, IDH1, IDH2, IKZF1, JAK2, JAK3, KDM6A, KIT, KMT2A, KRAS, MPL, MYD88, NOTCH1, NPM1, NRAS, PDGFRA, PHF6, PPM1D, PTEN, PTPN11, RAD21, RUNX1, SETBP1, SF3B1, SMC1A, SMC3, SRSF2, STAG2, TET2, TP53, U2AF1, WT1, ZRSR2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  90 
 
  
    EGAD00001009417 
   
  
    
    13 paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001009418 
   
  
    
    RNAseq FASTq files from 418 pre-treatment (Ven-Obi or Clb-Obi) CD19+ B cells. 
    
   
  
    
      
      unspecified 
      
    
   
  418 
 
  
    EGAD00001009419 
   
  
    
    RNAseq FASTq files from 44 pre-treatment (Ven-Obi or Clb-Obi) and 44 paired,
post-treatment relapsed CD19+ B cells. 
    
   
  
    
      
      unspecified 
      
    
   
  88 
 
  
    EGAD00001009420 
   
  
    
    Fastq transcriptomic sequencing files from Z138 mantle cell lymphoma (MCL) cell line upon MSI2 knockdown (KD) with two different shRNAs and after MSI2 inhibition with Ro 08-2750 small molecule. Dataset includes 4 samples of Z138 MSI2-KD, 4 of Z138 control, 3 of Z138 treated with Ro 08-2750 and 3 of Z138 treated with DMSO. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD00001009421 
   
  
    
    Fastq transcriptomic sequencing files from Z138 SOX11+ and JVM2 SOX11- mantle cell lymphoma (MCL) cell lines upon SOX11 knock out (KO) and ectopic overexpression, respectively. Dataset includes 3 samples of Z138-SOX11KO, 3 of Z138 control, 3 of JVM2 control and 3 of JVM2-SOX11 MCL cell lines. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001009422 
   
  
    
    Dataset includes fastq transcriptomic sequencing files from 8 conventional (SOX11+) and 4 non-nodal (SOX11-) mantle cell lymphoma (MCL) primary cases. RNA-sequencing has performed from peripheral blood and lymph node diagnostic samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001009424 
   
  
    
    Illumina whole genome sequencing of Medulloblastoma Blood, Primary tumor, and Relapse tumor 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009425 
   
  
    
    Illumina RNA-sequencing of Medulloblastoma primary and relapse tumor. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009426 
   
  
    
    genetic data of 14 rigorously selected CUP samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001009427 
   
  
    
    8 pregnant women at the 3rd trimesters, 4 hepatitis B carriers, and 4 patients  with hepatocellular carcinoma 
    
   
  
    
      
      Sequel 
      
    
   
  16 
 
  
    EGAD00001009428 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1181A passage 1 on DLP library A108765A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009429 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1182E passage 1 on DLP+ library A108847B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009430 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA610 passage 3 on DLP+ library A110660A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009431 
   
  
    
    Single Cell Genome Sequence for Immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 cell line SA1188  on DLP+ library A118357B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009432 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA605 passage 3 on DLP+ library A118368B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009433 
   
  
    
    Single Cell Genome Sequence for Telomerase immortalized breast epithelium cell line 184-hTERT 85.14 p20 cell line AT135  on DLP+ library A118389B 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  4 
 
  
    EGAD00001009434 
   
  
    
    Single Cell Genome Sequence for Immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 cell line SA1188 on DLP+ library A118425B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009435 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050B passage 1, patient-derived xenograft SA1050E passage 1 on DLP+ library A118782A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009436 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050B passage 1, patient-derived xenograft SA1050E passage 1 on DLP+ library A118784A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009437 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA610 passage 3, patient-derived xenograft SA1096C passage 1 on DLP+ library A118790A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009438 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1184D passage 1 on DLP+ library A118797B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009439 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1162B passage 1, patient-derived xenograft SA1096B passage 1 on DLP+ library A118804A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009440 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096B passage 1 on DLP+ library A118808A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009441 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096C passage 1 on DLP+ library A118808B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009442 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1180C passage 1 on DLP+ library A118812B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009443 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1162B passage 1 on DLP+ library A118814B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009444 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1180C passage 1 on DLP+ library A118816A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009445 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1184D passage 1 on DLP+ library A118857B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009446 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053F passage 1 on DLP+ library A95663A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009447 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1162SA on DLP+ library A95668A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009448 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 5 on DLP+ library A95670A 
    
   
  
    
      
      NextSeq 550 
      
    
   
  3 
 
  
    EGAD00001009449 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 6 on DLP+ library A95670B 
    
   
  
    
      
      NextSeq 550 
      
    
   
  3 
 
  
    EGAD00001009450 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096A passage 1 on DLP+ library A95717A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009451 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A95722A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD00001009452 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2 on DLP+ library A96109A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009453 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage on DLP+ library A96113A 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001009454 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA609 passage 8 on DLP+ library A96130A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009455 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1162SA on DLP+ library A96142B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009456 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient SA1135, patient SA1162SA on DLP+ library A96154A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009457 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96161A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009458 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2, patient-derived xenograft SA611 passage 3 on DLP+ library A96171A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD00001009459 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 15 on DLP+ library A96173A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD00001009460 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 11 on DLP+ library A96174A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009461 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 7 on DLP+ library A96175A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009462 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A96177C 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009463 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A96180A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009464 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 11, patient-derived xenograft SA609 passage 7 on DLP+ library A96187A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  10 
 
  
    EGAD00001009465 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049C passage 1 on DLP+ library A96189A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009466 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051A passage 1 on DLP+ library A96190B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009467 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1093C passage 1, patient SA1147 on DLP+ library A96192A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009468 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053B passage 1 on DLP+ library A96194A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009469 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053E passage 1 on DLP+ library A96194B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009470 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052D passage 1 on DLP+ library A96200B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009471 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage 1 on DLP+ library A96205B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001009472 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050F passage 1 on DLP+ library A96206B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009473 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051A passage 1 on DLP+ library A96207A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009474 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052B passage 1 on DLP+ library A96207B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009475 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1047B on DLP+ library A96210B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009476 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 7 on DLP+ library A96212B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009477 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 6, cell line SA1090 on DLP+ library A96213A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009478 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1105, patient SA1103, patient SA1106, patient SA1104 on DLP+ library A96222A 
    
   
  
    
      
      NextSeq 550 
      
    
   
  11 
 
  
    EGAD00001009479 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA919 passage 7, patient-derived xenograft SA1050A passage 1 on DLP+ library A98181A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009480 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA530 passage 3 on DLP+ library A98240A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  4 
 
  
    EGAD00001009481 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1096A, patient-derived xenograft SA1052D passage 1 on DLP+ library A98243B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009482 
   
  
    
    Intratumoral heterogeneity (ITH) has been linked to decreased efficacy of clinical treatments. However, although genomic ITH has been characterized in genetic, transcriptomic and epigenetic alterations are hallmarks of esophageal squamous cell carcinoma (ESCC), the extent to which these are heterogeneous in ESCC has not been explored in a unified framework. Further, the extent to which tumor-infiltrated T lymphocytes (TILs) are directed against cancer cells, but how the immune infiltration acts as a selective force to shape the clonal evolution of ESCC is unclear. In this study, we perform multi-omic sequencing on 186 samples from 36 primary ESCC patients. Through multi-omics analyses, it is discovered that genomic, epigenomic, and transcriptomic ITH are underpinned by ongoing chromosomal instability. Based on the RNA-seq data, we observe diverse levels of immune infiltrate across different tumor sites from the same tumor. We reveal genetic mechanisms of neoantigen evasion under distinct selection pressure from the diverse immune microenvironment. Overall, our work offers an avenue of dissecting the complex contribution of the multi-omics level to the ITH in ESCC and thereby enhances the development of clinical therapy. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
    
   
  129 
 
  
    EGAD00001009483 
   
  
    
    Circle-Seq experiment. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009484 
   
  
    
   
  
    
      
      GridION 
      
    
   
  1 
 
  
    EGAD00001009485 
   
  
    
    Contains 4 clonal organoid samples + 1 bulk healthy tissue sample 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009486 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050A passage 1, patient SA1234 on DLP+ library A98279A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001009487 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2 on DLP+ library A95621B 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009488 
   
  
    
    Single Cell Genome Sequence for Immortalized breast epithelium - BRCA2-/-; Tp53-/- cell line 184-hTERT-22 L9 112.109 cell line SA1055 on DLP+ library A95621A 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001009489 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050D passage 1 on DLP+ library A95717B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  3 
 
  
    EGAD00001009490 
   
  
    
    Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post- therapy medulloblastoma (2 samples). One sequencing was done on GridION, the other one on a P2 Solo. In both cases the SQK LSK-109 Kit was used for preparation. 
    
   
  
    
      
      GridION 
      
      PromethION 
      
    
   
  2 
 
  
    EGAD00001009491 
   
  
    
    High coverage whole genome sequencing data (total: 22; median coverage: ~56X; range: 27X – 82X) from fresh frozen postmortem tissues harvested from patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001009492 
   
  
    
    RNA-Seq data from 20 fresh-frozen postmortem samples from three patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001009493 
   
  
    
    High-coverage whole genome sequencing data (median coverage: 23.5X, range: 14.14X-32.62X)) from white blood cells of the patients from isolated buffy coat of the blood drawn postmortem from patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001009494 
   
  
    
    Low-pass whole genome sequencing data (median coverage: 0.37X, range: 0.07X-5.8X) from diagnostic formalin-fixed paraffin-embedded tissue and fresh frozen postmortem tissues from ten organs and postmortem blood from patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer. For samples CA27_11, CA34_10, CA35_5, CA35_6, CA36_3, CA36_11, CA63_13, CA63_34, CA76_4, CA76_11, CA79_4, CA83_14 and CA83_26, we subsampled (using `samtools view -s 0.01` after mapping) from respective high coverage data from "CASCADE tumour high-coverage whole genome sequencing data" dataset. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  152 
 
  
    EGAD00001009495 
   
  
    
    The study includes methylC-capture sequencing (MCC-Seq) on 73 cord blood DNA samples from the result of natural pregnancies (control) and through the assisted reproductive technologies for infertile couples (ART/infertile).  Samples were collected as a part of the Quebec-based 3D (Design, Develop, Discover) longitudinal pregnancy cohort study. All the data were generated with 100bp paired-end reads using the Illumina NovaSeq 6000 systems. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  73 
 
  
    EGAD00001009496 
   
  
    
    Dataset contains paired-end Whole Exome sequencing data from 257 glioma samples from 28 patients. 26 normal blood samples are also included. 
    
   
  
    
   
  283 
 
  
    EGAD00001009497 
   
  
    
    Dataset contains paired-end RNA-seq sequencing data from 221 glioma samples. 
    
   
  
    
   
  221 
 
  
    EGAD00001009498 
   
  
    
    Nasal epithelial cells of PCD and non-PCD patients grown at air-liquid interface for RNAseq analysis. A total of 10 non-PCD patients (ALI day 14), 9 non-PCD patients (ALI day 21), 8 non-PCD patients (ALI day 28), 4 PCD patients (ALI day 14, day 21 and day 28), and 23 PCD patients (ALI day 21). Non-PCD patients and the 4 PCD patients on the three ALI days were sequenced at a depth of 100M reads, the remaining 23 PCD patients were sequenced at a depth of 70M. Overall sequencing design was rRNA depletion and 150bp paired-end. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  65 
 
  
    EGAD00001009499 
   
  
    
    This dataset includes WES, WGS, and RNAseq data generated from autopsy samples. 
    
   
  
    
      
      unspecified 
      
    
   
  347 
 
  
    EGAD00001009500 
   
  
    
    Count matrix from 44 pre-treatment (Ven-Obi or Clb-Obi) and 44 paired, post-treatment relapsed CD19+ B cells. 
    
   
  
    
   
  1 
 
  
    EGAD00001009501 
   
  
    
    Count matrix from 418 pre-treatment (Ven-Obi or Clb-Obi) CD19+ B cells 
    
   
  
    
   
  1 
 
  
    EGAD00001009502 
   
  
    
    Table of treatment arm information for the 418 RNAseq evaluable population. 
    
   
  
    
   
  1 
 
  
    EGAD00001009504 
   
  
    
    10x Genomics 5' library scRNA-seq data for 4 iAMP21 patients 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001009505 
   
  
    
    This dataset contains 29 paired FASTQ files from whole-genome bisulfite sequencing (WGBS) assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument. It also contains whole-genome sequencing bam files aligned to hg38 using BWA from 36 patients, with tumor and matched tumor-adjacent normal samples. Sequencing was generated using HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001009506 
   
  
    
    Samples encompass primary colorectal tumors or metastasis of 75 patients, collected by Medical Pathologists from surgically removed specimens. Tissues were embedded in optimal cutting temperature (OCT) medium, snapshot frozen in liquid nitrogen within 40 minutes of collection and preserved at -80ºC. Samples were collected between June 2010 and October 2017 as part of a prospective biobanking project. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  75 
 
  
    EGAD00001009507 
   
  
    
    Fastq, Mutect (SNVs), Platypus (indels), and InfoGenomeR (SVs and CNAs) calls from whole genome sequencing data and fastq files of whole genome transcription data of five patients with pediatric medulloblastoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009508 
   
  
    
    We conducted whole-genome sequencing on blood and buccal specimen from a family with chimerism identified in the two monochorionic dizygotic twins. Blood DNA was obtained from all family members. In addition, we obtained buccal specimen from the chimeric twins. Whole-genome sequencing was conducted on Illumina NovaSeq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009509 
   
  
    
    Targeted exome sequencing for a panel of 13 CLL driver genes 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  58 
 
  
    EGAD00001009510 
   
  
    
    Differential Presence of Exons in Cell-Free DNA Reveals Different Patterns in Colorectal Cancer Between Metastatic, Non-Metastatic Patients and Healthy Donors. 
159 samples, Illumina sequencing technology. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  159 
 
  
    EGAD00001009512 
   
  
    
    This dataset contains three sets of samples.
The first sample set contains euploid fetus pregnancies reported by NIPTIFY screening test and postnatal evaluation. Dataset was processed similarly to previously published guidelines from KU Leuven, with modifications [1]. Briefly, peripheral blood samples were collected in cell-free DNA BCT tubes (Streck, USA), and plasma was separated with standard dual centrifugation. Cell-free DNA was extracted from 3 ml plasma using MagMAX Cell-Free DNA Isolation Kit (ThermoFisher Scientific). Whole-genome libraries were prepared using the FOCUS (Fragmented DNA Compact Sequencing Assay, Competence Centre on Health Technologies, Estonia) NIPT method protocol with 12 cycles for the final PCR enrichment step. In the following quantification, equal amounts of 36 samples were pooled, and the quality and quantity of the pool were assessed on Agilent 2200 TapeStation (Agilent Technologies, USA). Whole genome sequencing was performed on the NextSeq 550 instrument (Illumina Inc.) with an average coverage of 0.32× (minimum 0.08 and maximum 0.42) and producing 85 bp single-end reads. 
The second sample set contains a single NIPT sample postnatally diagnosed with Prader-Willi syndrome. The sample was sequenced with Illumina NextSeq 500 platform, producing 85 bp single-end reads with an average per-sample coverage of 0.32× at the University of Tartu, Institute of Genomics Core Facility, according to the manufacturer’s standard protocols, as described previously [2]. 
The third sample set contains samples SC005 (SeraCare Life Sciences Inc lot #10446565), SC0042 (#10571706), and SC016 (#10560229). These are SeraCare Life Sciences Inc circulating cell-free DNA (ccfDNA) like mixture of human genomic DNA that consists of matched maternal and fetus. SC005 and SC0042 consist of matched DNA of maternal and fetus with DiGeorge Syndrome. SC016 is a custom-ordered DNA Mix with fetus DNA having a pathogenic loss of the terminal region of 20p13 and a pathogenic 3q29 duplication. SC016 was processed as the first sample set was processed, and SC0042 was processed as the second sample set was processed. Sample SC005 was processed once as was sample set 1 and once as was sample set 2 processed.
This study was performed with the approval of the Research Ethics Committee of the University of Tartu (#352/M-12).
1. Bayindir B, Dehaspe L, Brison N, Brady P, Ardui S, Kammoun M, et al. Noninvasive prenatal testing using a novel analysis pipeline to screen for all autosomal fetal aneuploidies improves pregnancy management. Eur J Hum Genet. 2015;23: 1286– 1293. doi:10.1038/ejhg.2014.282
2. Žilina O, Rekker K, Kaplinski L, Sauk M, Paluoja P, Teder H, et al. Creating basis for introducing noninvasive prenatal testing in the Estonian public health setting. Prenat Diagn. 2019;39: 1262–1268. doi:10.1002/pd.5578 
    
   
  
    
      
      NextSeq 550 
      
    
   
  377 
 
  
    EGAD00001009513 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96141A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  212 
 
  
    EGAD00001009514 
   
  
    
    These are the raw subreads bam files for the pacbio IsoSeq data 
    
   
  
    
      
      NextSeq 500 
      
      PacBio RS II 
      
      Sequel 
      
    
   
  30 
 
  
    EGAD00001009516 
   
  
    
    This data set includes bam files (aligned to hg38) from the germline of parents whose children have CHEK2 germline mutations. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  48 
 
  
    EGAD00001009517 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA533 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009518 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA409 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009519 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA420 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009520 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA296 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009521 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA597 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009522 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA232 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009523 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA211 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009524 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA230 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009525 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA234 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009526 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA278 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009527 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA101 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009528 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA212 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009529 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA214 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009530 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA224 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009531 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA225 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009532 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA226 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009533 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA228 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009534 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA229 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009535 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA231 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009536 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA233 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009537 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA237 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009538 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA271 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009539 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA277 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009540 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA284 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009541 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA399 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009542 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA294 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009543 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA285 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009544 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA101 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009545 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA211 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009546 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA212 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009547 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA214 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009548 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA224 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009549 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA225 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009550 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA226 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009551 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA228 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009552 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA229 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009553 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA230 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009554 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA231 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009555 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA232 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009556 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA233 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009557 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA234 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009558 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA237 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009559 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA271 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009560 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA277 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009561 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA278 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009562 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA284 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009563 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA285 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009564 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA290 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009565 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA294 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009566 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA296 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009567 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA399 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009568 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA400 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009569 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA409 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009570 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA420 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009571 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA533 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009572 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA597 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009573 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA997 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009574 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA998 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009575 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA416 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009576 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA400 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009577 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA290 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009578 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA288 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009579 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA299 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009580 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA095 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009581 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA415 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009582 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA718 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009583 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA720 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009584 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA718 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009585 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA720 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009586 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1017 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009587 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1026 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009588 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1027 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009589 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1028 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009590 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1040 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009591 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1064 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009592 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1065 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009593 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1069 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009594 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1073 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009595 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1074 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009596 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA576 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009597 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA610 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009598 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA992 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009599 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA994 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009600 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA095 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009601 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA288 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009602 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA299 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009603 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA415 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009604 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA666 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009605 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1065 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009606 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1064 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009607 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1026 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009608 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1069 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009609 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1017 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009610 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1027 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009611 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1073 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009612 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1074 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009613 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA576 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009614 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1028 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009615 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1040 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009616 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA992 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009617 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA610 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009618 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA416 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009619 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1070 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009620 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1070 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009621 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA997 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009623 
   
  
    
    The data set contains information from 9 individuals (5 ALS + 4 controls) using single cell RNA sequencing in combination with TCR V(D)J sequencing to study the immune profile of the central nervous compartment (CSF). Sequencing was done using 10x Genomics platform (5’ scRNAseq & V(D)J Reagent Kits v1.1). 5P and TCR libraries were then pooled and for sequencing on the NovaSeq sequencer. Provided files are in .fastq.gz format and per individual four files are available (Read 1&2 and Lane 1&2). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001009624 
   
  
    
    high coverage whole genome sequencing of 38 samples was done on a patterned flowcell v.2.5 (150 bp paired end, HiSeq X Ten) with coverage of about 60x for the tumor and whole blood control samples. All tumors had a tumor cell content of ≥60%. Sequencing libraries were prepared using the Truseq DNA Nano kit (Illumina) according to the manufacturers’ instructions and size selected using SPRI beads (Beckman Coulter Genomics). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD00001009625 
   
  
    
    Small RNA sequencing data (TruSeq small RNA library preparation kit v2) from serum samples and tumor tissue of orthotopically injected mice (SH-SY5Y cell line) and unengrafted mice, treated with idasanutlin, temsirolimus and vehicle control. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  128 
 
  
    EGAD00001009626 
   
  
    
    Whole genome sequencing of 14 cases of low-grade ovarian serous carcinoma with matched normal DNA 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  28 
 
  
    EGAD00001009627 
   
  
    
    For WGS DNA of tumor or control samples was prepared for paired sequencing using the Illumina TruSeq DNA Nano Kit and sequenced on NovaSeq 6000. For RNA-Seq the sequencing Kit Illumina TruSeq stranded mRNA was used with the same sequencer. There are 4 samples for WGS (18 runs) and 4 samples for RNA (4 runs) available. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001009628 
   
  
    
    This dataset include the Fastq files from Capture-based targeted high throughput sequencing of bulk, monocytic and progenitor subfractions of PHENOMUT11 sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009629 
   
  
    
    This dataset include the Fastq files from Mission Bio DNA+Protein single-cell multiomic sequencing from 11 NPM1-mutated AML diagnostic samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001009630 
   
  
    
    Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (Illumina) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001009631 
   
  
    
    Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (pacBio) 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001009632 
   
  
    
    Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (ONT) 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001009633 
   
  
    
    This dataset has the processed WGS data for the cancer models in CCMA. 
    
   
  
    
   
  148 
 
  
    EGAD00001009634 
   
  
    
    The dataset contains samples of 11 CRC patients (2 samples for each patient, tumor and normal adjacent tissue site, 22 samples in total).
Dataset is composed by fastq file (paired end) type from 10x single-cell RNA-Seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001009635 
   
  
    
    The dataset contains samples of 30 CRC patients (3 samples for each patient, tumor and 2 normal adjacent tissue sites, 90 samples in total).
Dataset is composed by fastq file (paired end) type from bulk RNA-Seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  90 
 
  
    EGAD00001009636 
   
  
    
    miRNA sequencing data, single-end, produced by an llumina NextSeq 500 sequencer. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  216 
 
  
    EGAD00001009639 
   
  
    
    WES dataset obtained using Illumina HiSeq 2500, Swift Bioscience library kit, paired reads. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001009641 
   
  
    
    Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing WGS analysis of a panel of mesothelioma cells lines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001009642 
   
  
    
    Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing RNAseq analysis of a panel of mesothelioma cells lines. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  21 
 
  
    EGAD00001009643 
   
  
    
    This study presents Whole Genome Sequencing results from the Anson Street African Burial Ground Project, which is a community-based initiative aimed at understanding the histories of  37 Ancestors in Charleston, South Carolina. Here we report fastq files for all 37 Ancestors. DNA was extracted at the University of Tennessee-Knoxville following Dabney et al. 2013, and dual index libraries prepared using a modified NEBNext Ultra II kit with partial USER enzyme digestion. Libraries were then enriched for human genomic DNA (MyBaits) and sequenced on Illumina Platforms. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  31 
 
  
    EGAD00001009644 
   
  
    
    Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1105, patient SA1106 on DLP+ library A96168B 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  5 
 
  
    EGAD00001009645 
   
  
    
    Single Cell Genome Sequence for Immortalized lymphoblastoid cell line GM18507 cell line SA928,Triple negative breast cancer patient-derived xenograft SA609 passage 2 patient-derived xenograft SA609 passage 2 on DLP+ library A96228A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  6 
 
  
    EGAD00001009646 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA530 passage 3 on DLP+ library A98247A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001009647 
   
  
    
    We performed targeted NGS in Follicular lymphoma samples at diagnosis. Explored clinico-genetic correlations and assessed four clinical or clinicogenetic risk models (FLIPI, FLIPI-2, PRIMA-IP or m7-FLIPI-molecular score)  in patients with symptomatic FL who received frontline immunochemotherapy. Out of 191 patients with FL grade 1-3a, 109 were successfully genotyped. Treatment consisted on rituximab (R) plus CVP/CHOP (72.5%) or R-bendamustine (R-B) (27.5%). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  109 
 
  
    EGAD00001009648 
   
  
    
    bam files of sc-RNA and sc-BCR sequencing of multiple myeloma and precursors from 65 samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD00001009649 
   
  
    
    The cold pressor test (CPT) is a widely used pain provocation test to investigate both pain tolerance and cardiovascular responses. Twenty-two females were phenotypically assessed before and after a CPT, and blood samples were taken for RNA-sequencing. Files were processed and quantified with kallisto v0.42.5 using the human reference transcriptome (Gencode Release 28). Countdata was rlog-transformed. 
    
   
  
    
   
  1 
 
  
    EGAD00001009650 
   
  
    
    This data set includes RNAseq from 38 follicular lymphoma tumours. All tumours were fresh frozen. Libraries were constructed by enriching for poly-A transcripts and sequenced as 75bp paired end reads on an Illumina HiSeq 2500 instrument. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  38 
 
  
    EGAD00001009651 
   
  
    
    This submission includes targeted and whole exome paired-end fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  866 
 
  
    EGAD00001009652 
   
  
    
    Whole Exome Sequencing Dataset (CRAM files) of 415 admixed Brazilians with Covid-19 extreme phenotypes from recovered nonagenarians and centenarians to deceased adults. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  415 
 
  
    EGAD00001009653 
   
  
    
    Targeted sequencing of a biobank of PDOs PDXs and LMHs 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  302 
 
  
    EGAD00001009654 
   
  
    
    This dataset includes Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data, in FASTQ format, from 70 metastatic castration-resistant prostate cancer tumor samples from the SU2C/PCF West Coast Dream Team (WCDT) project. The sequencing data is paired-end, 150 bp sequencing data from an Illumina NovaSeq 6000 machine. ATAC-seq libraries were prepared following the protocol described in Buenrostro et al. Nature Methods. 2013 (PMID: 24097267) and Corces et al. Nature Methods. 2017 (PMID: 28846090). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  70 
 
  
    EGAD00001009655 
   
  
    
    scRNA-seq of monocultures and co-cultures of patient-derived PDAC organoids and matched CAFs. 3 sample sets per patient. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001009657 
   
  
    
    2 patient-derived xenograph tumours, and associated normal blood samples. Duplicated samples for each gave 8 pairs of fastq files 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009658 
   
  
    
    RNAseq from PDX tumours under treatment with dpbs or eribulin. Sarcomatous or mixed sarcoma/carcinoma. 6 PDX tumours each with 2 treatments gave 12 pairs of fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001009659 
   
  
    
    Original patient tumours from which PDX models were derived. TruSight Oncology RNA panel for 2 samples, sequenced over 4 lanes each, gave 8 pairs of fastq files 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  8 
 
  
    EGAD00001009660 
   
  
    
    5 samples as fastq file pairs. 1 solid tumour sample from patient #1105 with matched blood, and 2 solid tumour samples from patient #1177 with matched blood. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001009661 
   
  
    
    Shallow sequencing of organoid/xenograft or human colorectal metastases 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  302 
 
  
    EGAD00001009662 
   
  
    
    Single cell RNA Seq of: 193 MCSP+ DCC isolated from SLNs of melanoma patients, 9 MCSP+ cells isolated from LNs of non-melanoma patients, 14 melanocytes from a healthy donor. Bulk RNA Seq of 10 samples from 4 DCC-PDX-derived cell lines. Sequencing on NovaSeq6000.  Fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  226 
 
  
    EGAD00001009663 
   
  
    
    Nanopore low-pass WGS of human brain tumors for evaluation of DNA methylation-based classification of cancer 
    
   
  
    
      
      MinION 
      
    
   
  16 
 
  
    EGAD00001009664 
   
  
    
    RNA-seq for fusion gene discovery of human astroblastomas 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001009666 
   
  
    
    The incidence of non-melanoma skin cancer is 17-fold lower in Singapore compared to the UK1, despite Singapore receiving 2-3 times more year-round ultraviolet radiation (UV)2,3. The ageing epidermis of the skin comprises competing somatic mutant clones4,5, from which such cancers develop. We question if differences in keratinocyte skin cancer incidence are reflected in the mutational landscape by comparing ageing facial epidermis from donors of Singapore and the UK. We find UK skin to be a highly competitive, densely mutated landscape with 4-fold greater mutation burden compared to Singaporean skin and differences in clonal selection by country. We disproportionately observe multiple features common to keratinocyte skin cancers6,7,8 in UK skin, such as UV mutagenesis, copy number aberration and hotspot mutations (in particular TP53 R248W). We conclude that keratinocyte skin cancer incidence is reflected in the somatic clones of non-cancerous epidermis. Finally, we re-analyse squamous cell carcinoma exomes from Korea9 to show, even in low incidence populations, carcinogenesis is driven by UV damage. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  191 
 
  
    EGAD00001009667 
   
  
    
    Human clonal intestinal organoids were treated with 1µM MMF (Roche), 20 μM GCV (Hainan Poly Pharm Co Ltd), or in combinations (1 µM MMF + 20 μM GCV or 1 µM MMF + 40 μM GCV) continuously for 4-6 weeks. DNA was extracted after drug treatment. WGS was performed with 150 bp PE sequencing at 30X using an Illumina Novaseq sequencer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001009668 
   
  
    
    This dataset contains raw exome sequencing data from nine sinonasal undifferentiated carcinoma FFPE samples and matched normal tissue that were assigned to a shared epigenetic class using DNA methylation-based classification. They were analyzed using the Twist Human Core Exome Plus Kit (Twist Bioscience) on a NovaSeq 6000 sequencer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001009669 
   
  
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001009670 
   
  
    
    Sequencing data of 20 tumor runs (different tumors), which were uploaded to EGAS00001004813 and used in the ImmuNeo publication. The sequencing was always paired and run on Illumina HiSeq sequencers. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001009671 
   
  
    
    Sequencing data of 39 tumor and control runs (different tumors and blood controls), which were uploaded to EGAS00001004813 and reused in this ImmuNEO publication. The sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001009672 
   
  
    
    Whole genome sequencing of normal sample for triple negative breast cancer patient SA1058 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009673 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1058 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009674 
   
  
    
    Whole genome sequencing of tumour sample for triple negative breast cancer patient SA998 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001009675 
   
  
    
    RNA-Seq transcriptome data is only for academic use. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001009676 
   
  
    
    RNA-Seq data for both Academic and For-profit use 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  221 
 
  
    EGAD00001009677 
   
  
    
    There are two datasets: 
1. scRNA-seq of human cutaneous immune cells from psoriasis patients. These include pre- and post-Tildrakizumab treated patients and come in a BAM file format.
19006FL-25-01
19006FL-38-01
19006FL-32-01-03
19006FL-33-01
19006FL-28-01-05
19006FL-35-01-01
2.  RNA-seq of ZFP36L2 CRISPIR deleted Human T cells are FASTQ files.
19006XR-30-05
19006XR-30-04
19006XR-30-02
19006XR-30-01
19006XR-26-05
19006XR-26-04
19006XR-26-02
19006XR-26-01
19006R-22-04
19006R-22-08
19006R-22-05
19006R-22-01 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001009678 
   
  
    
    This dataset contains germline variants (in .vcf format) from six pediatric cancer patients (sample IDs D1 - D6). WES data of the children and their parents was mapped to hg38. A consensus of four variant callers was used to obtain germline variants of the children. 
    
   
  
    
   
  6 
 
  
    EGAD00001009679 
   
  
    
    subset of 11 samples (RNA-Seq and WGS) from study EGAS00001005973, which was published earlier and are linked here to study EGAS00001006538 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001009680 
   
  
    
    Paired RNA sequencing of 30 samples RRMM using Illumina TruSeq stranded mRNA kit and either HiSeq2000 or HiSeq X Ten for sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  30 
 
  
    EGAD00001009681 
   
  
    
    Paired scRNA sequencing using 10xgenomics library preparation and Illumina HiSeq4000 for sequencing of 2 samples RRMM (relapsed refractory multiple myeloma) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001009682 
   
  
    
    Here is mostly paired WGS data of RRMM, 45 samples (tumors and controls) in 86 runs. This data was produced by using Illumina TruSeq Nano DNA and NovaSeq6000 or HiSeq X Ten for sequencing. One tumor/control pair is WES data using Agilent SureSelect V5+UTRs and NovaSeq6000 for sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD00001009683 
   
  
    
    scATAC sequencing was performed of 29 samples of RRMM tumors using 10xGenomics for the preparation and NovaSeq6000 for sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001009684 
   
  
    
    A pan-cancer cohort of 1031 patients resistant to systemic therapies or with no approved therapeutic options. It includes whole-exome sequencing of 571 tumor and matched-normal samples, and transcriptome sequencing of 947 tumor samples. Biopsies were taken at entry into precision medicine trials, often after diagnosed resistance. Comprehensive clinical information is available for all patients and include patient age at biospy, tumor primary site and histological subtype, biopsy site, treatments received prior to biopsy, blood assessment results at biopsy, metastatic sites at biopsy, and survival time from the biopsy date. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  2089 
 
  
    EGAD00001009686 
   
  
    
    This dataset contains both snRNA-seq and bulk RNA-seq data from 19 different patients, comprising of 9 healthy controls (4 Spinal Cord, 5 Motor Cortex) as well as 5 C9ALS patients and 5 sporadic ALS patients, each with paired data from the spinal cord and motor cortex. For the snRNA-seq data, fastq files containing the raw reads are provided, many of these samples were pooled for sequencing and subsequently require demultiplexing using SNPs using a tool such as freemuxlet. The paired bulk RNA-seq data (raw fastq files) can be used to acquire the ground truth per patient - a list of pooled samples can be found in map.txt. snRNA-seq data was produced using the 10X Genomics 3' v3 kit and bulk RNA-seq was produced using the Illumina TruSeq v2 kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001009687 
   
  
    
    scRNAseq dataset containing 5 healthy donors and 4 asthmatic donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009688 
   
  
    
    In this study single cell RNA-Seq data was used to train a deconvolution algorithm. The algorithm was validated on paired bulk RNA-Seq profiles. 
    
   
  
    
   
  4 
 
  
    EGAD00001009689 
   
  
    
    FAST5 original nanopore data from MinION sequencing of 10 tumor samples 
    
   
  
    
   
  10 
 
  
    EGAD00001009690 
   
  
    
    Reads were aligned to 1000 Genomes assembly reference (hs37d5) using minimap2 2.22. SAM-to-BAM conversion, BAM sorting and indexing were performed with SAMtools 1.13. Read summarization was performed with featureCounts (from Subread 2.0.3) over exon features based on GENCODE Version 19 gene models. Strand specific counting was used. 
    
   
  
    
   
  10 
 
  
    EGAD00001009691 
   
  
    
    6 organoids transcriptomic profiles 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001009692 
   
  
    
    Mutations calls from 68 MM patients collected with Mutect2. All tumor samples were CD138+ cells at diagnosis and all control samples were PBMCs. 
    
   
  
    
   
  68 
 
  
    EGAD00001009693 
   
  
    
    This dataset is composed of NGS data from 33 XP patients studied by WGS. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  65 
 
  
    EGAD00001009694 
   
  
    
    We profiled CD45- enriched, viable cells from GBM (n = 7) and IDH-MUT (n = 7) primary samples with multi-modality single-cell sequencing of scDNAme (by reduced representation bisulfite sequencing [RRBS]) and scRNAseq (Smart-seq2). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  2989 
 
  
    EGAD00001009695 
   
  
    
    Targeted sequencing data to look for the involvement of genes in the RAS-MAPK pathway, angiogenesis and brain vascular disorders among others, in brain AVMs 
    
   
  
    
      
      unspecified 
      
    
   
  30 
 
  
    EGAD00001009696 
   
  
    
    Whole genome sequencing data of brain AVM endothelial and non-endothelial cell fractions, as well as paired blood samples 
    
   
  
    
      
      unspecified 
      
    
   
  31 
 
  
    EGAD00001009697 
   
  
    
    Data includes whole exome sequenced bam files for matched tumor-normal pairs from the study. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  90 
 
  
    EGAD00001009698 
   
  
    
    Data from samples that are marked for academic use only 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001009699 
   
  
    
    Data from samples that are marked for both academic and for-profit use. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  425 
 
  
    EGAD00001009700 
   
  
    
    WES for Patient 9 to 14 of NIBIT-M4 clinical trial 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001009701 
   
  
    
    WES for Patient 1 to 8 of NIBIT-M4 clinical trial 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  8 
 
  
    EGAD00001009702 
   
  
    
    RNAseq for Patients of NIBIT-M4 clinical trial 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  14 
 
  
    EGAD00001009703 
   
  
    
    RRBS for Patients of NIBIT-M4 clinical trial 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  14 
 
  
    EGAD00001009704 
   
  
    
    Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009705 
   
  
    
    Paired Exome sequencing of 34 samples (tumors and controls) of different tumors. The samples were prepared using Agilent SureSelect V5+UTRs, the sequencing was done on Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  34 
 
  
    EGAD00001009706 
   
  
    
    Paired RNA sequencing data (21 runs/ 17 samples) of different tumors. The samples were prepared using the Illumina TruSeq stranded mRNA Kit. The sequencing was done on Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001009707 
   
  
    
    RRBS data from TRACERx non-small cell lung cancer (NSCLC) tumours and matched normal adjacent tissue.
TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  155 
 
  
    EGAD00001009709 
   
  
    
    We profiled 111 patient medulloblastoma primary tumor samples by bulk RNA-seq (19 samples), 27ac (98 samples) / 27me3 (61 samples) ChIP-Seq, WGS (4 samples) and 27ac hichip (8 samples). Submitted data consists of data generated from previously unpublished tumors as well as complementary data for data sets already published for identical medulloblastoma tumors (ex: 27me3 ChIP-Seq and RNA-Seq data submitted for a tumor with publicly available WGS data). The raw fastqs and hg19 aligned RNA-Seq bams are provided. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  111 
 
  
    EGAD00001009710 
   
  
    
    This data set contains the CRAM files for the samples in the CHILD cohort, sequenced on the Illumina HiSeq X platform. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  604 
 
  
    EGAD00001009711 
   
  
    
    244 infected single-cell alveolar bam files, 48 empty well bam files, and 52 RNA sequencing of amplicons (4 SARS-CoV-2 variants with 12 batches and 4 viral variants pool samples).
244 alveolar single cells were captured over 12 experimental batches and experimental condition is written in metadata uploaded as "infected_cells_final_revision.csv". on github (https://github.com/twkim-0510/SARS-CoV-2_viral_competition). Each bam file name corresponds to the sample_name column of the metadata. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  344 
 
  
    EGAD00001009712 
   
  
    
    Bank of human both primary and metastastic colorectal cancer sample RNAseq 
    
   
  
    
      
      unspecified 
      
    
   
  119 
 
  
    EGAD00001009713 
   
  
    
    Dataset contains mRNA capture sequencing data from plasma of 266 different human donors. The first, pan-cancer, cohort covers 25 high-grade to metastatic cancer types (8 cancer patients per type) and a control group (8 healthy donors). The validation cohort comprises additional plasma samples from ovarian, prostate and uterine cancer patients (12 per type) as well as additional samples from controls (22 new and 8 repeated). Samples were sequenced on a NovaSeq 6000 and are provided in FASTQ format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  274 
 
  
    EGAD00001009714 
   
  
    
    2 paired WGS samples of peritumour regions of colorectal cancer (2 patients). The library was prepared using the Illumina TruSeq Nano FFPE kit and the sequencing was done on NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001009715 
   
  
    
    Exome sequencing data from seven phenotypically abnormal human fetal samples. Anaysis perfomed using Illumina NovaSeq 6000, Twist Bioscience  - Human Comprehensive Exome. Paired end fastq files were aligned to hg38 reference genome using BWA-MEM v0.7.15, followed by sorting using SAMtools sort v1.3.1, and duplicate reads marked using Picard Tools MarkDuplicates v2.18.2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001009718 
   
  
    
    This dataset consists of 39 noncancerous donor and 62 cancer patient plasma samples (including 29 patients with CRC across a total of 13 tumor types) that were analyzed with the PGDx elio plasma resolve assay. The PGDx elio plasma resolve assay is a hybrid capture approach targeting 33 genes with sequencing performed using the Illumina NextSeq with 150bp paired-end reads. The bam files provided have been adapter masked and contain duplicate reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  101 
 
  
    EGAD00001009719 
   
  
    
    The dataset "PGDx elio™ plasma resolve assay: targeted sequencing analyses of plasma cfDNA" includes paired end FASTQ reads of  183 cfDNA samples from metastatic colorectal cancer patients. Sequencing was performed using  a panel consisting of 33 genes, covering over 237,000 bp, targeting 25,000x depth across the targeted regions. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  183 
 
  
    EGAD00001009720 
   
  
    
    The dataset "PGDx elio™ tissue complete assay: targeted sequencing analyses of tissue DNA" includes paired end FASTQ reads of 28 tissue samples from metastatic colorectal cancer patients. Sequencing was performed using  a panel consisting of 505 genes, targeting 2,500x depth across the targeted regions. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD00001009721 
   
  
    
    The dataset "PGDx elio™ plasma resolve assay: targeted sequencing analyses of WBC DNA" includes paired end FASTQ reads of 49 white blood cell (WBC) genomic DNA samples from metastatic colorectal cancer patients. Sequencing was performed using  a panel consisting of 33 genes, covering over 237,000 bp, targeting 25,000x depth across the targeted regions. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  49 
 
  
    EGAD00001009722 
   
  
    
    Paired end FASTQ files and multisample VCF file of 119 Iberian Roma whole exome sequence data (Illumina sequencing) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  119 
 
  
    EGAD00001009723 
   
  
    
    Dataset containing the FASTQ files of RNA (scr*) and TCR (vdj*) sequencing of 17 bronchoalveolar lavage fluid samples collected from ICI pneumonitis (n=11) and control (n=6) patients.  To comply with GDPR regulations, please note that individual sample identifiers used for this data deposit (alphabetical ID) are different from and cannot be traced back to the patient identifiers used throughout the manuscript (numerical ID). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001009724 
   
  
    
    mRNA capture sequencing and small RNA sequencing data (FASTQ files) of the exRNAQC study phase 2 (interaction study) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  180 
 
  
    EGAD00001009725 
   
  
    
    ctDNA data for IMpower150, including individual mutation calls (one mutation per sample per line), sample list including ctDNA status (one sample per line), and patient-level ctDNA summaries called ctDNA features (one patient per line). 
    
   
  
    
   
  - 
 
  
    EGAD00001009726 
   
  
    
    Clinical data for IMpower150 (one patient per line): anonymized_patient_id, train_test_split, ctDNA_status, ARM1, OS_months, OS_event, PFS_months, PFS_event, TTEOS_rebaseline_BL, TTEPFS_rebaseline_BL, TTEOS_rebaseline_C2D1, TTEPFS_rebaseline_C2D1, TTEOS_rebaseline_C3D1, TTEPFS_rebaseline_C3D1, TTEOS_rebaseline_C4D1, TTEPFS_rebaseline_C4D1, TTEOS_rebaseline_C8D1, TTEPFS_rebaseline_C8D1, pdl1_high, number_metastatic_sites, baseline_ECOG, age, sex_female, history_of_tobacco_use, sld_baseline, sld_wk6, sld_percent_change_bl_to_wk6, sld_difference_bl_to_wk6, AGEGRP, tumor_assessment_week_6, tumor_assessment_week_12, tumor_assessment_week_18, tumor_assessment_week_24, PFS_days, days_between_randomization_c3 
    
   
  
    
   
  - 
 
  
    EGAD00001009727 
   
  
    
    Clinical data from AVANT: Clinical data include race, age, sex, baseline ecog, tumor stage, node status, treatment arm, KRAS and BRAF mutation status, tumor location, concensus molecular subtype, overall survival and disease free survival for 797 patients across AVANT. 
    
   
  
    
   
  1 
 
  
    EGAD00001009728 
   
  
    
    RNAseq FASTq files from 797 tumors from AVANT. Sequencing libraries were generated with the TruSeq Stranded Total RNA kit (Illumina) following ribosomal RNA (rRNA) depletion with the Ribo-Zero Gold kit (Illumina). The libraries were sequenced on the HiSeq4000 (Illumina) with a sequencing protocol of 75 bp paired-end sequencing. Note: 10 samples used in the original publication were excluded from this upload due to regulations from the Human Genetics Resources Administration of China (HGRAC). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  797 
 
  
    EGAD00001009730 
   
  
    
    Paired RNA-Seq of four patients with advanced Parathyroid carcinoma (PC). The library was prepared using the Illumina TruSeq stranded mRNA Kit, the sequencing was done either on an Illumina HiSeq 4000 or on Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001009731 
   
  
    
    Paired WGS data of four patients with advanced Parathyroid carcinoma (PC). There are tumor/control pairs (buffy coat control). The library was prepared with Illumina TruSeq Nano DNA, the sequencing was done with HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  8 
 
  
    EGAD00001009732 
   
  
    
    The gut microbiota composition is unique to every individual but is shaped by common factors including diet, lifestyle, medication use, early-life determinants, living environment or genetics. Most of these factors may be influenced by ethnicity. This study explored variations in fecal microbiota composition in 6048 individuals with different ethnic backgrounds living in the same geographical area (Amsterdam, the Netherlands).
The HELIUS data are owned by the Amsterdam University Medical Centers, location AMC in Amsterdam, The Netherlands. To allow sharing of microbiome data collected in HELIUS with (inter)national researchers, 16s rRNA sequence analysis has been stored at the European genome-phenome archive (EGA; accession code EGAD00001004106). This requires that access needs to be granted, also because the HELIUS data are stored with relevant phenotypical variables. Access is granted to all researchers affiliated with an internationally recognized research institution who request to use the HELIUS data within the EGA context, after having signed the data transfer agreement. Any researcher can request the data by submitting a proposal to the HELIUS Executive Board as outlined at http://www.heliusstudy.nl/en/researchers/collaboration, by email: heliuscoordinator at amsterdamumc dot nl. The HELIUS Executive Board will check proposals if they do not conflict with ethical approvals and informed consent forms of the HELIUS study. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  3885 
 
  
    EGAD00001009733 
   
  
    
    This data set contains KiCS cancer panel data for academic and for-profit use. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001009734 
   
  
    
    This data set contains KiCS cancer panel data for academic and for-profit use. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  521 
 
  
    EGAD00001009735 
   
  
    
    Files from Tapestri snDNA-seq of archival tissue samples from 16 pancreatic ductal adenocarcinoma (PDAC) patients. Matched bulk sequencing (whole-exome, whole-genome, MSK-IMPACT) data are attached for a subset of the patients. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD00001009736 
   
  
    
    This dataset contains RNA sequencing information for Chronic Myeloid Leukemia. In total 2 single-end RNA-seq tumor cell line samples are present. 
    
   
  
    
      
      Sequel 
      
    
   
  2 
 
  
    EGAD00001009737 
   
  
    
    The .cram files of the Trio or Quad sequencing data used for generation of the genomic autopsy study. This contains a mix of genome sequencing and exome sequencing data for probands and their parents. A subset of families (n=32) did not provide consent to publicly sharing their data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  156 
 
  
    EGAD00001009738 
   
  
    
    Whole genome sequencing data of 5 High-grade serous carcinoma (HGSC) patients (6 samples) sequenced with BGI. 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001009739 
   
  
    
    Phenotype data from pregnant mothers unexposed and exposed to the Rwandan genocide from 59 whole blood samples. 
    
   
  
    
   
  1 
 
  
    EGAD00001009741 
   
  
    
    Two primary tumor-derived PDAC organoids were subjected to SNP array, RNA-seq, and single-cell WGS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001009742 
   
  
    
    16 additional samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  16 
 
  
    EGAD00001009743 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009744 
   
  
    
    This dataset contains RNA-sequencing of Bone marrow-derived CD34+ cells from Healthy Controls (n=2) and SLE patients (n=10). 
SLE patients are divided into two categories based on severity: patients with moderate/mild disease (n=4) and patients with severe disease (n=6). 
Libraries were generated using the Illumina TruSeq Sample Preparation kit v2. Single-end 75-bp mRNA sequencing was performed on Illumina NextSeq 500. The raw fastq files are uploaded. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001009745 
   
  
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009746 
   
  
    
    Whole genome sequencing of high-grade serous ovarian cancer (HGSC) tumours and matched normals from 15 patients with homologous recombination deficiencies. The dataset includes fastq files from 56 HGSC tumours (1 primary, 1 relapse, 54 end-stage) and 15 matched normals.
Sequence libraries were generated from tumour and matched normal genomic DNA using the KAPA HyperPrep PCR-free library preparation kit (Roche), or the Illumina TruSeq DNA Nano kit according to manufacturer’s instructions. Sequencing was carried out by the Kinghorn Centre for Clinical Genomics Sequencing Laboratory (Sydney, Australia) on the HiSeq X Ten System (Illumina) or by the Australian Genome Research Facility (Melbourne, Australia) on an Illumina NovaSeq to a minimum base coverage of 30-fold for normal DNA and 60-fold for tumour DNA samples. 
    
   
  
    
      
      unspecified 
      
    
   
  66 
 
  
    EGAD00001009747 
   
  
    
    Targeted DNA sequencing of high-grade serous ovarian cancer (HGSC) tumour and normal samples from 15 patients with homologous recombination deficiencies. The dataset includes fastq files from 243 HGSC tumours (15 primary, 3 relapse, 225 end-stage) and 15 normals from 15 HGSC patients.
Following target hybrid capture of 63 genes involved in DNA repair and response to treatment with an Agilent SureSelect XT panel, sequencing libraries were generated using the SureSelect XT Low Input Target Enrichment System (Agilent) as per the manufacturer's protocol. Libraries were sequenced on an Illumina NextSeq 500 at the Peter MacCallum Cancer Centre (Melbourne, Australia). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  266 
 
  
    EGAD00001009748 
   
  
    
    This dataset comprises of Clinical-Epidemiological (CE) data from an Erasmus MC cohort of 151 individuals who were tested positive for COVID-19. 
    
   
  
    
   
  151 
 
  
    EGAD00001009749 
   
  
    
    This dataset includes 4 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Human patient T-ALL samples were serially propagated as xenografts in immunodeficient mice. The samples were collected after development of frank leukemia in recipient mice. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD00001009750 
   
  
    
    This dataset includes 90 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Normal human CD34+ cord blood (CB), bone marrow (BM), or post-natal thymus (PNT) cells were transduced with various combinations of T-ALL oncogenes and cultured in vitro. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  90 
 
  
    EGAD00001009751 
   
  
    
    This dataset includes 73 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Normal human CD34+ cord blood (CB), bone marrow (BM), or post-natal thymus (PNT) cells were transduced with various combinations of T-ALL oncogenes, cultured in vitro on OP9-DL1 feeders for up to 25 days, and then transplanted into immunodeficient NSG or NRG mice. The samples were collected after development of frank leukemia in recipient mice. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  73 
 
  
    EGAD00001009752 
   
  
    
    This dataset contains paired RNA sequencing data for end-stage kidney disease (ESKD) patients on dialysis. There are two cohorts. The first includes 179 samples from 51 COVID-19 patients recruited during the initial phase of the COVID-19 pandemic (April-May 2020) and 55 non-infected ESKD patients as controls. 17 patients initially recruited as controls as part of the Wave 1 cohort were later infected with COVID-19 in January-March 2021. We acquired a total of 90 samples during the acute infection and convalescent samples for 12 of the 17 patients following the acute COVID-19 episode. RNA-seq counts and full clinical metadata for these cohorts are available without restriction from Zenodo (https://doi.org/10.5281/zenodo.6497251). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  336 
 
  
    EGAD00001009754 
   
  
    
    Tumor-specific T cells are frequently exhausted by chronic antigenic stimulation. To explore new pathways for reinvigoration of anti-tumor immune functions, we developed a human ex vivo exhaustion model by repetitive antigenic stimulation of primary CD8 T cells. This results in T cells that resemble patient-derived T cells in tumors on a phenotypic and transcriptional level.
Four human healhy donor CD8+ T cells  were isolated, transduced with an NY-ESO-1 TCR lentivirus construct, stimulated in four different conditions (Trested, Ttumor, Tex, Teff) with T2 tumor cells and specific peptides over 12 days. Cells were then sorted for TCR Vbeta 13.1+ (NY-ESO-1 TCR) CD8+ CD3+ CD56- CD4- DAPI- cells. 
RNA-seq TruSeq libraries were generated from polyA-enriched mRNA isolated from the samples, and sequenced in paired-end mode (2x51bp) on 2 lanes of an Illumina NovaSeq 6000 flow-cell. 
FASTQ sequence files were generated with the Illumina RTA version 3.4.4 and Base-calling Version bcl2fastq-2.20.0.422. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001009755 
   
  
    
    scWGS-seq of flow sorted blast and normal cells from SJBALL021901 with 71 high quality cells sequenced (67 blast and 4 normal) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  84 
 
  
    EGAD00001009756 
   
  
    
    Three whole genome sequecing of three independent cohorts. The three cohorts derives from projects including samples of inviduals with Danish origin. Data is deviden into males and female and compiled. 
    
   
  
    
   
  1 
 
  
    EGAD00001009757 
   
  
    
    Bam files aligned using hg19. Sequencing data generated with Illumina MiSeq, HiSeq 2500, or HiSeq 4000 instruments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  62 
 
  
    EGAD00001009758 
   
  
    
    A novel in-house-made pediatric MEF2D-BCL9 fusion positive acute lymphoblastic leukemia cell line was characterized 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001009759 
   
  
    
    A novel in-house-made pediatric MEF2D-BCL9 fusion positive acute lymphoblastic leukemia cell line was characterized 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001009760 
   
  
    
    Colorectal cancer samples will be submitted for Illumina sequencing using a custom capture of 116 genes implicated in colorectal tumourigenesis. Driver mutations will be detected and ultimately correlated with phenotypic data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  2229 
 
  
    EGAD00001009761 
   
  
    
    The dataset contains whole exome sequencing of a family revealing a homozygous splice variant LGR4 gene rresponsible of salt wasting and adrenal zonation alteration. The members sequences were the proband with hypoaldosteronism , her parents and her two healthy brother in a consanguineous family. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001009763 
   
  
    
    This dataset includes the Fastq files from sequencing data generate from diagnostic and remission bone marrow mononuclear cell samples using the Mission Bio Tapestri Plateform with both DNA amplicons and protein from antibody-derived tags sequencing libraries. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  7 
 
  
    EGAD00001009764 
   
  
    
    Rmarkdown code, PDF,  and Rdata file to recapitulate the paper's primary figures and machine learning model development. 
    
   
  
    
   
  - 
 
  
    EGAD00001009766 
   
  
    
    Profiling of childhood neuroblastoma by single-cell RNA sequencing 
    
   
  
    
      
      NextSeq 500 
      
    
   
  24 
 
  
    EGAD00001009771 
   
  
    
    Manuscript Title: 
Co-targeting of BTK and MALT1 overcomes resistance to BTK inhibitors in mantle cell lymphoma
Journal: 
Journal of Clinical Investigation
Authors
Vivian Changying Jiang1, Yang Liu1, Junwei Lian1, Shengjian Huang1, Alexa Jordan1, Qingsong Cai1, Fangfang Yan3, Joseph Mitchell McIntosh1, Yijing Li1, Yuxuan Che1, Zhihong Chen1, Jovanny Vargas1, Maria Badillo1, JohnNelson Bigcal1, Heng-Huan Lee1, Wei Wang1, Yixin Yao1, Lei Nie1, Christopher Flowers1, and Michael Wang1, 2* 
Abstract
Bruton’s tyrosine kinase (BTK) is a proven target in mantle cell lymphoma (MCL), an aggressive subtype of non-Hodgkin lymphoma. However, resistance to BTK inhibitors is a major clinical challenge. We here report that MALT1 is one of the top overexpressed genes in ibrutinib-resistant MCL cells, while expression of CARD11, which is upstream of MALT1, is decreased. MALT1 genetic knockout or inhibition produced dramatic defects in MCL cell growth regardless of ibrutinib sensitivity. Conversely, CARD11 knockout cells showed anti-tumor effects only in ibrutinib-sensitive cells, suggesting that MALT1 overexpression could drive ibrutinib resistance via bypassing BTK-CARD11 signaling. Additionally, BTK knockdown and MALT1 knockout markedly impaired MCL tumor migration and dissemination, and MALT1 pharmacological inhibition decreased MCL cell viability, adhesion, and migration by suppressing NF-κB, PI3K-ATK-mTOR, and integrin signaling. Importantly, co-targeting MALT1 with safimaltib and BTK with pirtobrutinib induced potent anti-MCL activity in ibrutinib-resistant MCL cell lines and patient-derived xenografts. Therefore, we conclude that MALT1 overexpression associates with resistance to BTK inhibitors in MCL, targeting abnormal MALT1 activity could be a promising therapeutic strategy to overcome BTK inhibitor resistance, and co-targeting of MALT1 and BTK should improve MCL treatment efficacy and durability as well as patient outcomes. 
Dataset description:
The bulk RNA-seq dataset was generated for the cell lines below and used for two major purposes:
1.	DEG analysis and GSEA analysis comparing IBN-R and IBN-S cells
2.	DEG analysis and GSEA analysis comparing MCL cells with/without MI-2 treatment.
sample 	Cell	MI-2	Ibrutinib (IBN)	Venetoclax (VEN)	Used for IBN-R vs IBN-S comparison	Used for MI-2 vs untreated (DMSO)
H9	Granta519	-	R	S	yes	
H21	Granta519	-	R	S	yes	
H33	Granta519	-	R	S	yes	
H10	Granta519-VEN-R	-	R	R	yes	
H22	Granta519-VEN-R	-	R	R	yes	
H34	Granta519-VEN-R	-	R	R	yes	
H3	JeKo BTK KD_1 	-	R	R	yes	yes
H15	JeKo BTK KD_1 	-	R	R	yes	yes
H27	JeKo BTK KD_1 	-	R	R	yes	yes
H5	JeKo BTK KD_2 	-	R	R	yes	yes
H17	JeKo BTK KD_2 	-	R	R	yes	yes
H29	JeKo BTK KD_2 	-	R	R	yes	yes
H1	JeKo-1	-	S	R	yes	yes
H13	JeKo-1	-	S	R	yes	yes
H25	JeKo-1 	-	S	R	yes	yes
H7	Mino	-	S	S	yes	
H19	Mino	-	S	S	yes	
H31	Mino	-	S	S	yes	
H8	Mino-VEN-R	-	S	R	yes	
H20	Mino-VEN-R	-	S	R	yes	
H32	Mino-VEN-R	-	S	R	yes	
H11	Rec-1	-	S	S	yes	
H23	Rec-1	-	S	S	yes	
H12	Rec-VEN-R	-	S	S	yes	
H24	Rec-VEN-R	-	S	R	yes	
H36	Rec-VEN-R	-	S	R	yes	
H35	Rec-1	--	S	R	yes	
H4	JeKo BTK KD_1 + MI-2	+				yes
H16	JeKo BTK KD_1 + MI-2	+				yes
H28	JeKo BTK KD_1 + MI-2	+				yes
H6	JeKo BTK KD_2 + MI-2	+				yes
H18	JeKo BTK KD_2 + MI-2	+				yes
H30	JeKo BTK KD_2 + MI-2	+				yes
H2	JeKo-1 + MI-2	+				yes
H14	JeKo-1 + MI-2	+				yes
H26	JeKo-1 + MI-2	+				yes 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  35 
 
  
    EGAD00001009772 
   
  
    
    miRNA libraries of the AML-PMP project were sequenced on the Illumina HiSeq 2000 instrument, approximately 16 samples per HiSeq lane, to a median depth of 5.7 million reads per library. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00001009773 
   
  
    
    116,958 single-cell transcriptomes from samples of peripheral blood mononuclear cells (PBMCs) from five CVID patients at three distinct stages of the SARS-CoV-2 infection: 1) baseline, before viral infection, 2) progression, during viral infection, and 3) convalescence, once the viral infection had been resolved and the patient was PCR negative. CVID patients were under regular immunoglobulin replacement therapy and displayed only mild symptoms during SARS-CoV-2 infection. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001009774 
   
  
    
    Bone marrow aspirates were obtained from patients with relapsed/refractory large B cell lymphoma (rrLBCL), mononuclear cells isolated by ficoll density-gradient centrifugation, and loaded onto a 10X Chromium for single cell RNA-sequencing using 5’ chemistry without prior cryopreservation. Healthy donor bone marrow mononuclear cells were obtained from healthy allogeneic stem cell transplant donors and analyzed following viable cryopreservation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001009775 
   
  
    
    Summary statistics
Korean PD (n=410) vs. Korean Healthy Control (n=200) 
    
   
  
    
   
  1 
 
  
    EGAD00001009777 
   
  
    
    Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96141A 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009778 
   
  
    
    The goal of this study is to characterize immune cell populations by single cell RNA-sequencing (scRNA-seq) in tumor and uninvolved normal tissues from non-small cell lung cancer (NSCLC) patients with resectable non-small cell lung cancer and who received neoadjuvant chemoimmunotherapy. scRNA-seq was performed on seven pairs of tumor and normal tissues as well as one lymph node (LN) sample. Data set includes pair-end fastq files for single cell RNA sequencing of 7 neo-immuno patients. (Total 15 samples and 50 runs). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001009780 
   
  
    
    Plasma from lung cancer patients from EDTA tubes was fractionated using size exclusion chromatography. Fractions 1-5, 7-11, 12-15, 16-20 were pooled, cfDNA was extracted from the fractions and paired unfractionated samples and PE150bp sequencing was performed on an Illumina Novaseq S4 flowcell. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001009781 
   
  
    
    Multiple regions were cut out of each tumor extracted form  breast cancer patients and two experiments were run:
- Whole exome sequencing on each of the regions plus an adjacent normal tissue sample
- Smart-Seq3 Single cell RNA sequencing on EPCAM+ CD45- sorted cells from different tumor regions 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1518 
 
  
    EGAD00001009783 
   
  
    
    This dataset includes 2*76bp RNA-seq reads from 9 pigs sequenced using Illumina NextSeq500. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001009784 
   
  
    
    This dataset contains:
Ultra-deep sequencing data using the Duplex sequencing technology of: 
1.) SeraSeq cfDNA reference materials with spike-in variants with allele frequencies from 0% to 5%
2.) One cfDNA sample from a CRC patient
3.) One cfDNA sample from a patient with asymmetric overgrowth
Paired-end sequencing was performed with 2x151 bp reads on the NextSeq 500 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  13 
 
  
    EGAD00001009785 
   
  
    
    Contains control and PXR KD human small intestinal organoids 
    
   
  
    
      
      NextSeq 500 
      
    
   
  18 
 
  
    EGAD00001009786 
   
  
    
    This dataset contains 68 BAM files from matched normal-tumor pairs of HCV positive lymphoma analyzed by exome sequencing on Ilumina platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009787 
   
  
    
    WGS files for CIC paper titled "Malignant progression of an ancestral bone marrow clone harboring a CIC-NUTM2A fusion in isolated myeloid sarcoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001009788 
   
  
    
    RNASeq files for CIC paper titled "Malignant progression of an ancestral bone marrow clone harboring a CIC-NUTM2A fusion in isolated myeloid sarcoma" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001009789 
   
  
    
    A DNA methylation atlas of normal human cell types. The atlas includes 205 whole-genome bisulfite sequencing (WGBS) samples, from 39 sorted cell types. The samples were paired-end sequenced with 30x coverage. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  410 
 
  
    EGAD00001009790 
   
  
    
    The TEP dataset consists of 549 Fastq samples which are divided into two experiments: a training data cohort, used to train the classifier, and a validation data cohort, used to assess classifier performance 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  548 
 
  
    EGAD00001009791 
   
  
    
    Whole genome sequencing data of 8 High-grade serous carcinoma (HGSC) patients (20 samples) sequenced with HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  20 
 
  
    EGAD00001009792 
   
  
    
    This dataset contains RNA-sequencing of whole blood samples from Healthy Controls (n=11),  AAV patients (n=30, GPAn=22, MPAn=8). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001009793 
   
  
    
    We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009794 
   
  
    
    We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001009796 
   
  
    
    12.5 ng of cfDNA was used as input for shallow whole-genome sequencing (sWGS), aiming for a coverage of x0.2-0.4-fold. Library preparation was performed using the TruSeq Nano DNA High Throughput Library Prep Kit (Illumina, San Diego, CA, USA) on an automated Hamilton STAR liquid handling system (Hamilton, Germany GmbH, Robotics, Gräfeling, Germany) with dual indexing, and sequencing was performed on the NextSeq500/550 platform (Illumina). The fraction of tumor-derived DNA in cell-free DNA was estimated using the R package ichorCNA. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009797 
   
  
    
    Clinical & biomarker data from IMagyn050: treatment arm, treatment approach, outcome of surgery, ECOG PS, PD-L1 status, race, age, disease stage, progression free survival (investigator assessed), overall survival, histology, tumor mutation burden and status, genomic loss of heterozygosity, microsatellite status, BRCA1/2 mutation status, tissue of origin. Mutation status based on FoundationOne NGS for the following genes is also being provided: TP53, BRCA1, CCNE1, MYC, NF1, PIK3CA, RAD21, TERC, PRKCI, KRAS, RB1, BRCA2, ARID1A, AKT2, PTEN, KDM5A, NOTCH3, FGF12, ERBB2, CDK12, EMSY, WHSC1L1, BCL2L1, CDKN2A, GNAS, ARFRP1, ZNF217, SOX2, CCND2, FGF6, FGF23, LYN, MUTYH, AURKA, FGFR1, MCL1, MLL2, MYCL1, ZNF703, BRAF, MAP2K4, CREBBP, TSC2 
    
   
  
    
   
  1 
 
  
    EGAD00001009798 
   
  
    
    Smart-seq3 scRNA-seq of cells from primary (OV2295) and metastatic (OV2295R2) high-grade serous ovarian cancer cell-line 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  768 
 
  
    EGAD00001009800 
   
  
    
    This dataset was used to compare gene expression profiles of ex vivo isolated classical CD14+ monocytes from patients with moderate COVID-19 to those of healthy individuals.
Blood samples were taken from patients with moderate COVID-19 admitted to hospitals in London (Hammersmith Hospital, Charing Cross Hospital, Saint Mary’s Hospital) 3-14 days after disease onset and 0-2 days after hospitalization and positive PCR, and before study treatment initiation. Moderate patients displayed mild or moderate COVID-19 pneumonia, defined as grade 3 or 4 WHO severity. Samples were collected from March 2020 to February 2021. Healthy donors were Imperial College staff with no prior diagnosis of or recent symptoms consistent with COVID-19, and where possible, were matched in age and sex distribution with COVID-19 patients. None of the participants of this study were COVID-19 vaccinated.
Peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll Hypaque (GE Healthcare) gradient centrifugation <4 hours after blood collection. CD14+ monocytes were isolated using a positive selection magnetic sorting kit (StemCell Technologies, UK) from total PBMC and stimulated with vehicle, UV-inactivated SARS-CoV-2 (CoV-2). RNA was isolated using the RNeasy Micro Plus Kit (QIAGEN) following the manufacturer’s guidelines. RNA-sequencing was performed by the Oxford Genomics Centre. PolyA-enriched strand- specific libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kits (Illumina). All samples were pooled together and 150bp PE reads were sequenced on a Novaseq 6000 system. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  42 
 
  
    EGAD00001009801 
   
  
    
    The primitive streak emerges during the 14th day of human development and establishes dorsoventral and antero-posterior (craniocaudal) body axes. Segmentation along the craniocaudal axis is governed by Hox genes; four clusters of transcription factors whose 3’ to 5’ expression correlates with position along the axis. The precise utilisation of Hox genes in different cell types remains incompletely characterised in humans. In this study, we applied single-cell and spatial transcriptomics to contiguous regions of the human fetal spine between the 5th and 13th post-conception weeks. We built a detailed developmental atlas to examine the segmental expression of Hox genes across different cell types, observing that the Hox code was displayed by all anatomically fixed cell types along the craniocaudal axis. By contrast, mature derivatives of neural crest cells retained the anatomical Hox code of their origin within the crest, a pattern reproduced across neural crest derivatives in other human fetal organs. These findings indicate that scars of Hox gene expression persist in crest cells which may serve as barcodes of neural crest migration. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009806 
   
  
    
    64 paired-end Illumina RNAseq whole transcriptome stranded libraries from 32 pairs of matched primary and recurrent GBM 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  64 
 
  
    EGAD00001009807 
   
  
    
    Single-cell RNA sequencing of 18 peripheral blood samples from six melanoma patients. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001009808 
   
  
    
    Targeted DNA sequencing on 37 Merkel Cell Carcinomas from New Zealand with known Merkel cell polyomavirus status 
    
   
  
    
      
      Ion Torrent Proton 
      
      NextSeq 500 
      
    
   
  92 
 
  
    EGAD00001009809 
   
  
    
    WGS of MAPKi acquired resistant samples from patients and PDX models 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  104 
 
  
    EGAD00001009812 
   
  
    
    Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  76 
 
  
    EGAD00001009813 
   
  
    
    Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  41 
 
  
    EGAD00001009814 
   
  
    
    Concatenated long-read single-cell RNA sequencing samples prepared using 10X and the HIT-scIsoSeq protocol. The sequencing was performed on Sequel II Pacbio machine.
3 ovarian cancer patients, 5 omentum biopsies samples: 3 metastasis samples (one per patient), 2 healthy samples (one per patient except Patient2). 4 bam files per metastasis sample, 2 bam files per healthy sample. 
    
   
  
    
      
      Sequel 
      
    
   
  5 
 
  
    EGAD00001009815 
   
  
    
    Illumina Novaseq paired-end single-cell RNA sequencing samples prepared using 10X Genomics platform. 
3 ovarian cancer patients, 5 omentum biopsies samples: 3 HGSOC metastasis samples (one per patient), 2 healthy samples (one per patient except Patient2). 4 paired fastq files per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001009816 
   
  
    
    This dataset contains the CRAM files of the samples used for the article "Neutrophil extracellular traps have auto-catabolic activity and produce mononucleosome-associated circulating DNA" published in Genome Medicine. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  12 
 
  
    EGAD00001009817 
   
  
    
    This dataset contains raw ITS amplicon sequencing data for 719 sputum samples from individuals in Guangdong province, China. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  625 
 
  
    EGAD00001009818 
   
  
    
    Whole genome sequencing of 27 paired (54 total) tumor-normal Hodgkin Lymphoma whole genomes.  Bulk tumor (-T) and flow-sorted Reed Sternberg cell (-HRS) samples.  Approximately 30x wgs normal depth, 40-50x wgs depth tumor samples.
Whole RNA Sequencing of Hodgkin Lymphoma paired samples (64 pairs - 128 total files) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  182 
 
  
    EGAD00001009819 
   
  
    
    This dataset contains the raw sequencing data (Runs) from the 10x Genomics single-cell Multiome Experiments which belong to the MCL group 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001009820 
   
  
    
    This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell CITE-seq Experiments, as well as the demultiplexed cell-donor identities matrices which were generated with Vireo (Analysis). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001009821 
   
  
    
    This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Visium Experiments, as well as the corresponding imaging data (Analyses). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001009822 
   
  
    
    This dataset contains the raw sequencing data (Runs) for all 10x Genomics single-cell ATAC-seq Experiment. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001009823 
   
  
    
    This dataset contains the raw sequencing data (Runs) for all 10x Genomics single-cell RNA-seq Experiments. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001009824 
   
  
    
    This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Multiome Experiments 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009825 
   
  
    
    TRACERx NSCLC - Whole exome multiregion sequencing data from the 421 cohort 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2193 
 
  
    EGAD00001009826 
   
  
    
    This is a test dataset derived from public data of the 1000 Genomes Project. Its purpose is not to allow for any inference about cohort data or results, but to aid bioinformaticians in the technical development and testing of tools, as well as data consumers in learning how to access information.  
This dataset consists of 3 pairs of light-weight (sliced) files: BAM + BAI, CRAM + CRAI and VCF + TBI. These files can be downloaded directly through the EGA-download-client PyEGA3 (https://github.com/EGA-archive/ega-download-client). 
For any further questions, please contact the DAC (Helpdesk - email: helpdesk [at] ega-archive [dot] org). 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001009827 
   
  
    
    Seven clonal organoid lines and one bulk wild-type control sample were paired-end whole-genome sequenced using the Illumina Novaseq 6000 system. We sequenced four clonal intestinal organoid lines harbouring engineered TP53 and FBXW7 mutations as well as three lines targeted for oncogenic APC/TP53/PIK3CA/SMAD4 mutations. The reads were mapped to hg38 genome assembly and data is provided as BAM files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009828 
   
  
    
    FASTQ and BAM files for 46 samples (consisting of 35 samples for patients diagnosed with CLL, 5 samples consisting of a dilution series of patient DNA, and 6 samples of cell lines and dilutions involving the cell lines) from targeted capture next-generation sequencing from the LySeq panel. Libraries were sequenced on either Illumina HiSeq 2500 (LySeq66 PoP, R1-3) or Illumina NovaSeq 6000 (LySeq66 Validation Round). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD00001009829 
   
  
    
    11 plasma cases and 4 urine cases (mouse) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD00001009830 
   
  
    
    Sixteen patients with refractory solid cancers received up to three distinct neoTCR-transgenic cell products, each expressing a patient-specific neoTCR, in a cell dose-escalation, first-in-human phase 1 clinical trial (NCT03970382). Included are the tumor and normal WXS and tumor RNAseq for dosed patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  86 
 
  
    EGAD00001009831 
   
  
    
    Sorted single CD8+T cells expressing CD14 from human liver for SMARTSeq2.
Livers processed: Kucykowicz et al STAR Prot 2022:pubmed.ncbi.nlm.nih.gov/35516846/ 
Published: Pallett et al Nature 2022
Tissue CD14+CD8+T-cells are reprogrammed by myeloid cells and modulated by LPS
A modified SMART-seq2 protocol was performed on the single flow cytometry sorted-cells as previously described58. After cDNA generation, libraries were prepared (384 cells per library) using the Illumina Nextera XT kit (Illumina). Each library was sequenced to achieve a minimum depth of 1-2 million raw reads per cell using an Illumina HiSeq 4000 using v. 4 SBS chemistry to generate 75-bp paired-end reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  378 
 
  
    EGAD00001009834 
   
  
    
    RNA-Seq data of 46 matched lobular breast cancer metastatic samples obtained from 21 unique patients from GELATO clinical trial assayed at three timepoints: at baseline (directly after patient randomization), pre-atezolizumab (after induction treatment with carboplatin for two weeks) and on atezolizumab (after two cycles of atezolizumab combined with carboplatin). The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000 from fresh frozen material. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD00001009835 
   
  
    
    RNA-Seq data of 10 lobular breast cancer primary tumors and 3 local recurrences obtained from 11 unique patients from GELATO clinical trial. The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000 from archived FFPE material. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001009836 
   
  
    
    Paired-end whole exome sequencing of 10 lobular breast cancer primary tumors, 3 local recurrences and matched normal samples obtained from 10 unique patients from the GELATO clinical trial. The included raw sequencing data in fastq format was generated using Illumina NovaSeq 6000 from archived FFPE material (tumor data) and fresh frozen material (matched normal data). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD00001009837 
   
  
    
    Paired-end whole exome sequencing of 19 lobular breast cancer metastatic tumors and matched normal samples obtained from 19 unique patients from the GELATO clinical trial. The included raw sequencing data in fastq format was generated using Illumina NovaSeq 6000 from fresh frozen material. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD00001009838 
   
  
    
    15 Healthy controls, 25 conlonrectal cancer patients without liver metastasis and 24 conlonrectal cancer patients with liver metastasis (target capture) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  64 
 
  
    EGAD00001009839 
   
  
    
    18 plasma samples and their paired 18 urinary cfDNA samples without cancer 
    
   
  
    
      
      NextSeq 500 
      
    
   
  36 
 
  
    EGAD00001009840 
   
  
    
    12 Nasopharyngeal carcinoma patients without treatment and 12 Nasopharyngeal carcinoma patients with treatment (WGS) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  48 
 
  
    EGAD00001009844 
   
  
    
    RNAseq of 45 high-grade serous ovarian cancer tumour samples. Libraries were generated using the NEB Ultra II Directional RNA library Prep kit with polyA enrichment. Libraries were sequenced as paired-end 50 or 100bp on an Illumina NextSeq or NovaSeq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009845 
   
  
    
    Clonal evolution drives cancer progression and therapeutic resistance. Recent studies revealed divergent longitudinal trajectories in gliomas, but early molecular traits steering post-treatment cancer evolution remain unclear. We analyzed sequencing data of 544 initial-recurrent adult diffuse glioma pairs to identify genomic and transcriptomic early predictors of tumor evolution in each molecular subtype. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  366 
 
  
    EGAD00001009847 
   
  
    
    We provide whole exome DNA sequening data in fastq format for 23 clinical samples of chronic myeloid leukemia stem cells (CML-SC) plus two buccal swipes derived normal samples. CML samples are comprised of 4 to 8 replicates from two patients, at diagnosis and after treatment. Single CML stem cells before treatment and single non-transformed hematopoietic stem cells (HSC) at remission were selected from bone marrow samples by FACS, according to newly identified genetic markers CD33+CD26+ at diagnosis and CD33+CD26-/CD33-CD26- at remission. WES libraries of colony forming assays derived CML-SC and HSC populations were prepared using Agilent SureSelect Human All Exon V6 kit and sequenced running 150 cycles (2x 75bp paired-end) on an Illumina NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001009848 
   
  
    
    Pathogenic germline variants in the protection of telomeres 1 gene (POT1) have been associated with predisposition to a range of tumor types, including melanoma, glioma, leukemia and cardioangiosarcoma. We sequenced all coding exons of the POT1 gene in 2,929 European-descent melanoma cases and 3,298 controls, identifying 43 protein-changing genetic variants. We performed functional studies on each of these variants and explored their possible contribution to disease risk. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  6226 
 
  
    EGAD00001009851 
   
  
    
    A collection of four induced pluripotent stem cell models (iPSC) derived from patients diagnosed with Spinocerebellar ataxia 15 (SCA15) . Spinocerebellar ataxia 15 (SCA15) is a neurological condition characterised by progressive gait and limb ataxia as well as abnormalities in eye movement and difficulties with balance, speech and swallowing (Synofzik et al., 2011). Whole Genome Sequencing was performed to confirm the presence of a heterozygous deletions in the inositol 1,4,5-triphosphate receptor gene (ITPR1), characteristic of the disease. Cell models names: HPSI0216i-vieg_5 or Vieg_5 (WTSIi472-A), HPSI0216i-vieg_3 or Vieg_3 (WTSIi472-B), HPSI0216i-dacv_6 or Dacv_6 (WTSIi554-A), HPSI0216i-boho_3 or Boho_3 (WTSIi502-A). All iPSC models are available via ECACC-Culture Collections. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009852 
   
  
    
    Whole-exome sequence (WES) data of tumor-normal pairs from 40 ENKTCL patients and RNA sequence (RNA-seq) data of tumors from 20 ENKTCL patients. 
    
   
  
    
      
      unspecified 
      
    
   
  52 
 
  
    EGAD00001009853 
   
  
    
    198 exome sequencing samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  198 
 
  
    EGAD00001009854 
   
  
    
    This data set includes bam files (aligned to hg38) from the germline of children who have  pathogenic mutations in cancer predisposing genes 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  4 
 
  
    EGAD00001009855 
   
  
    
    Dataset containing 2068 WES tumor and control samples of central nervous system neoplasm patients. The data was sequenced on a Illumina NextSeq 500 using a NPHD2015A kit. The sequencing was always paired. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2068 
 
  
    EGAD00001009857 
   
  
    
    Fastq files from RNAseq of breast cancer bone metastases PDX of tumor HBC-124 treated by IACS-010759 (4 samples) or not (4 samples). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001009858 
   
  
    
    WGS data of multi-region samples from PLANET 123 Patient cohort 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001009859 
   
  
    
    RNA-seq data of multi-region samples from PLANET 123 Patient cohort 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  - 
 
  
    EGAD00001009860 
   
  
    
    The dataset for the study “Dynamics of sequence and structural cell-free DNA landscapes in small-cell lung cancer” includes 171 bam files from targeted next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 33 individuals with small cell lung cancer, alongside 10 bam files from whole exome sequencing of tumor and matched normal DNA for 5 individuals with small cell lung cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  181 
 
  
    EGAD00001009861 
   
  
    
    scRNASeq analysis of human Lin neg lymphocytes from control liver, cirrhotic liver, tonsil, duodenum, and colon tissues. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001009862 
   
  
    
    RNAseq data from the TRACERx 421 cohort 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1051 
 
  
    EGAD00001009863 
   
  
    
    The pediatric cancer cohort in this study included 70 PDX models from 65 different individuals. This cohort included a total of 16 different pediatric solid tumor subtypes, including fourteen Wilms tumors, thirteen hepatoblastomas, thirteen osteosarcomas, ten germ cell tumors, four neuroblastomas, three clear cell sarcomas, two adrenal cortical carcinomas, two leydig cell tumors, two medulloblastomas, one embryonal rhabdomyosarcoma (ERMS), one Ewing sarcoma, one pleomorphic sarcoma, one adenocarcinoma, one glioblastoma, one mesothelioma and one ovarian tumor. Notably, we have five samples with multiple PDX models from same patient, including two cases with duplicates (564 and 564-Dup, 1796 and 1796-Dup), one case with two different metastasis (560-SM, 560-LM), one case with two blocks from same tumor (1939 and 1939-Dup), and one case with different primary tumor from same patient (2264 and 1932). We have a total of 353 sequencing data, including 82 RNA sequencing data (RNA-seq), 138 whole-exome sequencing (WES) and 135 low-pass whole-genome sequencing (WGS). For RNA-seq data, we have 61 PDXs and 21 PTs; for WES, we have 67 PDXs, 30 PTs and 40 matched normal germlines; for WGS, we have 64 PDXs, 30 PTs and 40 matched normal germlines. Of which, 19 PT-PDX paired RNA-seq, 28 paired PT-PDX paired WES and WGS were included. 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  352 
 
  
    EGAD00001009864 
   
  
    
    Data from NABUCCO cohort 2 (NCT03387761). This dataset includes Whole exome DNA sequencing
on bladder tumor samples matched with blood samples for patients from NABUCCO Cohort 2 (Cohort 2A and Cohort 2B). The data is pre-treatment 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  59 
 
  
    EGAD00001009865 
   
  
    
    Single-cell RNA sequencing was performed on bone marrow mononuclear cells of a patient with acute myeloid leukemia with erythroid differentiation of the blasts and on peripheral blood mononuclear cells of a patient with acute myeloid leukemia with megakaryocytic differentiation of the blasts. The dataset contains raw fastq files of these two samples with single-cell RNA sequencing performed using the 10x Genomics platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001009866 
   
  
    
    whole genome sequencing data 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  512 
 
  
    EGAD00001009867 
   
  
    
    methyl-seq data 64 cases 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  128 
 
  
    EGAD00001009868 
   
  
    
    ATAC-seq data 72cases 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  144 
 
  
    EGAD00001009870 
   
  
    
    In Vivo Loss of Tumorigenicity in a Patient-Derived Orthotopic Xenograft Mouse Model of Ependymoma. Whitehouse et al. 2023 Frontiers in Oncology.
We describe the establishment of a patient-derived orthotopic xenograft (PDOX) model of posterior fossa A (PFA) EPN, derived from a metastatic cranial lesion. Patient and PDOX tumors were analyzed using RNA sequencing. 
RNAseq data (paired end) provided here correspond to Primary tumour, two metastatic lesions (one spinal, one cranial), and a patient-derived xenograft derived from the patient. 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD00001009871 
   
  
    
    van Hijfte snRNA glioblastoma dataset 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009872 
   
  
    
    LP2100030-DNA_A02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009873 
   
  
    
    LP2100082-DNA_A01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009874 
   
  
    
    LP2100030-DNA_A08 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009875 
   
  
    
    LP2100030-DNA_A01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009876 
   
  
    
    LP2100030-DNA_A07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009877 
   
  
    
    LP2100030-DNA_A06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009878 
   
  
    
    LP2100030-DNA_A03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009879 
   
  
    
    LP2100030-DNA_A09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009880 
   
  
    
    LP2100030-DNA_A05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009881 
   
  
    
    LP2100030-DNA_A04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009882 
   
  
    
    LP2100082-DNA_A02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009883 
   
  
    
    We have assessed the added value of long-read sequencing for PGx focusing on the clinically important and highly polymorphic CYP2C19 gene within 48 samples. 
    
   
  
    
      
      PacBio RS II 
      
    
   
  1 
 
  
    EGAD00001009884 
   
  
    
    WGS data normal and hypomethylation 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001009885 
   
  
    
    Nanopore pericentromere normal methylation 
    
   
  
    
      
      MinION 
      
    
   
  1 
 
  
    EGAD00001009886 
   
  
    
    Nanopore pericentromere hypomethylation 
    
   
  
    
      
      MinION 
      
    
   
  1 
 
  
    EGAD00001009887 
   
  
    
    RNA seq data normal and hypomethylation 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  6 
 
  
    EGAD00001009888 
   
  
    
    Whole genome sequencing data of 9 high-grade serous carcinoma (HGSC) patients (55 samples) sequenced with HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  55 
 
  
    EGAD00001009890 
   
  
    
    This meta data contains extensive meta data from the cross sectional flow of the Isala Citizen Science project. The meta data file contains ENA Accession numbers for 16S Microbiome data, and cleaned responses to questionnaires. 
    
   
  
    
   
  3453 
 
  
    EGAD00001009891 
   
  
    
    LP2100082-DNA_G04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001009892 
   
  
    
    LP2100082-DNA_A03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009893 
   
  
    
    LP2100082-DNA_A04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009894 
   
  
    
    LP2100082-DNA_A05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009895 
   
  
    
    LP2100082-DNA_A06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009896 
   
  
    
    LP2100082-DNA_A07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009897 
   
  
    
    LP2100082-DNA_B01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009898 
   
  
    
    LP2100082-DNA_B02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009899 
   
  
    
    LP2100082-DNA_B03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009900 
   
  
    
    LP2100082-DNA_B04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009901 
   
  
    
    LP2100082-DNA_B05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009902 
   
  
    
    LP2100082-DNA_B06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009903 
   
  
    
    LP2100082-DNA_B07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009904 
   
  
    
    LP2100082-DNA_C01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009905 
   
  
    
    LP2100082-DNA_C02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009906 
   
  
    
    LP2100082-DNA_C03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009907 
   
  
    
    LP2100082-DNA_C05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009908 
   
  
    
    LP2100082-DNA_D02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009909 
   
  
    
    LP2100082-DNA_D01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009910 
   
  
    
    LP2100082-DNA_D03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009911 
   
  
    
    LP2100082-DNA_D05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009912 
   
  
    
    LP2100082-DNA_D06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009913 
   
  
    
    LP2100082-DNA_E01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009914 
   
  
    
    LP2100082-DNA_E02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009915 
   
  
    
    LP2100082-DNA_E03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009916 
   
  
    
    LP2100082-DNA_E04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009917 
   
  
    
    LP2100082-DNA_E05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009918 
   
  
    
    LP2100082-DNA_E06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009919 
   
  
    
    LP2100082-DNA_F01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009920 
   
  
    
    LP2100082-DNA_F02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009921 
   
  
    
    LP2100082-DNA_F03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009922 
   
  
    
    LP2100082-DNA_F04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009923 
   
  
    
    LP2100082-DNA_F05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009924 
   
  
    
    LP2100082-DNA_F06 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009925 
   
  
    
    LP2100082-DNA_G01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009926 
   
  
    
    LP2100082-DNA_G02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009927 
   
  
    
    LP2100082-DNA_G03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009928 
   
  
    
    LP2100082-DNA_H01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009929 
   
  
    
    LP2100082-DNA_H02 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009930 
   
  
    
    LP2100082-DNA_H03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009931 
   
  
    
    LP2100082-DNA_H04 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009932 
   
  
    
    LP2100082-DNA_H05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009933 
   
  
    
    LP2100098-DNA_A01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009934 
   
  
    
    LP2100098-DNA_A03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009935 
   
  
    
    LP2100098-DNA_A05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009936 
   
  
    
    LP2100098-DNA_A07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009937 
   
  
    
    LP2100098-DNA_A09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009938 
   
  
    
    LP2100098-DNA_B01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009939 
   
  
    
    LP2100098-DNA_B03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009940 
   
  
    
    LP2100098-DNA_B05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009941 
   
  
    
    LP2100098-DNA_B07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009942 
   
  
    
    LP2100098-DNA_B09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009943 
   
  
    
    LP2100098-DNA_C01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009944 
   
  
    
    LP2100098-DNA_C03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009945 
   
  
    
    LP2100098-DNA_C05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009946 
   
  
    
    LP2100098-DNA_C07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009947 
   
  
    
    LP2100098-DNA_C09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009948 
   
  
    
    LP2100098-DNA_D01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009949 
   
  
    
    LP2100098-DNA_D03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009950 
   
  
    
    LP2100098-DNA_D05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009951 
   
  
    
    LP2100098-DNA_D07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009952 
   
  
    
    LP2100098-DNA_D09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009953 
   
  
    
    LP2100098-DNA_E03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009954 
   
  
    
    LP2100098-DNA_E05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009955 
   
  
    
    LP2100098-DNA_E07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009956 
   
  
    
    LP2100098-DNA_E09 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009957 
   
  
    
    LP2100098-DNA_F01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009958 
   
  
    
    LP2100098-DNA_F03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009959 
   
  
    
    LP2100098-DNA_F05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009960 
   
  
    
    LP2100098-DNA_F07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009961 
   
  
    
    LP2100098-DNA_G01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009962 
   
  
    
    LP2100098-DNA_G05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009963 
   
  
    
    LP2100098-DNA_G07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009964 
   
  
    
    Bolleboom-Gao peri-tumoral snRNA-seq glioblastoma dataset 2022/A 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009965 
   
  
    
    Imputed HLA alleles and variation. Imputation was carried out using the Multi-Ethnic HLA reference panel (version 1.0 2021) available on the Michigan imputation server 
    
   
  
    
   
  - 
 
  
    EGAD00001009966 
   
  
    
    Phenotype and covariates 
    
   
  
    
   
  - 
 
  
    EGAD00001009967 
   
  
    
    LP2100098-DNA_H01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009968 
   
  
    
    LP2100098-DNA_H03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009969 
   
  
    
    LP2100098-DNA_H05 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009970 
   
  
    
    LP2100098-DNA_H07 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009971 
   
  
    
    Variant Calls for all 97 consenting participants in the study. 
    
   
  
    
   
  1 
 
  
    EGAD00001009972 
   
  
    
    Low-pass whole genome sequencing samples from pediatric solid tumor patients who are deceased 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  26 
 
  
    EGAD00001009973 
   
  
    
    Autopsy-derived, later snap-frozen tissue fragments from a 5-year-old female with recurrent metastatic fusion-negative embryonal rhabdomyosarcoma (RMS) primary tumor of the left thigh were analyzed.  No quality matched tissue was available.  Prior panel analysis identified 2 prominent genetic changes:  NRAS Q61K, PIK3CA H1047R, CDKN2A/B loss, and ERBB3 overexpression. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001009974 
   
  
    
    Dataset contains four samples taken from a neonate with congenital KMT2A-rearranged Acute Lymphoblastic Leukemia patient (CHI-0391) with rare IKZF1 gene fusions. Sequencing was carried out using mRNA-seq sequencing on a Illumina NextSeq 500 machine 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD00001009975 
   
  
    
    LP2100100-DNA_A01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009976 
   
  
    
    LP2100100-DNA_A03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009977 
   
  
    
    LP2100100-DNA_C01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009978 
   
  
    
    LP2100100-DNA_C03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009979 
   
  
    
    LP2100100-DNA_E01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009980 
   
  
    
    LP2100100-DNA_E03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009981 
   
  
    
    LP2100100-DNA_G01 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009982 
   
  
    
    LP2100100-DNA_G03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009983 
   
  
    
    LP2100100-DNA_H03 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001009984 
   
  
    
    Pre-diagnostic saliva microbiota samples of Finnish children (aged 11/12 years). This is a case-control study, where case refers to the children who developed Type 1 DM or IBD later in life and control refers to the children who were free from these diseases. The aim of the study was to find biomarkers in saliva microbiota that may help us predict DM or IBD before the onset of these diseases. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  163 
 
  
    EGAD00001009985 
   
  
    
    The dataset contains proteomics data of seven healthy family members. The samples were taken from peripheral blood mononuclear cells. 
    
   
  
    
   
  1 
 
  
    EGAD00001009986 
   
  
    
    single cell RNAseq and TCR sequencing data of 5 individuals. 10x Genomics VDJ single cell sequencing  (17 runs) and 10x Genomics sc RNA-Seq (19 runs). The VDJ sequencing was done on a Nextseq 550 using the Chromium Single Cell VDJ Reagent Kit. The RNA-Seq was done either on HiSeq4000 or NovaSeq 6000 using the Chromium Single Cell 5 Reagent Kit. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  10 
 
  
    EGAD00001009987 
   
  
    
    This dataset includes NGS profiling of 13 women with simultaneous bilateral breast cancer.   Seven women have WES of untreated surgical resections and matched healthy tissue.  The six other women have WES of healthy tissue, WES+RNAseq of pre-neoadjuvant tumor biopsies, and when residual disease was present (6 tumors in 4 patients), WES+RNAseq of residual disease from post-neoadjuvant therapy surgery. One patient from the WES+RNAseq cohort had multifocal bilateral disease at diagnosis so there are 2 pre-neoadjuvant biopsy samples from each breast. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD00001009988 
   
  
    
    The dataset includes cram files from WGS of 2 NEN tumors and matched PDTOs. For all tumors, cram files from WGS of matched normal tissue from the corresponding patients are included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009989 
   
  
    
    The dataset includes cram files from WGS of 6 NEN tumors or metastases and matched PDTOs. For all tumors, cram files from WGS of either matched normal tissue or matched blood from the corresponding patients are included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009990 
   
  
    
    The dataset includes cram files from WGS of 2 LCNEC tumors and matched PDTOs. For all tumors, cram files from WGS of matched normal tissue derived organoids from the corresponding patients is included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009991 
   
  
    
    The dataset includes fastq files from 4 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001009992 
   
  
    
    The dataset includes fastq files from 15 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009993 
   
  
    
    The dataset includes fastq files from 2 LCNEC tumors and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001009994 
   
  
    
    The dataset includes RNA-seq expression R data, RNA-seq gene counts matrix, and RNA-seq gene FPKM matrix from 21 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument. 
    
   
  
    
   
  1 
 
  
    EGAD00001009995 
   
  
    
    Adipose-derived mesenchymal stromal cells from subcutaneous (n=4) and visceral (n=4) tissue, along with dermal fibroblasts (n=3) were analyzed by single-cell RNA sequencing. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  11 
 
  
    EGAD00001009997 
   
  
    
    RNAseq in cryostat-microdissected metastatic and primary prostate cancer tissues and matched noncancerous tissues from the same study subjects. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  130 
 
  
    EGAD00001009998 
   
  
    
    Long-read (PacBio) RNA sequencing dataset of in vitro stimulated PBMC cells. 5 samples consisting of 1 RPMI control and 4 stimulus conditions (lipopolysaccharide (LPS), polyI-polyC, S. aureus and C. albicans) all originating from one donor. Files are raw BAM format files generated by Sequel 2 machine. 
    
   
  
    
      
      Sequel 
      
    
   
  5 
 
  
    EGAD00001009999 
   
  
    
    Cell-free methylated DNA immunoprecipitation sequencing of plasma samples from healthy control patients. 
    
   
  
    
   
  28 
 
  
    EGAD00001010000 
   
  
    
    Shallow whole genome sequencing of plasma samples from healthy control patients. 
    
   
  
    
   
  30 
 
  
    EGAD00001010001 
   
  
    
    Targeted panel sequencing of hereditary cancer syndrome-associated genes (TP53, BRCA1, BRCA2, PALB2, MLH1, MSH2, MSH6, PMS2, EPCAM, and APC) in plasma and buffy coat samples from healthy control patients. 
    
   
  
    
   
  23 
 
  
    EGAD00001010002 
   
  
    
    Shallow whole genome sequencing of plasma samples from patients with Li-Fraumeni syndrome. 
    
   
  
    
   
  173 
 
  
    EGAD00001010003 
   
  
    
    RNA-seq data. 287 Japanese RCC cases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  574 
 
  
    EGAD00001010004 
   
  
    
    CD8 T cells (5 Samples treated, 5 samples control) form the same donors (5 donors, paired design) was subject to RNA-seq (Illumina stranded mRNA)  processing. Single end fastq-files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  10 
 
  
    EGAD00001010005 
   
  
    
    CD8 Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 6 samples were processed. Fastq files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001010007 
   
  
    
    CD8 Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 6 samples were processed. Fastq files are supplied. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD00001010008 
   
  
    
    In Vivo Loss of Tumorigenicity in a Patient-Derived Orthotopic Xenograft Mouse Model of Ependymoma. Whitehouse et al. 2023 Frontiers in Oncology.
We describe the establishment of a patient-derived orthotopic xenograft (PDOX) model of posterior fossa A (PFA) EPN, derived from a metastatic cranial lesion. Patient and PDOX tumors were analyzed using RNA sequencing. 
WSG data (paired end) provided here correspond to germline DNA, Surgical sample 4 (described in the above manuscript as a cranial metastasis of PFA ependymoma), and a patient-derived xenograft derived from the patient. 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001010009 
   
  
    
    Fasq-files from 3 unaffected TET2 mutation carriers, 2 mutation carriers diagnosed with lymphoma and 3 family members without TET2 mutation. Time series data collected from 0, 6 and 12 months after daily dose of 1g vitamin C. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001010010 
   
  
    
    Here, we explore the molecular signatures in RNA sequencing data from blood associated with disease severity as measured in Myotonic dystrophy type 1 (DM1) patients with less than 400 CTG-repeat length size in the DMPK gene in blood. These DM1 patients participated in the OPTIMISTIC study. This approach involved stratifying those within the OPTIMISTIC study into different patient groups with different degrees of disease severity (as measured by the muscle-impairment rating scale (MIRS)) and assessed at baseline. Patients were divided into groups with mild (MIRS 1–2) and severe (MIRS 3–5) neuromuscular symptoms with different DMPK repeat length characteristics. Therefore these .Bam files are baseline samples from this study. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  32 
 
  
    EGAD00001010011 
   
  
    
    Targeted panel sequencing of hereditary cancer syndrome-associated genes (TP53, BRCA1, BRCA2, PALB2, MLH1, MSH2, MSH6, PMS2, EPCAM, and APC) in plasma and buffy coat samples from patients with Li-Fraumeni syndrome. 
    
   
  
    
      
      unspecified 
      
    
   
  151 
 
  
    EGAD00001010012 
   
  
    
    Dataset containing 48 samples: 12 per timepoint (before or after treatment) and group (MMR vaccine or Placebo). 
Each sequencing run contains the sequencing data from 4 randomized samples. Genotype data is used to demultiplex sample ids inside of each pool. Phenotype data contains the information per pool. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD00001010013 
   
  
    
    Cell-free methylated DNA immunoprecipitation sequencing of plasma samples from patients with Li-Fraumeni syndrome. 
    
   
  
    
      
      unspecified 
      
    
   
  174 
 
  
    EGAD00001010014 
   
  
    
    WGS Cram files from the Childhood Cerebral Palsy Integrated Neuroscience Discovery Network "CP-NET" - Clinical Database Platforms - Phase 3 project. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  287 
 
  
    EGAD00001010015 
   
  
    
    Genotype data typed on the Human Origins array for 1510 individuals published in "Dense sampling of ethnic groups within African countries reveals fine-scale genetic structure and extensive historical admixture." 
    
   
  
    
   
  1 
 
  
    EGAD00001010016 
   
  
    
    BAM files containing paired-end mtDNA sequencing data from morphologically normal human liver. CCO-proficient hepatocytes acquired from human livers in which clonal CCO-deficient hepatocyte patches had been previously identified. Individual BAM files are named according to their patch, line and sample location, where PT denotes tissue near to the portal triad (PT), central hepatic vein (CV) and midway between these two structures (Mid). "Stroma" control samples were used for identifying germ-line variants. Sequenced on NextSeq 500 platform. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  80 
 
  
    EGAD00001010017 
   
  
    
    This dataset is part of a study that aims to compare in vivo human trophoblast differentiation into EVTs to different in vitro trophoblast organoids using single-cell and single-nuclei RNA sequencing.  This specific dataset includes scRNA-seq and snRNA-seq data from trophoblast stem cells (TSCs). Trophoblast stem cell (TSC) lines BTS5 and BTS11 derived by Okae and colleagues were grown as described previously (Okae et al. 2018) together with EVT differentiation media. This study shows that the main regulatory programs mediating EVT invasion in vivo are preserved in in vitro models of EVT differentiation from primary trophoblast organoids and trophoblast stem cells. Data for primary trophoblast organoids is available under E-MTAB-12650. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001010018 
   
  
    
    Chimeric antigen receptor (CAR)-modified T-cells have become established as an effective treatment of haematological cancers. In the context of relapsed and refractory childhood pre-B cell acute lymphoblastic leukaemia (B ALL), CD19 targeting CAR T-cells often induce durable remissions. Previously, we generated a novel low-affinity CAR incorporating a CD19-specific single-chain variable fragment (scFV) called CAT, displaying a faster off-rate of interaction than the FMC63 CD19 binder used in prior clinical studies. Here, we systematically analysed CD19 CAR T-cells of ten children with relapsed or refractory B ALL enrolled in the CARPALL trial (NCT02443831). To characterize persisting CD19 CAR T-cells, we performed high throughput single-cell gene expression and T-cell receptor (TCR) sequencing of infusion products and serial blood and bone marrow samples up to five years post-infusion. We isolated CAR T-cells from peripheral blood or bone marrow by flow cytometry for CD3 and CAR expression, prior to single cell sequencing (Chromium 10X) platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  66 
 
  
    EGAD00001010019 
   
  
    
    Sequencing data of 10 high-grade serous carcinoma (HGSC) patients (58 samples including blood derived normal samples as germline controls, fresh frozen tissue samples as tissue controls and organoids) sequenced with HiSeq X Ten / BGISEQ-500 / MGISEQ-2000. 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  58 
 
  
    EGAD00001010020 
   
  
    
    Single-cell RNA-seq and spatial transcriptomics data for 12 patients with sarcoidosis. From each patient, we analyzed skin biopsies of both lesional and non-lesional skin and we performed spatial transcriptomics for lesional skin samples. The data are provided as aligned BAM files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  41 
 
  
    EGAD00001010021 
   
  
    
    Air Pollution Study - DuplexSeq data 
    
   
  
    
      
      unspecified 
      
    
   
  81 
 
  
    EGAD00001010022 
   
  
    
    Bank of treateed and control PDXs metastastic colorectal cancer sample RNAseq 
    
   
  
    
      
      unspecified 
      
    
   
  185 
 
  
    EGAD00001010023 
   
  
    
    Bulk B Cell Receptor high-throughput sequencing data across 27 metastatic breast tumours obtained from 8 donors with therapy-resistant lethal metastatic breast cancer at the time of a warm autopsy. The 35 samples were sequenced on an Illumina MiSeq instrument and their raw FastQ files deposited here. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  27 
 
  
    EGAD00001010024 
   
  
    
    Bronchial brushing dataset from healthy never-smokers after exposure to diesel exhaust. Include 18 samples from 9 research participants who underwent bronchoscopy after controlled exposure to diesel exhaust. Main study design described in detail in Ryu et al 2022 AJRCCM (PMID: 35202552). This dataset was used in Hill et al Nature 2023. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001010025 
   
  
    
    scRNA
This dataset contains 50 scRNA-seq samples from bone marrow aspirates of 11 multiple myeloma patients experiencing long-term survival and 3 healthy donors. For each donor, total bone marrow and CD3+ T cells were sequenced. For multiple myeloma patients, paired samples were collected at initial diagnosis and between 7-17 years after first-line therapy. Bone marrow mononuclear cells were isolated by Ficoll density gradient centrifugation. For sorting of total bone marrow cells singlet, live cells were gated and sorted, for sorting of T cells CD45+, CD3+ cells were gated and sorted on either FACSAria Fusion or FACSAria II. Single-cell RNA sequencing were generated using 10x Genomics single-cell RNAseq technology (Chromium Single Cell 3’ Solution v2) according to the manufacturer’s protocol and sequenced on an Illumina HiSeq4000 (paired end, 26 and 74 bp).
Bulk RNA
Singlet, live CD3+CD4- CXCR3+CD8+ and CD3+CD4- CXCR3-CD8+ cells were sorted from 7 bone marrow and 3 peripheral blood samples of 7 multiple myeloma patients using a FACSAria Fusion machine. Bulk-RNA sequencing libraries were generated using the SMART Seq Stranded Total RNA-Seq kit (Takara) and sequenced using the Illumina NovaSeq 6000 platform (2 x 100 bp). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  82 
 
  
    EGAD00001010026 
   
  
    
    The dataset consists of: 51 paired tumor/normal WGS samples (26 tumors and 25 normals), and 13 normal targeted samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001010028 
   
  
    
    Metagenomics shotgun sequencing was conducted on fecal samples from the Australian patients enrolled in the OpACIN-neo clinical trial (n = 38). Metagenomic shotgun sequencing was performed utilizing the same DNA from the same preparations as for the 16S rRNA gene analysis. Individual libraries were prepared using Nextera XT, and sequencing was performed on the Illumina NovaSeq 6000 S1 (2 x 150bp; Xp workflow). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD00001010029 
   
  
    
    Whole genome sequencing data of 19 high-grade serous carcinoma (HGSC) patients (47 samples) sequenced with HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  78 
 
  
    EGAD00001010030 
   
  
    
    Dataset with whole-genome sequencing tumor and normal samples from 14 neuroblastoma patients. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  28 
 
  
    EGAD00001010031 
   
  
    
    Whole-exome sequencing of tumour regions and deep targeted sequencing of plasma samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  1106 
 
  
    EGAD00001010033 
   
  
    
    RNAseq data from Passman et al 2023. Clonal CCO-deficient hepatocyte patches and nearby CCO-proficient hepatocytes were identified in morphologically normal human livers and sampled at varying distances along the PT-CV axis. Samples are characterised according to their location within the liver lobule, with "PT" denoting samples abutting the portal triad, "CV" denoting samples abutting the central hepatic vein, and "Mid" sampled acquired midway between these structures. Analysis was performed on an Illumina Nextseq using a high output kit and 100 single-end cycles. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  114 
 
  
    EGAD00001010034 
   
  
    
    This dataset has the raw RNA sequencing data for the cancer models in CCMA. 
    
   
  
    
      
      unspecified 
      
    
   
  184 
 
  
    EGAD00001010035 
   
  
    
    This dataset has the mapped bam files from WGS for the cancer models in CCMA. 
    
   
  
    
      
      unspecified 
      
    
   
  146 
 
  
    EGAD00001010036 
   
  
    
    This dataset includes trios of germline/constitutional, primary tumor (small bowel carcinoid), and metastatic tumor (liver) trios for 5 patients (i.e. 15 samples total). Constitutional, primary tumor, and metastatic tumor samples all underwent whole exome sequencing (WES or WXS). Primary and metastatic tumor samples underwent RNA sequencing (RNA-seq). 
    
   
  
    
   
  15 
 
  
    EGAD00001010037 
   
  
    
    This dataset is part of a study that aims to provide a spatially resolved single-cell multiomics map of human trophoblast differentiation in early pregnancy. This dataset contains snucRNAseq; from three human implantation sites (between 8 and 12 post-conceptional weeks, PCW) from medical hysterectomies. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD00001010038 
   
  
    
    This dataset is part of a study that aims to provide a spatially resolved single-cell multiomics map of human trophoblast differentiation in early pregnancy. This dataset contains 10x multiome snRNA-seq/snATAC-seq from human implantation sites, decidual and placental samples from 8-9 PCW. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001010039 
   
  
    
    Whole-exome sequencing was performed on DNA extracted from blood samples of 50 children diagnosed with cutaneous melanoma prior to 18 years of age. Patients were all diagnosed in Queensland, Australia, and self-reported as of European descent. Sequencing was done the Illumina platform using SureSelct V7 Post capture kits. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001010040 
   
  
    
    scRNA-seq analysis of 384 placental immune cells 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010041 
   
  
    
    The cohort comprised of 48 pediatric patients with 21 different relapsed or refractory solid neoplasms. This cohort was analysed by RNASeq. The corresponding datasets contains fasq files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  46 
 
  
    EGAD00001010042 
   
  
    
    Molecular analysis of cancer genomes in children with Lynch syndrome: exploring causal associations WGS 
    
   
  
    
   
  1 
 
  
    EGAD00001010043 
   
  
    
    Molecular analysis of cancer genomes in children with Lynch syndrome: exploring causal associations WXS 
    
   
  
    
   
  1 
 
  
    EGAD00001010046 
   
  
    
    WES/WGS sequencing data of 37 germline runs, which were uploaded to umbrella studies. The sequencing was always paired. The WGS sequencing was on HiSeq X Ten using the Illumina TruSeq DNA Nano Kit. The WES Sequencing was on HiSeq4000 with Agilent Sureselect V5+UTR. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001010047 
   
  
    
    The control samples (mostly blood) of 351 samples (paired WGS and WES sequencing) are in this dataset. The WGS was in nearly all cases at an Illumina HighSeq X Ten with the Illumina TruSeq Nano DNA Kit. The WES mostly on Illumina HighSeq 4000 with the Agilent SureSelect V5 plus UTRs Kit. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  353 
 
  
    EGAD00001010049 
   
  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001010050 
   
  
    
    Dataset includes 1) RNAseq for Bob_Ngn2 cell differentiation from iPSC to iNeuron stage (time point 0, 24, 48 and 96 hours). 2) Bob_Ngn2 cell line chip-seq (H3K4me3, H3K4me1, H3K27me3, H3K9me3, H3K27ac, H3K36me3) for both iPSC and iNeuron stage with 3 replicates at each time point. 3) Single cell CRISPR activation experiment with 96 endogenous genes. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD00001010051 
   
  
    
    Dataset of CageKid Targeted Sequencing DNA samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  1022 
 
  
    EGAD00001010052 
   
  
    
    Samples of blood, muscle and fat were collected from individuals with TS (n = 33) and KS (n = 22) and from male (n = 16) and female (n = 44) controls. The RNA-seq libraries were multiplexed paired-end sequenced on an Illumina Novaseq 6000 (100 bp) and subjected to initial quality control using FastQC (BAbraham Bioinformatics). In addition to trimming of low-quality ends, adaptor removal was conducted using Trim Galore with default settings (BAbraham Bioinformatics). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  212 
 
  
    EGAD00001010055 
   
  
    
    We have paired 39 individuals in 4 conditions: T0_RPMI (baseline, before BCG and without LPS); T0_LPS (before BCG and with LPS); T3m_RPMI (3 months after BCG, without LPS); T3m_LPS (3 months after BCG, with LPS). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001010056 
   
  
    
    This multi-centre, non-randomized, open-label, phase II trial (NCT03016338), assessed niraparib monotherapy (cohort 1, C1), or niraparib and dostarlimab (cohort 2, C2) in patients with recurrent serous or endometrioid endometrial carcinoma. The primary endpoint was clinical benefit rate (CBR). Secondary outcomes were safety and objective response rate (ORR). Translational research was an exploratory outcome. Potential biomarkers were evaluated in archival tissue by immunohistochemistry and next generation sequencing panel. Feasibility of liquid biopsy by ctDNA was assessed. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  86 
 
  
    EGAD00001010057 
   
  
    
    We analyzed multiple myeloma samples from two patients included in the observational prospective cohort MYRACLE before talquetamab treatment and after relapse. Five other myelomas from the same cohort were included for comparison. Normal plasma cells were also retrieved. All samples were analyzed by whole genome sequencing and single-nucleus Multiome, except one that could only be analyzed by bulk RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001010059 
   
  
    
    Dataset including paired tumor-normal whole-genome deep-sequenced samples from 18 neuroblastoma patients (part1 of a total of 36 patients). 
    
   
  
    
      
      unspecified 
      
    
   
  36 
 
  
    EGAD00001010063 
   
  
    
    Dataset including paired tumor-normal whole-genome deep-sequenced samples from 18 neuroblastoma patients (part 2 of a total of 36 patients). 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001010064 
   
  
    
    38 STEMI patients at hospital admission, 24 hours (acute phase) and 6-8 weeks (chronic phase) after STEMI 
    
   
  
    
      
      unspecified 
      
    
   
  9 
 
  
    EGAD00001010065 
   
  
    
    Dataset for the paper: Non-muscle Invasive Bladder Cancer Molecular Subtypes Predict Differential Response to Intravesical Bacillus Calmette-Guérin 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  327 
 
  
    EGAD00001010066 
   
  
    
    Whole genome sequencing data of 7 high-grade serous carcinoma (HGSC) patients (32 samples) sequenced with HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  - 
 
  
    EGAD00001010067 
   
  
    
    Bank of metastasis-derived organoids (LMO) 
    
   
  
    
      
      unspecified 
      
    
   
  220 
 
  
    EGAD00001010068 
   
  
    
    156 samples of shot-gun gut metagenomics, corresponding to 51 patients with CRCm 54 patients with
adenoma, and 51 healthy controls 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001010069 
   
  
    
    Dataset contains all available exome sequencing paired-end fastq files from our study "A generalizable machine learning framework for classifying DNA repair defects using ctDNA exomes" 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  310 
 
  
    EGAD00001010070 
   
  
    
    Targeted gene panel sequencing for 206 genes with relevance in normal an leukemic lymphopoiesis was performed in bone marrow / peripheral blood samples from n=96 patients with first diagnosis of B cell precursor acute lymphoblastic leukemia. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  96 
 
  
    EGAD00001010071 
   
  
    
    This dataset includes all data produced in the study describing "scEC&T-seq", a method for parallel sequencing of extrachromosomal circular DNA and transcriptome in single cells. This dataset includes: 
- Illumina scEC&T-seq Circle-seq data (scCircle-seq) for a total of 626 single cells / nuclei  - bam files
- Illumina scEC&T-seq RNA-seq data (scRNA-seq-Illumina) for the same single cells / nuclei - bam files 
- Nanopore scCircle-seq data for 18 single cells - bam files
- Nanopore bulk WGS for 2 cell lines and 2 primary tumor samples - bam files
- Illumina bulk WGS for 2 cell lines - bam files
- Illumina bulk Circle-seq data from 1 cell line - bam file
- Illumina ChIP-seq H3K27me3 data from 1 cell line - fasta files + peaks bed file + coverage bw file 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina MiniSeq 
      
      Illumina NovaSeq 6000 
      
      MinION 
      
    
   
  1180 
 
  
    EGAD00001010073 
   
  
    
    This data set include FASTQ files for five experiments:
Human scRNA-Seq data (6 samples),
Human Visium spatial transcriptomic data (3 samples),
Mouse scRNA-seq data (4 samples),
Mouse scATAC-seq data (2 samples),
Mouse ChIP-seq data (8 samples).
For the mouse scATAC-seq data (2 samples), there are originally three FASTQ files, R1, R2 and R3 files. The R1 and R2 FASTQ files were merged into one larger file ("R1R2") per sample for submission as paired-end sequencing setting to follow the EGA guidelines. They can be split into individual R1 and R2 files in order to be processed by Cell Ranger software.
The files with IC1 suffix are CHIP-seq Input control for anti-KDM6B antibody pull down experiments and the files with IC2 suffix are CHIP-seq Input control for anti-H3K27ME3 antibody pull down experiments in murine model. 
    
   
  
    
      
      NextSeq 500 
      
      unspecified 
      
    
   
  22 
 
  
    EGAD00001010074 
   
  
    
    Data supporting: “Single-cell RNA sequencing unifies developmental programs of Esophageal and Gastric Intestinal Metaplasia.” Nowicki-Osuch, Zhuang et al.
scRNAseq (FASTQ files)
59 samples 
    
   
  
    
      
      unspecified 
      
    
   
  16 
 
  
    EGAD00001010075 
   
  
    
    WGS was performed for five Japanese subjects. DNA samples isolated from whole blood were sequenced at Macrogen Japan Corporation. All libraries were constructed using the TruSeq DNA PCR-Free Library Preparation Kit according to the manufacturer’s protocols. Libraries were sequenced on HiSeqX (Illumina, San Diego, CA, USA) or Novaseq6000 (Illumina, San Diego, CA, USA). 
    
   
  
    
      
      Illumina HiSeq 3000 
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001010076 
   
  
    
    This datasets contains raw sequencing fastq data of 17 samples from 7 donors of single cell RNA using10x genome technology of human postmenopausal fallopian tube and ovary tissues. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  17 
 
  
    EGAD00001010077 
   
  
    
    This datasets contains raw sequencing fastq data of14 samples from 5 donors of single cell ATAC using10x genome technology of human postmenopausal fallopian tube and ovary tissues. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD00001010078 
   
  
    
    RNA-sequencing profiling of leucocytes from peripheral blood samples from 9 KS patients, 9 control males and 13 female controls 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001010079 
   
  
    
    446 samples of covid19 patients. Raw Reads in fastq format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  446 
 
  
    EGAD00001010080 
   
  
    
    This depository contains data from two bulk RNA sequencing experiments:
1) Bulk RNA sequencing data of peripheral blood neutrophils from healthy donors cultured with a) human adipose-derived stromal cells (ADSC) as a model for mesenchymal stromal cells (MSC), and b) IL-1β stimulated ADSC as a model for inflammatory MSC as found in multiple myeloma (MM). 
2) Bulk RNA sequencing data from ADSCs cultured a) without stimuli, b) with recombinant human IL-1β, c) with supernatant from iMSC-like cells, d) with neutrophils previously cultured with MSC, e) with neutrophils previously cultured with iMSC, f) with neutrophils previously cultured with iMSC in the presence of anti-human IL-1β or g) with neutrophils previously cultured with iMSC in the presence of an isotype control. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  41 
 
  
    EGAD00001010081 
   
  
    
    28 patients. Cell-free DNA and leukocyte DNA, both from before any neoadjuvant treatment. Tumor FFPE tissue (plus metastasic tissue for some patients) from after any neoadjuvant treatment. Some cfDNA from follow-up time points (e.g. after neodj. treatment or in the clinical course). IDT xgen Pan-Cancer panel for hybridisation capture. Illumina TruSeq library prep for cfDNA and leukocyte DNA, Illumina DNA Prep for FFPE DNA. n=150 libraries in total. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  150 
 
  
    EGAD00001010082 
   
  
    
    Technical replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010083 
   
  
    
    Technical replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010084 
   
  
    
    Freezing replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010085 
   
  
    
    Freezing replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010086 
   
  
    
    FACS processing technical replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010087 
   
  
    
    FACS processing technical replicates 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010088 
   
  
    
    time-course biological replicates; 10x lane replicate 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010089 
   
  
    
    time-course biological replicates; 10x lane replicate 1 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010090 
   
  
    
    time-course biological replicates; 10x lane replicate 2 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010091 
   
  
    
    time-course biological replicates; 10x lane replicate 2 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010092 
   
  
    
    time-course biological replicates; 10x lane replicate 3 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010093 
   
  
    
    time-course biological replicates; 10x lane replicate 3 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010094 
   
  
    
    Fastq files of single-cell RNA-sequencing data generated with 10X Genomics of twelve non-invasive cervical samples from pregnant women (7-12 weeks gestational age) and six placental biopsies from patients who had a recurrent miscarriage early during gestation (<12 weeks gestational age). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001010095 
   
  
    
    Whole-genome sequencing data in the form of multi-sample, per-chromosome VCFs for n=449 individuals across 47 unique ethnolinguistic groups. 
    
   
  
    
   
  1 
 
  
    EGAD00001010097 
   
  
    
    In order to investigate possible mechanisms underlying the phenotype of cell fitness decline observed following decrease of VRK3 in pontine DMG-K27 altered cells 7, we examined differences in global gene expression. RNA-seq was performed 44h and 60h post-transduction with two distinct shRNAs targeting VRK3 to be able to evaluate the early impact of VRK3 knock-down (KD) in four independent in vitro models of DMG 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD00001010098 
   
  
    
    Dataset contains 483 Irish origin Individuals with Covid19. For WGS, Alignment has been done using BWA-mem (sention v 201808.03 ) and BAM file generated. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  483 
 
  
    EGAD00001010099 
   
  
    
    The Dataset contains 446 Covid19 patient's RNASeq Alignment files in BAM format for both genomic and transcriptomic alignment. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  446 
 
  
    EGAD00001010100 
   
  
    
    These are aligned paired-end reads from Illumina NovaSeq 6000 whole-genome sequencing of 4 cfDNA samples extracted from blood plasma (plasma-Seq). Three samples from patients with breast cancer, prostate cancer, or colorectal cancer and one sample from a healthy individual were aligned to GRCh38 (GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set). The observed GC bias is different in each of these cfDNA samples which leads to different average GC content per sample. This bias is corrected using the GCparagon commandline tool. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001010101 
   
  
    
    Bank of primary sites (PRs) colorectal cancer of Patient Derived Xenografts (PDXs) 
    
   
  
    
      
      unspecified 
      
    
   
  159 
 
  
    EGAD00001010102 
   
  
    
    Whole exome and RNA sequencing of 5 samples of patient-derived xenograft (PDX). Available files are raw sequencing fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001010103 
   
  
    
    This dataset included 19 paired diagnostic and remission samples with high hyperdiploid acute lymphoblastic leukemia (ALL) that were collected from four different cohorts: the Division of Clinical Genetics, Lund University, Sweden. All samples were subjected to whole genome sequencing using the Illumina HiSeqX platform. Paired-end sequencing (2x150bp) was done to ~60x coverage for diagnostic samples and ~30x coverage for remission samples. The paired-end reads were aligned to the human reference genome GRCh37 (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/all_assembly_versions/GCF_000001405.25_GRCh37.p13/GCF_000001405.25_GRCh37.p13_genomic.fna.gz) by the Burrows-Wheeler Aligner tool (version 0.7.17). Duplicate reads marking and local realignment were performed by GATK (version 4.0.11.0). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  23 
 
  
    EGAD00001010104 
   
  
    
    LLD PhIP-Seq reads. Oligopeptides were designed at Eran's Segal group at Weizmann Institute of Science. 
Dataset include 1,783 plasma samples. In 340 participants, a second time point was taken after a 4-year follow-up. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1784 
 
  
    EGAD00001010105 
   
  
    
    This submission contains gzipped fastq files from paired-end targeted sequencing. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  242 
 
  
    EGAD00001010106 
   
  
    
    This dataset contains scRNA-seq fastq files to the paper entitled "Single-cell profiles reveal distinctive immune response in atopic dermatitis in contrast to psoriasis". The details of experiment setup was described in the paper. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001010108 
   
  
    
    Whole genome, exome and RNA sequencing of TFCP2-rearranged rhabdomyosarcoma, 86 samples of paired fastq files, sequenced on: 
Illumina HiSeq 2500 using Agilent SureSelect WGS,
Illumina HiSeq 2500 using Illumina TruSeq RNA,
Illumina HiSeq 2500 using Agilent SureSelect v5 WES (+UTRs),
Illumina HiSeq 4000 using Agilent SureSelect WGS,
Illumina HiSeq 4000 using Illumina TruSeq stranded mRNA Kit,
Illumina HiSeq 4000 using Agilent SureSelect v5 WES stranded mRNA Kit,
Illumina NovaSeq 6000 using Illumina TruSeq Stranded mRNA,
Illumina NovaSeq 6000 using Illumina TruSeq Nano DNA,
Illumina HiSeq X Ten using TruSeq Nano DNA. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  50 
 
  
    EGAD00001010109 
   
  
    
    Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had breast cancer and those who are BRCA 1/2 carriers.
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  48 
 
  
    EGAD00001010110 
   
  
    
    Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 4000 
      
    
   
  92 
 
  
    EGAD00001010111 
   
  
    
    Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD00001010112 
   
  
    
    Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  67 
 
  
    EGAD00001010113 
   
  
    
    Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001010114 
   
  
    
    Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  251 
 
  
    EGAD00001010115 
   
  
    
    Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  315 
 
  
    EGAD00001010116 
   
  
    
    Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  199 
 
  
    EGAD00001010117 
   
  
    
    Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . 
This dataset contains all the data available for this study on 2023-03-08. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  480 
 
  
    EGAD00001010118 
   
  
    
    PhIP-Seq experiment conducted in Eran Sigal's group at WIS. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  497 
 
  
    EGAD00001010119 
   
  
    
    Formalin-fixed, paraffin-embedded samples from 643 colorectal adenomas collected in different hospitals in Norway, from which DNA was extracted, were analysed for DNA copy number alterations. Some adenomas had more than one block (n=42), thus 643 individuals, 685 blocks. Low-coverage whole genome sequencing was run in all samples. For 529 individuals all the clinical information was available. A subset was matched for follow-up time, age and sex in a nested case-control approach (n=366; cases - individuals who developed later on CRC; controls - individuals who did not develop CRC within the same follow-up time). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  685 
 
  
    EGAD00001010120 
   
  
    
    Microhaplotype amplicon sequencing of cervical samples (n=10), parental DNA (n=20), cfDNA (n=10) and control experiments using HapMap DNA in different spike-in percentages. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  81 
 
  
    EGAD00001010121 
   
  
    
    Early-stage Luminal B breast cancer is frequent and is a major cause of breast cancer death due to its poor prognosis. Our proposal aims to study the biology behind the sensitivity and resistance of Luminal B breast cancer to chemotherapy (CHT) or a non-CHT regimen composed of hormone therapy in combination with ribociclib, a CDK4/6 inhibitor. To accomplish this, we first completed the SOLTI-1402 CORALLEEN phase II trial, a study where 106 patients with early-stage Luminal B breast cancer were randomized to standard neoadjuvant CHT for 6 months, or neoadjuvant letrozole and ribociclib for 6 months. After treatment, patients underwent surgery. The primary results of the study, which showed that the response rate to letrozole+ribociclib was similar to CHT, was reported (Prat et al; Lancet Oncol). Tumor biopsies were available at baseline, week 3 and surgery. A total of 257 samples were analyzed using the Illumina TruSeq Stranded Total RNA w/Ribo Zero Gold with MiSeq in TGL (Sequencer NovaSeq S4/PE/100x) 
    
   
  
    
   
  257 
 
  
    EGAD00001010122 
   
  
    
    In this study we aim to characterise the landscape of mutation and clonal selection in normal lung and premalignant lung disease. The study combines targeted sequencing and whole-genome sequencing of microbiopsies of lung and bronchial epithelium. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with premalignant lung disease. . 
This dataset contains all the data available for this study on 2023-03-09. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010123 
   
  
    
    Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development. . 
This dataset contains all the data available for this study on 2023-03-09. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010124 
   
  
    
    Whole genome sequencing to identify subclonal variants for subsequent mapping back to fixed tissue specimens.  . 
This dataset contains all the data available for this study on 2023-03-09. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010125 
   
  
    
    This project is correlating the molecular profiling of renal tumours with multiparametric and 13C-MRI including by 13C-MRSI. . 
This dataset contains all the data available for this study on 2023-03-09. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001010126 
   
  
    
    Melanoma is the most aggressive type of skin cancer, causing about 75% of dermatological cancer deaths. Acral lentiginous melanoma (ALM) is the most common subtype of melanoma in admixed Latin American populations, but very few tumour genomes and exomes, all from European-descent individuals, have been analysed across several studies. Because of this, the genomic landscape of ALM is mostly unknown. Our aim in this project is to define this landscape and identify driver somatic alterations by whole-exome sequencing a collection of ALM germline/tumour paired FFPE samples from the National Cancer Institute of Mexico. . 
This dataset contains all the data available for this study on 2023-03-09. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001010128 
   
  
    
    Whole genome sequencing data from paediatric (<18-y) ETV6-RUNX1 fusion positive acute lymphoblastic leukemias.
Dataset includes fastq and BAM files from diagnostic and remission (control) samples of 33 patients. Dataset consists of two experiments 
depending on sequencing instrument; "Experiment 1" sequenced by using Illumina Hiseq X Ten instrument, and "Experiment 2" Illumina Novaseq 6000, respectively. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  66 
 
  
    EGAD00001010129 
   
  
    
    We perform targeted sequencing on 1217 pre-malignant gastric biopsies. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1900 
 
  
    EGAD00001010130 
   
  
    
    This dataset contains single cell RNA-seq data of stromal cells derived from two PDX models (N = 2 in total) and bulk RNA-seq data of two PDX models treated with gemcitabine and our novel antibody-drug conjugate, C6-EBET (N = 59 in total). Bulk RNA-seq experiments were performed with Agilent SureSelect Strand Specific RNA Library Prep Kit (Agilent). Single cell RNA-seq experiments were performed with Chromium Single Cell 3' Reagent Kits v2 Chemistry (10x Genomics). 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  61 
 
  
    EGAD00001010131 
   
  
    
    Bulk RNAseq from 183 premalignant gastric biopsies 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  183 
 
  
    EGAD00001010134 
   
  
    
    patient-derived head and neck cancer organoids 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD00001010135 
   
  
    
    FASTQ reads for 34 matched tumour-normal WGS pairs for high grade serous ovarian cancer patients. Scottish HGSOC samples were collected via local Bioresource facilities at Edinburgh, Glasgow, Dundee and Aberdeen and stored in liquid Nitrogen until required. HGSOC patients were determined from pathology records and were included in the study where there was matched tumour and whole blood samples. Tumour samples were divided into two for DNA and RNA extraction and slivers of tissue were taken, fixed in formalin and embedded in paraffin wax (FFPE). Samples were only included if they were confirmed as HGSOC and there was greater than 40% tumour cellularity throughout the tumour, determined using H&sE staining of the FFPE sections and pathology review. Somatic DNA was extracted from the tumour and germline DNA was extracted from whole blood. Somatic DNA was extracted using the Qiagen DNeasy Blood and tissue kit (cat no 69504). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including RNase digestion step). Germline DNA was extracted from 1-3ml whole blood using the Qiagen FlexiGene kit (cat no 51206) following the manufacturers recommended protocol. The resulting DNA underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as surrogate measures of DNA purity. A260/280 had to be 1.8 or greater and A260/230 had to be 2.0 or greater. Then, DNA was quantified using LifeTechnologies Qubit dsDNA BR kit (cat no Q32850) and we required a minimum of 50ul at 25ng/ul for WGS. Thirdly, DNA was diluted to 25ng/ul and a representative sample was loaded onto a 0.8% TAE gel, ran at 100v for 60mins and then imaged using a BioRad ChemiDoc imaging system to visualise the DNA quality. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  68 
 
  
    EGAD00001010136 
   
  
    
    DDD resource files (e.g. link between sample and individual ids) 
    
   
  
    
   
  - 
 
  
    EGAD00001010137 
   
  
    
    Candidate diagnostic variants reported into DECIPHER by 4 April 2022, annotated with clinical and automated pathogenicity assertions (see DOI: 10.1056/NEJMoa2209046). Genomic Diagnosis of Rare Pediatric Disease in the United Kingdom and Ireland, Wright et al, NEJM 2023. 
    
   
  
    
   
  - 
 
  
    EGAD00001010138 
   
  
    
    483 samples were collected for WGS and aligned with GRCh38 human genome. The variants were called using GATK (Sentieon v. 201808.03) in GVCF and VCF format. 
    
   
  
    
   
  - 
 
  
    EGAD00001010139 
   
  
    
    Somatic RNA for 37 samples was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500. Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and 2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  37 
 
  
    EGAD00001010140 
   
  
    
    This dataset contains 55 Whole Genome Sequencing of the study titled Spatial transcriptomics reveal topological immune landscapes of Asian head and neck angiosarcoma. 
    
   
  
    
   
  59 
 
  
    EGAD00001010141 
   
  
    
    Shallow whole-genome sequencing data divided into three groups:
- sWGS data from Pap smears of patients with confirmed high grade serous ovarian cancer
- sWGS data from matched tumor tissue (at diagnosis) from the same patients
- sWGS data from Pap tests smears of healthy women 
    
   
  
    
      
      NextSeq 550 
      
    
   
  186 
 
  
    EGAD00001010142 
   
  
    
    The PGAP dataset 2 includes 82 whole genome sequences for Papua New Guinean individuals sampled in Daru (N=1), Port Moresby (N=64) and Mount Wilhelm (N=17). DNA was extracted from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  82 
 
  
    EGAD00001010143 
   
  
    
    The PGAP dataset 1 includes 81 whole genome sequences for Papua New Guinean individuals sampled in Daru (N=38) and Mount Wilhelm (N=43). DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  81 
 
  
    EGAD00001010144 
   
  
    
    Processed somatic variant calls 
    
   
  
    
   
  1 
 
  
    EGAD00001010145 
   
  
    
    Core phenotypic variables 
    
   
  
    
   
  1 
 
  
    EGAD00001010146 
   
  
    
    Whole-transcriptome sequencing (RNAseq) in patients with chronic rhinosinusitis with nasal polyps (CRSwNP) and controls. Differential expression patterns in genes involved in cilia, viral defense and NKT-cell specific pathways, suggesting a role of viral immunity in combination with cilia functionality in CRSwNP. 
    
   
  
    
   
  53 
 
  
    EGAD00001010147 
   
  
    
    This dataset includes an analyzed DMP file that provide the information about differential methylation positions based on Illumina Infinium MethylationEPIC BeadChip. All samples (5 lung cancer cases vs. 5 benign lung disease controls) were obtained from bronchial washings at the site of the lesion under bronchoscopy manipulation.  Of the five lung cancer cases, 3 are adenocarcinoma and 2 are squamous carcinoma. 
    
   
  
    
   
  1 
 
  
    EGAD00001010148 
   
  
    
    Contains IMCISION samples sequenced on Flongle flowcells 
    
   
  
    
      
      MinION 
      
    
   
  1 
 
  
    EGAD00001010149 
   
  
    
    Contains Synthetic samples sequenced on R9 flowcells 
    
   
  
    
      
      MinION 
      
    
   
  3 
 
  
    EGAD00001010150 
   
  
    
    Contains Healthy samples sequenced on Flongle flowcells 
    
   
  
    
      
      MinION 
      
    
   
  2 
 
  
    EGAD00001010151 
   
  
    
    Contains Healthy samples sequenced on R9 flowcells 
    
   
  
    
      
      MinION 
      
    
   
  7 
 
  
    EGAD00001010152 
   
  
    
    Contains Synthetic samples sequenced on Flongle flowcells 
    
   
  
    
      
      MinION 
      
    
   
  6 
 
  
    EGAD00001010153 
   
  
    
    Contains PREDICT samples sequenced on R10 flowcells 
    
   
  
    
      
      MinION 
      
    
   
  2 
 
  
    EGAD00001010154 
   
  
    
    Contains IMCISION samples sequenced on R9 flowcells 
    
   
  
    
      
      MinION 
      
    
   
  13 
 
  
    EGAD00001010155 
   
  
    
    Contains PREDICT samples sequenced on R9 flowcells 
    
   
  
    
      
      MinION 
      
    
   
  28 
 
  
    EGAD00001010156 
   
  
    
    We stratified 69 primary IDH-wt GBM patients into TMZ-resistant (n = 29) and sensitive (n = 40) groups, using TMZ screening of the corresponding patient-derived glioma stem-like cells (GSCs). Genomic and transcriptomic features were then examined to identify TMZ-associated molecular alterations. Subsequently, we developed a machine learning (ML) model to predict TMZ response from combined signatures. Moreover, TMZ response in multisector samples (52 tumor sectors from 18 cases) was evaluated to validate findings and investigate the impact of intra-tumoral heterogeneity on TMZ efficacy. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  113 
 
  
    EGAD00001010157 
   
  
    
    Whole genome sequencing of 5 IM samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001010158 
   
  
    
    NCCS-NSCLC-ITH2 dataset of 185 sectors from 48 patients with early-stage non-small cell lung cancer diagnosed in National Cancer Centre Singapore; these are paired-end, whole-exome and bulk RNA sequencing data, sequenced by Illumina HiSeq 4000/2000. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  336 
 
  
    EGAD00001010159 
   
  
    
    This dataset contains in solution target-enrichment bisulfite sequencing of placental tissue, buffy coat and plasma DNA from pregnant women. Blood samples were taken for cell-free DNA (cfDNA) DNA extraction from 64 women at the time of early-onset preeclampsia (PE) diagnosis, or from 38 controls (uncomplicated pregnancies) at a similar gestational age that did not develop preeclampsia subsequently. Among these subjects, plasma samples from 7 PE patients and 6 controls were also subjected to oxidative bisulfite sequencing. Placental tissues from 11 PE and 26 control subjects after delivery, and buffy coat from 16 PE and 16 control subjects at the same time of cfDNA sampling were profiled. A discovery cohort for early PE assessment in the first trimester was collected. In this cohort, cfDNA from 75 pregnancies that went on to develop early-onset PE and from 124 matched controls were collected and methylome sequencing were carried out. An independent validation cohort to validate early PE assessment with methylome profiling was collected as well. This validation cohort includes cfDNA samples from 61 PE and 136 control pregnancies. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  604 
 
  
    EGAD00001010161 
   
  
    
    77 samples collected from 35 multiple myeloma patients.
Each patient provided one healthy sample and one primary tumor sample and, in some cases, also  samples collected after the progression of the disease. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  77 
 
  
    EGAD00001010162 
   
  
    
    tissue, organoid and normal bam files for whole exome samples 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001010163 
   
  
    
    Batch RNA sequencing of passages 5-7 of three patient-derived monolayer glioblastoma cultures in which TMZ and BMP4 synergize. 3 biological replicates, either untreated, only temozolomide, only BMP4 or temozolomide + BMP4 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001010164 
   
  
    
    High-throughput transcriptome sequencing data from paediatric (<18-y) ETV6-RUNX1 fusion positive acute lymphoblastic leukemias.
Dataset includes fastq and BAM files from diagnostic samples of 33 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001010165 
   
  
    
    Human small non-coding RNA sequencing of serum from sons of PCOS mothers (n=9) and sons of control mothers (n=9), see publication for details. 
    
   
  
    
   
  18 
 
  
    EGAD00001010166 
   
  
    
    Single-cell RNA sequencing on 10 antrum and 4 body gastric biopsies 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  14 
 
  
    EGAD00001010168 
   
  
    
    scRNAseq dataset of colonic organoids derived from epithelium from biopsies taken from three healthy human individuals. The organoids have either been grown in standard conditions (control) or treated with IL22 (treated). Includes 6 samples in total, one control from each individual (ctrl1, ctrl2, ctrl3) and one treated from each (treat1, treat2, treat3). 
The samples have been multiplexed using the antibody hashing technique. The 6 samples have been pooled into the one organoids sample. In order to analyse the raw files, they have to be demultiplexed first. Information necessary for demultiplexing, as well as which files belong to which sample, can be found in the map_file.csv, attached to each sample.   
Dataset includes raw Fastq files and processed csv count matrices. Fastq files are divided into HTO (hashtag) and RNA (transcriptome) files. HTO has one index (I1) and two read (R1, R2) files and RNA has two index (I1, I2) and two read (R1, R2) files. The fastq files are for the pooled (organoids) sample and need to be demultiplexed. Count matrices contain comma-separated values with cell barcodes as column names and gene names as row names. Since count matrices have been created after the demultiplexing step, there’s one matrix for each of the 6 individual samples. 
scRNA-seq data from human colon organoids was analysed in the same manner as for the Colitis dataset, apart from the following changes. Data was generated with the Cell Hashing technique, which uses oligo-tagged antibodies against surface proteins to barcode single cells. This allows for samples to be multiplexed together and run in a single experiment. The data was demultiplexed using the HTODemux() function from Seurat (Hao et al., 2021). 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD00001010169 
   
  
    
    This dataset includes WES and RNAseq from 11 patients with metastatic melanoma, lung, kidney, and stomach cancers, enrolled in phase I clinical trials of TIL ACT (NCT03475134 & NCT04643574). WES was performed on matched cancer and healthy samples, whereas RNAseq was performed on cancer tissues, using Illumina HiSeq 2500/4000 and Illumina NextSeq 550 systems. Paired-end reads are provided in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  36 
 
  
    EGAD00001010170 
   
  
    
    This dataset contains the open chromatin profiles of 8 patient H3-K27M mutant DMGs utilizing the single-cell/nucleus assay for transposase-accessible chromatin using sequencing (snATAC-seq) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001010171 
   
  
    
    Paired RNA-Seq was performed on 125 samples of low grade pediatric glioma. The sequencing was done with Illumina Novaseq 6000 and the Illumina TruSeq stranded mRNA kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  124 
 
  
    EGAD00001010172 
   
  
    
    Paired RNA seq of wild type VDH15 cells (3 replicates) - a cell line of oral squamous cell carcinoma (OSCC). RNA was extracted and sequencing libraries were prepared using TruSeq Stranded mRNA Library Prep Kit following manufacturer's protocol. The sequencing was done using Illumina Novaseq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001010173 
   
  
    
    Fastq files of WES data. Primary tumor and/or metastasic samples of three chRCC patients. 
DNA libraries were produced using the SureSelect XT HumanAllExon V5 kit (Agilent
Technologies), and sequencing was performed in a HiSeq instrument (Illumina) using a 100-bp
paired-end mode. 
    
   
  
    
   
  6 
 
  
    EGAD00001010174 
   
  
    
    Nanostring DSP spatial profiles of 8 patients whose antral sections contained histologically normal, IM, GC, lymphoid aggregates, and stromal regions, representing 480 regions of interest (ROIs) and 76 CD45-segmented areas of illumination (AOIs). 
    
   
  
    
      
      unspecified 
      
    
   
  556 
 
  
    EGAD00001010175 
   
  
    
    organoid and tissue bam files from rna-seq experiment 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001010176 
   
  
    
    These data correspond to samples used in the following papers: 
1. GWAS and meta-analysis identifies 49 genetic variants underlying critical Covid-19
2. Genetic determinants of monocyte splicing are enriched for disease susceptibility loci including for COVID-19.
3. Identification of Genetic Determinants of Transcriptional Response to Metformin in Primary Human Monocytes.
They are the raw RNA sequencing files as used in all downstream analysis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  611 
 
  
    EGAD00001010178 
   
  
    
    iTHER is a prospective national precision oncology program aiming to define tumor molecular profiles in children and adolescents with primary very high-risk, relapsed, or refractory pediatric tumors in order to identify relevant aberrations to inform treatment. 
    
   
  
    
   
  1 
 
  
    EGAD00001010179 
   
  
    
    iTHER is a prospective national precision oncology program aiming to define tumor molecular profiles in children and adolescents with primary very high-risk, relapsed, or refractory pediatric tumors in order to identify relevant aberrations to inform treatment. 
    
   
  
    
   
  1 
 
  
    EGAD00001010180 
   
  
    
    We collected longitudinal samples from 15 patients with MCL at various clinical time points before and after CAT-T therapy brexucabtagene autoleucel (BA) infusion. The patients were grouped into three categories based on their clinical responses after BA treatment: 1) responsive (n = 9), 2) relapsed (n = 5), and 3) refractory (n = 1). All patients in category #1 and #2 had initially attained a complete response (CR) after BA therapy. The responsive group maintained CR with no relapse at the time of last follow up, while the relapsed group achieved initial CR but eventually relapsed. The 10x Chromium™ Single-Cell 5′ Reagent Kit v2 (PN-1000190, 10x GENOMICS) and Chromium Single-Cell Human TCR amplification Kit (PN-1000252, 10x GENOMICS) were used to perform single-cell separation, cDNA amplification, and library construction for gene expression and TCR repertoire following the manufacturer’s guidelines. Thirty-nine samples passed quality control and underwent single-cell transcriptome profiling with simultaneous single-cell T-cell receptor (TCR) repertoire analysis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  654 
 
  
    EGAD00001010181 
   
  
    
    27 Chip-Seq samples from Human CD4+ T cells 
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD00001010182 
   
  
    
    20 RNAseq samples of Human CD4+ cells 
    
   
  
    
      
      NextSeq 500 
      
    
   
  20 
 
  
    EGAD00001010183 
   
  
    
    Throughout an individual’s lifetime, genomic alterations accumulate in somatic cells. However, the mutational landscape induced by retrotransposition of long interspersed nuclear element-1 (L1), a widespread mobile element in the human genome, is poorly understood in normal cells. Here, we explored the whole-genome sequences of 406 normal colorectal clones, 12 MUTYH-associated adenomatous clones, and 19 matched colorectal cancer tissues. In addition, we analyzed promoter DNA methylation status of retrotransposition-competent L1 (in 139 clones) and read-through RNA expression profiles (in 116 clones) to investigate the epigenetic regulation of L1 activity. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  654 
 
  
    EGAD00001010184 
   
  
    
    This dataset includes Illumina EPIC Capture Sequencing Data of 376 samples from 188 men with prostate cancer. Samples were taken from primary tissue obtained at prostatectomy, with matched pathologically assessed non-cancer control material. This DNA methylation data includes donors and samples included in previously published WGS datasets (from CRUK-ICGC batches 1 to 3 [EGAD00001001116] and batches 4 to 6 [EGAD00001003225]; including the majority of donors used in Wedge et al, Nature Genetics 2018 [PMID: 29662167]).  The targeted DNA methylation sequencing data in this EGA dataset were generated using the Illumina TruSeq methyl capture method (EPICseq), covering over 3.3 million CpGs in the human genome, representing a total targeted hybridisation capture panel of 107Mbp.  According to the EPICseq protocol, DNA samples extracted from prostatectomy tissue samples were enriched for target regions using hybridisation capture, prior to bisulfite conversion, amplification and sequencing in pools of 12 samples (150 single end reads over two Illumina HiSeq4000 lanes).  This approach generated DNA methylation profiles from prostate cancer and control samples at base-pair resolution across millions of CpGs in the human genome. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  376 
 
  
    EGAD00001010185 
   
  
    
    The dataset contains the gene expression profile of each individual along with gene fusion events. 
    
   
  
    
   
  446 
 
  
    EGAD00001010186 
   
  
    
    Dataset contains 483 Irish origin Individuals with Covid19.  paired end sequencing has been performed For WGS and later processed by using sention v 201808.03. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  483 
 
  
    EGAD00001010187 
   
  
    
    Single-cell whole transcriptome and antibody expression for bone marrow samples from Cohorts A and B. CITEseq protocol was followed. 37 and 77 surface markers were measured in each cohort, respectively (see Supplementary Table 1). For details on cell sorting prior scRNAseq see the methods section of the manuscript. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  49 
 
  
    EGAD00001010188 
   
  
    
    Bulk ATAC libraries were generated for patient samples A.1, A.3, A.5, A.6, A.7, A.11, A.12, A.13 and A.15. For each patient a library of CD3- cells (myeloid cells, considered tumor cells) and CD3+ cells (T cells, considered healthy) were generated. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001010189 
   
  
    
    MutaSeq was applied to CD34+ cells from patient samples A.10, A.11 and A.12 of the study. The protocol followed is described in https://doi.org/10.1038/s41467-021-21650-1 and in the methods of the manuscript 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001010190 
   
  
    
    Libraries were constructed using  SureSelect HS XT Target Enrichment System v6 (Agilent). For each patient a library of CD3- cells (myeloid cells, considered tumor cells) and CD3+ cells (T cells, considered healthy) were generated. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001010191 
   
  
    
    Optimized 10x library to increase the coverage of the mitochondrial genome from 3’ 10x gene expression data. See details of the experimental method in the methods section of the manuscript 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD00001010192 
   
  
    
    Optimized 10x library to increase the coverage of selected nuclear variants from 3’ 10x Genomics scRNAseq data. SNVs were selected based on exome data and criteria described in the Supplementary Information of the manuscript.See details of the experimental method in the methods section of the manuscript 
    
   
  
    
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001010193 
   
  
    
    Targeted DNA sequencing was applied to colonies grown from single-cells of patient A.6. The protocol followed is described in https://doi.org/10.1038/s41467-021-21650-1 and the methods of the manuscript 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010194 
   
  
    
    In mammals, X-chromosomal genes are expressed from a single copy since males (XY) possess a single X chromosome, while females (XX) undergo X inactivation. To compensate for this reduction in dosage compared to two active copies of autosomes, it has been proposed that genes from the active X chromosome exhibit dosage compensation. However, the existence and mechanism of X-to-autosome dosage compensation are still under debate. Here, we show that X-chromosomal transcripts are reduced in m6A modifications and more stable compared to their autosomal counterparts. Acute depletion of m6A selectively stabilises autosomal transcripts, resulting in perturbed dosage compensation in mouse embryonic stem cells. We propose that higher stability of X-chromosomal transcripts is directed by lower levels of m6A, indicating that mammalian dosage compensation is partly regulated by epitranscriptomic RNA modifications. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001010195 
   
  
    
    94 human adipocyte samples isolated from whole adipose tissues using collagenase digestion of tissue and flotation of lipid-laden adipocytes, followed by RNA isolation and RNA sequencing (SMARTer Stranded Total RNA-Seq library preparation, HiSeq 4000 100-bp paired-end reads). Adipocyte samples comprise subcutaneous and visceral adipocytes isolated from obese and lean people (N=24 obese-subcutaneous, N=24 obese-visceral, N=22 control-subcutaneous, N=24 control-visceral). Human adipocyte RNA sequencing data are provided as BAM files. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  93 
 
  
    EGAD00001010196 
   
  
    
    This dataset contains 46 fastq files of paired-end RNA sequencing of an Illumina®️ TrueSeq stranded mRNA library of 23 glioblasoma PDX samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD00001010197 
   
  
    
    exome data from leiomyosarcoma, large cell neuroendocrine carcinoma, and clear cell sarcoma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD00001010198 
   
  
    
    Transcriptome sequencing from the Rare Cancer Research Foundation: leiomyosarcoma, large cell neuroendocrine carcinoma, clear cell carcinoma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001010199 
   
  
    
    Rare Cancer Research Foundation: leiomyosarcoma, large cell neuroendorcrine carcinoma, clear cell carcinoma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001010200 
   
  
    
    We performed whole genome sequencing on 84 LFS family members from 47 families: 22 with wildtype TP53 and 62 with variant TP53. The variant TP53 cohort consists of 49 individuals who developed cancer and 13 individuals who remain cancer-free; 34 were from 13 families with 2-4 individuals sequenced within a given family and the remaining 28 had no family members sequenced. The wildtype cohort consists of 14 individuals who developed cancer and 8 individuals who are cancer-free, from 6 families. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  80 
 
  
    EGAD00001010201 
   
  
    
    294 formalin-fixed paraffin-embedded (FFPE) tissue samples were sent to the UNC Lineberger Comprehensive Cancer Center Translational Genomics Lab (TGL) for RNA isolation using the Maxwell 16 MDx Instrument (Promega AS3000) and the Maxwell 16 LEV RNA FFPE Kit (Promega AS1260) following the manufacturer’s protocol (Promega 9FB167). 279 total RNA sequencing libraries were prepared at TGL using a Bravo Automated Liquid-Handling Platform (Agilent G5562A) and the TruSeq Stranded Total RNA Library Prep Gold Kit (Illumina 20020599) following the manufacturer’s protocol (Illumina 1000000040499). RNAseq library quality and quantity were measured using a TapeStation 4200 (Agilent G2991AA) and Qubit 3.0 fluorometer (Life Technologies Q33216), pooled at equal molar ratios, and denatured following the manufacturer’s protocol (Illumina 1000000106351). 271 total RNA sequencing libraries were sequenced at TGL on NovaSeq 6000 (Illumina 20012850) S4 flow cells (Illumina 20028313) following the manufacturer’s protocol (Illumina 1000000019358) using a 2x50 bp paired-end configuration and pool sizes of 91 libraries to target a read depth of 110 million clusters per library on average. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  271 
 
  
    EGAD00001010202 
   
  
    
    Whole genome sequencing data of 31 high-grade serous carcinoma (HGSC) patients (101 samples) sequenced with HiSeq X Ten  and BGISEQ-500 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  90 
 
  
    EGAD00001010203 
   
  
    
    Spatial characterization by TCR sequencing in tumor core and matched stroma in seven cases of triple negative breast cancer. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  14 
 
  
    EGAD00001010204 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_003 for Follicular lymphoma patient sample TFRI_Cont_2 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010205 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_003 for Follicular lymphoma patient sample TFRI_Cont_3 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010206 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_003 for Follicular lymphoma patient sample TFRI_Cont_4 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010207 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_003 for Follicular lymphoma patient sample TFRI_Cont_5 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010208 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_003A for Follicular lymphoma patient sample TFRI_Cont_6 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010209 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_003A for Follicular lymphoma patient sample TFRI_Cont_8 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010210 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_003A for Follicular lymphoma patient sample TFRI_Cont_9 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010211 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0187_003A for Follicular lymphoma patient sample TFRI_Cont_10 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010212 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_003A for Follicular lymphoma patient sample TFRI_Cont_11 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010213 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_003A for Follicular lymphoma patient sample TFRI_Cont_12 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010214 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_003A for Follicular lymphoma patient sample TFRI_Cont_13 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010215 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_001 for Follicular lymphoma patient sample TFRIPAIR2_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010216 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_001 for Follicular lymphoma patient sample TFRIPAIR3_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010217 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_001 for Follicular lymphoma patient sample TFRIPAIR4_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010218 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_001A for Follicular lymphoma patient sample TFRIPAIR5_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010219 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_001A for Follicular lymphoma patient sample TFRI_Pair_6_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010220 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_001A for Follicular lymphoma patient sample TFRIPAIR8_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010221 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_001A for Follicular lymphoma patient sample TFRIPAIR9_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010222 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0187_001A for Follicular lymphoma patient sample TFRIPAIR10_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010223 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_001A for Follicular lymphoma patient sample TFRIPAIR11_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010224 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_001A for Follicular lymphoma patient sample TFRIPAIR12_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010225 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_001A for Follicular lymphoma patient sample TFRI_Pair_13_FL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010226 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR2_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010227 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR3_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010228 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR4_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010229 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR5_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010230 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_6_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010231 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR8_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010232 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_9_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010234 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR11_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010235 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR12_DLBCL_rel 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010236 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_13_DLBCL 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010237 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_004A for Lymph node patient sample RLN_02_5Prime 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010238 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_005A for Lymph node patient sample RLN_03_5Prime 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010239 
   
  
    
    DLP+ Single Cell Genomic Library 98211 for Diffuse large B-cell lymphoma patient sample TFRIPAIR11_DLBCL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010240 
   
  
    
    DLP+ Single Cell Genomic Library A98180 for Diffuse large B-cell lymphoma patient sample TFRI_Pair_13_DLBCL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010241 
   
  
    
    DLP+ Single Cell Genomic Library A98193 for Follicular lymphoma patient sample TFRI_Pair_6_FL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010242 
   
  
    
    DLP+ Single Cell Genomic Library A98203 for Diffuse large B-cell lymphoma patient sample TFRIPAIR8_DLBCL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010243 
   
  
    
    DLP+ Single Cell Genomic Library A98205 for Follicular lymphoma patient sample TFRIPAIR9_FL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010244 
   
  
    
    DLP+ Single Cell Genomic Library A98208 for Diffuse large B-cell lymphoma patient sample TFRIPAIR10_DLBCL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010245 
   
  
    
    DLP+ Single Cell Genomic Library A98221A for Follicular lymphoma patient sample TFRIPAIR5_FL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010246 
   
  
    
    DLP+ Single Cell Genomic Library A98225 for Diffuse large B-cell lymphoma patient sample TFRIPAIR12_DLBCL_rel 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010247 
   
  
    
    DLP+ Single Cell Genomic Library A98288 for Follicular lymphoma patient sample TFRIPAIR2_FL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010248 
   
  
    
    DLP+ Single Cell Genomic Library A98297 for Follicular lymphoma patient sample TFRIPAIR3_FL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010249 
   
  
    
    DLP+ Single Cell Genomic Library A98167 for Diffuse large B-cell lymphoma patient sample TFRIPAIR4_DLBCL 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010250 
   
  
    
    Genomic data used in "ACT-Discover: identifying karyotype heterogeneity in pancreatic cancer evolution using ctDNA" 
    
   
  
    
      
      unspecified 
      
    
   
  111 
 
  
    EGAD00001010251 
   
  
    
    Single-cell RNA seq data of epithelial cells ((EpCAM-enriched by FACS) isolated from cryopreserved human lung tissue (3 samples, 3 donors) 
    
   
  
    
      
      NextSeq 550 
      
    
   
  3 
 
  
    EGAD00001010252 
   
  
    
    Bulk RNA seq data from human lung fibroblasts isolated from fresh and cryopreserved lung tissue (18 samples, 3 donors) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD00001010253 
   
  
    
    This dataset contains 210 fastq files (RNA sequencing was performed in two centers) from 105 individuals (106 files in subcutaneous tissue and 104 files in visceral tissue). Of the 210 fastq files, 129 files are in PEx100 mode (appeared in a single fastq file) and 81 files are in PEx49bp mode (appeared in two separate fastq files). Sequencing was done on the Illumina HiSeq2000 platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  204 
 
  
    EGAD00001010254 
   
  
    
    This dataset contains 28 fastq files (11 files for subcutaneous tissue and 17 files for visceral tissue) from nine individuals. All samples were initially sequenced by SEx50 mode (16 files) and some of them were also sequenced by PEx100 mode (12 files). Sequencing was done on the Illumina HiSeq2000 platform. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  18 
 
  
    EGAD00001010255 
   
  
    
    This dataset contains a vcf file for 99 GM individuals genotyped on the Illumina HumanOmni2.5 array . The vcf file is originated after imputation (IMPUTE2) and filtering for minor allele frequency MAF≥0.05, imputation confidence score INFO of >0.4 and Hardy-Weinberg Equilibrium (HWE) p>1e-06, yielding ~6.3 million variants. 
    
   
  
    
   
  1 
 
  
    EGAD00001010257 
   
  
    
    Sequencing data of 77 sarcoma tumor and control runs, which were uploaded to EGAS00001004813. The sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001010258 
   
  
    
    Sequencing data of 144 sarcoma tumor and control runs, which were uploaded to EGAS00001004813. The sequencing was always paired. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001010259 
   
  
    
    Dataset contains total RNA sequencing data from plasma of 65 different human donors: 30 diffuse large B-cell lymphoma (DLBCL) and 13 primary mediastinal large B-cell lymphoma (PMBCL) patients, and 22 cancer-free controls. Samples were sequenced on a NovaSeq 6000 and are provided in FASTQ format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD00001010264 
   
  
    
    Additional WGS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  3 
 
  
    EGAD00001010265 
   
  
    
    Additional WXS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001010268 
   
  
    
    16S bacterial amplicon sequencing data for Guangzhou cohort 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010269 
   
  
    
    This is the raw TCRseq data for the manuscript T cell receptor repertoire sequencing reveals chemotherapy-driven clonal expansion in colorectal liver metastases. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001010270 
   
  
    
    Additional WGS files for Genomic Landscape ALL paper titled "The genomic landscape of pediatric acute lymphoblastic leukemia" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  2 
 
  
    EGAD00001010272 
   
  
    
    Contains WES and RNA-seq for the 59 patients with first- and second-generation EGFR TKI-resistant metastatic EGFR-mutated NSCLC. 
    
   
  
    
   
  179 
 
  
    EGAD00001010273 
   
  
    
    This dataset contains raw fastq files from the RNA-Seq of 96 T-Acute Lymphoblastic Leukemia. Libraries where prepared using Agilent SureSelect XT-HS2 RNA Reagent Kit. As such, reads contain molecular barcodes that could be specifically handled using the AGeNT tool. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  96 
 
  
    EGAD00001010274 
   
  
    
    There is a need for quantitative measurements of evolutionary metrics in controlled clinical trials with long term follow-up information. This is particularly true in advanced localised prostate cancer, which can recur more than a decade after diagnosis. Here we mapped genomic intra-tumour heterogeneity in 642 tumour samples from 114 patients who took part in the IMRT and DELINEATE clinical trials, for which full clinical information and 12y median follow-up was available. We concomitantly assessed phenotypic (morphological) heterogeneity using Deep Learning in 1,923 histological sections from 250 IMRT patients (fully overlapping with the genetic set). This study shows that combining genomics with AI-aided histopathology in clinical trials leads to novel clinical biomarkers.
This EGA repository contains data produced from tumour samples using low coverage whole genome sequencing and a prostate cancer specific gene panel data following compression of unique molecular identifiers. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1272 
 
  
    EGAD00001010276 
   
  
    
    Sequencing data of 410 sarcoma tumor runs, which were uploaded to EGAS00001004813. The sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001010277 
   
  
    
    Paired WGS samples (tumor and control) of one Sarcoma case. The paired sequencing was done on Hiseq X Ten with Illlumina TruSeq Nano DNA. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2 
 
  
    EGAD00001010278 
   
  
    
    Paired Exome sequencing of Sarcoma tumor and control of 10 samples (5 tumor/control pairs). The sequencing was done on Illumina Hiseq 4000 and Agilent Sureselect V5+UTRs kit. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001010279 
   
  
    
    120 samples of cachectic and non cachectic. 240 Fastq files from Illumina metagenomic shothun paired End Sequencing 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010280 
   
  
    
    This dataset contains the raw RNA-seq data (FASTQ files) of the samples used in the study title: "Evaluation of triple negative breast cancer with heterogeneous immune infiltration". The dataset is composed of 3 patients with 6 samples per patient (3 with high TILs and 3 with low TILs). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  18 
 
  
    EGAD00001010281 
   
  
    
    Ither NB in Organoids WXS dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake. 
    
   
  
    
   
  17 
 
  
    EGAD00001010282 
   
  
    
    Ither NB in Organoids WGS dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake. 
    
   
  
    
   
  17 
 
  
    EGAD00001010283 
   
  
    
    Ither NB in Organoids RNA-Seq dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake. 
    
   
  
    
   
  9 
 
  
    EGAD00001010284 
   
  
    
    cfChIP-seq using an H3K4me3-specific antibody of 9 plasma samples collected after marathon run. cfChIP-seq was performed as described in Sadeh et al. 2021. dataset contains paired-end fastq files and BAM files of raw sequencing data. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001010287 
   
  
    
    This dataset contains 9 bulk RNAseq of neuroblastoma patient's tumors used to compare with derived PDXs and/or single-cell data in the Thirant C et al, Nature Communications, 2023. They were intially produced for the Berlanga P., Cancer Discovery, 2022. 
    
   
  
    
   
  9 
 
  
    EGAD00001010288 
   
  
    
    scWGS-seq of flow sorted blast and normal cells from SJBALL030072with 66 high quality cells sequenced (61 blast and 5 normal) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  66 
 
  
    EGAD00001010289 
   
  
    
    Dataset contains 50 paired-end Whole Exome sequencing samples from 3 patients. 3 normal blood samples are also included. 
    
   
  
    
   
  50 
 
  
    EGAD00001010290 
   
  
    
    Dataset contains 46 paired-end RNA-seq samples from 3 patients. 
    
   
  
    
   
  46 
 
  
    EGAD00001010292 
   
  
    
    This dataset contains WGS data in the form of BAM files for NPC268 - "Tumor" derived from snap-frozen tissue while "Cell line" derived from late passage NPC268 cell line. Extracted DNA was sent for 100x and 60x WGS with Novogene via Apical Scientific Sdn Bhd. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001010293 
   
  
    
    Paired end shallow whole genome sequencing (sWGS) data (FASTQ) for the identification of genomewide somatic copy number alterations (SCNA) and the estimation of tumor fractions. 
    
   
  
    
   
  30 
 
  
    EGAD00001010294 
   
  
    
    Aligned NGS data (BAM) of 77 frequently mutated genes in cancer using the AVENIO Expanded platform. 
    
   
  
    
   
  30 
 
  
    EGAD00001010295 
   
  
    
    There is a need for quantitative measurements of evolutionary metrics in controlled clinical trials with long term follow-up information. This is particularly true in advanced localised prostate cancer, which can recur more than a decade after diagnosis. Here we mapped genomic intra-tumour heterogeneity in 642 tumour samples from 114 patients who took part in the IMRT and DELINEATE clinical trials, for which full clinical information and 12y median follow-up was available. We concomitantly assessed phenotypic (morphological) heterogeneity using Deep Learning in 1,923 histological sections from 250 IMRT patients (fully overlapping with the genetic set). This study shows that combining genomics with AI-aided histopathology in clinical trials leads to novel clinical biomarkers.
This EGA repository contains data produced from tumour samples using low coverage whole genome sequencing and a prostate cancer specific gene panel data following compression of unique molecular identifiers. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  100 
 
  
    EGAD00001010297 
   
  
    
    sWGS dataset of 18 matched PDO and ascites samples, and scDNA sequencing of three of these PDOs. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  39 
 
  
    EGAD00001010298 
   
  
    
    One of the most dangerous forms of DNA damage are interstrand crosslinks (ICLs), which covalently crosslink the two strands of the DNA double helix. The repair of these lesions is crucial for cellular survival due to their ability to block transcription and DNA replication. Initially, the major pathway that has been described in ICL repair involves a network of 22 genes that are mutated in a severe human genetic disease known as Fanconi Anemia (FA). 
Using synthetic lethality screens in the near-haploid human HAP1 cell line, we recently identified two potentially novel regulators of ICL repair, C1orf112 and THAP12. Loss of C1orf112 and THAP12 causes hypersensitivity to ICL-inducing DNA damaging agents, such as Mitomycin C (MMC). Additionally, C1orf112-depleted cells show elevated levels of micronuclei and accumulation of DNA damage in S-phase. To better understand how C1orf112 and THAP12 mediate the repair of ICLs, we want to perform mutational signature analysis, using the BotSeq method. Therefore, WT, C1orf112 and THAP12 knockout cells were cultured in vehicle or MMC treated conditions for 10 days and the genomic DNA was isolated. FANCA and FANCD2 knockout cells are taken along as controls in this experimental setting. . 
This dataset contains all the data available for this study on 2023-04-20. 
    
   
  
    
   
  18 
 
  
    EGAD00001010300 
   
  
    
    The circulating tumor DNA (ctDNA) mutation-based approach shows limited performance in minimal residual disease (MRD) detection, especially for landmark MRD detection at an early-stage cancer after surgery. Here, A total of 87 NSCLC patients, who received curative surgical resections (23 patients relapsed during follow-up), enrolled in this study. A total of 163 plasma samples, collected at 7 days and 6 months postsurgical, were used for high-throughput sequencing. 
    
   
  
    
   
  1 
 
  
    EGAD00001010301 
   
  
    
    Smart-seq2 single-cell RNA-seq of human liver non-parenchymal cells from lean and obese individuals 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  1351 
 
  
    EGAD00001010302 
   
  
    
    Bulk tumour, germline and DigiPico WGS BAM files from patients 11611, 11615, 11619 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  11 
 
  
    EGAD00001010304 
   
  
    
    RNAseq and ATACseq data for the FMF patients and healthy control.
The RNAseq data was sequenced on a BGI MGI G400 machine, with PE100 reads.
ATAC-seq libraries were prepared with Illumina Nextera primers and sequenced on NovaSeq 6000 platform with 50bp paired-end sequencing, where each sample was sequenced to approximate 60 million reads. 
    
   
  
    
   
  58 
 
  
    EGAD00001010305 
   
  
    
    Bam and indexed bam files after removal of duplicates and trimming of the unique molecular identifiers. Sequencing was performed on NovaSeq 6000 platform. 
    
   
  
    
   
  1 
 
  
    EGAD00001010306 
   
  
    
    This dataset contains bulk transcriptomes from the operable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19), used to verify the deconvolution method used for the scRNA-seq dataset. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  24 
 
  
    EGAD00001010307 
   
  
    
    This dataset is derived from whole-transcriptome sequencing (RNA-seq) of RNA from 57 BCR-ABL1 lymphoblastic leukemias (53 diagnostic, 4 relapse). 
    
   
  
    
   
  57 
 
  
    EGAD00001010308 
   
  
    
    Bulk RNA-seq data of 8 RNA samples were generated. Libraries were prepared using the Stranded mRNA Library Prep, Ligation Kit (Illumina) following manufacturer’s recommendations and sequenced on a NextSeq 2000 (2x50 bp, Illumina). 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD00001010309 
   
  
    
    Purified DNA from PDX samples were subjected to WGS. Libraries were performed using the TruSeq DNA PCR-Free kit (Illumina) starting with 1μg of input DNA and performed following manufacturer's instructions. Libraries were sequenced on a NovaSeq 6000 (2x151 bp) instrument (Illumina). A mean coverage of 30.4x was obtained. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001010310 
   
  
    
    Revision Experiments UMI 
    
   
  
    
   
  12 
 
  
    EGAD00001010311 
   
  
    
    Dataset contains 70 paired-end ATAC-seq samples from 8 patients. 
    
   
  
    
   
  70 
 
  
    EGAD00001010312 
   
  
    
    Dataset contains 21 paired-end Hi-C samples from 9 patients. 
    
   
  
    
   
  21 
 
  
    EGAD00001010313 
   
  
    
    Dataset contains 11 paired-end snATAC-seq samples from 4 patients. 
    
   
  
    
   
  11 
 
  
    EGAD00001010314 
   
  
    
    Here, seven patients with BPDCN were characterized using RNA-seq and WXS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD00001010320 
   
  
    
    We performed single cell RNA sequencing (scRNA-seq) for 208,506 cells derived from 58 lung adenocarcinomas from 44 patients, which covers primary tumour, lymph node and brain metastases, and pleural effusion in addition to normal lung tissues and lymph nodes. The extensive single cell profiles depicted a complex cellular atlas and dynamics during lung adenocarcinoma progression which includes cancer, stromal, and immune cells in the surrounding tumor microenvironments. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001010321 
   
  
    
    Paired diagnostic and relapse medulloblastoma sequencing.  Targetted panel sequencing (n=8 relapse, n=7 matched diagnostic).  Whole exome sequencing (n=23 relapse, n=18 matched diagnostic).  Capture using Agilent SureSelect. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  56 
 
  
    EGAD00001010322 
   
  
    
    180502:
RNA-Sequencing data of cocultured matched CRC patient (P4) derived normal fibroblasts (NFs), cancer associated fibroblasts (CAFs) and tumor spheroids.
200503_coculture:
RNA-Sequencing data of cocultured CRC patient derived normal fibroblasts (NFs) or cancer associated fibroblasts (CAFs) (P16, P19, P22, P32, P41, P42) and tumor spheroids (HT29).
200503_il1b:
RNA-Sequencing data of IL-1β stimulated fibroblasts (NFs and CAFs)
Cole:
scRNA-sequencing of matched CRC tumour samples and  normal tissue counterparts derived from 3 patients.
220501:
RNA-Sequencing of FACS sorted IL1R1 high and IL1R1 low CT5.3 CAFs 
    
   
  
    
      
      NextSeq 500 
      
      unspecified 
      
    
   
  96 
 
  
    EGAD00001010323 
   
  
    
    This dataset is derived from whole-genome sequencing (WGS) of DNA from 57 BCR-ABL1 lymphoblastic leukemias (53 diagnostic, 4 relapse) and 53 germline samples. 
    
   
  
    
   
  110 
 
  
    EGAD00001010324 
   
  
    
    We have generated and analyzed genomic data from a cohort of metastatic urothelial carcinoma patients treated with ICI such as anti-PD-(L)1 monoclonal antibodies. The dataset contains whole exome sequencing data of 27 whole blood samples and 27 FFPE tumor samples. Further, it includes RNA sequencing data from 21 tumor samples. Following the RECIST criteria, 10 patients were classified as non-responders to the treatment, and 17 were responders. The dataset also contains a merged vcf file containing somatic mutations called by Strelka2 and Mutect2 following the gatk best practice pipeline. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  74 
 
  
    EGAD00001010326 
   
  
    
    A set of tumor/normal paired sequencing experiments, performed in short read WGS and 10X linked read whole genomes.
BAM files aligned to hg19 are provided.
Sample Alias number and Subject ID reflects patient of origin.  T/N distinction discriminates between tumor and normal (peripheral blood) tissue. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  98 
 
  
    EGAD00001010327 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-10 from patient PBC0002108, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010328 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-11 from patient PBC0002429, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010329 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-12 from patient PBC0002459, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010330 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-13 from patient PBC0002595, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010331 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-14 from patient PBC0001051, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010332 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-16 from patient PBC0001627, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010333 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-18 from patient PBC0002062, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010334 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-20 from patient PBC0002108, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010335 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-21 from patient PBC0002429, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010336 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-22 from patient PBC0002459, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010337 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-23 from patient PBC0002595, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010338 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-4 from patient PBC0001051, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010339 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-6 from patient PBC0001627, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010340 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run002-8 from patient PBC0002062, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010341 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-10 from patient PBC0002459, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010342 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-11 from patient PBC0002595, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010343 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-12 from patient PBC0001224, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010344 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-13 from patient PBC0002383, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010345 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-14 from patient PBC0002816, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010346 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-15 from patient PBC0002824, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010347 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-18 from patient PBC0001845, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010348 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-20 from patient PBC0002680, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010349 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-4 from patient PBC0001627, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010350 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-5 from patient PBC0002062, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010351 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-8 from patient PBC0002108, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010352 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run005-9 from patient PBC0002429, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010353 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-10 from patient PBC0002680, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010354 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-12 from patient PBC0002816, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010355 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-13 from patient PBC0002824, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010356 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-15 from patient PBC0001255, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010357 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-18 from patient PBC0001845, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010358 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-20 from patient PBC0002680, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010359 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-22 from patient PBC0002816, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010360 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-23 from patient PBC0002824, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010361 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-24 from patient PBC0001051, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010362 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-5 from patient PBC0001255, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010363 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run007-8 from patient PBC0001845, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010364 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-10 from patient PBC0001673, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010365 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-11 from patient PBC0002255, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010366 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-12 from patient PBC0002294, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010367 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-13 from patient PBC0001328, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010368 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-14 from patient PBC0001224, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010369 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-15 from patient PBC0002383, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010370 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-18 from patient PBC0001432, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010371 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-20 from patient PBC0001673, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010372 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-21 from patient PBC0002255, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010373 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-22 from patient PBC0002294, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010374 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-23 from patient PBC0001328, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010375 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-4 from patient PBC0001224, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010376 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-5 from patient PBC0002383, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010377 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run011-8 from patient PBC0001432, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010378 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-12 from patient PBC0002108, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010379 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-13 from patient PBC0002459, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010380 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-14 from patient PBC0002595, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010381 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-17 from patient PBC0002824, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010382 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-20 from patient PBC0001432, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010383 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-22 from patient PBC0001673, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010384 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-23 from patient PBC0001255, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010385 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-24 from patient PBC0002255, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010386 
   
  
    
    Targeted capture ctDNA Library CRCQV34Run015-9 from patient PBC0001627, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010387 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-10 from patient PBC0004414, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010388 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-11 from patient PBC0004565, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010389 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-13 from patient PBC0005116, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010390 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-17 from patient PBC0004076, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010391 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-18 from patient PBC0004350, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010392 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-19 from patient PBC0004414, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010393 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-20 from patient PBC0004565, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010394 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-22 from patient PBC0005116, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010395 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-6 from patient PBC0001335, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010396 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-7 from patient PBC0003364, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010397 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-8 from patient PBC0004076, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010398 
   
  
    
    Targeted capture ctDNA Library CRCQV40Run027-9 from patient PBC0004350, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010399 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run016-11 from patient PBC0001467, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010400 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run016-16 from patient PBC0001396, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010401 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run016-17 from patient PBC0001467, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010402 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run016-9 from patient PBC0001396, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010403 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-10 from patient PBC0002255, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010404 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-11 from patient PBC0001845, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010405 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-12 from patient PBC0002429, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010406 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-13 from patient PBC0002383, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010407 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-15 from patient PBC0001673, plasma 24 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010408 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-16 from patient PBC0002816, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010409 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-17 from patient PBC0002062, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010410 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-18 from patient PBC0002294, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010411 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-19 from patient PBC0001432, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010412 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-20 from patient PBC0001051, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010413 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-22 from patient PBC0001627, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010414 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-24 from patient PBC0002294, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010415 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-8 from patient PBC0001224, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010416 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run017-9 from patient PBC0001255, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010417 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run018-18 from patient PBC0001859, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010418 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run018-20 from patient PBC0001859, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010419 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-12 from patient PBC0001295, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010420 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-14 from patient PBC0001304, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010421 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-18 from patient PBC0001295, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010422 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-21 from patient PBC0001310, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010423 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-6 from patient PBC0001295, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010424 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run019-8 from patient PBC0001304, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010425 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-10 from patient PBC0001315, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010426 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-14 from patient PBC0001312, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010427 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-15 from patient PBC0001315, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010428 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-23 from patient PBC0001310, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010429 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-24 from patient PBC0001310, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010430 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-4 from patient PBC0001312, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010431 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-5 from patient PBC0001315, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010432 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run020-9 from patient PBC0001312, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010433 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-10 from patient PBC0001470, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010434 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-11 from patient PBC0001516, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010435 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-12 from patient PBC0001653, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010436 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-13 from patient PBC0001299, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010437 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-15 from patient PBC0001323, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010438 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-16 from patient PBC0001470, plasma 1 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010439 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-17 from patient PBC0001516, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010440 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-18 from patient PBC0001653, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010441 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-19 from patient PBC0001299, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010442 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-21 from patient PBC0001323, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010443 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-22 from patient PBC0001470, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010444 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-23 from patient PBC0001516, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010445 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-24 from patient PBC0001653, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010446 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-4 from patient PBC0001328, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010447 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-6 from patient PBC0001304, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010448 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-7 from patient PBC0001299, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010449 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run021-9 from patient PBC0001323, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010450 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-13 from patient PBC0002826, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010451 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-16 from patient PBC0002406, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010452 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-17 from patient PBC0002744, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010453 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-18 from patient PBC0002826, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010454 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-19 from patient PBC0002680, plasma 6 month repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010455 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-20 from patient PBC0001845, plasma 12 month repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010456 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-22 from patient PBC0002062, plasma 12 month repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010457 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-6 from patient PBC0002406, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010458 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-7 from patient PBC0002744, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010459 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run022-8 from patient PBC0002826, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010460 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run023-10 from patient PBC0001470, earliest baseline plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010461 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run023-16 from patient PBC0002824, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010462 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run023-19 from patient PBC0002383, plasma baseline repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010463 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run023-20 from patient PBC0001224, plasma baseline repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010464 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run023-24 from patient PBC0002383, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010465 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-11 from patient PBC0001306, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010466 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-13 from patient PBC0001306, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010467 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-14 from patient PBC0002853, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010468 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-16 from patient PBC0003641, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010469 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-18 from patient PBC0001306, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010470 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-20 from patient PBC0001589, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010471 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-21 from patient PBC0002853, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010472 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-23 from patient PBC0003641, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010473 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-4 from patient PBC0001306, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010474 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-6 from patient PBC0001589, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010475 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run024-9 from patient PBC0003641, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010476 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-10 from patient PBC0001335, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010477 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-11 from patient PBC0003595, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010478 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-12 from patient PBC0003014, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010479 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-13 from patient PBC0003334, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010480 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-14 from patient PBC0003385, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010481 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-17 from patient PBC0003595, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010482 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-19 from patient PBC0003364, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010483 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-5 from patient PBC0001589, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010484 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-6 from patient PBC0003014, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010485 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-7 from patient PBC0003334, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010486 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run025-8 from patient PBC0003385, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010487 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-14 from patient PBC0003587, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010488 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-15 from patient PBC0002872, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010489 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-16 from patient PBC0005064, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010490 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-17 from patient PBC0003643, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010491 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-18 from patient PBC0001299, plasma baseline repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010492 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-19 from patient PBC0002872, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010493 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-21 from patient PBC0003587, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010494 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-22 from patient PBC0003364, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010495 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-24 from patient PBC0005064, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010496 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-4 from patient PBC0002872, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010497 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-5 from patient PBC0003014, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010498 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-6 from patient PBC0003595, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010499 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run026-9 from patient PBC0003385, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010500 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-10 from patient PBC0003587, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010501 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-11 from patient PBC0003643, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010502 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-12 from patient PBC0004076, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010503 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-13 from patient PBC0004414, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010504 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-14 from patient PBC0004565, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010505 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-16 from patient PBC0005064, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010506 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-17 from patient PBC0005116, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010507 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-20 from patient PBC0002533, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010508 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-23 from patient PBC0002533, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010509 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-4 from patient PBC0001335, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010510 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run028-9 from patient PBC0003364, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010511 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-11 from patient PBC0001353, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010512 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-12 from patient PBC0004173, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010513 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-16 from patient PBC0001353, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010514 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-17 from patient PBC0004173, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010515 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-21 from patient PBC0001353, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010516 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-23 from patient PBC0004350, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010517 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-24 from patient PBC0003334, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010518 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-5 from patient PBC0002533, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010519 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run029-7 from patient PBC0004173, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010520 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-14 from patient PBC0001315, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010521 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-15 from patient PBC0003014, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010522 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-19 from patient PBC0001323, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010523 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-20 from patient PBC0001673, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010524 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-21 from patient PBC0003641, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010525 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-23 from patient PBC0002853, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010526 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run030-5 from patient PBC0001051, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010527 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-10 from patient PBC0001306, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010528 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-14 from patient PBC0001310, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010529 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-16 from patient PBC0001328, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010530 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-18 from patient PBC0001375, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010531 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-19 from patient PBC0001404, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010532 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-21 from patient PBC0001413, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010533 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-23 from patient PBC0001467, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010534 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-24 from patient PBC0001470, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010535 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-7 from patient PBC0001295, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010536 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-8 from patient PBC0001299, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010537 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run031-9 from patient PBC0001304, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010538 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-12 from patient PBC0001224, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010539 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-13 from patient PBC0001255, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010540 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-16 from patient PBC0001353, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010541 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-17 from patient PBC0001396, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010542 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-19 from patient PBC0001432, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010543 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-20 from patient PBC0001516, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010544 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-21 from patient PBC0001627, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010545 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-22 from patient PBC0001653, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010546 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-5 from patient PBC0003334, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010547 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-8 from patient PBC0001589, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010548 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run032-9 from patient PBC0001312, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010549 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-10 from patient PBC0002255, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010550 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-11 from patient PBC0002294, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010551 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-12 from patient PBC0002406, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010552 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-13 from patient PBC0002429, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010553 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-14 from patient PBC0002459, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010554 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-15 from patient PBC0002533, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010555 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-16 from patient PBC0002595, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010556 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-18 from patient PBC0002680, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010557 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-19 from patient PBC0002744, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010558 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-20 from patient PBC0002826, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010559 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-22 from patient PBC0003385, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010560 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-23 from patient PBC0003595, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010561 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-5 from patient PBC0001845, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010562 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-6 from patient PBC0001859, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010563 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-7 from patient PBC0002062, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010564 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run033-9 from patient PBC0002108, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010565 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-10 from patient PBC0004076, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010566 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-12 from patient PBC0004173, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010567 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-13 from patient PBC0004350, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010568 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-14 from patient PBC0004414, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010569 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-15 from patient PBC0004565, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010570 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-18 from patient PBC0005064, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010571 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-20 from patient PBC0005116, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010572 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-24 from patient PBC0001396, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010573 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-4 from patient PBC0002872, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010574 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-5 from patient PBC0003364, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010575 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-6 from patient PBC0003587, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010576 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run034-7 from patient PBC0003643, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010577 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-10 from patient PBC0002872, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010578 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-11 from patient PBC0001589, plasma 30 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010579 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-12 from patient PBC0004414, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010580 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-13 from patient PBC0001859, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010581 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-14 from patient PBC0001859, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010582 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-15 from patient PBC0002744, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010583 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-18 from patient PBC0002853, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010584 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-19 from patient PBC0002853, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010585 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-20 from patient PBC0003334, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010586 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-21 from patient PBC0003595, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010587 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-23 from patient PBC0004350, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010588 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-24 from patient PBC0004565, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010589 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-6 from patient PBC0002406, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010590 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-7 from patient PBC0001353, plasma 36 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010591 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run035-8 from patient PBC0002533, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010592 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-10 from patient PBC0002062, plasma 21 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010593 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-11 from patient PBC0002826, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010594 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-12 from patient PBC0003587, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010595 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-13 from patient PBC0001323, plasma 36 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010596 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-14 from patient PBC0001653, plasma 24 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010597 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-17 from patient PBC0001299, plasma 36 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010598 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-19 from patient PBC0004076, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010599 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-20 from patient PBC0003364, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010600 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-21 from patient PBC0004173, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010601 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-23 from patient PBC0002406, plasma 21 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010602 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-24 from patient PBC0001328, plasma 36 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010603 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-5 from patient PBC0001470, plasma 30 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010604 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-6 from patient PBC0005064, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010605 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run036-8 from patient PBC0003641, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010606 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-10 from patient PBC0001299, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010607 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-11 from patient PBC0001323, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010608 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-12 from patient PBC0001328, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010609 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-14 from patient PBC0001353, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010610 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-15 from patient PBC0001432, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010611 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-16 from patient PBC0001653, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010612 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-17 from patient PBC0001673, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010613 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-18 from patient PBC0002108, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010614 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-19 from patient PBC0002383, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010615 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-20 from patient PBC0002429, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010616 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-21 from patient PBC0003014, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010617 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-22 from patient PBC0003364, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010618 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-23 from patient PBC0003643, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010619 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-24 from patient PBC0001413, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010620 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-4 from patient PBC0001859, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010621 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-5 from patient PBC0002406, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010622 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-6 from patient PBC0002853, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010623 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-7 from patient PBC0002853, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010624 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-8 from patient PBC0003334, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010625 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run039-9 from patient PBC0001335, saliva, repeat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010626 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-12 from patient PBC0001413, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010627 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-13 from patient PBC0001413, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010628 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-17 from patient PBC0002406, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010629 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-20 from patient PBC0002744, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010630 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-21 from patient PBC0001335, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010631 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-22 from patient PBC0005064, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010632 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-23 from patient PBC0002294, plasma 21 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010633 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-24 from patient PBC0005116, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010634 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-4 from patient PBC0001375, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010635 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-5 from patient PBC0001375, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010636 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-6 from patient PBC0001404, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010637 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-7 from patient PBC0001404, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010638 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-8 from patient PBC0001404, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010639 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run041-9 from patient PBC0001404, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010640 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-10 from patient PBC0001051, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010641 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-11 from patient PBC0001310, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010642 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-12 from patient PBC0003385, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010643 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-13 from patient PBC0001295, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010644 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-15 from patient PBC0001470, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010645 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-16 from patient PBC0002255, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010646 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-17 from patient PBC0004350, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010647 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-18 from patient PBC0002995, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010648 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-19 from patient PBC0004537, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010649 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-20 from patient PBC0003244, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010650 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-21 from patient PBC0005013, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010651 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-22 from patient PBC0003448, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010652 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-23 from patient PBC0002480, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010653 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-24 from patient PBC0002329, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010654 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-4 from patient PBC0002294, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010655 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-5 from patient PBC0001516, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010656 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-6 from patient PBC0001255, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010657 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-7 from patient PBC0001589, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010658 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-8 from patient PBC0001304, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010659 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run043-9 from patient PBC0001306, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010660 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-10 from patient PBC0001488, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010661 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-11 from patient PBC0001515, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010662 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-12 from patient PBC0001550, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010663 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-13 from patient PBC0001665, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010664 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-14 from patient PBC0001667, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010665 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-15 from patient PBC0001818, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010666 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-16 from patient PBC0005586, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010667 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-17 from patient PBC0005602, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010668 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-18 from patient PBC0003376, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010669 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-19 from patient PBC0005963, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010670 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-20 from patient PBC0005209, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010671 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-21 from patient PBC0005373, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010672 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-22 from patient PBC0006040, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010673 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-23 from patient PBC0005498, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010674 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-24 from patient PBC0005531, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010675 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-4 from patient PBC0001048, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010676 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-5 from patient PBC0001099, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010677 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-6 from patient PBC0001279, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010678 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-7 from patient PBC0001314, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010679 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-8 from patient PBC0001376, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010680 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run044-9 from patient PBC0001445, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010681 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-10 from patient PBC0002989, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010682 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-11 from patient PBC0004156, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010683 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-12 from patient PBC0001068, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010684 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-13 from patient PBC0001727, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010685 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-14 from patient PBC0001468, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010686 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-15 from patient PBC0001714, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010687 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-16 from patient PBC0001782, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010688 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-17 from patient PBC0001982, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010689 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-18 from patient PBC0001810, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010690 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-19 from patient PBC0001311, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010691 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-20 from patient PBC0002317, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010692 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-21 from patient PBC0002651, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010693 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-22 from patient PBC0001456, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010694 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-23 from patient PBC0002481, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010695 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-24 from patient PBC0001409, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010696 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-4 from patient PBC0002458, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010697 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-5 from patient PBC0002851, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010698 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-6 from patient PBC0004124, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010699 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-7 from patient PBC0005444, plasma 18 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010700 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run045-8 from patient PBC0006360, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010701 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-10 from patient PBC0001552, plasma 12 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010702 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-11 from patient PBC0001284, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010703 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-12 from patient PBC0001135, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010704 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-13 from patient PBC0002625, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010705 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-14 from patient PBC0001065, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010706 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-15 from patient PBC0001183, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010707 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-16 from patient PBC0001042, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010708 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-17 from patient PBC0001329, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010709 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-18 from patient PBC0001067, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010710 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-19 from patient PBC0001527, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010711 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-20 from patient PBC0001176, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010712 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-21 from patient PBC0001828, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010713 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-22 from patient PBC0001486, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010714 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-23 from patient PBC0002622, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010715 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-24 from patient PBC0002494, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010716 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-4 from patient PBC0001776, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010717 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-5 from patient PBC0001242, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010718 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-6 from patient PBC0001666, plasma 3 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010719 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-7 from patient PBC0001528, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010720 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-8 from patient PBC0001134, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010721 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run046-9 from patient PBC0002769, plasma 6 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010722 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-10 from patient PBC0001051, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010723 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-11 from patient PBC0001310, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010724 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-12 from patient PBC0003385, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010725 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-13 from patient PBC0001295, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010726 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-14 from patient PBC0001470, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010727 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-15 from patient PBC0002255, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010728 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-16 from patient PBC0004350, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010729 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-17 from patient PBC0002995, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010730 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-18 from patient PBC0002458, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010731 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-19 from patient PBC0002851, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010732 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-20 from patient PBC0004124, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010733 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-21 from patient PBC0005444, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010734 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-22 from patient PBC0006360, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010735 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-24 from patient PBC0002989, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010736 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-4 from patient PBC0002294, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010737 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-5 from patient PBC0001516, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010738 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-6 from patient PBC0001255, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010739 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-7 from patient PBC0001589, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010740 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-8 from patient PBC0001304, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010741 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run047-9 from patient PBC0001306, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010742 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-10 from patient PBC0001488, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010743 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-11 from patient PBC0001515, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010744 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-12 from patient PBC0001550, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010745 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-13 from patient PBC0001665, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010746 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-14 from patient PBC0001667, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010747 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-15 from patient PBC0001818, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010748 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-16 from patient PBC0005586, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010749 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-17 from patient PBC0005602, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010750 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-18 from patient PBC0003376, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010751 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-19 from patient PBC0005963, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010752 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-20 from patient PBC0005209, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010753 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-21 from patient PBC0005373, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010754 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-22 from patient PBC0006040, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010755 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-23 from patient PBC0005498, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010756 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-24 from patient PBC0005531, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010757 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-4 from patient PBC0001048, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010758 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-5 from patient PBC0001099, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010759 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-6 from patient PBC0001279, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010760 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-7 from patient PBC0001314, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010761 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-8 from patient PBC0001376, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010762 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run048-9 from patient PBC0001445, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010763 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-10 from patient PBC0004156, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010764 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-11 from patient PBC0001068, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010765 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-12 from patient PBC0001727, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010766 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-13 from patient PBC0001468, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010767 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-14 from patient PBC0001714, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010768 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-15 from patient PBC0001782, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010769 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-16 from patient PBC0001982, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010770 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-17 from patient PBC0001810, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010771 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-18 from patient PBC0001311, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010772 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-19 from patient PBC0002317, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010773 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-20 from patient PBC0002651, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010774 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-21 from patient PBC0001456, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010775 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-22 from patient PBC0002481, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010776 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-23 from patient PBC0001409, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010777 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-4 from patient PBC0004537, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010778 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-5 from patient PBC0003244, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010779 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-6 from patient PBC0005013, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010780 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-7 from patient PBC0003448, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010781 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-8 from patient PBC0002480, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010782 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run049-9 from patient PBC0002329, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010783 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-10 from patient PBC0001552, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010784 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-11 from patient PBC0001284, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010785 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-12 from patient PBC0001135, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010786 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-13 from patient PBC0002625, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010787 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-14 from patient PBC0001065, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010788 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-15 from patient PBC0001183, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010789 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-16 from patient PBC0001042, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010790 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-17 from patient PBC0001329, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010791 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-18 from patient PBC0001067, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010792 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-19 from patient PBC0001527, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010793 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-20 from patient PBC0001176, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010794 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-21 from patient PBC0001828, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010795 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-22 from patient PBC0001486, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010796 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-23 from patient PBC0002622, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010797 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-24 from patient PBC0002494, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010798 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-4 from patient PBC0001776, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010799 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-5 from patient PBC0001242, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010800 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-6 from patient PBC0001666, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010801 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-7 from patient PBC0001528, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010802 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-8 from patient PBC0001134, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010803 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run050-9 from patient PBC0002769, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010804 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-10 from patient PBC0001665, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010805 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-11 from patient PBC0002989, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010806 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-12 from patient PBC0001516, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010807 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-13 from patient PBC0001319, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010808 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-14 from patient PBC0002989, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010809 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-15 from patient PBC0001224, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010810 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-16 from patient PBC0002533, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010811 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-17 from patient PBC0002995, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010812 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-19 from patient PBC0002294, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010813 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-20 from patient PBC0003587, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010814 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-21 from patient PBC0001665, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010815 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-22 from patient PBC0002989, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010816 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-23 from patient PBC0001516, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010817 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-24 from patient PBC0001319, buffy coat sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010818 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-4 from patient PBC0001224, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010819 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-5 from patient PBC0002533, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010820 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-6 from patient PBC0002995, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010821 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-8 from patient PBC0002294, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010822 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run051-9 from patient PBC0003587, plasma sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010823 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-10 from patient PBC0003643, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010824 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-12 from patient PBC0001375, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010825 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-13 from patient PBC0001467, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010826 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-14 from patient PBC0001375, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010827 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-16 from patient PBC0001413, TNBC FFPE sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010828 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-17 from patient PBC0001375, saliva sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010829 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-4 from patient PBC0001375, plasma baseline sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010830 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-6 from patient PBC0001224, plasma 9 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010831 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-7 from patient PBC0001299, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010832 
   
  
    
    Targeted capture ctDNA Library CRCQV42Run42-8 from patient PBC0003385, plasma 15 month sample 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010833 
   
  
    
    Paired WGA (whole-genome amplificated) samples from 65 single cell-derived organoids and six fresh single cell-derived organoid samples both of which were established from normal mammary epithelial cells, 84 FFPE LCM (laser-capture microdissection) samples of breast cancer and related clones, 79 fresh-frozen LCM samples of non-cancer lobules of breast cancer patients, and 36 matched germline controls were subjected to whole genome sequencing using NovaSeq 6000 system (Illumina) or DNBSEQ-G400RS (MGI Tech). 
    
   
  
    
      
      unspecified 
      
    
   
  335 
 
  
    EGAD00001010834 
   
  
    
    RNAseq of pancreatic cancer organoids. 
    
   
  
    
      
      unspecified 
      
    
   
  74 
 
  
    EGAD00001010835 
   
  
    
    Whole-genome sequencing of pancreatic cancer organoids and matched germline controls. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  100 
 
  
    EGAD00001010837 
   
  
    
    Total mononuclear cells (MNC) were isolated from peripheral blood (PB) or bone marrow (BM) samples of 105 Ph-/-/- (B-cell acute lymphoblastic leukemia [B-ALL] triple negative) and 31 Ph+ B-ALL adult patients using Lymphosep (Biowest, Nuaillé, France). A total of 15 samples from healthy subjects were processed including hematopoietic stem-progenitor cells (CD34+) from bone marrow specimens (n = 3), and bone marrow mononuclear cell samples (n = 3) from STEMCELL Technologies (Vancouver, Canada). PB MNC samples (n=5) and cord blood samples (n = 4). CD34+ cells were enriched from cord blood samples by immunomagnetic separation (CD34 MicroBead Kit, Miltenyi Biotec, Bergisch Gladbach, Germany).Libraries were prepared using the TruSight RNA Pan-Cancer Panel Kit (Illumina, San Diego, California, USA), following the manufacturer’s protocol. Sequencing was performed using the Illumina MiSeq instrument. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD00001010838 
   
  
    
    DEFLeCT metadata:
1) A table containing the mutational profile for all human genes across the cohort of the 81 PROMOLE lung cancer samples.
2) A table containing the expression levels for all human genes across the cohort of the 81 PROMOLE lung cancer samples.
3) A table containing the clinical, histology data and follow-ups of the cohort of 81 PROMOLE lung cancer patients. 
    
   
  
    
   
  1 
 
  
    EGAD00001010839 
   
  
    
    100 bp paired-end fastq RNAseq files for 25 sarcoma samples.  RNAseq data from exon capture library prep. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD00001010840 
   
  
    
    NGS profiling of the entire miRNOme conducted on the plasma of 24 samples obtained from well-differentiated, advanced, metastatic and inoperable G1, G2 and G3 GEP-NET patients. Sequencing was performed on NexSeq platform 
    
   
  
    
      
      NextSeq 500 
      
    
   
  24 
 
  
    EGAD00001010841 
   
  
    
    Stranded RNA-seq libraries were performed for 150 ng of mRNA using the TruSeq library kit (Illumina, San Diego, CA, USA). Libraries were sequenced on a NextSeq 2000 (Illumina) in a 2x50bp length. 
    
   
  
    
      
      unspecified 
      
    
   
  12 
 
  
    EGAD00001010842 
   
  
    
    The mutational status of 121 genes recurrently altered in B-cell lymphoma was examined in 55 of 56 diagnostic and 10 of 12 relapse samples using a custom targeted NGS panel. Libraries were generated from 150 ng of DNA using molecular-barcoded library adapters (ThruPLEX Tag- seq kit; Takara) coupled with a custom hybridization capture-based method (SureSelectXT Target Enrichment System Capture strategy, Agilent Technologies). The quality of the libraries was determined using the Bioanalyzer high sensitivity DNA kit (Agilent) and quantified by PCR using the KAPA library quantification kit (KAPA Biosystems). Finally, the libraries were pooled and sequenced in the MiSeq instrument (Illumina). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  65 
 
  
    EGAD00001010843 
   
  
    
    Single-cell RNA-sequencing of 8 patients, from primary and relapse tumour (total of 16 samples). Patients were treated with nivolumab prior to relapse surgery. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001010844 
   
  
    
    Bulk RNA-sequencing of 30 relapse tumour samples. 20 patients were treated with nivolumab prior to relapse surgery, while 10 patients are control patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD00001010845 
   
  
    
    Whole genome sequencing data of 36 high-grade serous carcinoma (HGSC) patients (89 samples) sequenced with HiSeq X Ten. 
    
   
  
    
      
      unspecified 
      
    
   
  77 
 
  
    EGAD00001010846 
   
  
    
    Dataset with 150 whole-exome sequences from Algerian Amazigh (Chaoui and Mozabite) and non-Amazgih saples. 
    
   
  
    
      
      unspecified 
      
    
   
  124 
 
  
    EGAD00001010847 
   
  
    
    WES sequencing of 23 samples from PCNSL tumor and blood control samples. Sequencing was performed on a NovaSeq 6000 and NextSeq 500 using Agilent SureSelectXT HS Human All Exon V7 and V8. Sequencing was always paired. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD00001010848 
   
  
    
    The dataset contains 23 ovarian cancer and 2 healthy control urine cfDNA samples. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD00001010849 
   
  
    
    Linked-read data from 25 medulloblastomas and their matching control. Dataset consists of 25 group 4 medulloblastomas (G4) as well as 2 sonic hedgehog medulloblastomas (SHH-MB) samples and 2 group 3 medulloblastomas.
The data consists of BAM files generated by the LongRanger pipeline developed by 10x Genomics 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  50 
 
  
    EGAD00001010850 
   
  
    
    RNA-Seq data from 12 medulloblastoma samples, all group 4 medulloblastomas.
The data consists of BAM files aligned using STAR 
    
   
  
    
      
      unspecified 
      
    
   
  12 
 
  
    EGAD00001010851 
   
  
    
    Nanopore data from 3 medulloblastoma samples of which 2 are tumor-normal pairs sequenced with the MinIon and one is tumor only data sequenced on the PromethIon.
The data consists of BAM files aligned using minimap2 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  5 
 
  
    EGAD00001010852 
   
  
    
    PacBio data from 5 medulloblastoma tumor-normal pairs. 
The data consists of BAM files aligned using NGMLR 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001010856 
   
  
    
    Sequence of breast cancer bone metastases PDX obtained by 2 targeted panels 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001010860 
   
  
    
    Sequence of breast cancer bone metastases PDX obtained by 2 targeted panels 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  9 
 
  
    EGAD00001010867 
   
  
    
    The Sys4MS cohort clinical data - 419 patients 
    
   
  
    
   
  419 
 
  
    EGAD00001010871 
   
  
    
    Genomic and epigenomic sequencing of 5 oesphageal adenocarciomas with evidence of chromothripsis. Genomic sequencing includes: Pacbio circular consensus sequencing, Pacbio continuous long read sequencing, 10X linked read and Illumia HiSeq X Ten sequencing. Epigenomic sequencing includes: Hi-C chromosome capture, ATAC-seq, ChIP seq (for H3K27ac, H3K4me3, H3K27me3 and CTCF) and long read RNA sequencing. All data types have the bam files which have not undergone haplotype resolution (demarcated as unresolved) and some data types also have haplotype resolved reads (demarcated as resolved). 
    
   
  
    
   
  - 
 
  
    EGAD00001010872 
   
  
    
    Allogeneic haematopoietic cell transplantation (HCT) replaces the stem cells responsible for blood production with those harvested from a donor, and is received by 40,000 patients worldwide each year. To quantify dynamics of long-term stem cell engraftment, we sequenced whole genomes of 2,824 single-cell-derived haematopoietic colonies from blood samples of 10 donor-recipient pairs taken 9-31 years after HLA-matched sibling HCT. With younger donors, 10,000-50,000 stem cells had engrafted and were still contributing to haematopoiesis at time of sampling, but estimates were 10-fold lower with older donors. Engrafted stem cells made multilineage contributions to myeloid, B-lymphoid and T-lymphoid populations, although individual clones often showed biases towards one or other mature cell type. Recipients had lower clonal diversity than matched donors, equivalent to ~10-15 years of additional ageing, arising from up to 25-fold greater expansion of stem cell clones. An HCT-related population bottleneck alone could not explain these differences: instead, phylogenetic trees evinced two distinct modes of HCT-specific selection. In 'pruning selection', cell divisions underpinning recipient-enriched clonal expansions had occurred in the donor, preceding transplant - their selective advantage derived from preferential mobilisation, harvest, survival ex vivo or initial homing. In 'growth selection', cell divisions underpinning clonal expansion occurred through proliferative advantage in the recipient's marrow after homing - clones with multiple driver mutations especially demonstrated this pattern. Uprooting stem cells from their native environment and transplanting them to foreign soil exaggerates selective pressures, distorting and accelerating the loss of clonal diversity compared to the unperturbed haematopoiesis of donors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010874 
   
  
    
    Allogeneic haematopoietic cell transplantation (HCT) replaces the stem cells responsible for blood production with those harvested from a donor, and is received by 40,000 patients worldwide each year. To quantify dynamics of long-term stem cell engraftment, we sequenced whole genomes of 2,824 single-cell-derived haematopoietic colonies from blood samples of 10 donor-recipient pairs taken 9-31 years after HLA-matched sibling HCT. With younger donors, 10,000-50,000 stem cells had engrafted and were still contributing to haematopoiesis at time of sampling, but estimates were 10-fold lower with older donors. Engrafted stem cells made multilineage contributions to myeloid, B-lymphoid and T-lymphoid populations, although individual clones often showed biases towards one or other mature cell type. Recipients had lower clonal diversity than matched donors, equivalent to ~10-15 years of additional ageing, arising from up to 25-fold greater expansion of stem cell clones. An HCT-related population bottleneck alone could not explain these differences: instead, phylogenetic trees evinced two distinct modes of HCT-specific selection. In 'pruning selection', cell divisions underpinning recipient-enriched clonal expansions had occurred in the donor, preceding transplant - their selective advantage derived from preferential mobilisation, harvest, survival ex vivo or initial homing. In 'growth selection', cell divisions underpinning clonal expansion occurred through proliferative advantage in the recipient's marrow after homing - clones with multiple driver mutations especially demonstrated this pattern. Uprooting stem cells from their native environment and transplanting them to foreign soil exaggerates selective pressures, distorting and accelerating the loss of clonal diversity compared to the unperturbed haematopoiesis of donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001010875 
   
  
    
    Whole exome sequencing of neoplastic colorectal lesions, matched normal mucosa and peripheral blood leucocytes from 7 individuals. Data is contained within FASTQ files. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  21 
 
  
    EGAD00001010876 
   
  
    
    Exome sequencing was performed on n=28 treatment-naïve esophageal adenocarcinoma (EACs). Three to four biopsies sampling different areas of each tumor were pooled before nucleic acid extractions to mitigate the elevated heterogeneity described for EAC. WES was performed on EAC biopsies at 120X average coverage, with autologous PBMCs used as germline controls at 80X average coverage. Libraries were prepared from 30 ng of input DNA using the SureSelect QXT Human All Exon V7 kit (Agilent Technologies) and sequenced on the NextSeq 550 (Illumina), 2x150 bp. BCL files were demultiplexed to FastQ files using bcl2fastq2 software (Illumina). Three paired end sequencing batches were analyzed independently (Batch1: samples 8, 10, 11, 12, 15, 17, 18; Batch2: samples 20, 24, 25, 26, 27, 29, 30, 31, 33, 34 ; Batch3: samples 35, 37, 39, 40, 41, 43, 45, 48, 54, 55, 57). 
RNA sequencing was performed on n=26 treatment-naïve esophageal adenocarcinoma (EACs). Three to four biopsies sampling different areas of each tumor were pooled before nucleic acid extractions to mitigate the elevated heterogeneity described for EAC. RNAseq libraries were prepared on 50 ng of total RNA (with RNA integrity index RIN >=7) with the TruSeq Stranded mRNA library preparation kit (Illumina) in accordance with low-throughput protocol. After PCR enrichment (15 cycles) and purification of adapter-ligated fragments, the concentration and length of DNA fragments were measured using D1000 Screen Tape System (Agilent), obtaining a median insert size of 311 nucleotides. Then, RNAseq libraries were sequenced using the Illumina NovaSeq platform, 1x100 bp, obtaining on average 100 million single reads per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  166 
 
  
    EGAD00001010877 
   
  
    
    Dataset for the initial melanoma PEACE paper in Cancer Discovery, March 2023. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
      NextSeq 550 
      
    
   
  894 
 
  
    EGAD00001010878 
   
  
    
    In this study a next-generation sequencing based method was applied to comprehensively screen for recurrent, disease-relevant copy number aberrations in a cohort of Hungarian patients. Diagnostic bone marrow samples from 260 children with B-cell acute lymphoblastic leukemia as well as 72 control samples and were investigated by digital multiplex ligation-dependent probe amplification using the disease-specific D007 probemix. Whole chromosome gains and losses, as well as subchromosomal copy number aberrations were simultaneously profiled. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  332 
 
  
    EGAD00001010879 
   
  
    
    MS Risk Gene RNAseq datasets of immune cell subsets (CD4, CD8, B cell, monocyte) from healthy controls and untreated MS cases. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  578 
 
  
    EGAD00001010880 
   
  
    
    The RRBS libraries of the genomic DNA from the 521 tissue samples were constructed following the standard RRBS protocol.  100-200 ng of intact genomic DNA in the volume of 21.5 µl was used as input material. Restriction digestion was done with 2.5 µl 10xCutSmart buffer and 1 µl MspI (NEB) for 18 h at 37  oC and 20 min at 65 oC. 0.5 µl 10xCutSmart buffer, 0.3 µl dACGTP mixture (100 mM dATP, 10 mM dCTP, 10 mM dGTP), 1 µl Klenow (exo-, 5U/µl, NEB) and 2.6 µl RT-PCR water, 0.6 µl 50 mM DTT (ThermoFisher) was added to the mixture for end repair and A-overhang addition with the program 30 oC for 20 min, 37 oC for 1 h and 75 oC for 20 min. Adapter ligation was then performed with 1 µl 10xThermoFisher HC T4 ligase buffer, 0.4 µl 100 mM ATP (ThermoFisher), 0.2 µl 50 mM DTT, 1 µl ThermoFisher HC T4 DNA ligase (30 Weiss Unit/µl), 30 ng home-made duplex UMI adapter with all the cytosines methylated (protocol adopted from Kennedy et al.) at 16  oC for 20 h and 65  oC for 20 min. Bisulfite conversion of the adapter-ligated product was carried out with QIAGEN EpiTect plus DNA bisulfite kit following their protocol for two rounds of conversion. The converted product was purified with Qiagen MinElute spin column and eluted with 20 µl RT-PCR water. PCR amplification was done using the NEBNext Multiplex Oligos for Illumina (2.5 µl of universal and index primer each) and 25 µl KAPA HiFi HotStart Uracil+ ReadyMix (Roche) with the following cycling conditions: 98  oC for 45 s, 9 cycles of 98  oC for 15 s, 60  oC for 30 s and 72  oC for 30 s, followed by a final extension at 72  oC for 5 min. The PCR product was purified with 1x AmpureXP beads and eluted with 30 µl EB buffer. DNA concentration was measured by Qubit 1xdsDNA HS assay. 5% TBE-UREA PAGE and bioanalyzer assay was performed as quality control on each library before sequencing. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  521 
 
  
    EGAD00001010881 
   
  
    
    The cfMethyl-Seq libraries of the serial plasma cfDNA samples from the four NSCLC patients were constructed following the standard protocol. 10 ng of cfDNA in the volume of 25 µl was used as input material. 5’-end dephosphorylation was done with 3 µl 10xCutSmart buffer and 2 µl quick CIP from NEB (Ipswich, MA) at 37 oC for 30 min then heat-inactivated at 80 oC for 5 min. The 3’-end blocking was done with 0.5 µl 10xCutSmart buffer, 3 µl 2.5 mM CoCl2, 1 µl terminal transferase (all from NEB), and 0.5 µl 1 mM ddGTP at 37 oC for 2 h followed by 75 oC for 20 min. The mixture was then purified with 2x AmpureXP beads (Beckman Coulter, Indianapolis, IN) and eluted in 21.5 µl RT-PCR grade water (Thermo-Fisher, Waltham, MA). Restriction digestion was done with 2.5 µl 10xCutSmart buffer and 1 µl MspI (NEB) for 18 h at 37  oC and 20 min at 65 oC . 0.5 µl 10xCutSmart buffer, 0.3 µl dACGTP mixture (100 mM dATP, 10 mM dCTP, 10 mM dGTP), 1 µl Klenow (exo-, 5U/µl, NEB) and 2.6 µl RT-PCR water, 0.6 µl 50 mM DTT (ThermoFisher) was added to the mixture for end repair and A-overhang addition with the program 30  oC for 20 min, 37 oC for 1 h and 75 oC for 20 min. Adapter ligation was then performed with 1 µl 10xThermoFisher HC T4 ligase buffer, 0.4 µl 100 mM ATP (ThermoFisher), 0.2 µl 50 mM DTT, 1 µl ThermoFisher HC T4 DNA ligase (30 Weiss Unit/µl), 5 ng home-made duplex UMI adapter with all the cytosines methylated (protocol adopted from Kennedy et al.) at 16 oC for 20 h and 65 oC for 20 min. Bisulfite conversion of the adapter-ligated product was carried out with QIAGEN EpiTect plus DNA bisulfite kit following their protocol for two rounds of conversion. The converted product was purified with Qiagen MinElute spin column and eluted with 20 µl RT-PCR water. PCR amplification was done using the NEBNext Multiplex Oligos for Illumina (2.5 µl of universal and index primer each) and 25 µl KAPA HiFi HotStart Uracil+ ReadyMix (Roche) with the following cycling conditions: 98  oC for 45 s, 15 cycles of 98  oC for 15 s, 60  oC for 30 s and 72  oC for 30 s, followed by a final extension at 72  oC for 5 min. The PCR product was purified with 1x AmpureXP beads and eluted with 30 µl EB buffer. DNA concentration was measured by Qubit 1xdsDNA HS assay. 5% TBE-UREA PAGE and bioanalyzer assay was performed as quality control on each library before sequencing. 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  12 
 
  
    EGAD00001010883 
   
  
    
    130 runs/ 65 samples of paired RNA-Seq data of chemo-naïve and post-chemotherapy pancreatic ductal adenocarcinoma (PDAC). The sequencing was done on Novaseq 6000 with Illumina TruSeq stranded mRNA Kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD00001010884 
   
  
    
    Paired RNA-Seq of 32 samples of chemo-naïve and post-chemotherapy PDAC tumors (HIPO_015) to define the molecular and cellular impact of neoadjuvant chemotherapy. Transcriptome analysis combined with high resolution mapping of whole tissue sections identified GATA6 (Classical), KRT17 (Basal-like) and Cytochrome P450 3A (CYP3A) co-expressing cells that were preferentially enriched in post-CTX resected samples. The sequencing was done on HiSeq2000/HiSeq2500 using the Takara_SMARTer_Ultra_Low_Input_RNA_and_NEBNext_ChIP-Seq Kit. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
    
   
  32 
 
  
    EGAD00001010887 
   
  
    
    Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001010888 
   
  
    
    Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD00001010889 
   
  
    
    Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  6 
 
  
    EGAD00001010890 
   
  
    
    The introduction of bowel cancer screening has led to a significant increase in the proportion of patients being diagnosed with asymptomatic, early-stage colorectal cancer (CRC). Although the majority of these patients are successfully treated with surgery alone, a small proportion of patients have 'born-to-be-bad' aggressive lesions with early dissemination leading to distant metastases. Current standard of care histological assessment is unable to distinguish between these aggressive versus non-aggressive early lesions which is essential to provide appropriate clinical management decisions. This study aims to carry out molecular and histological profiling of approximately 300 T1 CRCs in order to develop a molecular stratifier based on the risk of relapse in early-invasive CRC. This novel T1 cohort will represent the world's largest molecularly characterised T1 cohort of samples, with digital pathology assessment alongside whole exome sequencing, copy number variation analysis and 3' RNA-seq. This data will be used to generate a robust panel of molecular and/or histological markers applicable to formalin-fixed paraffin embedded (FFPE) archival tissue which discriminates between T1 lesions based on risk of relapse, which will ultimately be used to inform clinical management of CRC at the earliest stages of the disease. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  255 
 
  
    EGAD00001010891 
   
  
    
    The dataset for Single molecule genome-wide mutation profiles of cell-free DNA for non-invasive detection of cancer includes 57 BAM files from whole genome next-generation sequencing on the Illumina HiSeq2500.  The samples analyzed include plasma samples from individuals with and without cancer. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  57 
 
  
    EGAD00001010892 
   
  
    
    Hybrid capture sequencing was performed to 3 purified Hodgkin and Reed-Sternberg (HRS) cell samples. In brief, Probes for 177 genes were designed and synthesized by Twist Bioscience. Hybridization capture of DNA libraries was performed using Twist Hybridization and Wash Kit (Twist Bioscience). The captured library was measured using Agilent Bioanalyzer High Sensitivity chip and Qubit dsDNA HS Assay Kit and run on Illumina Nextseq550. The BAM files were generated from the raw sequencing data using Cell Ranger (v6.0.2) mkfastq and count commands 
    
   
  
    
      
      NextSeq 550 
      
    
   
  4 
 
  
    EGAD00001010893 
   
  
    
    Fastq.gz files for mRNA sequenced from Mtb infected and uninfected neutrophils after 1 and 6 hrs. Samples were sequenced in 2 batched as indicated per experiment. Batch 1 was SE unstranded, 100bp on an Illumina HiSeq4000 sequencer and batch 2 unstranded, 150bp paired-end on an Illumina NovaSeq6000 sequencer. Phenotypic data for the samples is also included. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010894 
   
  
    
    This data set includes RNAseq from 77 follicular lymphoma tumours. All tumours were fresh frozen. Libraries were constructed by enriching for poly-A transcripts and sequenced as 75bp paired end reads on an Illumina HiSeq 2500 instrument. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001010895 
   
  
    
    transcriptome analysis of NK cells sorted from PBMCs at baseline and after addition of a CD20 (B cell)-targeted T cell dependent bispecific antibody (TDB) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001010896 
   
  
    
    ChIP-seq has been perfomed on 4 healthy and 5 tumor fresh-frozen endometrial tissues from post-menopausal patients. Immunoprecipitation has been performed for H3K27ac and ERa. Raw single-end fastq data have been aligned using bwa-mem using Hg19 genome assebly as reference. Aligment bam files are provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  21 
 
  
    EGAD00001010897 
   
  
    
    4C-seq performed on 10 slices (30um thick) of fresh frozen endometrial tissues. These tissue include 2 healthy tissues and 4 tumor tisseus (post-menopausal patients) in replicate. The library has been performed using DpnII (primary) and NlaIII (secondary) restriction enzymes. Raw single-end fastq.gz files are provided. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  11 
 
  
    EGAD00001010898 
   
  
    
    Hi-C libraries have been prepared by enzymatic digestion with MboI restriction enzyme and sonication. Illumina single-indexig primers have been use to amplify ligated fragments. Hi-C experiments have been performed on 10 slices (30um thick) of fresh-frozen tissues derived from 3 healthy and 3 tumor endometrial tissues of post-menopausal patients. Raw paired-end fastq.gz files are provided. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD00001010899 
   
  
    
    nNGM analysis of treatment-naive MIBC (N=49) and NMIBC (N=16). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  68 
 
  
    EGAD00001010900 
   
  
    
    This dataset contains genome-wide array data from Amazigh (Chaoui and Mozabite) and non-Amazigh Algerian individuals. Chaoui were sampled in Oum El Bouaghi (n=47), Batna (n=46), and Khenchela (n=37). Mozabite were sampled in Ghardaïa (n=14). Non-Imazighen were sampled in Algiers (n=34). 
    
   
  
    
   
  1 
 
  
    EGAD00001010904 
   
  
    
    Short read whole genome sequencing analysis of the off target effect after Prime editing in IPSC line KCNQ2 R201C. Comparison of parental KCNQ2 R201C with two corrected clonal lines (3samples in total). Dataset contains CRAM files and VCF files for the respective samples. 
    
   
  
    
      
      unspecified 
      
    
   
  3 
 
  
    EGAD00001010905 
   
  
    
    Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of neuroblastoma xenograft models. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  67 
 
  
    EGAD00001010906 
   
  
    
    Human data for transcriptome (bulk RNA-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells 
    
   
  
    
      
      NextSeq 500 
      
    
   
  79 
 
  
    EGAD00001010907 
   
  
    
    Human data for transcriptome (scRNA-Seq) in CD34+ B cell precursors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010908 
   
  
    
    Human data for chromatin accessibility (ATAC-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells 
    
   
  
    
      
      NextSeq 500 
      
    
   
  78 
 
  
    EGAD00001010909 
   
  
    
    Human data for chromatin accessibility (ATAC-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  78 
 
  
    EGAD00001010910 
   
  
    
    Human data for chromatin accessibility (scATAC-Seq) in CD34+ B cell precursors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001010911 
   
  
    
    10 samples sequenced in Target-sequencing of a panel of 571 genes (Illumina NovaSeq 6000)
- Raw FASTQ data
- Annotated VCF 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001010912 
   
  
    
    15 samples sequenced in RNA-seq. This dataset contains their raw FASTQ data and the raw count table and the TPM count table. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001010913 
   
  
    
    While gene therapy (GT) provides a potentially curative treatment option for patients with sickle cell disease (SCD), the occurrence of myeloid malignancies in clinical trials has prompted concern. To interrogate potential mechanisms underlying increased cancer risk, we used hematopoietic stem cell (HSC) clonal tracking by whole genome sequencing (WGS) to map the somatic mutation and clonal landscape of 2,592 gene modified as well as unmodified single stem and progenitor cells from six SCD patients undergoing gene therapy (7-26 years old, average 12.7× depth). Pre-GT phylogenetic trees in SCD were highly polyclonal and mutation burdens per cell were elevated in some, but not all, patients. Post-GT, no clonal expansions were identified. However, an increased frequency of driver mutations associated with myeloid neoplasms or clonal hematopoiesis (DNMT3A- and EZH2-mutated clones in particular) were seen in both genetically modified and unmodified cells suggested positive selection of mutant clones during gene therapy. This work sheds light on the mutation landscape and HSC clonal dynamics in gene therapy for SCD and highlights enhanced fitness of some HSCs harboring pre-existing driver mutations following gene therapy. Future studies should define the long-term fate of mutant clones including any contribution to expansions associated with myeloid neoplasms. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3394 
 
  
    EGAD00001010914 
   
  
    
    While gene therapy (GT) provides a potentially curative treatment option for patients with sickle cell disease (SCD), the occurrence of myeloid malignancies in clinical trials has prompted concern. To interrogate potential mechanisms underlying increased cancer risk, we used hematopoietic stem cell (HSC) clonal tracking by whole genome sequencing (WGS) to map the somatic mutation and clonal landscape of 2,592 gene modified as well as unmodified single stem and progenitor cells from six SCD patients undergoing gene therapy (7-26 years old, average 12.7× depth). Pre-GT phylogenetic trees in SCD were highly polyclonal and mutation burdens per cell were elevated in some, but not all, patients. Post-GT, no clonal expansions were identified. However, an increased frequency of driver mutations associated with myeloid neoplasms or clonal hematopoiesis (DNMT3A- and EZH2-mutated clones in particular) were seen in both genetically modified and unmodified cells suggested positive selection of mutant clones during gene therapy. This work sheds light on the mutation landscape and HSC clonal dynamics in gene therapy for SCD and highlights enhanced fitness of some HSCs harboring pre-existing driver mutations following gene therapy. Future studies should define the long-term fate of mutant clones including any contribution to expansions associated with myeloid neoplasms. 
    
   
  
    
   
  24 
 
  
    EGAD00001010915 
   
  
    
    Transcriptome sequencing of three normal skeletal muscle and 15 pleomorphic rhabdomyosarcoma patient tumors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  20 
 
  
    EGAD00001010917 
   
  
    
    To define a transcriptomic reference of human B lymphopoiesis, bone marrow aspirates were obtained from n=4 healthy donors (study registration DRKS00023583). After immunodensity cell separation, samples were FACS-sorted into 7 established lymphopoietic differentiation stages. RNA was extracted from 5,000-320,000 cells per differentiation stage and subjected to ultra-low-input RNA sequencing after generation of stranded sequencing libraries. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD00001010918 
   
  
    
    BAM files from capture-sequencing dataset described in Veilleux et al. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  158 
 
  
    EGAD00001010919 
   
  
    
    Using RNAseq, we compare the 15% of poorest responders (PRs, n=177) as measured by proportional Ki67 changes after 2 weeks of neoadjuvant aromatase inhibitors to good responders (GRs, n=190) selected from the top 50% responders in the POETIC trial and matched for baseline Ki67 categories. In the POETIC trial, 4,480 postmenopausal women with primary ER+ BC were randomised 2:1 to receive either treatment with a non-steroidal AI (letrozole or anastrozole) for 2 weeks before and 2 weeks after surgery or to no perisurgical treatment. Only AI-treated patients with HER2- tumors, paired baseline and surgery Ki67 available, and baseline Ki67 immunohistochemistry (IHC) >10% (to minimise imprecision in proportional Ki67 falls) were included for selection. Data is baseline RNAseq and targeted exome DNA sequencing analysis of POETIC Good/Poor Responders to aromatase inhibitors based on change in Ki67. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  365 
 
  
    EGAD00001010920 
   
  
    
    Transcriptome profiling of 121 high-risk paediatric cancer samples for identifying T-cell infiltration signatures using poly-A capture by Truseq and sequenced on NextSeq 500 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  240 
 
  
    EGAD00001010921 
   
  
    
    This dataset contains the Visium Spatial Transcriptomics data from treatment naive melanoma lymph node metastatic samples. 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001010922 
   
  
    
    Phenotype data for Lassa Fever cases and population controls from Nigeria and Sierra Leone associated with genotype data generated using Illumina Omni 2.5M and 5M. 
    
   
  
    
   
  2667 
 
  
    EGAD00001010923 
   
  
    
    Phenotype data for Lassa Fever cases and population controls from Nigeria and Sierra Leone associated with genotype data generated using Illumina H3Africa array version 1. 
    
   
  
    
   
  1345 
 
  
    EGAD00001010924 
   
  
    
    Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing RNAseq analysis of a panel of mesothelioma cells lines. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  5 
 
  
    EGAD00001010925 
   
  
    
    This dataset includes metagenomic sequencing of faecal samples from 7,190 Israeli individuals. Single-end sequencing was performed using a NovaSeq sequencing platform (Illumina). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7190 
 
  
    EGAD00001010926 
   
  
    
    Illumina NovaSeq 6000 30x WGS of 26 samples, each with up to 5 matched timepoints. Timepoints A,B,C,D,and E correspond to Pretreatment, Week 3, Week 6, Week 9, and Week 12 after treatment, respectively. Additional sample metadata (sample recurrence, treatment course, age, sex, comorbidities, etc.) are present in sample description. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  281 
 
  
    EGAD00001010927 
   
  
    
    This dataset contains raw sequencing files associated with the paper "Neutrophils and emergency granulopoiesis drive immune suppression and an extreme response endotype during sepsis" (https://doi.org/10.1038/s41590-023-01490-5).
It is composed of two experiments, which are as follows:
1. Single-cell profiling of whole blood leukocytes: This experiment comprises sequencing files generated using the BD Rhapsody single-cell multi-omics profiling (RNA + protein) platform. This platform was used to profile the whole blood leukocyte population in a cohort of sepsis patients, cardiac surgery controls and healthy controls. There are 48 samples and four files per sample: two paired-end FASTQ files (R1 and R2) corresponding to the RNA-seq library, and two paired-end FASTQ files (R1 and R2) corresponding to the protein profiling (ADT-based AbSeq) library.
2. Single-cell profiling of circulating HSPCs: This experiment comprises sequencing files generated using the 10X single-cell multi-omics (RNA-seq + ATAC-seq) profiling platform. This platform was used to profile circulating HSPCs in a cohort of sepsis patients and healthy controls. There are 5 samples (or plexes), each of which consists of a pool of individuals for whom cells were multiplexed and sequenced as a single sample. There are four files per sample: two paired-end FASTQ files (R1 and R2) corresponding to the RNA-seq library, and two paired-end FASTQ files (R1 and R2) corresponding to the ATAC-seq library. Because these samples consists of multiplexed pools, index files are also provided. These can be used for sample deconvolution using the Cell Ranger pipeline. 
NOTE: Due to EGA's constrains in the number of files permitted per sample, index files (I1 and I2) for the HSPC data set are provided as a separate experiment named "FASTQ index files (I1 and I2) for deconvolution of 10X single-cell multiomics libraries". 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  106 
 
  
    EGAD00001010928 
   
  
    
    Sequencing libraries were constructed according to standard procedures from 600 ng
of tumor and paired constitutional DNA. WES was captured using Agilent SureSelect V5 (50
Mb), Clinical Research Exome (54 Mb) kit, SureSelect XT human All exon CRE version 1 or
2, or Twist Human Core Exome Enrichment System. Sequencing of subsequent libraries was
performed using Illumina sequencers (Next-Seq 500 or Hiseq 2000/2500/4000) in 75 bp
paired-end mode, aiming for a mean depth of coverage of 100x. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1376 
 
  
    EGAD00001010931 
   
  
    
    The dataset for Detecting Liver Cancer Using Cell-Free DNA Fragmentomes includes 444 BAM files from whole genome next-generation sequencing on the Illumina NovaSeq 6000.  The samples analyzed include plasma samples from individuals with and without cancer. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  444 
 
  
    EGAD00001010932 
   
  
    
    Total RNA paired-end sequencing was performed on whole blood samples from 74 Lupus nephritis (LN) patients and 20 healthy controls using Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  84 
 
  
    EGAD00001010934 
   
  
    
    The dataset consists of targeted sequencing data obtained from an independent cohort of 11 patients to validate discriminative DMRs (differentially methylated regions) for Acute coronary syndrome (ACS) subtypes. The cohort includes 2 healthy subjects, 4 STEMI (ST-segment elevation myocardial infarction), 3 NSTEMI (non-ST-segment elevation myocardial infarction), and 2 UA (unstable angina) patients. The sequencing panel targeted 18,831 CpGs for analysis, reaching at least 5 reads coverage for 75% of the targeted CpGs in each sample. The sequencing was performed using the NEBNext Enzymatic Methyl-seq Module and the Nonacus Cell3TMTarget: Library Preparation kit, followed by probe hybridization, capture enrichment, and sequencing on the Novaseq 6000 platform, generating 400 million reads. The dataset is in raw fastq format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001010935 
   
  
    
    The dataset contains valuable genomic data from a discovery cohort consisting of 29 individuals. This cohort is comprised of 8 healthy individuals (control), 8 patients with ST-segment elevation myocardial infarction (STEMI), 7 patients with non-ST-segment elevation myocardial infarction (NSTEMI), and 6 patients with unstable angina (UA). The genomic data was obtained by isolating cell-free circulating DNA (ccfDNA) and subjecting it to bisulfite conversion using a low-input BS-seq (PBAT) protocol. Sequencing was done on the Novaseq 6000 platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001010936 
   
  
    
    The objective of the colonoscopy study is to carry out the 16s sequencing of colon biopsies and faecal samples provided in ExHiBITT study to compare potential fluctuations in the microbiota of different sites. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1021 
 
  
    EGAD00001010937 
   
  
    
    Low-coverage whole genome methylation sequencing of cell-free DNA (cfDNA) from healthy volunteers (n=2) and allograft transplant recipients (n=11). The cfDNA was extracted from urine and plasma and sequenced using both a single- and double-strand library preparation method (n=15 and n=18). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  33 
 
  
    EGAD00001011041 
   
  
    
    PC-9 cells were barcoded following protocol from Chang et al. Nature Biotech. 2021, which does not affect downstream analysis. PC-9 cells were trypsinized into single-cell suspensions and processed using Chromium Single Cell Gene Expression 3’ Library and Gel Bead Kit V2.0 following the manufacturer’s instructions (10X Genomics). Cells were counted and checked for viability using Vi-CELL XR cell counter (Beckman Coulter), and then injected into microfluidic chips to form Gel Beads-in-Emulsion (GEMs) in the 10X Chromium instrument. Reverse transcription was performed on the GEMs, and RT products were purified and amplified. Expression libraries were made from the cDNA and profiled using the Bioanalyzer High Sensitivity DNA kit (Agilent Technologies) and quantified with Kapa Library Quantification Kit (Kapa Biosystems). Illumina HiSeq2500 (Illumina) was used to sequence the libraries. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001011042 
   
  
    
    This dataset contains bulk RNA sequencing at different timepoints post-BNT162b2 mRNA COVID-19 vaccination. Stimulation experiments were performed at each of the timepoints (RPMI, Influenza, R848 and Influenza stimuli), resulting in 242 libraries distributed over 4 stimulations and 4 timepoints. Libraries were sequenced on the DBNSEQ platform. 
    
   
  
    
      
      unspecified 
      
    
   
  242 
 
  
    EGAD00001011043 
   
  
    
    This dataset contains whole genome sequences from Illumina NovaSeq Devices sequenced at the WGGC Bonn to study effects of prolonged paternal exposure to ionizing radiation. Here we provide the reads, mapped to the hg19 reference genome of all samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  103 
 
  
    EGAD00001011045 
   
  
    
    Single-cell RNA sequencing was performed on viable frozen tumor dissociated cells from three RCC patients. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001011046 
   
  
    
    TCRab sequencing was performed on viable frozen tumor dissociated cells from three RCC patients. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  20 
 
  
    EGAD00001011047 
   
  
    
    Short read whole genome and long read Oxford Nanopore sequencing of matched tumor/normal material from 10 Melanoma and 1 case of TNBC. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      PromethION 
      
    
   
  44 
 
  
    EGAD00001011048 
   
  
    
    Pediatric patients with recurrent and refractory cancers are in most need for new treatments. This study developed patient-derived-xenograft (PDX) models within the European MAPPYACTS cancer precision medicine trial (NCT02613962).
To date, 131 PDX models were established following heterotopical and/or orthotopical implantation in immunocompromised mice: 76 sarcomas, 25 other solid tumors, 12 central nervous system tumors, 15 acute leukemias, and 3 lymphomas. PDX establishment rate was 43%. Histology, whole exome and RNA sequencing revealed a high concordance with the primary patient’s tumor profile, human leukocyte-antigen characteristics and specific metabolic pathway signatures. A detailed patient molecular characterization, including specific mutations prioritized in the clinical molecular tumor boards are provided. Ninety models were shared with the IMI2 ITCC Paediatric Preclinical Proof-of-concept Platform (IMI2 ITCC-P4) for further exploitation.
This new PDX biobank of unique recurrent childhood cancers provides an essential support for basic and translational research and new treatments development in advanced pediatric malignancies. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  166 
 
  
    EGAD00001011049 
   
  
    
    This dataset contains a collection of tumour samples from high grade serous ovarian carcinoma patients with recurrent disease collected near the point of diagnosis as well as tumour samples collected after patient relapse upon or some time after study entry. The majority of diagnosis samples were preserved in neutral buffered formalin whereas the majority of post-relapse samples were preserved in universal molecular fixative (UMFIX, Sakura Finetek USA, Inc). DNA was extracted and whole genome sequencing libraries were prepared either using the TruSeq DNA Nano kit (Illumina) or the ThruPLEX DNA-Seq Kit (Takara Bio), which were respectively sequenced  at low depth (~0.1-0.5X) on illumina HiSeq 2500 and HiSeq 4000 sequencing platforms. Sequenced reads were aligned to the GRCh37 reference genome (release hs37d5). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  679 
 
  
    EGAD00001011050 
   
  
    
    ATAC-seq of 79 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). ATAC-seq of CD34+ HSPCs from 3 healthy donors is also included. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001011051 
   
  
    
    Hi-C of 17 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Moreover, Hi-C of CD34+ HSPCs from 3 healthy donors are included. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD00001011052 
   
  
    
    MCIP-seq of 77 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Moreover, MCIP-seq of CD34+ HSPCs from 3 healthy donors is included. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00001011053 
   
  
    
    Whole exome sequencing of tumor material derived from 14 mixed myeloid/lymphoid leukemias with a CpG Island Methylator Phenotype (CIMP). For 4 of these patients, normal material was also sequenced and used as control (files with the same identifiers prepended by an “h”). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011054 
   
  
    
    Total RNA-seq of blasts derived from 131 adult T-ALL cases, 7 AML cases and 1 mixed myeloid/lymphoid leukemia with CpG Island Methylator Phenotype (CIMP). The other RNA-seq data used in this study has been previously published and is available at EGAD00001007581 (AML) and EGAD00001007646 (CD34+ cells). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  81 
 
  
    EGAD00001011055 
   
  
    
    Four different types of transcriptomic single-cell sequencing of blood from childhood B acute lymphoblastic leukemia. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001011056 
   
  
    
    bulk RNAseq was conducted on highly purified CD45 - CD71 - CD235a - CD31 - CD271 + BMSCs isolated from a cohort of newly diagnosed 62 AML patients, uniformly treated within an intensive chemotherapy clinical trial  and selected to represent the mutational landscape of AML 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  70 
 
  
    EGAD00001011057 
   
  
    
    To generate a cellular taxonomy of the human NBM and AML, representing both rare hematopoietic stem/progenitor cells (HSPCs) and stromal niche populations, allowing assessment of their cellular diversity and predicted intercellular signaling, we performed single cell RNA sequencing (scRNAseq) on viably frozen bone marrow (BM) aspirates from four healthy donors and 6 NPM1+ AML patients at diagnosis 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001011058 
   
  
    
    This dataset contains a collection of tumour samples from high grade serous ovarian carcinoma patients with recurrent disease collected near the point of diagnosis as well as tumour samples collected after patient relapse upon or some time after study entry. Whole blood samples were also collected from patients at study entry for the purpose of germline variant detection. The majority of diagnosis samples were preserved in neutral buffered formalin whereas the majority of post-relapse samples were preserved in universal molecular fixative (UMFIX, Sakura Finetek USA, Inc). DNA was extracted and the tagged-amplicon deep sequencing assay was applied (TAm-Seq, Forshew  et al. 2002, Sci Transl Med) with the aid of fluidigm access array technology. Targeted loci were sequenced at very high depths (typically >100X) for the detection of both somatic and germline variants. Sequenced reads were aligned to the GRCh37 reference genome (release hs37d5). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina MiSeq 
      
    
   
  933 
 
  
    EGAD00001011059 
   
  
    
    CTCF ChIP-seq of 39 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001011060 
   
  
    
    H3K27ac ChIP-seq of 79 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). In addition, 4 samples derived from CD34+ cord blood cells of healthy donors were included. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  3 
 
  
    EGAD00001011061 
   
  
    
    The dataset includes spatially-resolved and single-cell antigen receptor, as well as gene expression, data from two different HER2+ breast cancer patients. The tumor piece obtained during surgery from each patient was divided into several regions and tissue sections were used for spatial transcriptomics (Visium, 10x genomics). As indicated, some tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. In parallel, tissue pieces from the same tumor were dissociated for single-cell gene expression analysis (10x genomics GEX, VDJ, and feature barcoding/Hash Tag Oligonucleotide). The deposited data is in the form of fastq files. All processed data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are publicly available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Cell Ranger, Seurat, Space Ranger, and STutility pipelines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
      Sequel 
      
    
   
  43 
 
  
    EGAD00001011062 
   
  
    
    The dataset includes spatially-resolved gene expression and antigen receptor data from two Tonsil samples (1 and 2). Tissue sections from the tonsil samples were used for spatial transcriptomics (Visium, 10x genomics). Tonsil 2 tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. Nearby or adjacent tissue sections (from Tonsil2) were also analyzed by a bulk antigen receptor sequencing approach (amplicon sequencing), by a method also newly developed by us in the same publication (Bulk SS3 VDJ). For Visium, the data were anonymized (all SNPs removed) using Bamboozle (Ziegenhain and Sandberg, Nature Communications 2021). The deposited data is in the form of fastq files. All remaining data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Seurat, Space Ranger, and STutility pipelines. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
      Sequel 
      
    
   
  86 
 
  
    EGAD00001011063 
   
  
    
    We performed a global mutational landscape analysis using tumor samples from the 47 urothelial cancer patients included in MATCH-R or MOSCATO studies, with advanced metastatic disease and WES available (RNAseq is included for 38 patient samples). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  133 
 
  
    EGAD00001011064 
   
  
    
    This dataset contains CLCNKA/CLCNKB locus alignment data from 27 patients with Bartter syndrome and structural variants encompassing the CLCNKB gene. Due to data protection regulations and in accordance with the patient consent, only relevant alignments from the following regions are shared:
hg19: chr1:16,300,000-16,400,000
hg38 (linked read dataset only): chr1:16,000,000-16,100,000
Methods to generate libaries were: long-range amplicon PCR (24 samples), targeted long-fragment enrichment (Samplix/Xdrop technology, 4 samples), long-read whole genome (PacBio Sequel II HiFi reads, 3 samples), 10X linked read short read whole genome (1 sample). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      Sequel 
      
    
   
  27 
 
  
    EGAD00001011065 
   
  
    
    Data supporting: "SMAD4 and KCNQ3 Alterations are Associated with Lymph Node Metastases in Oesophageal Adenocarcinoma" RNAseq (FASTQ files) 6 samples 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001011066 
   
  
    
    Live CD4 T cells were sorted from inflamed and non-inflamed tissue samples of IBD patients or from healthy and IBD blood samples. ATAC-Seq libraries were generated from live CD4 T cells sorted from i) inflamed and non-inflamed tissue samples, ii) healthy and IBD blood samples, or from iii) CD4 T cell subsets polarised from healthy blood samples. After isolating crude nuclei, live CD4+ T cells were treated with Tagment DNA buffer and Tagment DNA Enzyme (Nextera DNA Library Prep Kit, Illumina), and then the DNA was purified by MinElute PCR Purification Kit (Qiagen). Transposed DNA fragments were amplified using specific adapters followed by purification with MinElute PCR Purification Kit (Qiagen). Fragments from 240-360pb were selected in the PippinHT system (Sage Science). The quality of the library and its DNA concentration were assessed by Bioanalyzer instruments (Agilent Technologies) and ultimately submitted for sequencing using Illumina HiSeq 2500 sequencer, V4 chemistry. On the other hand, single cell RNA-Seq libraries were generated exclusively from inflamed and non-inflamed tissue samples of Crohn’s disease patients. Briefly, live CD4 T cells were captured and encapsulated before cDNA amplification using the 10X Genomics Chromium Platform. Samples were prepared as outlined by 10x genomics Single Cell 3’ Reagent Kits v2 user guide. Samples were sequenced on a HiSeq 2500 with the following run parameters: Read 1 – 26 cycles, read 2 – 98 cycles, index 1 – 8 cycles. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  80 
 
  
    EGAD00001011067 
   
  
    
    69 OAMZL patients were sequenced on the Illumina platform. The data files are available in BAM format. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  69 
 
  
    EGAD00001011068 
   
  
    
    Ovarian Carcinosarcoma DNA and RNA sequencing of patient samples in the UK cohort (n=18). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  65 
 
  
    EGAD00001011069 
   
  
    
    Whole blood samples collected at baseline and week 12 in PAXgene Blood RNA tubes. RNA sequencing on the Illumina NovaSeq 6000 System generated 150 base pair length paired end reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  164 
 
  
    EGAD00001011074 
   
  
    
    This dataset contains whole-exome and RNA sequencing of biopsies obtained from patients enrolled in a phase I clinical trial investigating the UV1 vaccine in combination with pembrolizumab in patients with advanced melanoma. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  86 
 
  
    EGAD00001011075 
   
  
    
    Phenotype data for 3421 Samples from Nigeria and Ghana, sequenced with the Illumina NestSeq 500. 
    
   
  
    
   
  3421 
 
  
    EGAD00001011076 
   
  
    
    Data supporting: "The transcriptional landscape of endogenous retroelements delineates esophageal adenocarcinoma subtypes" RNAseq for 279 samples 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  - 
 
  
    EGAD00001011077 
   
  
    
    FASTQ files describing paired-end RNA-sequencing of isogenic TIRM+ and TIRM- muscle biopsies from 24 FSHD patients (48 samples) and vastus lateralis muscle biopsies from 11 matched control individuals. FASTQ files are also provided describing RNA-sequencing of 15 FSHD peripheral blood mononuclear samples and 14 matched controls. For muscle biopsies sequencing was at 21.7-35.5 million reads/sample. RNA was extracted from PBMCs followed by globin depletion with sequencing at 19.7-46.5 million reads/sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  88 
 
  
    EGAD00001011078 
   
  
    
    Single nuclei transcriptomic data from the AMBITION study. 
    
   
  
    
      
      unspecified 
      
    
   
  10 
 
  
    EGAD00001011079 
   
  
    
    In this study, we used RNA-sequencing gene expression profiling in order to characterize specific phenotypic traits in DIPG-derived glioma stem cell (GSC) models. Twenty-two primary GSC models derived from biopsies collected at diagnosis were cultured in an ECM-mimicking compound before for RNA extraction and subsequent rRNA-depletion and sequencing . 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD00001011080 
   
  
    
    In this study, we used RNA-sequencing gene expression profiling in order to characterize specific phenotypic traits in DIPG. Seventeen primary tumor samples stereotactically biopsied at diagnosis by neurosurgeons in the Necker-Enfants Malades hospital (Paris, France) and snap-frozen for RNA extraction and subsequent poly-A mRNA purification. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001011081 
   
  
    
    whole-genome sequencing data of 168-pared samples, of which  79 samples are archived here 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  79 
 
  
    EGAD00001011082 
   
  
    
    For human samples, total cellular RNA was isolated from post-sorted Mito+ and Mito CD8+ cells using the RNeasy Mini Kit (Qiagen). Stranded RNA libraries were created using SMARTSeq Stranded. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  9 
 
  
    EGAD00001011083 
   
  
    
    Targeted DNA sequencing was performed on 195 bone marrow samples to identify cases of clonal haematopoiesis, and on 99 paired peripheral blood samples. The SeqCap EZ HyperCap protocol was followed, and targeted capture performed against a panel of 97 genes recurrently mutated in myeloid malignancies and clonal hematopoiesis. One BAM file (mapped to the hg38 reference genome) is provided per sample. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  294 
 
  
    EGAD00001011086 
   
  
    
    Whole genome sequencing data of 21 high-grade serous carcinoma (HGSC) patients (59 samples) sequenced with MGISEQ-2000. 
    
   
  
    
      
      unspecified 
      
    
   
  59 
 
  
    EGAD00001011087 
   
  
    
    ZPM WES Pilot consisting of 30 samples paired tumor/normal analyzed with WES at four different laboratories in Germany. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  30 
 
  
    EGAD00001011088 
   
  
    
    Our objective was to establish a liquid biopsy-based monitoring strategy for pediatric high-risk neuroblastomas that are harboring genomic TERT rearrangements at diagnosis. TERT rearrangement breakpoints are detected by a hybrid capture-based neuroblastoma DNA panel sequencing (published in PMID: 34442335) in tumor material and are reflected in cell-free tumor DNA and can serve as robust biomarkers for disease activity. Within the dataset, 5 tumors of 4 pediatric patients with a neuroblastoma were DNA sequenced. Provided are FASTQ data files, bam and bambai files, as well as breakpoint spanning and encompassing read (enspan) bam and bambai files. Sequencing bam data files are aligned to GRCh37.p13 reference genome (processed). 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  5 
 
  
    EGAD00001011089 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  41 
 
  
    EGAD00001011090 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001011091 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD00001011092 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  81 
 
  
    EGAD00001011093 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  140 
 
  
    EGAD00001011094 
   
  
    
    Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . 
This dataset contains all the data available for this study on 2023-06-22. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  91 
 
  
    EGAD00001011095 
   
  
    
    Data supporting: "The transcriptional landscape of endogenous retroelements delineates esophageal adenocarcinoma subtypes" WGS for 452 samples 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001011097 
   
  
    
    Additional datasets linked to EGAS00001006692 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011098 
   
  
    
    This dataset contains paired whole exome sequence data of 5 patients (control/tumor pairs) and paired whole genome sequencing data of 1 patient (control/tumor pair) with Lynch Syndrome from the INFORM registry. Paired sequencing was done mostly on Illumina HiSeq 4000, few on HiSeq2500 and NovaSeq 6000. The library preparation was either with Agilent SureSelect Human_All_Exon V5 (hg19) or with Agilent SureSelectXT HS Human_All_Exon V7 (hg19). The WGS samples were prepared with Agilent SureSelect WGS. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001011099 
   
  
    
    To further decipher the cell type composition within single spheres we performed RNA-Seq of 25 selected spheres of 4 different categories (including 8 signature-/balanced, 9 signature+/balanced, 1 signature-/aberrant, 7 signature+/aberrant). Samples were sequenced using Illumina NextSeq 2000.Fastq reads were processed using inhouse RNA-Seq workflow. 
    
   
  
    
      
      unspecified 
      
    
   
  25 
 
  
    EGAD00001011100 
   
  
    
    This dataset contains samples from 8 patients with Sclerosing epithelioid fibrosarcoma. All 8 samples have whole exome tumor data and tumor RNAseq data. 3 samples also have matched normal dna sequence data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  11 
 
  
    EGAD00001011102 
   
  
    
    This dataset consists of RNA-seq data of 126 whole-blood samples derived from patients after abdominal surgery. Total-RNA collected preoperatively and at 3-time points postoperatively (2-6, 24 and 48hrs) were analysed using RNA-sequencing. RNA was collected in PAXGene (Qiagen) tubes and extracted using the PAXGene RNA extraction kit, with a DNase step to remove contaminating DNA. Globin and ribosomal RNA were removed via a Ribo-zero kit (Illumina) and library preparation was carried out using the TruSeq Stranded Total RNA Library Prep Kit (Illumina). Next generation sequencing was performed on the NovaSeq sequencing platform (Illumina) at the Wellcome Centre for Human Genetics (WCHG) in Oxford. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  252 
 
  
    EGAD00001011103 
   
  
    
    Single-cell RNA-seq analysis of cutaneous immune cells isolated from skin biopsies of psoriasis patients undergoing IL-23 blockade 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  42 
 
  
    EGAD00001011104 
   
  
    
    Optimizing single-cell transcriptomic discrimination of atopic dermatitis versus psoriasis vulgaris 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001011105 
   
  
    
    Bulk RNA-seq on normal human CD19+ cells 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  25 
 
  
    EGAD00001011106 
   
  
    
    A large cohort of 600 cases with familial breast cancer as classified by the Spanish Society of Medical Oncology (SEOM) Clinical Guidelines Update that were recruited during 6 years at Hospital Universitario Morales Meseguer (Murcia, Spain) were retrospectively evaluated to select 16 cases with no positive finding in NGS analysis of 20 genes implicated in this disease. These 16 cases were selected for further investigation using nanopore sequencing. This method involved the use of adaptive sampling enrichment, targeting a panel of 18 human genome regions, which contained the 20 genes (PTEN, ATM, BRCA2, PALB2, CDH1, TP53, NF1, RAD51D, BRCA1, RAD51C ,BRIP1, STK11, CHEK2, EPCAM, MSH2, MSH6, BARD1, MLH1, PMS2, NBN).
In 5 samples (P1, P2, P4, P15 and P16) no selection of long reads was performed. Additionally, in 3 samples (P7, P9 and P10), both procedures were performed in two independent runs, and for the second run of P7, the DNA was previously fragmented using g-TUBE Covaris® (ref 520079) according to the protocol for 6 kb fragments. 
    
   
  
    
      
      MinION 
      
    
   
  19 
 
  
    EGAD00001011108 
   
  
    
    Sample sheet linking anonymized sample IDs and anonymized patient IDs. 
    
   
  
    
   
  1 
 
  
    EGAD00001011109 
   
  
    
    This study includes 542 of multi-region sampled whole exome sequencing data from 83 pancreatic cancer patients in 6 individual experiments : Kras wildtype, local recurrence, treatment-naive (MetomeV2), radiotherapy plus chemotherapy (Radome), chemotherapy-only group (Treatome) and Proj_B-100-478. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  706 
 
  
    EGAD00001011110 
   
  
    
    195 exome sequencing samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  195 
 
  
    EGAD00001011111 
   
  
    
    No. of samples: 80 (28 ULP-WGS, 26 WES, 26 RNA-SEQ)
File types: FASTQ (28 ULP-WGS, 19 WES, 18 RNA), BAM (7 WES, 8 RNA) and VCF (26 WES, 2ULS-WGS)
Technology used: Sequencing - Illumina Novoseq 6000; Map/Align - Illumina DRAGEN v3.7.5; Genome assembly - GrCh38p13
Filename nomenclature: 
- SampleName_Passage_SampleType_TissueType_SequencingType
- Passage of: PX = unknown; PZ = from patient; P0 = first passage from patient on plastic; P1 = first passage from plastic/PDX/organoid
- SampleType: STN = normal; STT = tumor
- SampleType STN: 00 = tissue unknown; 01 = adjacent normal; 02 = fibroblast; 03 = germline blood; 21 = cell line from patient tissue; 22 = cell line from PDX; 23 = cell line from patient fibroblast
- SampleType STT: 00 = tissue unknown; 01 = primary tumor; 21 = cell line from patient; 22 = cell line from PDX
- TissueType: WT = Wilm's tumor; 00 = kidney unknown; 01 = kidney left; 02 = kidney right
- SequencingType: 00 = unknown; 02 = ultra-low pass whole-genome sequencing; 20 = whole-exome; 61 = bulk RNA-sequencing 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001011112 
   
  
    
    Sample count: 950
Experimentation: Illumina MiSeq amplicon sequencing 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  950 
 
  
    EGAD00001011113 
   
  
    
    This dataset consists of functional genomic data from 6 healthy donors taken from CD14+ monocytes upon different immune stimulations. It contains 150 paired end fastq files consisting of 30 total RNA-seq samples across 4 runs and 30 ATAC-seq samples. The samples were sequenced on Illumina HiSeq4000 platform. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  150 
 
  
    EGAD00001011114 
   
  
    
    Matched tumor-normal data for 6 JPAs. Generated using 10X Genomics Linked-reads 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  12 
 
  
    EGAD00001011115 
   
  
    
    RNA-seq of FACS sorted AT2 cells from ex-smokers with (n=6) and without (n=3) COPD at different disease stages. Fastq files are provided. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11 
 
  
    EGAD00001011116 
   
  
    
    WGBS of FACS sorted AT2 cells from ex-smokers with (n=6) and without (n=3) COPD at different disease stages. Fastq files are provided. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  11 
 
  
    EGAD00001011117 
   
  
    
    Whole Exome Sequencing Data for 10 patients for treatment with the ICI Nivolumab 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  26 
 
  
    EGAD00001011118 
   
  
    
    The affected twins have their lymph nodes and buccal swabs sequenced with WGS. The unaffected sibling have his/her buccal swab sequenced with WGS as control too. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001011119 
   
  
    
    RNA sequencing of 19 samples from PCNSL tumors. Sequencing was performed on a HiSeq X Ten using Illumina TruSeq Stranded mRNA Kit. Sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  19 
 
  
    EGAD00001011122 
   
  
    
    We provide clinical data sets of Array CGH, targeted RNA-seq, total RNA-seq, whole genome bisulfite (WGBS) and whole genome DNA sequencing (WGS) obtained from bone marrow (BM) or peripheral blood (PB) mononuclear cells of 57 pediatric patients with dicentric chromosome dic(9;20) positive Acute lymphocytic leukemia (ALL), from which in 6 cases DNMT3B gene rearrangement was identified. This data is complemented by total RNA-seq and WGBS of samples from 4 additional ALL patients with a t(12;21) translocation and ETV6-RUNX1 gene fusion. DNA was isolated from BM or PB B-lymphocytes using Qiagen QIAamp DNA Blood Midi Kit to perform i) Array CGH of 58 dic(9;20) positive samples by hybridizing 500ng DNA using a Agilent 400K SurePrint G3 Custom CGH Human Genome Microarray (e-Array design 84704) ii) WGBS of 6 DNMT3B rearrangement positive samples and 4 ETV6-RUNX1 positive samples using Tecan TrueMethyl oxBS-Seq module for library preparation and Illumina NovaSeq 6000 platform to run 2x151 cycles iii) WGS of DNMT3B rearrangement positive samples using Illumina Lotus DNA Library Prep Kit followed by sequencing running 2x160 cycles on an Illumina NovaSeq 6000 platform. RNA was isolated from PB B-lymphocytes using the PerkinElmer Chemagic 360 instrument, followed by i) targeted RNA-seq of 56 dic(9;20) positive samples prepared using Illumina TruSight RNA Pan-Cancer Panel and sequenced on an Illumina MiSeq platform running 2x75 cycles ii) total RNA-seq of 6 DNMT3B rearrangement positive samples and 4 ETV6-RUNX1 positive samples utilizing TruSeq Stranded Total RNA Library Prep Gold kit and running 2x100 cycles on an Illumina NovaSeq 6000 platform. 
    
   
  
    
      
      Illumina MiSeq 
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD00001011123 
   
  
    
    Evaluation of somatic mutations in cervicovaginal samples as a non-invasive method for the detection and molecular classification of endometrial cancer 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001011124 
   
  
    
    Limbal stem cells obtained during penetrating keratoplasty from aniridia patients with congenital aniridia (Lagali Stage 4), limbal tissue was digested in collagenase A solution (4 mg/ml) in keratinocyte serum-free medium (KSFM) (Thermo Fisher Scientific; Waltham, MA) for 20 h at 37 °C. Cell suspensions were filtered through a use of Flowmi® micro strainer (SP Bel-Art; Wayne, NJ). LSC clusters were dissociated with trypsin-EDTA (0.05%) solution and cultivated in KSFM. Medium was refreshed every other day. Subconfluent (80–90%) limbal epithelial cells were harvested at passage 2 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD00001011125 
   
  
    
    MNC were isolated from bone of 95 AML patients at initial diagnosis via Ficoll gradient. WES has been performed on all 95 samples. Paired exome sequencing was done on a NovaSeq 6000 sequencer with Twist human core exome plus kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  95 
 
  
    EGAD00001011126 
   
  
    
    Whole genome sequencing data of 57 single (PTA), clonally expanded and bulk human cells. Cells were obtained from bone marrow samples of patients with Fanconi Anemia (PMCFANCNN) or pediatric AML (PBNNNNN), from a clonal intestinal organoid line (STE0072/D-ORGWTNISL), from human cord blood (PMCCB15) and from a human lymphoblastoid cell line (PMCAHH1). WGS libraries were sequenced to ~15-30x genomic coverage (paired-end) on an Illumina Novaseq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  58 
 
  
    EGAD00001011127 
   
  
    
    Targeted DNA based panel of multiple ctDNA samples from 10 patients througout clinical care to assess treatment response. The panel is custom-designed and property of UGS, IC. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  107 
 
  
    EGAD00001011128 
   
  
    
    The dataset contains 16 xenograft plasma cfDNA samples from mice grafted with a human colorectal cell line. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001011129 
   
  
    
    Whole genome sequencing data for germline BRCA pancreatic cancer 
    
   
  
    
   
  1 
 
  
    EGAD00001011130 
   
  
    
    Sequencing data from a phase II study of nivolumab and ipilimumab in recurrent or refractory cancer of unknown primary (CheCUP trial). Panel sequencing data from baseline FFPE biopsies were used to perform a comprehensive genomic profiling of CUP metastases. Combined targeted next-generation sequencing of patient-specific hotsport mutations and shallow whole genome sequencing of baseline and follow-up liquid biopsy samples were used to analyze ctDNA and to evaluate response to immunei checkpoint inhibitor treatment. In some cases, whole exome sequencing of peripheral blood mononuclear cells was performed to screen for potential CHIP and germline mutations. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  53 
 
  
    EGAD00001011131 
   
  
    
    This study utilized blood samples collected in Lalitpur, Nepal as part of the Strategic Typhoid Alliance across Africa and Asia (STRATAA) study. The dataset comprises whole blood RNAseq of 376 febrile individuals from Nepal. Blood was collected in PAXgene tubes and sent to Monash University (Melbourne, Australia) where RNA was extracted using the PAXgene Blood RNA kit before being sent to the Wellcome Sanger Institute (Hinxton, UK) for sequencing. Library prep used NEBNext Ultra II RNA custom kits on an Agilent Bravo WS automation platform with poly(A) pulldown. After PCR, plates were purified using Agencourt AMPure XP SPRI beads and libraries were quantified using Biotium Accuclear Ultra high sensitivity dsDNA Quantitative kits. Pooled libraries were normalised to 2.8 nM. Samples were globin depleted using KAPA RNA HyperPrep with RiboErase. Libraries were then subjected to 2x100bp paired-end sequencing on Illumina NovaSeq. Each library was sequenced to an average of 80 million reads. The STRATAA study was approved by the Nepal Health Research Council (NHRC, ref 283 306/2015) and OxTREC (Oxford Tropical Research Ethics Committee, ref 39-15). All participants provided informed consent for human genetic tests. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  376 
 
  
    EGAD00001011132 
   
  
    
    In this study we performed whole genome sequencing on matched tumor-normal CD138+ bone marrow mononuclear cells from 60 patients with newly diagnosed multiple myeloma treated with daratumumab, carfilzomib, lenalidomide, and dexamethasone (NCT03290950-MANHATTAN trial). In addition, we performed 5’-single-cell RNA-sequencing (10X Genomics) coupled with V(D)J sequencing and capture of the surface protein markers (TotalSeq-C, Biolegend) of the CD138- bone marrow mononuclear cells to interrogate the composition of the immune microenvironment at baseline and after eight cycles of induction therapy in 22 patients with newly diagnosed multiple myeloma.  Samples were multiplexed using hashtag oligo. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  131 
 
  
    EGAD00001011134 
   
  
    
    RNA sequencing data from children with febrile illness and multisystem inflammatory syndrome in children (MIS-C). Samples used were Whole Blood. Febrile illness controls include children with bacterial and viral infections and healthy controls. This dataset contains samples from patients recruited into the DIAMONDS study. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD00001011135 
   
  
    
    This dataset contains ATAC-seq data performed in MM.1S cell line in ETOH (control) or Dexamethasone condition (Treatment) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001011136 
   
  
    
    This dataset gather ChIP-seq data produced by immunoprecipitating CTCF factor in own laboratory in MM.1S cell line in EtOH and Dex conditions. It also gather ChIP-seq dataset produced by external laboratory (Active Motif) for H3K27ac mark and GR transcription factor in same cell line and conditions ( MM.1S ETOH/Dex) 
    
   
  
    
      
      Illumina MiSeq 
      
      unspecified 
      
    
   
  2 
 
  
    EGAD00001011137 
   
  
    
    This dataset gather all RNA-sequencing data in MM.1S cell line in control and Dex condition; both in 3 biological replicates 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD00001011138 
   
  
    
    This dataset gather HiChIP data for H3K27ac mar in MM.1S cell line in control and Dex condition, both in two biological replicates 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011139 
   
  
    
    This dataset gather scRNAseq data performed for MM.1S cell line in control and dex conditions at 4h and 24h 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011140 
   
  
    
    This dataset gather scMultiomic data including RNA-seq and ATAC-seq in MM.1S in control and dex condition at 1h and 4h 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011141 
   
  
    
    BAM files from total RNA sequencing of samples from breast cancer patients in the TNT trial. Data includes 186 primary tumour samples and 13 matched recurrence samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  199 
 
  
    EGAD00001011142 
   
  
    
    TPM matrices of counts from RNA sequencing (RNAseq) from baseline  CD138(–)  BM fractions. 
    
   
  
    
   
  1 
 
  
    EGAD00001011143 
   
  
    
    Matrix of normal values from 39 marker CyTOF assay performed on longitudinal BMMC.  FCS files from the CyTOF assay were normalized and concatenated using Fluidigm's CyTOF software and then de-multiplexed using Astrolabe Diagnostics, Inc., a commercial, cloud-based platform for single-cell analysis. 
    
   
  
    
   
  1 
 
  
    EGAD00001011144 
   
  
    
    TPM matrices of counts from RNA sequencing (RNAseq) from longitudinal CD138+ enriched BM fractions. 
    
   
  
    
   
  1 
 
  
    EGAD00001011145 
   
  
    
    TPM matrices of counts from RNA sequencing (RNAseq) from  longitudinal  CD138(–)  BM fractions. 
    
   
  
    
   
  1 
 
  
    EGAD00001011146 
   
  
    
    Matrix of normal values from 39 marker  CyTOF assay performed on baseline BMMC.  FCS files from the CyTOF assay were normalized and concatenated using Fluidigm's CyTOF software and then de-multiplexed using Astrolabe Diagnostics, Inc., a commercial, cloud-based platform for single-cell analysis. 
    
   
  
    
   
  1 
 
  
    EGAD00001011147 
   
  
    
    Matrix of normalized values from Olink assay performed on baseline BM Plasma. The Olink Immuno-Oncology multiplex proteomic Panel included 92 proteins associated with human inflammatory conditions. Data is analyzed using real-time PCR analysis software via the Ct method and Normalized Protein Expression (NPX) manager. Data were normalized using internal controls in every single sample, inter-plate controls, negative controls and a correction factor and expressed as Log2 scale, which was proportional to the protein concentration. One NPX difference equals the doubling of the protein concentration. 
    
   
  
    
   
  1 
 
  
    EGAD00001011148 
   
  
    
    TPM matrices of counts from RNA sequencing (RNAseq) from baseline CD138+ enriched BM fractions. 
    
   
  
    
   
  1 
 
  
    EGAD00001011149 
   
  
    
    Exome sequencing data from two small cell prostate cancer patients - 4 cancer samples (FFPE) from Patient 1 collected at 3 different time points and 2 cancer samples (FFPE) from Patient 2 collected at 1 time point. Exonic DNA was enriched using the TruSeq Exome Kit (Illumina) and sequenced on the Illumina NextSeq 500 as 75bp paired end reads (total read length 150bp). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  8 
 
  
    EGAD00001011150 
   
  
    
    Single-cell genotyping data for bone marrow samples from 9 cases with clonal hematopoiesis and 1 control sample. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding transcriptome files use the same naming with the "_transcriptome" suffix. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  11712 
 
  
    EGAD00001011151 
   
  
    
    Capture-based NGS obtained using the "all-CLL" panel (also known as SOPHiA DDM (TM) Community CLL Clonality Solution). Library preparation was performed following SOPHiA GENETICS recommendations using 200 ng genomic DNA. Libraries were sequenced on a MiSeq instrument (2x300 bp, Illumina) aiming at a mean coverage of 1,000x. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  118 
 
  
    EGAD00001011153 
   
  
    
    Sample cohort (n=48) is consisted of healthy, atrophic gastritis and gastric cancer patients. Some of the gastric cancer patients samples are collected at the separate time points: -1 before the operation; -2 after the operation; -3 during the control visit. For the hybridisation capture of the genes unique 15 gastric cancer-related gene panel was developed and very deep sequencing using  TruSight Oncology Unique Molecular Identifier (UMI) Reagents (Illumina) was used. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001011154 
   
  
    
    RNASeq data from one small cell prostate cancer patient - 4 cancer samples (FFPE) from Patient 1 collected at 3 different time points. mRNA was selected using the Magnetic mRNA Isolation Module (NEB) and sequenced on the Illumina NextSeq 500 as 75bp paired end reads (total read length 150bp). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001011155 
   
  
    
    This dataset contains raw FASTQ files from single cell RNA sequencing of SarBC-01 cells treated with Dexamethasone vs DMSO, with or without Matrigel, processed with MULTI-seq. Cells from 4 different culture conditions (+Mat/Dex, +Mat/DMSO, -Mat/Dex, -Mat/DMSO) were harvested, processed for multiplexing using the MULTI-seq protocol and loaded in a Chromium Single Cell 3ʹ GEM Library and Gel Bead Kit v3 (10x Genomics). Gene expression (cDNA) and MULTI-seq libraries were prepared according to the manufacturers’ protocol of Chromium Next GEM Single Cell 3’ reagents Kits v3.1 (Dual Index). Finally, cDNA and MULTI-seq libraries were analyzed using an Agilent Bioanalyzer (DNA High Sensitivity kit) and sequenced on a NovaSeq6000 (S2 flow cell) platform.
MULTISEQ BARCODES:
TTAGCCAG => Matrigel/DMSO
CCACAATG => Matrigel/Dexamethasone
GCACACGC => noMatrigel/DMSO
AGAGAGAG => noMatrigel/Dexamethasone 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011156 
   
  
    
    This dataset contains raw FASTQ files from bulk RNA sequencing of SarBC-01 organoids at different passages (Passage 6, Pssage 19, Passage 59), UroBC-01 organoids (Passage 70), UroBC-16 organoids (Passage 19) and UroBC-22 organoids (Passage 8). RNA was isolated using the Quick-DNA/RNA Miniprep kit (Zymo Research, Irvine, CA, USA, D7001) and subjected to bulk RNA sequencing. TruSeq Stranded mRNA kit (Illumina, 20020594) was used for the library preparation according to manufacturer’s guidelines. Sequencing was performed on Illumina NovaSeq 6000 using paired-end 100-bp reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001011157 
   
  
    
    This dataset contains raw FASTQ files from DNA of SarBC-01-related samples (SarBC-01 patient Germline, SarBC-01 patient Urothelial tumor, SarBC-01 patient Sarcomatoid tumor, SarBC-01 Passage 6 organoids, SarBC-01 Passage 20 organoids) and UroBC-01 related samples (UroBC-01 patient Germline, UroBC-01 patient Urothelial tumor, UroBC-01 Passage 19 organoids, UroBC-01 Passage 6 organoids). DNA was extracted from FFPE tissue, using RecoverAll RNA/DNA extraction kit (Invitrogen, Carlsbad, CA, USA, AM1975) followed by incubation with Uracil-DNA Glycosylase, and fresh or flash frozen tissue, using Quick-DNA/RNA Miniprep kit (Zymo Research, Irvine, CA, USA, D7001).  Twist Human Core Exome + RefSeq + Mito-Panel kit (Twist Bioscience, 102031) was used for the whole exome capturing. Sequencing was performed on Illumina NovaSeq 6000 using paired-end 100-bp reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD00001011160 
   
  
    
    The dataset contains panel sequencing data of 170 genes from 380 patients of the EORTC-26101 trial. The corresponding methylation data is available via the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) repository with the accession number GSE237103. 
    
   
  
    
   
  1 
 
  
    EGAD00001011161 
   
  
    
    Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes limit of detection values provided by Olink. 
    
   
  
    
   
  1 
 
  
    EGAD00001011162 
   
  
    
    Sample metadata for the RNA-seq dataset. This dataset includes subject-level data and longitudinal visit day information for the corresponding samples. 
    
   
  
    
   
  1 
 
  
    EGAD00001011163 
   
  
    
    This dataset contains data for CBC counts and absolute protein abundance measurements from ELISA experiments. 
    
   
  
    
   
  1 
 
  
    EGAD00001011164 
   
  
    
    RNA-seq data from PAXgene extracted whole blood of patients enrolled in the COVACTA trial. This dataset includes read counts per gene. Read counts were generated for the Gencode v27 annotation using the summarize Overlaps method from bioC in mode “IntersectionStrict” 
    
   
  
    
   
  1 
 
  
    EGAD00001011165 
   
  
    
    Sample metadata for the olink dataset. This dataset includes subject-level data and longitudinal visit day information for the corresponding samples. 
    
   
  
    
   
  1 
 
  
    EGAD00001011166 
   
  
    
    Linking anonymized sample IDs and anonymized patient IDs. 
    
   
  
    
   
  1 
 
  
    EGAD00001011167 
   
  
    
    Olink Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes QC warning flag provided by Olink. 
    
   
  
    
   
  1 
 
  
    EGAD00001011168 
   
  
    
    RNA-seq data from PAXgene extracted whole blood of patients enrolled in the COVACTA trial. This dataset includes raw FASTQ files. 
    
   
  
    
      
      unspecified 
      
    
   
  1646 
 
  
    EGAD00001011169 
   
  
    
    Olink Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes the NPX values provided by Olink. 
    
   
  
    
   
  1 
 
  
    EGAD00001011171 
   
  
    
    In this study, we aimed to assess RNA expression and surface antigen expression of acute myeloid leukemia with complex karyotype (CK-AML) at the single-cell level. For this purpose, we performed cellular indexing of transcriptomes and epitopes (CITE-seq) of primary leukemia samples from four CK-AML patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001011172 
   
  
    
    In this study, we aimed to identify somatic structural variation of acute myeloid leukemia with complex karyotype (CK-AML) at the single-cell level and to investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from four CK-AML patients. We also performed strand-specific single-cell sequencing of two patient-derived xenografts (PDXs). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  364 
 
  
    EGAD00001011173 
   
  
    
    In colorectal cancers (CRC) the tumor microenvironment plays a key role for prognosis and therapy efficacy. Patient-derived tumor organoids (PDTOs) show enormous potential for preclinical testing, however, purely epithelial cultures features including the ‘consensus molecular subtypes’ (CMS) are largely eradicated. To better reflect the cell type heterogeneity, we established the CRC organoid-stroma biobank of matched PDTOs and cancer-associated fibroblasts (CAFs) from 30 patients. Whole exome sequencing and transcriptome analysis in various in vitro and in vivo contexts was performed to study the influence of the TME on the CRC phenotype. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
    
   
  258 
 
  
    EGAD00001011174 
   
  
    
    We performed whole genome sequencing on 42 prostate cancer samples from the prostate, seminal vesicles and regional lymph nodes of five treatment-naive patients with locally advanced disease who underwent radical prostatectomy. Whole genome sequencing was performed as 150bp paired end reads on the Illumina NovaSeq 6000 platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD00001011175 
   
  
    
    Single-cell whole transcriptome sequencing data for bone marrow samples from 9 cases with clonal hematopoiesis and 4 control samples. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding genotyping files use the same naming without the "_transcriptome" suffix. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14073 
 
  
    EGAD00001011176 
   
  
    
    This dataset represent the RNA-seq, which was done on untreated small intestinal organoids; small intestinal organoids treated with chemotherapeutic, busulfan; untreated small intestinal organoids co-cultured wth mesenchymak stromal/stem cells (MSCs;  busulfan treated small intestinal organoids co-cultured with MSCs. The same set of samples was done for 3 different primary bone marrow MSC donors. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001011178 
   
  
    
    This study contains methyl-binding domain sequencing and shallow whole genome sequencing from circulating cell-free DNA (cfDNA) for 143 patients with metastatic cancer of known type, 41 patients with Cancer of Unknown Primary (CUP) and 27 non-cancer controls. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  211 
 
  
    EGAD00001011180 
   
  
    
    The dataset contains the methylome EM-sequencing raw data (fastq) of different spermatogenic cells from 5 human males (three controls and two crypotzoospermic). The datasets correspond to the following cell types: undifferentiated spermatogonia, differentiating spermatogonia, 4C spermatocytes, and 1C spermatids (this cell type only for the control individuals) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD00001011186 
   
  
    
    Set of 19 patients afflicted with colorectal cancer with matching preoperative and postoperative blood plasma, PBMC, and tumor biopsy sequencing data. Originally referenced by Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat Med. 2020, Zviran et. al. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001011187 
   
  
    
    Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WGS OACs/BOs/normals) 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  63 
 
  
    EGAD00001011188 
   
  
    
    Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WES OACs/BOs/normals) 
    
   
  
    
      
      unspecified 
      
    
   
  170 
 
  
    EGAD00001011189 
   
  
    
    Data supporting: "TBC" Ganguli et al (sWGS for 75 samples) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001011190 
   
  
    
    Data supporting: "TBC" Ganguli et al (RNA for 394 samples) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      unspecified 
      
    
   
  49 
 
  
    EGAD00001011191 
   
  
    
    Data supporting: "TBC" Ganguli et al (WGS for 1298 samples) 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  42 
 
  
    EGAD00001011192 
   
  
    
    This study aims to investigate the dysregulation of RNA translation and identify functional non-canonical open reading frames (ORFs) as potential targets for medulloblastoma treatment. The study involves ribosome profiling and RNAseq of medulloblastoma tissues and cell lines to observe the translation of non-canonical ORFs. Multiple CRISPR-Cas9 screens will be used to identify functional non-canonical ORFs implicated in medulloblastoma cell survival. 
    
   
  
    
   
  4 
 
  
    EGAD00001011193 
   
  
    
    This study aims to investigate the dysregulation of RNA translation and identify functional non-canonical open reading frames (ORFs) as potential targets for medulloblastoma treatment. The study involves ribosome profiling and RNAseq of medulloblastoma tissues and cell lines to observe the translation of non-canonical ORFs. Multiple CRISPR-Cas9 screens will be used to identify functional non-canonical ORFs implicated in medulloblastoma cell survival. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  4 
 
  
    EGAD00001011194 
   
  
    
    Single-cell ATAC-seq of pediatric AML tumours from patients enrolled in the clinical trial AAML1031. Obtained using the 10X Chromium NextGEM Single Cell ATAC Reagent Kit, v1.1. A total of 64 samples were obtained from biopsies at diagnosis, remission and relapse from 25 patients. 
    
   
  
    
      
      DNBSEQ-G400 
      
    
   
  64 
 
  
    EGAD00001011195 
   
  
    
    Single-cell RNA-seq of pediatric AML tumours from patients enrolled in the clinical trial AAML1031. Obtained using the 10X Chromium Single Cell 3’ Reagent Kit, v3.0. A total of 75 samples were obtained from biopsies at diagnosis, remission and relapse from 28 patients. 
    
   
  
    
      
      DNBSEQ-G400 
      
    
   
  62 
 
  
    EGAD00001011196 
   
  
    
    Data supporting: "Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma" Abbas et al (WGS for 1397 samples) 
    
   
  
    
      
      HiSeq X Five 
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  6 
 
  
    EGAD00001011197 
   
  
    
    Transcriptomic data generated by RNA-sequencing for adult human AMLs with STAG2 or RAD21 mutations or no cohesin mutations (CTRL-AMLs). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011198 
   
  
    
    High-throughput chromosome conformation capture (Hi-C) data generated for cohesin-mutated (STAG2 or RAD21) and cohesin-wildtype AMLs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011199 
   
  
    
    ChIP-Seq targeting the major cohesin core subunit RAD21 to represent cohesin occupancy and binding sites in cohesin-mutated (STA2 or RAD21 mutations) and wildtpye adult AMLs. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011200 
   
  
    
    ChIP-Seq targeting CTCF in cohesin-mutated (STAG2 or RAD21 mutations) and wildtype adult AMLs (CTRL-AMLs). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011201 
   
  
    
    This data includes scRNA-seq, scTCR-seq and scBCR-seq of 21 individuals post Covid'19 vaccination. Individuals range from the ages 52 to 75. Samples were genotype multiplexed in an overlapping mixture design, pooled, and sequenced on 16 lanes (10X, 5' GEM). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001011204 
   
  
    
    ChIP-seq targeting the H3K27ac histone modification in cohesin-mutated (STAG2 or RAD21 mutation) and cohesin wildtype (CTRL-AMLs) AMLs. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001011205 
   
  
    
    ChIP-Seq targeting the cohesin subunit STAG2 in STAG2-mutant or cohesin wildtype adult AMLs. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  5 
 
  
    EGAD00001011206 
   
  
    
    ChIPseq targeting the cohesin subunit STAG1 in STAG2-mutated AMLs or cohesin wildtype AMLs 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011207 
   
  
    
    Low-pass whole-genome DNA sequencing of cohesin-mutated (STAG2 or RAD21 mutations) and wildtype (CTRL-AML) adult AMLs generated generated from ultrasound-fragmented genomic DNA. Samples were only sequenced shallow (20-40 Mio reads) Used for digital karyotyping and ChIP-seq background/copy-nuber normalization/correction. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001011208 
   
  
    
    The dataset contains methylation values of all SNP-filtered CpG sites for all samples from the air pollution study (total n=60). Nasal lavage samples were collected from n=29 moderately exposed (residing in Stuttgart) and n=31 lowly exposed (residing in Simmerath) individuals. For methods and study details, please see PMID 37343754. 
    
   
  
    
   
  1 
 
  
    EGAD00001011209 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0063_000 for Triple negative breast cancer patient-derived xenograft SA609X3XB01584 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011210 
   
  
    
    10x Single Cell Gene Expression library TENX068 for Triple negative breast cancer patient-derived xenograft SA609X4XB03080 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011211 
   
  
    
    10x Single Cell Gene Expression library TENX069 for Triple negative breast cancer patient-derived xenograft SA609X4XB03083 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011212 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0150_001 for Triple negative breast cancer sample SA609X5XB03230 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001011213 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0150_002 for Triple negative breast cancer sample SA609X5XB03231 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq X 
      
    
   
  1 
 
  
    EGAD00001011214 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_002 for Triple negative breast cancer patient-derived xenograft SA609X5XB03223 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011215 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0152_001 for Triple negative breast cancer patient-derived xenograft SA609X6XB03401 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011216 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0152_002 for Triple negative breast cancer sample SA609X6XB03404 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011217 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0172_001 for Triple negative breast cancer sample SA609X6XB03447 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011218 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0163_002 for Triple negative breast cancer sample SA609X7XB03510 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011219 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0163_001 for Triple negative breast cancer patient-derived xenograft SA609X7XB03505 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011220 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0172_002 for Triple negative breast cancer sample SA609X7XB03554 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011221 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0148_001 for Triple negative breast cancer patient-derived xenograft SA535X4XB02498 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011222 
   
  
    
    10x Single Cell Gene Expression library TENX048 for Triple negative breast cancer sample SA535X5XB02895 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011223 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_006 for Triple negative breast cancer sample SA535X6XB03099 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011224 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0175_001 for Triple negative breast cancer patient-derived xenograft SA535X6XB03101 
    
   
  
    
      
      BGISEQ-500 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011225 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0182_001 for Triple negative breast cancer patient-derived xenograft SA535X7XB03304 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011226 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0189_001 for Triple negative breast cancer patient-derived xenograft SA535X7XB03448 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011227 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0184_001 for Triple negative breast cancer sample SA535X7XB03305 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011228 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_004 for Triple negative breast cancer sample SA535X7XB03305 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011229 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0189_002 for Triple negative breast cancer patient-derived xenograft SA535X8XB03663 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011230 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0182_002 for Triple negative breast cancer sample SA535X8XB03431 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011231 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0184_002 for Triple negative breast cancer patient-derived xenograft SA535X8XB03434 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011232 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0173_002 for Triple negative breast cancer patient-derived xenograft SA535X9XB03617 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011233 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0173_001 for Triple negative breast cancer patient-derived xenograft SA535X9XB03616 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011234 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0195_001 for Triple negative breast cancer patient-derived xenograft SA535X8XB03664 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011235 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0206_002 for Triple negative breast cancer patient-derived xenograft SA535X10XB03696 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011236 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0206_001 for Triple negative breast cancer sample SA535X10XB03693 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011237 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0208_002 for Triple negative breast cancer patient-derived xenograft SA535X9XB03776 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011238 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0142_002 for Triple negative breast cancer sample SA1035X4XB02879 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011239 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0071_000 for Triple negative breast cancer sample SA1035X5XB03015 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011240 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0076_000 for Triple negative breast cancer sample SA1035X5XB03021 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011241 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0079_001 for Triple negative breast cancer sample SA1035X6XB03216 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011242 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0079_002 for Triple negative breast cancer patient-derived xenograft SA1035X6XB03211 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011243 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0142_004 for Triple negative breast cancer sample SA1035X6XB03209 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011244 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0145_001 for Triple negative breast cancer patient-derived xenograft SA1035X7XB03338 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011245 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0145_002 for Triple negative breast cancer patient-derived xenograft SA1035X7XB03340 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011246 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0162_002 for Triple negative breast cancer sample SA1035X7XB03502 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011247 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0151_001 for Triple negative breast cancer patient-derived xenograft SA1035X8XB03425 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011248 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0162_001 for Triple negative breast cancer sample SA1035X8XB03420 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011249 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0175_002 for Triple negative breast cancer patient-derived xenograft SA1035X8XB03631 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011250 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0164_002 for Triple negative breast cancer sample SA530X3XB03295 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011251 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0014_001 for Triple negative breast cancer patient-derived xenograft SA604X6XB01979 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011252 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0015_001 for Triple negative breast cancer patient-derived xenograft SA604X6XB01979 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011253 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0019_001 for Triple negative breast cancer patient-derived xenograft SA604X7XB02089 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011254 
   
  
    
    10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0020_002 for Triple negative breast cancer patient-derived xenograft SA604X8XB02164 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011255 
   
  
    
    Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WGS BOs/normals) 
    
   
  
    
      
      unspecified 
      
    
   
  28 
 
  
    EGAD00001011256 
   
  
    
    Dataset contains WGS sequencing data from bulk sorted therapy-related myeloid neoplasms (t-MN) and reference cells (MSCs/B cells/T cells). In addition, from 4 patients, also WGS data from single hematopoietic stem and progenitor cells, obtained from samples of t-MN diagnosis, are included. These are either clonally expanded before WGS, or DNA was directly amplified via the primary template-directed amplification (PTA) protocol (mentioned in the sample name). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  80 
 
  
    EGAD00001011257 
   
  
    
    WGS data of clonally expanded HSPCs from a Li-Fraumeni patient at the time of second cancer (Burkitt lymphoma and <5% t-MN) after primary osteosarcoma diagnosis and a reference MSC bulk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD00001011258 
   
  
    
    We identified a T-cell receptor (TCR) reactive to the recurrent FLT3 D835Y mutation in the tyrosine-kinase domain. To validate the elimination efficacy of leukemia cells, we transplanted human acute myeloid leukemia (AML) cells with FLT3 D835Y mutations into NSG-SGM3 mice and treated either with TCR FLT3 D835Y redirected T cells, or control TCR (TCR 1G4). After treatment, we performed flow sorting of AML blasts (CD3-CD19-) and primary T cells (CD3+CD8+orCD4+CD19-CD33-) and performed whole-exome sequencing. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  4 
 
  
    EGAD00001011259 
   
  
    
    10x genomics single-cell RNAseq of an isogenic human iPSC model for SMA and control. The transcriptomic analysis was performed at 3 timepoints, day 4, day 20 and day 40. The analysis of this dataset was reported in the manuscript "An isogenic human iPSC model unravels neurodevelopmental abnormalities in SMA" from Grass et al.: 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001011260 
   
  
    
    A prospective study of individuals with suspicion of a hereditary cancer syndrome for whom previous clinical targeted genetic testing was either not informative or was not available. To identify pathogenic disease-causing variants explaining participant presentation, germline whole-genome sequencing and a comprehensive cancer virtual gene panel analysis were undertaken. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  182 
 
  
    EGAD00001011264 
   
  
    
    This dataset contains the count matrices and corresponding metadata for our study on bronchial epithelial cells response to RSV in healthy and in asthma. This scRNAseq data is from primary cells, that have been differentiated in ALI cultures and infected with RSV. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001011265 
   
  
    
    Data from sequencing of microbiopsies of keratinocytes isolated via laser capture microscopy from lesional and non-lesonal skin biopsies from psoriasis patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1211 
 
  
    EGAD00001011267 
   
  
    
    ATAC-seq (Illumina TDE1 Transposase) to profile accessible chromatin regions of cohesin-mutated (STAG2 or RAD21 mutations) and -wildtype adult AMLs. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011268 
   
  
    
    10x Single Cell Gene Expression library TENX063 for Triple negative breast cancer patient-derived xenograft SA609X3XB01584 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001011269 
   
  
    
    Data supporting: "Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma" Abbas et al (RNA for 197 samples) 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  - 
 
  
    EGAD00001011271 
   
  
    
    Whole genome sequencing on DNA from snap frozen tumour sample and matched whole blood of patient #130. Analysis used the QIAamp DNA Mini Kit. Libraries were prepared using the Illumina TruSeq Nano library method using 200ng of DNA. Extracted DNA was sheared using the Covaris M220 Focused-ultrasonicator with a target fragment length of 550bp through bead size selection. The libraries were sequenced at depth of 40x for germline DNA and 80-100x for tumour DNA using paired 150bp reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011272 
   
  
    
    Whole Exome Sequencing of blood and FFPE tumour of patient #368.  150–300 ng of DNA was fragmented to approximately 200 bp using a focal acoustic device. Libraries were prepared with the Kapa Hyper Prep Kit and SureSelectXT adaptors. Hybridisation capture was performed with SureSelect Clinical Research Exome V2 baits following the SureSelectXT recommended protocol (Agilent). Indexed libraries were sequenced on an Illumina NovaSeq 6000 to generate paired-end 150 bp reads with average of 70-fold base coverage for the germline sample and 330-fold coverage for tumour sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011273 
   
  
    
    Single RNA-Seq of CD11b Beads selected tumor associated macrophages (TAMs) of 3 gliomablastoma patients treated with small molecule inhibitors. The sequencing was done on HiSeq 4000 with the SmarTer Ultra Low Input RNA v4 and NEBNext ChIP-Seq Kit. The TAMs were treated with GW2580, BLZ945 and PLX3397 and DMSO as control. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  12 
 
  
    EGAD00001011274 
   
  
    
    Tumor Organoids from glioblastoma, 2 patients, treated with different small molecule inhibitors. Paired RNA-Seq was done on NovaSeq 6000 with the Illumina TruSeq stranded mRNA Kit. The small inhibitors GW2580, BLZ945 and PLX3397 were used. DMSO was used as control. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD00001011275 
   
  
    
    This dataset contains scRNA-seq data from 8 co-cultures of GSCCs and macrophages, and 1 monoculture of macrophages. Samples were individually labeled and pooled using MULTI-seq technology and processed with 10x Genomics Technology. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011276 
   
  
    
    This dataset contains DNA sequencing data from 7 GSCCs (data for BT569 is not available). We performed focused exome sequencing (43 most mutated genes in GBM), combined with OneSeq analysis (Agilent) on the GSCCs and identified both shared and GSCC-specific mutations. For CME038, whole genome sequencing was performed. 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD00001011278 
   
  
    
    This dataset contains paired-end whole-exome sequencing data (2x50 bp) from the normal samples, the primary tumors and the recurrences/metastases of 8 head and neck cancer patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD00001011279 
   
  
    
    This dataset contains paired-end RNA sequencing data (2x50 bp) from the primary tumors and the recurrences/metastases of 6 head and neck cancer patients. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  15 
 
  
    EGAD00001011281 
   
  
    
    Single-cell CITE(cellular indexing of transcriptomes and epitopes)-seq from MDS (n = 2, MDS02 in 2 replicates). cDNA from 10x Genomics 3' V3. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD00001011282 
   
  
    
    Single-cell long-read (ONT) transcriptome sequencing from CH (n = 3), MDS (n = 5, MDS02 in 2 replicates) and AML (n =1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8, U2AF1 n = 1) or transcription regulators (DNMT3A - CH04). Full length cDNA from 10x Genomics 3' V3. 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  11 
 
  
    EGAD00001011283 
   
  
    
    Single-cell RNA sequencing from CH (n = 2), MDS (n = 6, MDS02 in 2 replicates) and AML (n = 1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8, U2AF1 n = 1). cDNA from 10x Genomics 3' V3. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD00001011284 
   
  
    
    Single-cell targeted amplicon RNA sequencing from CH (n = 3), MDS (n = 6, MDS02 in 2 replicates) and AML (n =1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8 [2 -CH, 6 MDS], U2AF1 n = 1) or transcription regulators (DNMT3A - CH04). The transcripts are targeted to detect a particular CH mutation in the listed genes. U2AF1 GoT are full-length cDNA, while the rest follow the 10x 3' V3 protocol. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD00001011285 
   
  
    
    Multiple large-scale genomic profiling efforts have been undertaken in osteosarcoma to define the genomic drivers of tumorigenesis, therapeutic response, and disease recurrence. The spatial and temporal intratumor heterogeneity could also play a role in promoting tumor growth and treatment resistance. Here, we conducted longitudinal whole-genome sequencing of 37 tumor samples from eight patients with osteosarcoma that relapsed or became refractory to initial therapy. We found that the chemoresistant population in recurrent osteosarcoma is subclonal at diagnosis, emerges at the time of primary resection due to selective pressure from neoadjuvant chemotherapy, and is characterized by unique oncogenic amplifications. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  37 
 
  
    EGAD00001011286 
   
  
    
    AD Samples 1 and 2.
AD1 has 2 runs of additional sequencing, AD2 has 1 run of additional sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD00001011288 
   
  
    
    Single cell transcriptomics (10x 3') of human adrenal gland. The adrenal gland of this particular individual is characterized by the presence of mutant clone characterized by aneuploidy in chromosomes 8, 9, 13 and 22,  occupying the zona glomerulosa and zona fasciculata. This dataset contains the primary sequencing data generated from the 10x genomics 3' library in fastq format. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  1 
 
  
    EGAD00001011291 
   
  
    
    This dataset contains sequencing data from VLP-enriched fecal microbiome from LLNEXT project 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  206 
 
  
    EGAD00001011293 
   
  
    
    This dataset contains sequencing data from fecal microbiome from LLNEXT project 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  321 
 
  
    EGAD00001011294 
   
  
    
    RNASeq files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  307 
 
  
    EGAD00001011295 
   
  
    
    WGS files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  264 
 
  
    EGAD00001011296 
   
  
    
    WXS files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis" 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  215 
 
  
    EGAD00001011297 
   
  
    
    Dataset consists of 216 samples, with each sample having a BAM, BAI and VCF file. 
    
   
  
    
      
      Ion Torrent S5 
      
    
   
  216 
 
  
    EGAD00001011300 
   
  
    
    116 single cell ATAC runs and 23 single cell multiome (ATAC+RNA) runs on various brain regions during human first-trimester development. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  116 
 
  
    EGAD00001011301 
   
  
    
    Dataset is described in doi: https://doi.org/10.1101/2022.12.13.22283363 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  108 
 
  
    EGAD00001011302 
   
  
    
    We performed Whole Exome (WXS) and RNASeq sequencing on samples obtained from the same site before and during therapy from our prospective clinical trial (CA209-153, NCT02066636) of nivolumab in advanced Non-small cell lung cancer (NSCLC)  patients that progressed on chemotherapy. There are 58 pre and 42 on therapy WXS samples and 24 pre  and 12 on therapy RNASeq samples. All WXS tumor samples have matching normal samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  200 
 
  
    EGAD00001011303 
   
  
    
    In this study nanopore sequencing was applied to obtain sparse DNA methylation profiles from pediatric CNS tumor samples. A neural network was used to classify the tumor based on the obtained methylation profile. 
    
   
  
    
      
      MinION 
      
      PromethION 
      
    
   
  62 
 
  
    EGAD00001011304 
   
  
    
    Whole genome sequencing data of 26 high-grade serous carcinoma (HGSC) patients (87 samples) sequenced with MGISEQ-2000 and HiSeq X Ten. 
    
   
  
    
      
      HiSeq X Ten 
      
      unspecified 
      
    
   
  84 
 
  
    EGAD00001011305 
   
  
    
    As part of the study "Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation" we wanted to identiffy the SNVs that are located in STRC and which ones in STRCP1. For this we applied targeted long-read sequencing. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  7 
 
  
    EGAD00001011309 
   
  
    
    Diffuse large B-cell lymphoma (DLBCL) is the most common non-Hodgkin lymphoma (NHL), comprising 25-30% of all NHL in developed countries with an annual incidence in the USA of 7 cases/100000 persons/year. Collectively, DLBCL is classified based on a common morphological appearance of diffuse growth of large transformed B-cells, immunophenotype, high proliferation rate and aggressive behaviour. Despite these similarities, DLBCLs are a heterogeneous collection of malignancies with distinct clinical and molecular characteristics that do not always correlate with immunohistological features. This gene expression dataset includes transcriptomes of ABC-DLBCLs and of GCB-DLBCLs where cell of origin is determined by the HTG-EdgeSeq quantitative nuclease protection assay. Also included are clonality results from BCR profiling from high-grade B-cell lymphomas sequenced using a NOVA sequencer 
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD00001011311 
   
  
    
    We performed single cell RNA- and TCR-sequencing (10x Genomics) on immune infiltrates (CD45+ cells) from 18 HNSCC patients enrolled in the IMCISION trial (Vos et al. 2021). Viable immune cells were isolated from pre-treatment and post-treatment primary tumor biopsies of 10 patients responding (1 partial pathological response and 9 major pathological responses) and 7 patients non-responding to anti-PD-1 and anti-CTLA4 combination immunotherapy. One patient treated with anti-PD-1 monotherapy (1 major pathological response) was included in the dataset. Bulk TCR-seq was performed on the PBMCs of responding patients, pre- and post-treatment. 
    
   
  
    
      
      unspecified 
      
    
   
  137 
 
  
    EGAD00001011312 
   
  
    
    This dataset contains raw sequencing data from multi-timepoint cell-free methylated DNA immunoprecipitation and sequencing (cfMeDIP-seq) of plasma samples from the INSPIRE study (NCT02644369). Details about the study, including inclusion/exclusion criteria and interventions are available at https://clinicaltrials.gov/study/NCT02644369. Briefly, five cohorts were included, INS-A (head & neck squamous cell carcinoma), INS-B (triple-negative breast cancer), INS-C (high-grade serous ovarian cancer), INS-D (melanoma), and INS-E (mixed solid tumors). All patients in the study were diagnosed with advanced cancer and received treatment with pembrolizumab. Plasma samples were collected at baseline and every 3 cycles until disease progression, death, loss of follow-up, or completion of the study. Plasma samples were first processed for cell-free DNA mutations and remaining samples were then processed by cfMeDIP-seq, prioritizing baseline and post-cycle 3 samples. In total, data from 204 timepoints from 87 distinct patients are deposited. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  204 
 
  
    EGAD00001011315 
   
  
    
    We profiled 16 patient tumor samples by single-cell or single-nuclei RNA-seq using 10X Chromium 3'. It includes 4 low-grade gliomas and 12 ependymomas. The raw fastqs are provided. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00001011317 
   
  
    
    Total RNA sequencing of olfactory mucosa (OM) cells derived from cognitively healthy individuals exposed to traffic-related ultrafine particles (UFPs) for 24h and 72h in submerged cultures. The UFPs used for exposures were: A0, A20 and Euro6. Exposures were compared to the corresponding blank samples. 
    
   
  
    
   
  1 
 
  
    EGAD00001011318 
   
  
    
    Multiomics data for a cohort of 20 COVID-19 patients (10 patients mild, 3 patients moderate, 4 patients severe, 3 patients critical) obtained from peripheral blood mononuclear cells (PBMCs) from longitudinally sampled at hospital admission, discharge, and 1 month thereafter. The data has been obtained a multiwell-based single-cell technology (BD Rhapsody) that includes the analysis of PBMCs' whole transcriptome and a set of 52 surface proteins. The samples of the different patients at different collection times were labelled using a cell hashing strategy with the BD Single-Cell Multiplexing Kit (6 samples for each run). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001011319 
   
  
    
    Multiomics data for a cohort of 20 COVID-19 patients (10 patients mild, 3 patients moderate, 4 patients severe, 3 patients critical) obtained from peripheral blood mononuclear cells (PBMCs) from longitudinally sampled at hospital admission, discharge, and 1 month thereafter. The data has been obtained a multiwell-based single-cell technology (BD Rhapsody) that includes the targeted expression of BD  Immune Response Targeted Panel, the TCR/BCR profiling and a set of 52 surface proteins. The samples of the different patients at different collection times were labelled using a cell hashing strategy with the BD Single-Cell Multiplexing Kit (6 samples for each run). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD00001011320 
   
  
    
    This dataset includes WGS data of 12 ancient individuals (97–688 years BP) from Zambia and South Africa, presented in Fortes-Lima et al. Nature 2023. Further details like C14-dates and archaeological descriptions were reported elsewhere (Meyer et al. Azania 2021; Steyn et al. African Archaeological Review 2022). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00001011322 
   
  
    
    Raw molecular data of Vd2 T cells upon treatment with different stimuli and mevalonate pathway inhibitors. 
    
   
  
    
      
      unspecified 
      
    
   
  20 
 
  
    EGAD00001011323 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains targeted DNA sequencing data of 40 tumor/normal pairs from the validation cohort plus 10 tumor/normal pairs of patients from the screening cohort for technical validation. Data was generated on Illumina HiSeq 2000 device in paired-end mode and is stored in BAM file format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  100 
 
  
    EGAD00001011324 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains whole exome sequencing data of 37 tumor/normal pairs from the screening cohort plus an additional relapse tumor of one of those 37 patients. Data was generated on Illumina HiSeq 2000 device in paired-end mode and is stored in BAM file format. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  75 
 
  
    EGAD00001011325 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains 44 samples derived from RNA-sequencing data of bile duct and CCA cell lines. Data was generated on Illumina NovaSeq 6000 device in paired-end mode and is stored in compressed FASTQ file format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001011326 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains fusion gene analysis using multiplex single primer extension-based RNA-sequencing for a subset of 25 patients of the screening cohort. Data was generated on Illumina NexSeq 550 device in paired-end mode and is stored in compressed FASTQ file format. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  25 
 
  
    EGAD00001011331 
   
  
    
    HGSC cases were selected for which matched fresh frozen and FFPE samples were available from the same tissue specimens.  Fresh frozen, FFPE and normal blood were subject to WGS for the purpose of assessing the possibility of using FFPE WGS in place of fresh frozen for somatic mutation calling. 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  6 
 
  
    EGAD00001011333 
   
  
    
    This dataset consists of RNA sequencing data (FASTQs) from intestinal mucosal biopsies from 9 IBD patients. All patients endoscopically active disease and were not receiving immunosuppressive or biologic therapies. All biopsies (6 per donor) were collected from a single inflamed site. Biopsies were cultured for 18 hours at an air-liquid interface in media containing either DMSO (vehicle control), PD-0325901 (0.5uM) or infliximab (10ug/ml; MSD) - two biopsies per condition. Sequencing was performed on a NovaSeq 6000 (100bp, PE reads). After 18 hours, biopsies were harvested and snap frozen. After lysis, RNA was extracted using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NovaSeq 6000 (100bp, PE reads). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD00001011334 
   
  
    
    CTD-ILD BALF and blood scRNA-seq dataset consists of single-cell transcriptome data of bronchoalveolar lavage fluid (BALF) and blood derived from 30 connective tissue disease-associated interstitial lung disease (CTD-ILD) and 12 idiopathic interstitial pneumonia (IIP) patients. 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  161 
 
  
    EGAD00001011335 
   
  
    
    RNA-Seq and ATAC-Seq of iPSC derived neurons under baseline and KCl stimulation conditions from 10 distinct donors, including 5 healthy controls and 5 schizophrenic individuals. 
scATAC of human post mortem prefrontal cortex from 4 adult individuals including 2 neurotypical individuals and 2 schizophrenic individuals. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD00001011336 
   
  
    
    Instestinal organoids treated with either busulfan, fludarabine or clofarabine for 24h 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD00001011337 
   
  
    
    This dataset comprises raw RNA sequencing from inflammatory (TPP) macrophages that were treated with the MEK inhibitor PD-0325901 (100nM or 500nM) or vehicle control (n=3 donors). MEK inhibitor or vehicle control was added on day 4 and cells were harvested on day 6 and lysed. RNA was extracted from cell lysates using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NextSeq500. Raw data are provided as 50 bp paired-end Illumina reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD00001011338 
   
  
    
    This dataset comprises raw RNA sequencing data (FASTQs) from primary inflammatory (TPP) macrophages that were unedited (non-targeting control, NTC), edited to delete the disease-associated chr21q22 enhancer region (n=5), or edited to disrupt ETS2 with 1 of 2 independent gRNAs (n=9). We also performed RNA-sequencing in NTC or ETS2-edited TPP macrophages that were treated for 12 hours with vehicle (DMSO) or roxadustat (30 uM, n=3). Macrophages were detached using Accutase and lysed. RNA was extracted from cell lysates using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NextSeq500. Raw data are provided as 50 bp paired-end Illumina reads. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  38 
 
  
    EGAD00001011339 
   
  
    
    Fastq files are deposited for single-cell transcriptomes of patient H3-K27M diffuse midline gliomas, generated using the SMART-seq2 method. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  13 
 
  
    EGAD00001011340 
   
  
    
    The dataset consists of 12 samples of monocytes, which were analyzed using RNA-sequencing (RNA-seq). These samples were categorized into four distinct groups, each comprising three samples. The dataset was designed to investigate the transcriptomic and metabolic profiles of monocytes across different age groups and stimulation conditions. 3 neonatal controls, 3 neonatal LPS stimulated, 3 adult controls and 3 adult LPS stimulated samples. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
    
   
  12 
 
  
    EGAD00001011342 
   
  
    
    Single-cell RNA sequencing of 43 bronchoalveolar lavage fluid (BALF) samples from 15 CAPA and 22 COVID-19 mechanically ventilated patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  43 
 
  
    EGAD00001011343 
   
  
    
    Whole genome sequencing for tumour/normal matched pairs from 78 samples 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  - 
 
  
    EGAD00001011345 
   
  
    
    scRNAseq and scTCRseq of serial peripheral blood mononuclear cell (PBMC) samples (n=72) taken at various timepoints before and during treatment (Week 0 (W0), Week 3 (W3), Week 6 (W6)). PBMC samples were pooled together into 37 pools, loading 2 or three samples per lane in the 10X Genomics chip, in equal proportions, according to a pre-designed pooling matrix. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  70 
 
  
    EGAD00001011346 
   
  
    
    Single cell TotalSeqC protein data serial peripheral blood mononuclear cell (PBMC) samples taken from advanced HCC patients. TotalSeqC is available for 41 out of 72 PBMC samples included in the study, and were combined into 20 pools. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00001011347 
   
  
    
    Single-cell RNA and TCR sequencing of 40 advanced HCC pre-treatment biopsies from 38 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  80 
 
  
    EGAD00001011348 
   
  
    
    TrypanoGEN+ phenotype data containing 183 samples from DRC, Malawi and Uganda. 
    
   
  
    
   
  183 
 
  
    EGAD00001011349 
   
  
    
    This dataset consists of sequencing data from an ETS2 CUT&RUN experiment in primary inflammatory (TPP) macrophages (n = 2).  Pre-cultured TPP macrophages were harvested and processed immediately using the CUT&RUN Assay kit (Cell Signaling) according to the manufacturer’s instructions with the following modifications (essentially, avoiding the use of ConA-coated beads). Anti-ETS2 (ThermoFisher) or IgG control (Cell Signaling) antibodies were used for targeted digestion of chromatin. For each donor, 5x10^5 cells were pelleted, washed in Wash Buffer, and resuspended in Antibody Binding buffer. Cells were incubated with antibodies (1:100 dilution for anti-ETS2) for 2h at 4°C. After washing in Digitonin Buffer, cells were incubated with pA/G-MNase for 1h at 4°C. Cells were washed twice in Digitonin Buffer, resuspended in the same buffer and cooled for 5 minutes on ice. Calcium chloride was added to activate pA/G-MNase digestion and cells were incubated for 30 minutes at 4°C before Stop Buffer was added, and cells were incubated for 10 min at 37°C to release cleaved chromatin fragments. Supernatants were collected by centrifugation and DNA extracted using DNA Purification Buffers and Spin Columns (Cell Signaling). Library preparation was performed according to a protocols.IO protocol (dx.doi.org/10.17504/protocols.io.bagaibse) using the NEBNext Ultra II DNA Library Prep Kit. Size selection was performed using AMPure XP beads (Beckman Coulter) and fragment sizes were determined using an Agilent 2100 Bioanalyzer (High Sensitivity DNA kit). Equimolar pools of indexed libraries were sequenced on an Illumina NovaSeq 6000 (100bp PE reads). Raw data are provided as 100 bp paired-end Illumina reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001011350 
   
  
    
    We performed scRNA-seq on 15 fresh lymph node core biopsies from 7 Lymphoma patients treated with CD20xCD3 bispecific antibodies. We also performed whole-exome sequencing (WES) and bulk RNA-seq on formalin-fixed paraffin embedded (FFPE) tumor samples from these patients. WES was additionally performed on matched germline samples (blood) for each patient. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD00001011351 
   
  
    
    This dataset consists of H3K27ac ChIP-sequencing in inflammatory (TPP) macrophages from 2 minor allele homozygotes and 2 major allele homozygotes at rs2836882. Monocytes were positively selected from PBMC using CD14 Microbeads and inflammatory macrophage differentiation performed using conditions that model chronic inflammation (TPP): 3 days GM-CSF (50ng/mL) followed by 3 days GM-CSF, TNFa (50ng/mL), PGE2 (1mg/mL), and Pam3CSK4 (1mg/mL). After harvesting, cells were cross-linked, quenched, lysed, and sheared. Immunoprecipitation of histone-DNA complexes was performed overnight at 4C with rotation using an anti-H3K27ac antibody and the SimpleChIP Plus Sonication ChIP kit (Cell Signaling Technology). Following reverse cross-linking, 50ng of immunoprecipitated DNA or input DNA were used to prepare sequencing libraries using the iDeal Library Preparation kit (Diagenode), according to manufacturer instructions. 10 PCR cycles were used for the amplification step and size selection was not performed. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 (Agilent) and the libraries were sequenced in pools of 8, with each pool being sequenced in 2 lanes of an Illumina HiSeq2500 high output flow-cell (50bp, single-end reads). Raw data are provided as raw and aligned single-end sequencing reads from H3K27ac-bound DNA and the input chromatin. Raw reads were trimmed using Trim Galore and aligned to the reference human genome (hg19) using Burrows-Wheeler Aligner (v0.7.12) with default parameters.  Aligned reads were converted to BAM files, sorted, and technical duplicates merged before indexing – all using SAMtools (v1.4). PCR duplicates were identified using Picard tools (v2.18.1) and removed together with unmapped reads using SAMtools (v1.4). The resulting BAM files were re-sorted and indexed after filtering. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001011352 
   
  
    
    Includes 4 Datasets:
1. 35 Control Plasma WGS samples sequenced on NovaSeq V1.0 Chemistry at the New York Genome Center. Denoted by CTRL-2XX naming scheme.
2. Plasma WGS from 17 patients with Small Cell Lung Cancer. Samples extracted at either Pretreatment or Postoperative at weeks 2 or 3. Denoted by SCLC-XX naming scheme.
3. Plasma WGS from a synthetic mixing study of a high-burden melanoma plasma sample with plasma bag, at estimated tumor concentrations of 10e-3, 10e-4, 10e-5, and 10e-6 for 2 replicates. Denoted by SM-repX naming scheme.
4. Assorted Plasma, Tumor, and Normal WGS from patients with NSCLC expressing high-burden for use in model training. Denoted by NSCLC-2XX naming scheme. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  83 
 
  
    EGAD00001011353 
   
  
    
    Whole genome sequencing data of 6 high-grade serous carcinoma (HGSC) patients (11 samples) sequenced with MGISEQ-2000 
    
   
  
    
      
      unspecified 
      
    
   
  11 
 
  
    EGAD00001011354 
   
  
    
    RNAseq fastq files for 254 samples for the neoALTTO study of lapatinib, trastuzumab or combination in HER2+ breast cancer patients. Those are pre-treatment baseline samples. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  254 
 
  
    EGAD00001011359 
   
  
    
    The dataset for the study "Early ctDNA molecular response captures therapeutic response in the first stage of CCTG BR.36 ctDNA-directed, multi-center phase II study of molecular response adaptive immunotherapy in non-small cell lung cancer", includes 134 bam files from hybrid capture targeted error-correction next-generation sequencing (PGDx Elio plasma resolve) from plasma cell-free DNA and matched white blood cell DNA from 35 individuals with non-small cell lung cancer on the BR.36 trial, alongside 11 bam files from targeted next generation sequencing (PGDx Elio tissue complete) of  tumor DNA from 11 individuals with non-small cell lung cancer on the BR.36 trial. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  153 
 
  
    EGAD00001011361 
   
  
    
    Single-cell RNA-seq data of angioimmunoblastic T-cell lymphoma 
    
   
  
    
   
  - 
 
  
    EGAD00001011363 
   
  
    
    We generated a dataset consisting of 79 VCF files, and respective FASTQ and CRAM files, methodically generated using the GLIMPSE1 imputation algorithm leveraging the 1000 Genomes Project Phase 3 dataset as the reference panel of haplotypes. In total this dataset is composed of approximately 325 GB of FASTQ data, 156 GB of CRAM data, and 6 GB of VCF data. Our samples were specifically derived from sequenced DNA from a highly selective cohort of patients, mostly comprised of Iberian Populations in Spain (IBS) individuals but also containing some individuals with other genetic backgrounds, who presented severe COVID-19 symptoms during the initial wave of the SARS-CoV-2 pandemic in Madrid, Spain. On average, each VCF file in this rich dataset contains 9.49 million high-confidence single nucleotide variants [95%CI: 9.37 million - 9.61 million]. 
    
   
  
    
      
      unspecified 
      
    
   
  80 
 
  
    EGAD00001011364 
   
  
    
    This dataset includes the 2 extra normal samples adjacent to PTC tumors that were multiplexed and profiled using kits from different batches. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001011365 
   
  
    
    This dataset include all spatial transcriptomics experiments. Samples coming from the same patient were sequenced on the same flowcell. Patients PTC4 to PTC9 were also sequenced on the same flowcell, as well as ATC1 and ATC2 on another one, and ATC3A and ATC3B on another one. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  37 
 
  
    EGAD00001011366 
   
  
    
    This dataset includes the first 9 PTC samples and 6 ATC samples profiled on the same sequencing flowcell. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD00001011367 
   
  
    
    This dataset contains metagenomic sequencing of stool samples of babies from CS Baby Biome project, and their mothers 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  195 
 
  
    EGAD00001011368 
   
  
    
    Whole genome sequencing data of 35 high-grade serous carcinoma (HGSC) patients (112 samples) sequenced with Illumina Novoseq 6000 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  112 
 
  
    EGAD00001011369 
   
  
    
    221 patient samples including 164 initial tumor (TI) samples (53/164 fresh frozen TI samples, and 111/164 formalin-fixed paraffin-embedded (FFPE) TI samples), 22 paired matched normal samples, and 35 unpaired normal samples from healthy donors;  FASTQ file format, Agilent SureSelect Human All Exon V6 Kit (Agilent Technologies, Inc., Santa Clara, California, USA) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011373 
   
  
    
    BAM files for two families recruited to the HICF2 genome sequencing project due to craniosynostosis.  One family is a singleton and the other is an affected mother-daughter duo. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  3 
 
  
    EGAD00001011374 
   
  
    
    HNF1A haploinsufficiency causes decreased insulin expression, dysregulation of pancreatic progenitor signature genes and affects chromatin accessibility 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  - 
 
  
    EGAD00001011376 
   
  
    
    RNA-seq data 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD00001011377 
   
  
    
    Whole-genome sequencing data 
    
   
  
    
      
      unspecified 
      
    
   
  23 
 
  
    EGAD00001011378 
   
  
    
    Paired tumor-normal whole genome sequencing data from primary tumors of patients diagnosed with neuroblastoma, Ewing sarcoma, Wilms tumor, hepatoblastoma and rhabdomyosarcoma 
    
   
  
    
   
  1 
 
  
    EGAD00001011379 
   
  
    
    HLA sequence data and final calls for VaccGene and 1000Gp3 African populations 
    
   
  
    
   
  1 
 
  
    EGAD00001011581 
   
  
    
    Exome sequencing data for study of the microenviroment of angioimmunoblastic T-cell lymphoma 
    
   
  
    
   
  1 
 
  
    EGAD00001011645 
   
  
    
    Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011646 
   
  
    
    Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001011647 
   
  
    
    Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001011666 
   
  
    
    TrypanoGEN+ data containing fastq files of 183 samples from DRC, Malawi and Uganda using NextSeq500. 
    
   
  
    
      
      unspecified 
      
    
   
  183 
 
  
    EGAD00001011676 
   
  
    
    Germline BAMs from blood/saliva samples from patients diagnosed with both uveal and cutaneous patients. Reads have been aligned, deduplicated and recalibrated. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  81 
 
  
    EGAD00001011677 
   
  
    
    RNA-sequencing data of 5 human thyroid cancer cell lines cultured in control conditions 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD00001011678 
   
  
    
    RNA-sequencing data: 5 normal thyroid tissues, 14 papillary thyroid carcinomas, 2 lymph node metastases, 19 poorly differentiated thyroid carcinomas and 17 anaplastic thyroid carcinomas;
Targeted DNA-sequencing of the 165 genes included in the “Solid and Haematological tumors” panel (BRIGHTCore, Brussels, Belgium): 2 normal thyroid tissues, 2 poorly differentiated thyroid carcinomas and 7 anaplastic thyroid carcinomas;
2 FASTQ files for each sample (paired). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  68 
 
  
    EGAD00001011679 
   
  
    
    Blood plasma samples (n=168) and matched diagnostic formalin-fixed paraffin-embedded (FFPE) tissue samples (n=69) of DLBCL patients, PMBCL patients and healthy controls were collected between 2016-2021. Plasma samples were collected at diagnosis, at interim evaluation, after treatment, and in case of refractory or relapsed disease. RNA was extracted from 200 µl plasma using the miRNeasy serum/plasma kit and from FFPE tissue using the miRNeasy FFPE kit. RNA was subsequently sequenced on a NovaSeq 6000 instrument using the SMARTer Stranded Total RNA-seq pico v3 library preparation kit. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  172 
 
  
    EGAD00001011812 
   
  
    
    The dataset includes 43 high coverge (30x) whole genome samples mostly from the Sahelian belt. 
    
   
  
    
   
  1 
 
  
    EGAD00001011813 
   
  
    
    WES of LUAD 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD00001011815 
   
  
    
    Primary sclerosing cholangitis (PSC) is a T-cell mediated, chronic inflammatory condition of the biliary tree that is strongly associated with inflammatory bowel disease. Genome-wide association studies have identified 22 non-HLA genetic risk variants associated PSC. Identifying the genes impacted by these variants has proven difficult as the majority lie in non-coding regions of the genome. Knowledge of the genes and biological pathways these non-coding variants are perturbing is vital to understanding the disease biology. One means of assessing the impact of non-coding variants within disease associated loci upon genes is via colocalisation with eQTL. Many eQTL are cell-type specific, requiring the analysis of disease relevant cell types to detect colocalisation. We have collected PSC-relevant T-cell-subtypes from the peripheral blood of PSC patients via fluorescence activated cell sorting in preparation for RNA sequencing and mapping of eQTL. Samples were collected at the Norfolk and Norwich University Hopital, for which local ethical approval has been granted. Lysed cell samples will be transferred to WTSI and DNA/RNA will be extracted from lysed cell samples by T143 before genotyping (DNA) and custom library preparation and sequencing (RNA). This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  12 
 
  
    EGAD00001011816 
   
  
    
    This data set includes serial biopsies from 75 patients with DLBCL. Tumour tissue was preserved either in FFPE or frozen. Each biopsy was sequenced with either whole genome or exome. A custom targeted sequencing data set is available to match most whole genome samples. RNAseq data were available for a subset of biopsies. 
    
   
  
    
      
      unspecified 
      
    
   
  237 
 
  
    EGAD00001011817 
   
  
    
    Plasma samples from patients with melanoma (stage II/III/IV) and breast cancer (stage IV) as well as healthy individuals were subjected to low-coverage whole-genome sequencing (less than 10x average depth). This dataset contains raw fastq files from 39 breast cancer, 127 melanoma and 42 healthy control plasma samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  208 
 
  
    EGAD00001011819 
   
  
    
    Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001011820 
   
  
    
    Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD00001011821 
   
  
    
    Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      NextSeq 2000 
      
    
   
  5 
 
  
    EGAD00001011822 
   
  
    
    The dataset for the study “Circulating tumor DNA, pathological and immunologic responses to neoadjuvant nivolumab or nivolumab plus relatlimab and chemoradiotherapy in resectable esophageal/gastroesophageal junction cancer” includes 173 bam files from hybrid capture targeted error-correction next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 32 individuals with esophageal/gastroesophageal junction cancer,  who received immunotherapy-containing regimens. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 550 
      
    
   
  173 
 
  
    EGAD00001011989 
   
  
    
    Dataset contains one paired-end Whole Exome sequencing sample. One normal blood sample is also included. 
    
   
  
    
   
  2 
 
  
    EGAD00001011990 
   
  
    
    Dataset contains one paired-end RNA-seq sample. 
    
   
  
    
   
  1 
 
  
    EGAD00001011991 
   
  
    
    The dataset for the study Elucidating the heterogeneity of immunotherapy response and immune-related toxicities by longitudinal ctDNA and immune cell compartment tracking in lung cancer includes 207 bam files from hybrid capture targeted error-correction next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 30 individuals with non-small cell lung cancer, alongside 46 bam files from whole exome sequencing of tumor and matched normal DNA for 21 individuals with non-small cell lung cancer who received immunotherapy-containing regimens. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  253 
 
  
    EGAD00001011992 
   
  
    
    CRAM files of 340 human genomes from Angola and Mozambique. Paired-end reads were generated  in an Illumina-X Ten and were mapped against the human reference genome build hg19/GRCh37. More details about the sequencing and the samples in Tallman et al. 2023. Nature Communications. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001011997 
   
  
    
    Visium spatial transcriptomics (10X Genomics) performed on 4 CCA samples. Each sample has two paired-end sequencing runs: the first (I1 & I2) are a pair reading indexes; the second (R1 & R2) are a pair reading inserts, with R1 additionally reading 10X barcodes. For histology images, please contact authors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD00001011998 
   
  
    
    Fastq files for RNA-seq for 60 CCAs, 6 normal bile duct tissues, 14 CCA cell-lines (including replicates), and 2 normal cholangiocyte cell-lines (including replicates). RNA was extracted using the Qiagen RNeasy Mini kit. Illumina Tru-Seq Stranded Total RNA kit (Illumina, San Diego, California, USA) was used to prepare RNA libraries from 1 µg of total RNA. Paired-end 150 bp sequencing was performed using Illumina HiSeq4000 sequencer with the paired-end 150 bp read option. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  82 
 
  
    EGAD00001012100 
   
  
    
    Fastq files for WGS for 16 CCAs (with matched normal tissues), and 4 cell lines in an AA-treatment experiment (0, 10, 20, 40ug AA treatment after 180 days; 2 runs each for 10ug and 40ug experiments). Genomic DNA was extracted using DNeasy Blood and Tissue Kit (Qiagen). Sequencing libraries were prepared from DNA extracted using the SureSelect XT2 Target Enrichment System for the Illumina Multiplexed Sequencing platform (Illumina) according to the manufacturer’s instructions. Whole genome sequencing was performed on Illumina HiSeq4000 sequencer with paired-end sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  36 
 
  
    EGAD00001012101 
   
  
    
    Telomere fusions (TFs) can trigger the accumulation of oncogenic alterations leading to malignant transformation and drug resistance. Despite their relevance in tumour evolution, our understanding of the patterns and consequences of TFs in human cancers remains limited. Here, we characterize the rates and spectrum of somatic TFs across >30 cancer types using whole-genome sequencing data. TFs are pervasive in human tumours with rates varying markedly across and within cancer types. In addition to end-to-end fusions, we find novel patterns of TFs that we mechanistically link to the activity of the alternative lengthening of telomeres (ALT) pathway. We show that TFs can be detected in the blood of cancer patients, which enables cancer detection with high specificity and sensitivity even for early-stage tumours and cancers of high unmet clinical need. Overall, we report a novel genomic footprint that enables characterization of the telomere maintenance mechanism of tumours and liquid biopsy analysis. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001012102 
   
  
    
    Geographic variation of mutagenic exposures in kidney cancer genomes – sequence data (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001012103 
   
  
    
    Fastq files for H3K27ac ChIP-seq (with matched input-DNA control) for 63 CCAs, 8 normal bile duct tissues, 16 CCA cell-lines (including replicates), and 3 normal cholangiocyte cell-lines (including replicates). Library prep was performed with the NEBNext ChIP-seq library preparation kit. Each library (including matching input DNA) was sequenced to an average depth of 20 to 30 million raw reads on Illumina HiSeq4000 sequencer, with paired-end sequencing (except for 3 normal bile tissues done with single-end sequencing). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  180 
 
  
    EGAD00001012116 
   
  
    
    The dataset includes RNA sequencing data on PRE- treatment biopsies of lymph node metastasis (n=80)
The technology used for sequencing is llumina HiSeq 2500 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  80 
 
  
    EGAD00001012121 
   
  
    
    This dataset contains 10 samples from 9 patients with chronic graft-versus-host disease (GVHD). Each sample is analysed with Chromium V(D)J and 5' Gene Expression Platform v1.1 (10X Genomics). The raw data includes fastq files for Gene expression and fastq files for V(D)J Expression. The processed data have been deposited in the ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-13419. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  120 
 
  
    EGAD00001012222 
   
  
    
    Geographic variation of mutagenic exposures in kidney cancer genomes – filtered vcf files (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001012223 
   
  
    
    Geographic variation of mutagenic exposures in kidney cancer genomes – patient metadata files (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001012227 
   
  
    
    Single-cell profiling of sero-negative and sero-positive humans that were inoculated with SARS-CoV-2. The cellular response during SARS-CoV-2 is profiled using single-cell transcriptomics, CITE-seq and single cell immune profiling, by sampling PBMCs and nasal swabs before and at multiple time points during SARS-CoV-2 infection. This one-of-a-kind cellular map will give unique temporal resolution of how nasal and immune cells respond to SARS-CoV-2 exposure and infection. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001012228 
   
  
    
    Data from Representation of genomic intratumor heterogeneity in multi-region non-small cell lung cancer patient-derived xenograft models 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  237 
 
  
    EGAD00001012229 
   
  
    
    1 sample is pure plasmid DNA and 8 samples are cell pellets for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001012230 
   
  
    
    1 sample is pure plasmid DNA and 10 samples are cell pellets for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001012231 
   
  
    
    1 sample is pure plasmid DNA and 10 samples are cell pellets for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  10 
 
  
    EGAD00001012232 
   
  
    
    8 cell pellet samples for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD00001012233 
   
  
    
    8 cell pellet samples for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001012234 
   
  
    
    8 samples are cell pellets for genomic DNA extraction.
CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1 
 
  
    EGAD00001012235 
   
  
    
    This dataset contains RNA Seq, 10x scRNA Seq and Exome sequencing of glioblastoma samples. Sequencing was performed on a Illumina NovaSeq 6000 and Illumina HiSeq 4000. The sequencing was always paired. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  61 
 
  
    EGAD00001012437 
   
  
    
    Fresh peripheral blood mononuclear cells of four human donors were cultured together with either lung adenocarcinoma A549 cancer cells or A549-expressing H1N1 Sialidase cancer cells. These treatments induced the differentiation of donor cells into immunosuppressive MDSC-like cells, which were further subjected to bulk RNA sequencing. 
RNA-seq TruSeq libraries were generated from polyA-enriched mRNA isolated from the samples, and sequenced in paired-end mode on 4 lanes of an Illumina NextSeq 500 flow-cell 
    
   
  
    
      
      NextSeq 500 
      
    
   
  32 
 
  
    EGAD00001012638 
   
  
    
    Re-aligned BAM files for manuscript titled Discrepancies in Tumour Mutation Burden (TMB) reporting from sequential Endobronchial ultrasound trans bronchial needle aspiration (EBUS TBNA) samples within single lymph node stations for Copy Number Variant Calling. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  45 
 
  
    EGAD00001012639 
   
  
    
    Stitched BAM files for manuscript titled Discrepancies in Tumour Mutation Burden (TMB) reporting from sequential Endobronchial ultrasound trans bronchial needle aspiration (EBUS TBNA) samples within single lymph node stations for SNP and INDEL Variant Calling. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  45 
 
  
    EGAD00001012841 
   
  
    
    This dataset contains 15 TCRab sequencing samples from 6 CML patients before and after TKI-cessation. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  240 
 
  
    EGAD00001012842 
   
  
    
    This dataset contains 15 single-cell RNA sequencing samples from 6 CML patients before and after TKI-cessation. The raw data is available as fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  272 
 
  
    EGAD00001013726 
   
  
    
    Geographic variation of mutagenic exposures in kidney cancer genomes – structural variation vcf files ( Mutographs ) 
    
   
  
    
   
  - 
 
  
    EGAD00001013727 
   
  
    
    Geographic variation of mutagenic exposures in kidney cancer genomes – copy number variants (Mutographs) 
    
   
  
    
   
  - 
 
  
    EGAD00001014787 
   
  
    
    Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  89 
 
  
    EGAD00001014788 
   
  
    
    Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  52 
 
  
    EGAD00001014789 
   
  
    
    Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  72 
 
  
    EGAD00001014790 
   
  
    
    Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  41 
 
  
    EGAD00001015012 
   
  
    
    BAMs from deep sequencing using a custom panel for the study 'Early evolutionary branching across spatial domains predisposes to clonal replacement under chemotherapy in neuroblastoma' 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  69 
 
  
    EGAD00001015157 
   
  
    
    Molecular characterization of 41 tumors from 17 individuals with CMMRD to gain a better understandig of mutational processes driving subsequent tumor development. The molecular characterization includes the investigation of tumor mutational load and mutational signatures. 
    
   
  
    
   
  1 
 
  
    EGAD00001015158 
   
  
    
    Molecular characterization of 41 tumors from 17 individuals with CMMRD to gain a better understandig of mutational processes driving subsequent tumor development. The molecular characterization includes the investigation of tumor mutational load and mutational signatures. 
    
   
  
    
   
  1 
 
  
    EGAD00001015241 
   
  
    
    Shallow whole genome sequencing of 196 formalin-fixed paraffin-embedded p53abn endometrial cancers. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  196 
 
  
    EGAD00001015251 
   
  
    
    The mutational landscape of haematopoietic cells will be characterized by WGS following amplification of DNA and preparation of libraries by PTA(primary template-directed amplification). Samples have been sourced from the Cambridge Biobank. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  145 
 
  
    EGAD00001015252 
   
  
    
    The transcriptional landscape of haematopoietic cells will be characterized by RNA Seq following amplification of RNA/cDNA and preparation of libraries by PTA(primary template-directed amplification). Samples have been sourced from the Cambridge Biobank. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  111 
 
  
    EGAD00001015255 
   
  
    
    T-cell lymphoblastic lymphoma (T-LBL) is a common pediatric malignancy accounting for approximately 20% of the non-Hodgkin lymphomas during childhood. Survival rates of T-LBL are ~80%, but outcome after relapse is dismal, with salvage rates reaching only ~15. Considering the extremely poor prognosis after relapse and absence of clinically relevant high-risk genetics, there is an urgent need for the identification of molecular risk factors and new prognostic biomarkers in T-LBL, as well as identification of new therapeutic strategies. In this study we present a novel entity of high-risk pediatric T-LBL patients characterized by previously unknown NOTCH1 gene fusions and highly elevated blood TARC levels 
    
   
  
    
   
  1 
 
  
    EGAD00001015263 
   
  
    
    Genome and transcriptome sequence data from a infantile fibrosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015264 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015265 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015266 
   
  
    
    Genome and transcriptome sequence data from a neurofibromatosis type 1 (NF1) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015267 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015268 
   
  
    
    Genome and transcriptome sequence data from a CNS sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015269 
   
  
    
    Genome and transcriptome sequence data from a ocular Melanoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015270 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015271 
   
  
    
    Genome and transcriptome sequence data from a fibrovascular brain tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015272 
   
  
    
    Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015273 
   
  
    
    Genome and transcriptome sequence data from a craniopharyngioma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015274 
   
  
    
    Genome and transcriptome sequence data from a NHL large B Cell patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015275 
   
  
    
    Genome and transcriptome sequence data from a malignant granular cell tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015276 
   
  
    
    Genome and transcriptome sequence data from a papillary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015277 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015278 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015279 
   
  
    
    Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015280 
   
  
    
    Genome and transcriptome sequence data from a pineoblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015281 
   
  
    
    Genome and transcriptome sequence data from a multifocal glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015282 
   
  
    
    Genome and transcriptome sequence data from a progressive facial plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015283 
   
  
    
    Genome and transcriptome sequence data from a plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015284 
   
  
    
    Genome and transcriptome sequence data from a diffuse Intrinsic Pontine Glioma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015285 
   
  
    
    Genome and transcriptome sequence data from a acute lymphoblastic leukemia
patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015286 
   
  
    
    Genome and transcriptome sequence data from a ewing sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015287 
   
  
    
    Genome and transcriptome sequence data from a ependymoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015288 
   
  
    
    Genome and transcriptome sequence data from a glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015289 
   
  
    
    Genome and transcriptome sequence data from a NUT midline carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015290 
   
  
    
    Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015291 
   
  
    
    Genome and transcriptome sequence data from a pre-B all (2nd relapse in CNS) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015292 
   
  
    
    Genome and transcriptome sequence data from a gliomatosis cerebri anaplastic astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015293 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015294 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015295 
   
  
    
    Genome and transcriptome sequence data from a metastatic alveolar rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015296 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015297 
   
  
    
    Genome and transcriptome sequence data from a minimally invasive adenocarcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015298 
   
  
    
    Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015299 
   
  
    
    Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015300 
   
  
    
    Genome and transcriptome sequence data from a glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015301 
   
  
    
    Genome and transcriptome sequence data from a neurofibromatosis type 1 patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015302 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015303 
   
  
    
    Genome and transcriptome sequence data from a papillary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015304 
   
  
    
    Genome and transcriptome sequence data from a relapsed Wilms tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015305 
   
  
    
    Genome and transcriptome sequence data from a plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015306 
   
  
    
    Genome and transcriptome sequence data from a metastatic osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015307 
   
  
    
    Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma of the nasopharynx patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015308 
   
  
    
    Genome and transcriptome sequence data from a diffuse large B-cell lymphoma (relapse) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015309 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015310 
   
  
    
    Genome and transcriptome sequence data from a malignant rhabdoid tumour patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015311 
   
  
    
    Genome and transcriptome sequence data from a relapsed osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015312 
   
  
    
    Genome and transcriptome sequence data from a relapsed blastic plasmacytoid dendritic cell neoplasm patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015313 
   
  
    
    Genome and transcriptome sequence data from a synovial sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015314 
   
  
    
    Genome and transcriptome sequence data from a recurrence nasopharyngeal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015315 
   
  
    
    Genome and transcriptome sequence data from a ewing sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015316 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015317 
   
  
    
    Genome and transcriptome sequence data from a pineal parenchymal tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015318 
   
  
    
    Genome and transcriptome sequence data from a rhabdomyosarcoma, alveolar patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015319 
   
  
    
    Genome and transcriptome sequence data from a anaplastic astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015320 
   
  
    
    Genome and transcriptome sequence data from a rosette-forming glioneuronal tumor (RGNT) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015321 
   
  
    
    Genome and transcriptome sequence data from a choroid plexus carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015322 
   
  
    
    Genome and transcriptome sequence data from a metastatic malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015323 
   
  
    
    Genome and transcriptome sequence data from a malignant rhabdoid tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015324 
   
  
    
    Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015325 
   
  
    
    Genome and transcriptome sequence data from a NUT midline carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015326 
   
  
    
    Genome and transcriptome sequence data from a diffuse midline glioma, H3K27 mutant patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015327 
   
  
    
    Genome and transcriptome sequence data from a CNS non-germinoma germ cell tumour patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015328 
   
  
    
    Genome and transcriptome sequence data from a GBM  (H3 K27M mutant) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015329 
   
  
    
    Genome and transcriptome sequence data from a alveolar rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015330 
   
  
    
    Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015331 
   
  
    
    Genome and transcriptome sequence data from a acute myeloid leukemia patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015332 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015333 
   
  
    
    Genome and transcriptome sequence data from a pIlomyxoid astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015334 
   
  
    
    Genome and transcriptome sequence data from a wilms tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015335 
   
  
    
    Genome and transcriptome sequence data from a rhabdomyosarcoma, alveolar patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015336 
   
  
    
    Genome and transcriptome sequence data from a high-grade glioma, glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015337 
   
  
    
    Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015338 
   
  
    
    Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study 
    
   
  
    
   
  1 
 
  
    EGAD00001015344 
   
  
    
    This dataset included cfMethyl-Seq data of 15 plasma cfDNA samples from 15 lung cancer patients and RRBS data of 58 lung tumor tissue samples from 58 lung cancer patients. The data were generated following the standard protocols. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  73 
 
  
    EGAD00001015345 
   
  
    
    In this study, we investigate differences in the cellular landscape and functionality of ex vivo cultured nasal epithelial cells in response to SARS-CoV-2 infection across different age groups: paediatric (<12y), adult (30-50y), and older adults (>70y). We unravel, that while ciliated cells serve as primary sites for viral replication consistently across all age groups, a distinctive goblet inflammatory subtype emerges in infected paediatric cultures, characterized by heightened expression of interferon-stimulated genes and incomplete viral replication. Conversely, older adult cultures infected with SARS-CoV-2 exhibit a proportional surge in basaloid-like cells, which not only facilitate viral dissemination but also demonstrate associations with altered epithelial repair pathways. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD00001015347 
   
  
    
    Contains fast5 data for each of the 10 samples sequenced. 
    
   
  
    
      
      PromethION 
      
    
   
  10 
 
  
    EGAD00001015348 
   
  
    
    Whole-genome sequences from the Korea4K project are available. These encompass 3913 CRAM files derived from 3776 Korean whole-genome sequencing (WGS) datasets generated using Illumina HiSeqX10 or Illumina NovaSeq6000 platforms. Further details can be found in the published paper, accessible via doi.org/10.1093/gigascience/giae014. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  3776 
 
  
    EGAD00001015349 
   
  
    
    Targeted panel sequencing of 188 formalin-fixed paraffin-embedded p53abn endometrial cancer samples. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  188 
 
  
    EGAD00001015350 
   
  
    
    Each gene contains individual nanopore long-read amplicon sequencing FASTQ files for: individual (IND) 01-05 and brain regions: Brodmann Area (BA): 10, 24, 9, 46, caudate (CAUD), cerebellum (CBM) and temporal cortex (TCX). 
    
   
  
    
      
      GridION 
      
    
   
  31 
 
  
    EGAD00001015351 
   
  
    
    The landscapes of somatic mutation in normal cells inform on the processes of mutation and selection operative throughout life, permitting insight into embryogenesis, normal ageing and the earliest stages of cancer development. Here, by whole-genome sequencing and targeted panel sequencing of microdissections from 30 individuals, including 18 with gastric cancer, we elucidate the developmental trajectories of normal and malignant gastric epithelium. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD00001015352 
   
  
    
    The landscapes of somatic mutation in normal cells inform on the processes of mutation and selection operative throughout life, permitting insight into embryogenesis, normal ageing and the earliest stages of cancer development. Here, by whole-genome sequencing and targeted panel sequencing of microdissections from 30 individuals, including 18 with gastric cancer, we elucidate the developmental trajectories of normal and malignant gastric epithelium. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  72 
 
  
    EGAD00001015353 
   
  
    
    We sequence >1000 whole genomes from 9 patients with CML, providing the largest sequencing dataset for this cancer. We reconstruct phylogenetic trees using somatic mutations and infer BCR::ABL1 timing and tumour growth rates. We correlate mutation landscapes and clonal trajectories with clinical features. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015356 
   
  
    
    This data set includes 27 full-length transcript sequence generated from PacBio IsoSeq that were used for verify the cancer-specific exons identified in three genes: FN1, COL6A3 and TNC. The data were generated from PDX models of osteosarcoma patients. 
    
   
  
    
      
      PacBio RS 
      
    
   
  1 
 
  
    EGAD00001015357 
   
  
    
    Due to the lower incidence of T-LBL and difficulties in obtaining diagnostic T-LBL material, extensive research on T-LBL has been hampered whereas genetic aberrations in T-ALL are thoroughly characterized. Given the similarities and differences between T-LBL and T-ALL, the question has been raised whether T-LBL and T-ALL represent two different diseases or different manifestations of the same disease. This study aims to identify the genomic and transcriptomic landscape of T-LBL and compare the findings to what is found T-ALL. Comparison of the molecular aberrations between T-LBL and T-ALL can provide insights into the overlap and differences in malignant development between the two entities, which could lead to improved risk stratification in T-LBL in order to eventually adapt T-LBL treatment protocols based on molecular-genetic prognostic factors. 
    
   
  
    
   
  1 
 
  
    EGAD00001015358 
   
  
    
    CEBPA/PU.1/TCF7 ChIP-seq of 6 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Low-coverage whole genome sequencing (ChIP input) of the same samples is also included as a control to be used in peak calling. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015361 
   
  
    
    STAG1 and STAG2-ChIP-seq in RAD21-mutant adult acute myeloid leukemia 
    
   
  
    
      
      NextSeq 500 
      
    
   
  1 
 
  
    EGAD00001015362 
   
  
    
    In 43 patients pretreatment tumor biopsies, resected tumors and normal tissue of sufficient quality and quantity were obtained to longitudinally explore the mutational profiles of a comprehensive set of cancer-related genes.
For tumor samples, one to four FFPE sections (10 µm thickness, number depending on sample size) were lysed for genomic DNA isolation. Isolation was performed semi-automatically on the Maxwell purification system (Maxwell RSC DNA FFPE Kit, AS1450, Promega) as specified by the manufacturer. DNA was eluted in 50 µl RNase-free water and quantified fluorescently for library preparation using a Qubit 2.0 fluorometer (Life Technology) with its appertaining DNA broad-range assay. Corresponding normal DNA was isolated from blood or PBMCs using routinely available QIAGEN technology. DNA was stored at -20°C before use. 
Whole-exome sequencing (WES) was performed using the Twist Human Core + RefSeq + Mitochondrial Panel (Twist Bioscience), and 2 x 100 bp fragment sizes were sequenced using a NovaSeq6000 (Illumina). Demultiplexing of sequenced reads was achieved using bcl2fastq (version 2.2). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  145 
 
  
    EGAD00001015363 
   
  
    
    Repli-seq data for "Replication timing alterations are associated with mutation acquisition during tumour evolution in breast and lung cancer" 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD00001015364 
   
  
    
    Colorectal cancer – unmapped reads (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015365 
   
  
    
    Wnt signalling must be 'just right' to promote tumour growth. Basal cell adenoma (BCA) and basal cell adenocarcinoma (BCAC) of the salivary gland are rare tumours that can be difficult to distinguish from each other and other salivary gland tumour subtypes. Due to their rarity, the genomic profiles of BCA and BCAC have not been explored. Using whole-exome and transcriptome sequencing of BCA and BCAC cohorts, we identify a novel recurrent FBXW11 missense mutation (p.F517S) in BCA, that was mutually exclusive with the previously reported CTNNB1 p.I35T gain-of-functon (GoF) mutation. These driver events collectively accounted for 94% of BCAs. In vitro, mutant FBXW11 had a dominant negative affect, characterised by defective binding to β-catenin and the accumulation of β-catenin in cells. This was consistent with the nuclear expression of β-catenin observed in BCA cases harbouring the FBXW11 p.F517S mutation and activation of the Wnt/β-catenin pathway and defines a novel mechanism of Wnt pathway control. The genomic profiles of BCAC were distinct from BCA, with hotspot DICER1 and HRAS mutations and putative driver mutations affecting PI3K/AKT and NF-κB signalling pathway genes. A single BCAC, which may represent a malignant transformation of BCA, harboured the recurrent FBXW11 mutation. These findings have important implications for the diagnosis and treatment of BCA and BCAC, which, despite histopathologic overlap, may be unrelated entities. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  124 
 
  
    EGAD00001015366 
   
  
    
    Wnt signalling must be 'just right' to promote tumour growth. Basal cell adenoma (BCA) and basal cell adenocarcinoma (BCAC) of the salivary gland are rare tumours that can be difficult to distinguish from each other and other salivary gland tumour subtypes. Due to their rarity, the genomic profiles of BCA and BCAC have not been explored. Using whole-exome and transcriptome sequencing of BCA and BCAC cohorts, we identify a novel recurrent FBXW11 missense mutation (p.F517S) in BCA, that was mutually exclusive with the previously reported CTNNB1 p.I35T gain-of-functon (GoF) mutation. These driver events collectively accounted for 94% of BCAs. In vitro, mutant FBXW11 had a dominant negative affect, characterised by defective binding to β-catenin and the accumulation of β-catenin in cells. This was consistent with the nuclear expression of β-catenin observed in BCA cases harbouring the FBXW11 p.F517S mutation and activation of the Wnt/β-catenin pathway and defines a novel mechanism of Wnt pathway control. The genomic profiles of BCAC were distinct from BCA, with hotspot DICER1 and HRAS mutations and putative driver mutations affecting PI3K/AKT and NF-κB signalling pathway genes. A single BCAC, which may represent a malignant transformation of BCA, harboured the recurrent FBXW11 mutation. These findings have important implications for the diagnosis and treatment of BCA and BCAC, which, despite histopathologic overlap, may be unrelated entities. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  68 
 
  
    EGAD00001015367 
   
  
    
    Sebaceous tumours are a rare cutaneous cancer with potential for aggressive behaviour. However, limited information is available on these cancers with few published cases. Here we wish to exome sequence these cancers to define the first genomic landscape for this malignancy. We will extract DNA from formalin-fixed, paraffin-embedded (FFPE) cores. Cores may be obtained from lesional and non-lesional tissues of primaries as well as matching metastases. The extracted DNA will be used for exome sequencing. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  156 
 
  
    EGAD00001015368 
   
  
    
    Sebaceous tumours are a rare cutaneous cancer with potential for aggressive behaviour. However, limited information is available on these cancers with few published cases. Here we wish to exome sequence these cancers to define the first genomic landscape for this malignancy. We will extract RNA from formalin-fixed, paraffin-embedded (FFPE) cores. Cores may be obtained from lesional and non-lesional tissues of primaries as well as matching metastases. The extracted RNA will be used for RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015369 
   
  
    
    This cohort comprises a subset of patients enrolled in the Genomic Advances in Sepsis (GAinS) study, an established biobank of adult sepsis patients. Sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection. Patients with sepsis due to community acquired pneumonia or faecal peritonitis were recruited from 35 hospitals across the UK from 2005-2018, with samples for functional genomics and detailed clinical information collected over the first five days of ICU admission to investigate how host genetics affects the individual repsonse to sepsis. DNA was extracted from buffy coat or whole blood samples using the Qiagen DNA extraction protocol, the automated Maxwell Blood purification kit (Promega), or the QIAamp Blood Midi kit protocol (Qiagen). Genotyping data were generated using the Illumina HumanOmniExpress BeadChip (295 patients), the Infinium CoreExome BeadChip (655 patients), and the Infinium Global Screening Array BeadChip (307 patients). Genotyping QC and imputation into the Haplotype Reference Consortium was perfomed within each batch. The datasets were combined and following post-imputation filtering data were available on 1168 samples. 
    
   
  
    
   
  1 
 
  
    EGAD00001015370 
   
  
    
    Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs) as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://borninbradford.nhs.uk/ [borninbradford.nhs.uk] 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001015371 
   
  
    
    Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs)  as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://www.bristol.ac.uk/alspac/researchers/our-data/[bristol.ac.uk] 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001015372 
   
  
    
    Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs)  as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://cls.ucl.ac.uk/cls-studies/millennium-cohort-study/ [cls.ucl.ac.uk] 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15173 
 
  
    EGAD00001015373 
   
  
    
    Dataset for manuscript titled: Spatial Intra-Tumour Heterogeneity and Treatment-Induced Genomic Evolution in Oesophageal Adenocarcinoma: Implications for Prognosis and Therapy 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1 
 
  
    EGAD00001015374 
   
  
    
    This dataset includes raw nanopore, base-called, and 6mA frequency data for EcoGII-treated NA12878 and MCF7 chromatin samples. It also includes raw nanopore data for the HG002 EcoGII-treated DNA. 
    
   
  
    
      
      PromethION 
      
    
   
  3 
 
  
    EGAD00001015376 
   
  
    
    Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015377 
   
  
    
    Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015378 
   
  
    
    Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015379 
   
  
    
    Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015382 
   
  
    
    Exome capturing was performed using xGen Exome Research Panel v1.0 based on standard protocols. Paired-end sequencing (2 x 151 bp) was performed using Illumina NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD00001015383 
   
  
    
    Single cell encapsulation and DNA libraries were prepared by Chromium™ Single Cell DNA Reagent Kits and Chromium™ Single Cell C and D Chip Kits. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015384 
   
  
    
    T cells develop from circulating precursor cells, which enter the thymus and migrate
through specialised sub-compartments that support their maturation and selection. In
humans, this process starts in early fetal development and is highly active until
thymic involution in adolescence. To map the micro-anatomical underpinnings of this
process in pre- and early postnatal stages, we established a novel quantitative
morphological framework for the thymus, the Cortico-Medullary Axis, and used it to
perform a spatially resolved analysis. By applying this framework to a curated
multimodal single-cell atlas, spatial transcriptomics, and high-resolution multiplex
imaging data, we demonstrate establishment of the lobular cytokine network,
canonical thymocyte trajectories and thymic epithelial cell distributions within the first
trimester of fetal development. We pinpoint tissue niches of thymic epithelial cell
progenitors and distinct subtypes associated with Hassall’s corpuscles and uncover
divergence in the timing of medullary entry between CD4 vs. CD8 T cell lineages.
These findings provide a basis for a detailed understanding of T lymphocyte
development and are complemented with a holistic toolkit for cross-platform imaging
data analysis, annotation, and Organ Axis construction (TissueTag), which can be
applied to any tissue. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001015386 
   
  
    
    The complexity of tobacco smoke induced mutagenesis in head and neck cancer - sequence data (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015387 
   
  
    
    The complexity of tobacco smoke induced mutagenesis in head and neck cancer - patient metadata files (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015388 
   
  
    
    The complexity of tobacco smoke induced mutagenesis in head and neck cancer - filtered vcf files (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015389 
   
  
    
    The complexity of tobacco smoke induced mutagenesis in head and neck cancer - structural variation vcf files (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015390 
   
  
    
    The complexity of tobacco smoke induced mutagenesis in head and neck cancer - copy number variants (Mutographs) 
    
   
  
    
   
  1 
 
  
    EGAD00001015391 
   
  
    
    Patient-matched normal kidney organoid (103H) and MRT tumoroid (103T2) models were treated for 24h with either DMSO (ctrl), 400nM MTX, or 50nM BAY to investigate the direct effects of drug treatment on the expression of key metabolic enzymes in the nucleotide biosynthesis pathways. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  1 
 
  
    EGAD00001015395 
   
  
    
    DNA Whole Exome Sequence for manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015396 
   
  
    
    Illumina TSO500 DNA Dataset for Manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001015397 
   
  
    
    Illumina TSO500 RNA Dataset for Manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD00001015398 
   
  
    
    Cancer predisposition syndromes mediated by recessive cancer genes generate tumours via somatic variants (second hits) in the unaffected allele. Second hits may or may not be sufficient for neoplastic transformation. Here, we performed whole genome and exome sequencing on 479 tissue biopsies from a child with neurofibromatosis type 1, a multi-system cancer-predisposing syndrome mediated by constitutive monoallelic NF1 inactivation. We identified multiple independent NF1 driver variants in histologically normal tissues, but not in 610 biopsies from two non-predisposed children. We corroborated this finding using targeted duplex sequencing, including a further nine adults with the same syndrome. Overall, truncating NF1 mutations were under positive selection in normal tissues from individuals with neurofibromatosis type 1. We demonstrate that normal tissues in neurofibromatosis type 1 commonly harbour second hits in NF1, the extent and pattern of which may underpin the syndrome's cancer phenotype. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  - 
 
  
    EGAD00001015399 
   
  
    
    DNA WGS Short Read Sequence (Illumina NovaSeq) for manuscript titled: "Performance of Somatic Structural Variant Calling in Lung Cancer using Oxford Nanopore Sequencing Technology" 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD00001015400 
   
  
    
    DNA WGS Long Read Sequence (PromethION) for manuscript titled: "Performance of Somatic Structural Variant Calling in Lung Cancer using Oxford Nanopore Sequencing Technology" 
    
   
  
    
      
      PromethION 
      
    
   
  1 
 
  
    EGAD00001015401 
   
  
    
    Pediatric acute lymphoblastic leukemia (ALL) is marked by low mutational load at initial diagnosis, which increases at relapse. The elevated mutational load at relapse can partly be explained by at least two therapy-related effects and a combination of therapy and underlying mismatch repair deficiency. However, our understanding of the type and timing of mutational mechanisms in relapsed ALL is limited, and it is unclear to what extent mutational processes contribute to disease progression. We collected a cohort of 29 Dutch ALL patients across multiple treatment protocols who had multiple relapses. Using whole genome sequencing of the sequential tumor samples of each patient we were able to distinguish the mutational processes active in relapsed ALL and could track the activity of mutational processes over time. This allowed us to investigate whether subtype-specific mutational processes at diagnosis can continue in relapse or emerge at relapse if absent in initial diagnosis. Furthermore, we assessed whether the activity of mutational processes contributed to disease development and relapse. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  135 
 
  
    EGAD00010000050 
   
  
    
    Matched tumor-negative pancreas tissues 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  15 
 
  
    EGAD00010000051 
   
  
    
    Cell line derived from microdissected primary pancreatic ductal adenocarcinoma tissues 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  15 
 
  
    EGAD00010000052 
   
  
    
    Monozygotic twins that are discordant for schizophrenia (Genotyping) 
    
   
  
    
      
      CompleteGenomics build 1.4.2.8 - CG Build 1.4.2.8 
      
    
   
  36 
 
  
    EGAD00010000096 
   
  
    
    DBA case samples using 250K Nsp 
    
   
  
    
      
      Affymetrix_250K(Nsp) - gtype 
      
    
   
  27 
 
  
    EGAD00010000124 
   
  
    
    Psoriasis cases as part of WTCCC2 phase 2 
    
   
  
    
      
      Illumina_670k - Illuminus 
      
    
   
  2622 
 
  
    EGAD00010000130 
   
  
    
    Cerebellar ataxia, mental retardation, and disequilibrium syndrome (CAMRQ) samples 
    
   
  
    
      
      Illumina 
      
      Illumina 300 Duo V2 - Bead Studio 
      
    
   
  2 
 
  
    EGAD00010000144 
   
  
    
    Healthy volunteer collection of European Ancestry 
    
   
  
    
      
      Illumin OmniExpress v1.0 - Illumina GenomeStudio 
      
    
   
  288 
 
  
    EGAD00010000148 
   
  
    
    tumour samples using Affymetrix Genome-Wide SNP6.0 arrays 
    
   
  
    
      
      Affymetrix_GenomeWide_SNP6.34 
      
    
   
  104 
 
  
    EGAD00010000150 
   
  
    
    WTCCC2 project samples from Ankylosing spondylitis Cohort 
    
   
  
    
      
      Illumina_670k - Illuminus 
      
    
   
  2005 
 
  
    EGAD00010000158 
   
  
    
    Affymetrix 6.0 cel files 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  473 
 
  
    EGAD00010000160 
   
  
    
    Illumina HT 12 IDATS 
    
   
  
    
   
  - 
 
  
    EGAD00010000162 
   
  
    
    Illumina HT 12 IDATS 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  - 
 
  
    EGAD00010000164 
   
  
    
    Affymetrix 6.0 CEL files 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000202 
   
  
    
    Case samples (Illumina_660K & Illumina_670K) 
    
   
  
    
      
      Illumina_660K/Illumina_670K 
      
    
   
  1478 
 
  
    EGAD00010000210 
   
  
    
    Normalized expression data; discovery set 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  1 
 
  
    EGAD00010000211 
   
  
    
    Normalized expression data; validation set 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  - 
 
  
    EGAD00010000212 
   
  
    
    Normalized expression data; normals 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  - 
 
  
    EGAD00010000213 
   
  
    
    Segmented (CBS) copy number aberrations (CNA); discovery set  
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000214 
   
  
    
    Segmented (CBS) copy number variants (CNV); discovery set  
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000215 
   
  
    
    Segmented (CBS) copy number aberrations (CNA); validation set 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000216 
   
  
    
    Segmented (CBS) copy number variants (CNV); validation set 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000217 
   
  
    
    Segmented (HMM) copy number aberrations (CNA); discovery set  
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  - 
 
  
    EGAD00010000220 
   
  
    
    Ovarian & matched normal (Genotypes) 
    
   
  
    
      
      Complete Genomics - CG Build 1.4.2.8 
      
    
   
  2 
 
  
    EGAD00010000230 
   
  
    
    WTCCC2 samples from Hypertension Cohort  
    
   
  
    
      
       - Illuminus 
      
    
   
  2943 
 
  
    EGAD00010000232 
   
  
    
    WTCCC2 samples from Type 2 Diabetes Cohort  
    
   
  
    
      
       - Illuminus 
      
    
   
  2975 
 
  
    EGAD00010000234 
   
  
    
    WTCCC2 samples from 1958 British Birth Cohort  
    
   
  
    
      
      Illumina HumanExome-12v1_A-GenCall, zCall 
      
    
   
  12241 
 
  
    EGAD00010000236 
   
  
    
    WTCCC2 samples from Coronary Artery Disease Cohort  
    
   
  
    
      
       - Illuminus, GenoSNP 
      
    
   
  3125 
 
  
    EGAD00010000238 
   
  
    
    CLL Expression array 
    
   
  
    
      
      Affymetrix GeneChip Human Genome U133 plus 2.0 
      
    
   
  64 
 
  
    EGAD00010000246 
   
  
    
    Coeliac disease cases and control samples. (1958BC samples excluded) 
    
   
  
    
      
      GenoSNP 
      
      Illumina ImmunoBeadChip - Illuminus 
      
    
   
  10758 
 
  
    EGAD00010000248 
   
  
    
    1958BC control samples  
    
   
  
    
      
      GenoSNP 
      
      Illumina ImmunoBeadChip - Illuminus 
      
    
   
  6812 
 
  
    EGAD00010000250 
   
  
    
    NBS control samples 
    
   
  
    
      
      GenoSNP 
      
      Illumina ImmunoBeadChip - Illuminus 
      
    
   
  3030 
 
  
    EGAD00010000252 
   
  
    
    CLL Expression Arrays 
    
   
  
    
      
      Affymetrix U219 
      
    
   
  137 
 
  
    EGAD00010000254 
   
  
    
    CLL Methylation Arrays 
    
   
  
    
      
      Illumina HumanMethylation450 
      
    
   
  165 
 
  
    EGAD00010000260 
   
  
    
    PNET genotyping 
    
   
  
    
      
      Illumina OmniQuad 2.5 - CNVpartition 
      
    
   
  77 
 
  
    EGAD00010000262 
   
  
    
    WTCCC2 project Schizophrenia (SP) samples 
    
   
  
    
      
      Affyemtrix 6.0 - CHIAMO 
      
    
   
  3019 
 
  
    EGAD00010000264 
   
  
    
    WTCCC2 project samples from Ischaemic Stroke Cohort 
    
   
  
    
      
      Illumina_670k - Illuminus 
      
    
   
  4205 
 
  
    EGAD00010000266 
   
  
    
    Metabric breast cancer samples (Genotype raw data) 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  543 
 
  
    EGAD00010000268 
   
  
    
    Metabric breast cancer samples (Expression raw data) 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  543 
 
  
    EGAD00010000270 
   
  
    
    Metabric breast cancer samples (Images) 
    
   
  
    
      
      Aperio image - H&E stained tissue_section 
      
    
   
  564 
 
  
    EGAD00010000272 
   
  
    
    Colon tumour samples 
    
   
  
    
      
      Illumina_2.5M 
      
    
   
  75 
 
  
    EGAD00010000274 
   
  
    
    Colon matched tumour samples 
    
   
  
    
      
      Illumina_2.5M 
      
    
   
  74 
 
  
    EGAD00010000276 
   
  
    
    SCLC tumor genotypes 
    
   
  
    
      
      Illumina_2.5M 
      
    
   
  56 
 
  
    EGAD00010000278 
   
  
    
    SCLC matched normal genotypes 
    
   
  
    
      
      Illumina_2.5M 
      
    
   
  51 
 
  
    EGAD00010000280 
   
  
    
    CLL Expression array 
    
   
  
    
      
      Affymetrix snp 6.0 
      
    
   
  4 
 
  
    EGAD00010000282 
   
  
    
    Pharmacogenomic response to Statins samples (Genotypes/Phenotypes) 
    
   
  
    
      
      Affymetrix 6.0 - CHIAMO 
      
    
   
  4134 
 
  
    EGAD00010000284 
   
  
    
    NBS control samples only (Hap300) 
    
   
  
    
      
      Illumina (Various) 
      
    
   
  2500 
 
  
    EGAD00010000286 
   
  
    
    All cases and controls (Hap550) 
    
   
  
    
   
  11950 
 
  
    EGAD00010000288 
   
  
    
    All cases and Finnish, Dutch, Italian control samples (Hap550) 
    
   
  
    
   
  6313 
 
  
    EGAD00010000290 
   
  
    
    NBS control samples only (Hap550) 
    
   
  
    
   
  2276 
 
  
    EGAD00010000292 
   
  
    
    All cases and Finnish, Dutch, Italian control samples (Hap300) 
    
   
  
    
   
  10339 
 
  
    EGAD00010000294 
   
  
    
    1958BC control samples only (Hap300) 
    
   
  
    
   
  2436 
 
  
    EGAD00010000296 
   
  
    
    1958BC control samples only (Hap550) 
    
   
  
    
   
  2224 
 
  
    EGAD00010000298 
   
  
    
    All cases and controls (Hap300) 
    
   
  
    
   
  13761 
 
  
    EGAD00010000300 
   
  
    
    Summary statistics from Haemgen RBC GWAS 
    
   
  
    
      
      Affymetrix 
      
      Illumina 
      
      Perlegen 
      
    
   
  1 
 
  
    EGAD00010000371 
   
  
    
    Case and control samples (Genotypes) 
    
   
  
    
      
      Infinium_370k - GenomeStudio 
      
    
   
  170 
 
  
    EGAD00010000377 
   
  
    
    DNA methylation analysis of 6 primary lymphoma samples 
    
   
  
    
      
      HumanMethylation450k Bead Chip - Genome Studio 
      
    
   
  6 
 
  
    EGAD00010000379 
   
  
    
    DNA methylation analysis of 2 peripheral blood samples 
    
   
  
    
      
      HumanMethylation450k Bead Chip - Genome Studio 
      
    
   
  2 
 
  
    EGAD00010000381 
   
  
    
    MRCE sample using 300K 
    
   
  
    
      
      Illumina 300K - GenomeStudio 
      
    
   
  543 
 
  
    EGAD00010000383 
   
  
    
    MRCA sample using 100K 
    
   
  
    
      
      Illumina 100K - GenomeStudio 
      
    
   
  1 
 
  
    EGAD00010000385 
   
  
    
    MRCA sample using 300K 
    
   
  
    
      
       Illumina 300K - GenomeStudio 
      
    
   
  394 
 
  
    EGAD00010000387 
   
  
    
    Cambridge control samples using a 1.2M genotyping chip from Illumina 
    
   
  
    
      
      Illumina Human 1.2M Duo custom BeadChips v1 - Genome Studio 
      
    
   
  188 
 
  
    EGAD00010000389 
   
  
    
    Cambridge control samples using a 24k expression array from Illumina 
    
   
  
    
      
      Illumina Human-Ref 8 v3.0 expression array 
      
    
   
  395 
 
  
    EGAD00010000391 
   
  
    
    Cambridge control samples using a 660K genotyping chip from Illumina 
    
   
  
    
      
      Illumina Human 660K Quad BeadChips - Illuminus 
      
    
   
  232 
 
  
    EGAD00010000395 
   
  
    
    Myeloma case sample genotype using Affymetrix SNP6.0 
    
   
  
    
      
      Affymetrix_SNP6 
      
    
   
  19 
 
  
    EGAD00010000417 
   
  
    
    Han Chinese samples using Illumina OMNIExpress (cases) 
    
   
  
    
      
      Illumina OMNIExpress 
      
    
   
  62 
 
  
    EGAD00010000419 
   
  
    
    Han Chinese samples using Affymetrix (cases) 
    
   
  
    
      
      Affymetrix_6.0 
      
    
   
  62 
 
  
    EGAD00010000421 
   
  
    
    Han Chinese samples using Affymetrix (controls) 
    
   
  
    
      
      Affymetrix_6.0 
      
    
   
  187 
 
  
    EGAD00010000423 
   
  
    
    Han Chinese samples using Illumina OMNIExpress (controls) 
    
   
  
    
      
      Illumina OMNIExpress 
      
    
   
  213 
 
  
    EGAD00010000425 
   
  
    
    Han Chinese samples using Immunochip 
    
   
  
    
      
      HanChinese_Immunochip 
      
    
   
  192 
 
  
    EGAD00010000427 
   
  
    
    DNA methylation analysis of 4 peripheral blood samples 
    
   
  
    
      
      HumanMethylation450k Bead Chip - Genome Studio 
      
    
   
  4 
 
  
    EGAD00010000429 
   
  
    
    DNA methylation analysis of 4 primary lymphoma samples 
    
   
  
    
      
      HumanMethylation450k Bead Chip - Genome Studio 
      
    
   
  4 
 
  
    EGAD00010000434 
   
  
    
    Normalised mRNA expression 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  1302 
 
  
    EGAD00010000436 
   
  
    
    Illumina HT 12 IDAT files 
    
   
  
    
      
      Illumina HT 12 
      
    
   
  1302 
 
  
    EGAD00010000438 
   
  
    
    Normalized miRNA expression data 
    
   
  
    
      
      Agilent ncRNA 60k 
      
    
   
  1480 
 
  
    EGAD00010000440 
   
  
    
    Segmented copy number data 
    
   
  
    
      
      Affymetrix_SNP6_raw 
      
    
   
  1302 
 
  
    EGAD00010000442 
   
  
    
    Affymetrix SNP 6.0 CEL files 
    
   
  
    
      
      Affymetrix_SNP6_raw 
      
    
   
  1302 
 
  
    EGAD00010000444 
   
  
    
    Agilent ncRNA 60k txt files 
    
   
  
    
      
      Agilent ncRNA 60k 
      
    
   
  1480 
 
  
    EGAD00010000446 
   
  
    
    Monocyte Gene Expression 
    
   
  
    
      
      Illumina Human-Ref-8 v3 beadchip 
      
    
   
  758 
 
  
    EGAD00010000448 
   
  
    
    Macrophage Gene Expression 
    
   
  
    
      
      Illumina Human-Ref-8 v3 beadchip 
      
    
   
  758 
 
  
    EGAD00010000450 
   
  
    
    Genome Wide Genotype Data 
    
   
  
    
      
      Illumina Human Custom 1,2M and Human 610 Quad Custom arrays 
      
    
   
  758 
 
  
    EGAD00010000452 
   
  
    
    Chondrosarcoma case sample genotype using Affymetrix SNP6.0 
    
   
  
    
      
      Affymetrix_SNP6 
      
    
   
  36 
 
  
    EGAD00010000456 
   
  
    
    Leukemia samples using 450K DNA methylation 
    
   
  
    
   
  800 
 
  
    EGAD00010000458 
   
  
    
    Controls using 450K DNA methylation 
    
   
  
    
   
  151 
 
  
    EGAD00010000460 
   
  
    
    GENCORD2 DNA methylation  
    
   
  
    
   
  294 
 
  
    EGAD00010000462 
   
  
    
    SJLGG Case samples using Gene Expression Array 
    
   
  
    
      
      Affymetrix_U133v2 
      
    
   
  75 
 
  
    EGAD00010000464 
   
  
    
    Down syndrome SNP genotyping data 
    
   
  
    
      
      Illumina 550K - Illumina Genome Studio 
      
    
   
  338 
 
  
    EGAD00010000466 
   
  
    
    Down syndrome CNV genotyping data 
    
   
  
    
      
      NimbleGen 135K aCGH - NimbleScan  
      
    
   
  108 
 
  
    EGAD00010000468 
   
  
    
    Uveal melanoma matched Tumour and blood samples 
    
   
  
    
      
      Illumina HumanOmni2.5 
      
    
   
  24 
 
  
    EGAD00010000470 
   
  
    
    CLL Expression Array 
    
   
  
    
      
      GPL570 
      
    
   
  20 
 
  
    EGAD00010000472 
   
  
    
    CLL Expression Array 
    
   
  
    
      
      Affymetrix U219 
      
    
   
  219 
 
  
    EGAD00010000474 
   
  
    
    blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 2 (CC2) 
    
   
  
    
      
      Illumina 
      
    
   
  98 
 
  
    EGAD00010000476 
   
  
    
    blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 1 (CC1) 
    
   
  
    
      
      Illumina 
      
    
   
  110 
 
  
    EGAD00010000478 
   
  
    
    blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 3 (CC3) 
    
   
  
    
      
      Illumina 
      
    
   
  118 
 
  
    EGAD00010000480 
   
  
    
    ccRCC case samples using 250K Nsp 
    
   
  
    
      
      Affymetrix_250K(Nsp) - gtype 
      
    
   
  240 
 
  
    EGAD00010000482 
   
  
    
    ccRCC case samples using methylation array 
    
   
  
    
      
      Illumina Infinium HumanMethylation 450K - GenomeStudio 
      
    
   
  1 
 
  
    EGAD00010000484 
   
  
    
    ccRCC control samples using 250K Nsp 
    
   
  
    
      
      Affymetrix_250K(Nsp) - gtype 
      
    
   
  234 
 
  
    EGAD00010000486 
   
  
    
    ccRCC case samples using expression array 
    
   
  
    
      
      Agilent Human Whole Genome 4x44k v2 - Feature Extraction 
      
    
   
  101 
 
  
    EGAD00010000488 
   
  
    
    Chondroblastoma case sample genotype using Affymetrix SNP6.0 
    
   
  
    
      
      Affymetrix_SNP6- 
      
    
   
  7 
 
  
    EGAD00010000490 
   
  
    
    Affymetrix Genome-Wide Human SNP Array 6.0 data 
    
   
  
    
      
      Affymetrix 6.0- 
      
    
   
  19 
 
  
    EGAD00010000492 
   
  
    
    Cases_Human660W-Quad_v1_A 
    
   
  
    
      
      Illumina_Human660W-Quad_v1_A-Not supplied 
      
    
   
  4 
 
  
    EGAD00010000494 
   
  
    
    Controls_Human660W-Quad_v1_A 
    
   
  
    
      
      Illumina_Human660W-Quad_v1_A-Not supplied 
      
    
   
  4 
 
  
    EGAD00010000496 
   
  
    
    Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists 
    
   
  
    
      
      Illumina HumanOmni1-Quad-Illumina GenomeStudio 
      
    
   
  260 
 
  
    EGAD00010000498 
   
  
    
    Affymetrix SNP6.0 genotype data for prostate cancer patients 
    
   
  
    
      
      Affymetrix_SNP6- 
      
    
   
  18 
 
  
    EGAD00010000500 
   
  
    
    Case samples using U133 Plus 2.0 Array  
    
   
  
    
      
      Affymetrix_U133plus2- 
      
    
   
  35 
 
  
    EGAD00010000502 
   
  
    
    Case samples using SNP Array 6.0 
    
   
  
    
      
      Affymetrix_U133plus2- 
      
    
   
  35 
 
  
    EGAD00010000504 
   
  
    
    Control samples using SNP Array 6.0 
    
   
  
    
      
      Affymetrix_U133plus2- 
      
    
   
  35 
 
  
    EGAD00010000506 
   
  
    
    WTCCC2 BO (Barretts oesophagus) samples 
    
   
  
    
      
      Illumina_670k-Illuminus 
      
    
   
  1991 
 
  
    EGAD00010000508 
   
  
    
    Matched control samples using SNP 6.0 Array 
    
   
  
    
      
      GenomeWideSNP_6-BirdseedV2 
      
    
   
  12 
 
  
    EGAD00010000510 
   
  
    
    Matched control samples using HumanOmni1-Quad 
    
   
  
    
      
      GenomeWideSNP_6-BirdseedV2 
      
    
   
  12 
 
  
    EGAD00010000512 
   
  
    
    Case samples using HumanOmni1-Quad 
    
   
  
    
      
      GenomeWideSNP_6-BirdseedV2 
      
    
   
  12 
 
  
    EGAD00010000514 
   
  
    
    Case samples using SNP 6.0 Array 
    
   
  
    
      
      GenomeWideSNP_6-BirdseedV2 
      
    
   
  12 
 
  
    EGAD00010000516 
   
  
    
    Samples from the Pomak Villages in Greece, Pomak isolate 
    
   
  
    
      
      HumanExome_12v1.1_A -GenCall, zCall 
      
    
   
  1046 
 
  
    EGAD00010000518 
   
  
    
    Samples from the Greek island of Crete, MANOLIS cohort 
    
   
  
    
      
      HumanExome_12v1.1_A -GenCall, zCall 
      
    
   
  1280 
 
  
    EGAD00010000520 
   
  
    
    Healthy volunteer collection of European Ancestry 
    
   
  
    
      
      Illumina OmniExpress v1.0-Illumina GenomeStudio 
      
    
   
  144 
 
  
    EGAD00010000522 
   
  
    
    Samples from the Greek island of Crete, MANOLIS cohort 
    
   
  
    
      
      HumanOmniExpress-12 v1.1 BeadChip-GenCall 
      
    
   
  1364 
 
  
    EGAD00010000526 
   
  
    
    SNP 6.0 arrays of small cell lung cancer 
    
   
  
    
      
      Affymetrics_SNP_6.0- 
      
    
   
  63 
 
  
    EGAD00010000528 
   
  
    
    Illumina HumanHT-12 v4 array 
    
   
  
    
   
  - 
 
  
    EGAD00010000532 
   
  
    
    Illumina Human Omni1-Quad SNP genotyping array 
    
   
  
    
   
  - 
 
  
    EGAD00010000534 
   
  
    
    Illumina HumanMethylation450 BeadChip 
    
   
  
    
   
  - 
 
  
    EGAD00010000536 
   
  
    
    21 unlinked autosomal microsatellite loci for 30 Central Asian populations  
    
   
  
    
      
      Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics) 
      
    
   
  1702 
 
  
    EGAD00010000538 
   
  
    
    28 unlinked autosomal microsatellite loci for  20 African and 4 philippine populations  
    
   
  
    
      
      Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics) 
      
    
   
  1702 
 
  
    EGAD00010000542 
   
  
    
    Cusihg's syndrome normal samples using 250K 
    
   
  
    
      
      Affymetrix 250K Nsp-GTYPE 
      
    
   
  16 
 
  
    EGAD00010000544 
   
  
    
    Cusihg's syndrome tumor samples using 250K 
    
   
  
    
      
      Affymetrix 250K Nsp-GTYPE 
      
    
   
  16 
 
  
    EGAD00010000546 
   
  
    
    SNP 6.0 arrays of carcinoid samples 
    
   
  
    
      
      Affymetrics_SNP_6.0- 
      
    
   
  74 
 
  
    EGAD00010000552 
   
  
    
    Neuroblastoma samples 
    
   
  
    
   
  130 
 
  
    EGAD00010000554 
   
  
    
    SNP 6.0 arrays of small cell lung cancer 
    
   
  
    
   
  1032 
 
  
    EGAD00010000556 
   
  
    
    SNP 6.0 arrays of small cell lung cancer 
    
   
  
    
   
  1 
 
  
    EGAD00010000558 
   
  
    
    SNP 6.0 arrays of small cell lung cancer 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  54 
 
  
    EGAD00010000560 
   
  
    
    SNP array of 7 HCCs and matched background liver in children with bile salt export pump deficiency 
    
   
  
    
      
      Illumina HumanOmniExpress-12 v1. 
      
    
   
  14 
 
  
    EGAD00010000562 
   
  
    
    Medulloblastoma DNA methylation 
    
   
  
    
      
      Illumina_HumanMethylation450 
      
    
   
  115 
 
  
    EGAD00010000564 
   
  
    
    HipSci - Healthy Normals - Expression Array - May 2014 
    
   
  
    
   
  120 
 
  
    EGAD00010000566 
   
  
    
    HipSci - Healthy Normals - Genotyping Array - May 2014 
    
   
  
    
   
  120 
 
  
    EGAD00010000568 
   
  
    
    HipSci - Healthy Normals - Methylation Array - May 2014 
    
   
  
    
   
  - 
 
  
    EGAD00010000570 
   
  
    
     Imputation-based meta-analysis of severe malaria in Kenya.   
    
   
  
    
   
  3343 
 
  
    EGAD00010000572 
   
  
    
     Imputation-based meta-analysis of severe malaria in Gambia.   
    
   
  
    
   
  2870 
 
  
    EGAD00010000574 
   
  
    
    Pleuropulmonary blastoma samples using 250K 
    
   
  
    
   
  14 
 
  
    EGAD00010000578 
   
  
    
    Gencode case samples using 550K 
    
   
  
    
   
  249 
 
  
    EGAD00010000580 
   
  
    
    Gencode control samples using 550K 
    
   
  
    
   
  217 
 
  
    EGAD00010000584 
   
  
    
    WTCCC2 Glaucoma samples using Illumina 670k array 
    
   
  
    
      
      Illumina 670k (custom Illumina Human660W-Quad) 
      
    
   
  2765 
 
  
    EGAD00010000594 
   
  
    
    SCOOP severe early-onset obesity cases 
    
   
  
    
   
  1720 
 
  
    EGAD00010000596 
   
  
    
    PCGP Ph-likeALL GEA 
    
   
  
    
   
  837 
 
  
    EGAD00010000598 
   
  
    
    PCGP Ph-likeALL SNP6 
    
   
  
    
   
  1724 
 
  
    EGAD00010000600 
   
  
    
    Prostate Adenocarcinomas samples using 450K 
    
   
  
    
      
      Illumina450K 
      
    
   
  80 
 
  
    EGAD00010000602 
   
  
    
    WTCCC2 Reading and Mathematics ability (RM) samples from UK using the Affymetrix 6.0 array 
    
   
  
    
   
  3665 
 
  
    EGAD00010000604 
   
  
    
    DNA methylation data using Illumina 450K 
    
   
  
    
   
  2195 
 
  
    EGAD00010000606 
   
  
    
    SNP6 data for matched normal samples 
    
   
  
    
   
  8 
 
  
    EGAD00010000608 
   
  
    
    SNP6 data for seminoma samples 
    
   
  
    
   
  8 
 
  
    EGAD00010000610 
   
  
    
    Samples from the Greek island of Crete, MANOLIS cohort 
    
   
  
    
   
  221 
 
  
    EGAD00010000612 
   
  
    
    Celiac disease North Indian samples using Immunochip 
    
   
  
    
   
  - 
 
  
    EGAD00010000614 
   
  
    
    40 Druze Trios  
    
   
  
    
   
  120 
 
  
    EGAD00010000616 
   
  
    
    HumanOmni1-Quad genotyping array 
    
   
  
    
   
  230 
 
  
    EGAD00010000618 
   
  
    
    Ischemic stroke cases 
    
   
  
    
   
  3682 
 
  
    EGAD00010000620 
   
  
    
    Controls 
    
   
  
    
   
  3683 
 
  
    EGAD00010000622 
   
  
    
    SNP array data for gastric cancer cell lines 
    
   
  
    
   
  30 
 
  
    EGAD00010000624 
   
  
    
    A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille.
5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs). 
    
   
  
    
   
  - 
 
  
    EGAD00010000626 
   
  
    
    A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille.
5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs). 
    
   
  
    
   
  37 
 
  
    EGAD00010000628 
   
  
    
    The TEENAGE study target population comprised adolescent students aged 13 to 15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica. 
    
   
  
    
   
  1 
 
  
    EGAD00010000630 
   
  
    
    The TEENAGE study target population comprised adolescent students aged 13 to 15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica. 
    
   
  
    
   
  436 
 
  
    EGAD00010000632 
   
  
    
    WTCCC2 People of the British Isles (POBI) samples using Illumina 1.2M array 
    
   
  
    
   
  2912 
 
  
    EGAD00010000634 
   
  
    
    WTCCC2 People of the British Isles (POBI) samples using Affymetrix 6.0 array 
    
   
  
    
   
  2930 
 
  
    EGAD00010000636 
   
  
    
    WTCCC2 Visceral Leishmaniasis samples from Brazil using Illumina 670k 
    
   
  
    
      
      0 
      
    
   
  1 
 
  
    EGAD00010000638 
   
  
    
    WTCCC2 Visceral Leishmaniasis samples from Indial using Illumina 670k 
    
   
  
    
      
      0 
      
    
   
  1 
 
  
    EGAD00010000640 
   
  
    
    WTCCC2 Visceral Leishmaniasis samples from Sudanl using Illumina 670k 
    
   
  
    
      
      0 
      
    
   
  1 
 
  
    EGAD00010000642 
   
  
    
    CLL Expression Array 
    
   
  
    
   
  1 
 
  
    EGAD00010000644 
   
  
    
    Affymetrix SNP6.0 cancer cell line exome sequencing data 
    
   
  
    
   
  1022 
 
  
    EGAD00010000646 
   
  
    
    DNA methylation analysis of 35 prostate tumor and 6 normal prostate samples 
    
   
  
    
   
  41 
 
  
    EGAD00010000648 
   
  
    
    nccRCC tumor/normal genotypes 
    
   
  
    
   
  1 
 
  
    EGAD00010000650 
   
  
    
    Genotypes from Omni2.5 chip 
    
   
  
    
   
  1213 
 
  
    EGAD00010000652 
   
  
    
    Genotyped samples using Illumina HumanOmni2.5 
    
   
  
    
   
  402 
 
  
    EGAD00010000654 
   
  
    
    Control samples using SNP 6.0 Arrays 
    
   
  
    
   
  1 
 
  
    EGAD00010000656 
   
  
    
    Case samples using SNP 6.0 Array 
    
   
  
    
   
  1 
 
  
    EGAD00010000658 
   
  
    
    DLBCL 148 SNP 6.0 Cohort 
    
   
  
    
   
  1 
 
  
    EGAD00010000662 
   
  
    
    Finnish population cohort genotyping 
    
   
  
    
   
  - 
 
  
    EGAD00010000664 
   
  
    
    Finnish population cohort genotyping_B 
    
   
  
    
   
  - 
 
  
    EGAD00010000666 
   
  
    
    Purified plasma cells from  tonsil of Healthy donor  
    
   
  
    
   
  1 
 
  
    EGAD00010000668 
   
  
    
    Purified plasma cells from  bone marrow of Monoclonal gammopathy of unknown significance patient 
    
   
  
    
   
  1 
 
  
    EGAD00010000670 
   
  
    
    Purified plasma cells from  bone marrow of Pooled healthy donors 
    
   
  
    
   
  1 
 
  
    EGAD00010000672 
   
  
    
    Purified plasma cells from  bone marrow of Multiple myeloma patient 
    
   
  
    
   
  1 
 
  
    EGAD00010000674 
   
  
    
    ELSA genome-wide genotypes, excluding estimated related individuals. There are 3 files: .fam, .bim, .bed 
    
   
  
    
   
  7412 
 
  
    EGAD00010000676 
   
  
    
    ELSA genome-wide genotypes, including estimated related individuals. There are 3 files: .fam, .bim, .bed 
    
   
  
    
   
  7452 
 
  
    EGAD00010000678 
   
  
    
    Tumor sample SNP arrays 
    
   
  
    
      
      Illumina SNP array 
      
    
   
  11 
 
  
    EGAD00010000680 
   
  
    
    Tumor sample CGH arrays 
    
   
  
    
      
      Agilent CGH array 
      
    
   
  4 
 
  
    EGAD00010000682 
   
  
    
    glioma samples tumor using 250K 
    
   
  
    
   
  762 
 
  
    EGAD00010000684 
   
  
    
    glioma normal samples using cytoscan 
    
   
  
    
   
  3 
 
  
    EGAD00010000686 
   
  
    
    glioma samples tumor using cytoscan 
    
   
  
    
   
  5 
 
  
    EGAD00010000688 
   
  
    
    glioma normal samples using 250K 
    
   
  
    
   
  119 
 
  
    EGAD00010000690 
   
  
    
    Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanOmniExpress 
    
   
  
    
   
  160 
 
  
    EGAD00010000692 
   
  
    
    Genome-wide DNA methylation epigenotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanMethylation450 
    
   
  
    
   
  372 
 
  
    EGAD00010000694 
   
  
    
    HCC array for cnv 
    
   
  
    
   
  55 
 
  
    EGAD00010000696 
   
  
    
    PCGP ETP ALL SNP6 
    
   
  
    
   
  - 
 
  
    EGAD00010000698 
   
  
    
    PCGP INF ALL SNP6 
    
   
  
    
   
  - 
 
  
    EGAD00010000702 
   
  
    
    SNP-chip genotyping data for one proband in the DDD study (Ref : Carvalho AJHG 2015) 
    
   
  
    
   
  1 
 
  
    EGAD00010000704 
   
  
    
    610k genotyping imputed on Hapmap 3 and 1000G Phase 1 CEU 
    
   
  
    
   
  714 
 
  
    EGAD00010000708 
   
  
    
    Human samples typed on Illumina Omni 5M 
    
   
  
    
   
  - 
 
  
    EGAD00010000710 
   
  
    
    ATRT genotyping blood 
    
   
  
    
   
  11 
 
  
    EGAD00010000712 
   
  
    
    ATRT genotyping 
    
   
  
    
   
  40 
 
  
    EGAD00010000714 
   
  
    
    aplastic anemia samples tumor using 250K 
    
   
  
    
      
      Affymetrix 250K Nsp-GTYPE 
      
    
   
  440 
 
  
    EGAD00010000716 
   
  
    
    BLUEPRINT DNA Methylation of different B-cell subpopulations 
    
   
  
    
   
  35 
 
  
    EGAD00010000718 
   
  
    
    BLUEPRINT Gene expression of different B-cell subpopulations 
    
   
  
    
   
  42 
 
  
    EGAD00010000722 
   
  
    
    Pilot experiment on functional genomics in osteoarthritis (coreex) 
    
   
  
    
   
  1 
 
  
    EGAD00010000724 
   
  
    
    Pilot experiment on functional genomics in osteoarthritis (methyl) 
    
   
  
    
   
  - 
 
  
    EGAD00010000730 
   
  
    
    WTCCC2 Psychosis Endophenotype samples from UK, Germany, Holland, Spain and Australia using the Affymetrix 6.0 array 
    
   
  
    
   
  1 
 
  
    EGAD00010000736 
   
  
    
    AAD case and control samples from UK and Norway 
    
   
  
    
   
  117 
 
  
    EGAD00010000738 
   
  
    
    Generation Scotland APOE data 
    
   
  
    
   
  18336 
 
  
    EGAD00010000740 
   
  
    
    Osteoarthritis cases genotyped on Illumina HumanOmniExpress from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent. 
    
   
  
    
   
  674 
 
  
    EGAD00010000742 
   
  
    
    Subset 1 of osteoarthritis cases genotyped on Illumina610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent. 
    
   
  
    
   
  5383 
 
  
    EGAD00010000744 
   
  
    
    Subset 2 of osteoarthritis cases genotyped on Illumina 610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with consent for osteoarthritis studies only. 
    
   
  
    
   
  2326 
 
  
    EGAD00010000748 
   
  
    
    Genotyping using Illumina Human OmniExpress12v1.0 
    
   
  
    
   
  1 
 
  
    EGAD00010000750 
   
  
    
    German glioma control germline genotypes using Illumina HumanExome-12v1_A array 
    
   
  
    
      
      Illumina HumanExome-12v1_A 
      
    
   
  2391 
 
  
    EGAD00010000752 
   
  
    
    German glioma case germline genotypes using Illumina HumanExome-12v1_A array 
    
   
  
    
      
      Illumina HumanExome-12v1_A 
      
    
   
  899 
 
  
    EGAD00010000754 
   
  
    
    UK glioma case germline genotypes using Illumina HumanExome-12v1_A array 
    
   
  
    
      
      Illumina HumanExome-12v1_A 
      
    
   
  596 
 
  
    EGAD00010000756 
   
  
    
    French glioma control germline genotypes using Illumina HumanExome-12v1_A array 
    
   
  
    
      
      Illumina HumanExome-12v1_A 
      
    
   
  699 
 
  
    EGAD00010000758 
   
  
    
    French glioma case germline genotypes using Illumina HumanExome-12v1_A array 
    
   
  
    
      
      Illumina HumanExome-12v1_A 
      
    
   
  906 
 
  
    EGAD00010000764 
   
  
    
    Ovarian tumor samples using Illumina  
    
   
  
    
   
  1 
 
  
    EGAD00010000766 
   
  
    
    We have established a mechanism for the collection of postal DNA samples from consenting National Joint Registry for England and Wales (NJR) patients and have carried out genotyping genome-wide in 903 patients with the condition Developmental Dysplasia of the Hip (DDH) on the Illumina CoreExome array 
    
   
  
    
   
  903 
 
  
    EGAD00010000768 
   
  
    
    Replication data for HipSci normal samples using both HumanCoreExome-12_v1 and HumanOmni2.5-8 BeadChips 
    
   
  
    
   
  - 
 
  
    EGAD00010000771 
   
  
    
    HipSci - Healthy Normals - Methylation Array - April 2015 
    
   
  
    
   
  - 
 
  
    EGAD00010000773 
   
  
    
    HipSci - Healthy Normals - Genotyping Array - November 2014 
    
   
  
    
      
      Illumina 
      
    
   
  580 
 
  
    EGAD00010000775 
   
  
    
    HipSci - Healthy Normals - Expression Array - November 2014 
    
   
  
    
      
      Illumina 
      
    
   
  580 
 
  
    EGAD00010000777 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Genotyping Array - November 2014 
    
   
  
    
   
  - 
 
  
    EGAD00010000779 
   
  
    
    HipSci - Monogenic Diabetes - Genotyping Array - November 2014 
    
   
  
    
      
      Illumina 
      
    
   
  9 
 
  
    EGAD00010000781 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Methylation Array - April 2015 
    
   
  
    
   
  - 
 
  
    EGAD00010000783 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Expression Array - November 2014 
    
   
  
    
   
  - 
 
  
    EGAD00010000785 
   
  
    
    HipSci - Monogenic Diabetes - Expression Array - November 2014 
    
   
  
    
   
  - 
 
  
    EGAD00010000787 
   
  
    
    Epigen-Brasil samples using HumanOmni2.5 
    
   
  
    
   
  6487 
 
  
    EGAD00010000789 
   
  
    
    ATRT expression 
    
   
  
    
      
      Illumina Human HT6-v3 Array 
      
    
   
  4 
 
  
    EGAD00010000790 
   
  
    
    ATRT expression 
    
   
  
    
      
      Illumina Human HT6-v3 Array 
      
    
   
  41 
 
  
    EGAD00010000791 
   
  
    
    Illumina HumanOmni2.5-8 BeadChip 
    
   
  
    
   
  1 
 
  
    EGAD00010000807 
   
  
    
    Illumina HumanCoreExome genotyping data from the British Society for Surgery of the Hand Genetics of Dupuytren’s Disease consortium (BSSH-GODD consortium) collection 
    
   
  
    
   
  4201 
 
  
    EGAD00010000811 
   
  
    
    ATL tumor samples using Illumina 610K SNP array 
    
   
  
    
   
  1 
 
  
    EGAD00010000813 
   
  
    
    ATL tumor samples using Illumina 450K Methylation array 
    
   
  
    
   
  1 
 
  
    EGAD00010000815 
   
  
    
    ATL tumor samples using Affymetrix 250K SNP array 
    
   
  
    
   
  1 
 
  
    EGAD00010000817 
   
  
    
    HipSci - Monogenic Diabetes - Methylation Array - April 2015 
    
   
  
    
   
  - 
 
  
    EGAD00010000819 
   
  
    
    Summary statistics from meta-analysis for BP phenotypes 
    
   
  
    
   
  - 
 
  
    EGAD00010000823 
   
  
    
    Results of SNP arrays on synchronous CRC samples 
    
   
  
    
   
  1 
 
  
    EGAD00010000827 
   
  
    
    Illumina Infinium 450K array data 
    
   
  
    
   
  1 
 
  
    EGAD00010000829 
   
  
    
    Illumina Infinium 450K array data 
    
   
  
    
   
  70 
 
  
    EGAD00010000831 
   
  
    
    BLUEPRINT EpiMatch: harnessing epigenetics for haematopoietic stem cell transplantation 
    
   
  
    
   
  85 
 
  
    EGAD00010000847 
   
  
    
    Genotyping using Affymetrix SNP6.0 
    
   
  
    
   
  49 
 
  
    EGAD00010000850 
   
  
    
    BLUEPRINT DNA methylation profiles of monocytes, neutrophils and T cells from healthy donors 
    
   
  
    
   
  525 
 
  
    EGAD00010000853 
   
  
    
    VeraCode GoldenGate GT Assay technology 
    
   
  
    
   
  147 
 
  
    EGAD00010000854 
   
  
    
    WTCCC3 UK maternal cases of pre-eclampsia 
    
   
  
    
      
      Illumina Human670-QuadCustom_v1 
      
    
   
  3980 
 
  
    EGAD00010000858 
   
  
    
    Achalasia cases & controls 
    
   
  
    
   
  8151 
 
  
    EGAD00010000859 
   
  
    
    Smad3  
    
   
  
    
      
      Illumina ChIP-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000860 
   
  
    
    Pol2 
    
   
  
    
      
      Illumina ChIP-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000862 
   
  
    
    H3K27me3 
    
   
  
    
      
      Illumina ChIP-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000863 
   
  
    
    H3K27Ac 
    
   
  
    
      
      Illumina ChIP-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000865 
   
  
    
    MBDSEQ 
    
   
  
    
      
      Illumina MBD-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000867 
   
  
    
    Expression Arrays 
    
   
  
    
      
      Illumina beadarray 
      
    
   
  16 
 
  
    EGAD00010000868 
   
  
    
    Targeted bisulfite sequencing 
    
   
  
    
      
      Illumina Bisulfite-Sequencing 
      
    
   
  16 
 
  
    EGAD00010000869 
   
  
    
    RNA expression microarray 
    
   
  
    
      
      Illumina_HumanHT-12v4 
      
    
   
  62 
 
  
    EGAD00010000870 
   
  
    
    DNA methylation microarray 
    
   
  
    
      
      Illumina_Infinium_HumanMethylation450 
      
    
   
  48 
 
  
    EGAD00010000871 
   
  
    
    CLL and normal B cell samples using 450K 
    
   
  
    
   
  226 
 
  
    EGAD00010000872 
   
  
    
    Genotyped case and control sampes using HumanExome Beadchip 
    
   
  
    
   
  1610 
 
  
    EGAD00010000874 
   
  
    
    Understanding Society Sequenom genotypes 
    
   
  
    
      
      Sequenom 
      
    
   
  8590 
 
  
    EGAD00010000875 
   
  
    
    CLL Expression Array 
    
   
  
    
      
      Affymetrix U219 
      
    
   
  - 
 
  
    EGAD00010000881 
   
  
    
    Digital images of ovarian cancer sections 
    
   
  
    
      
      Aperio 
      
    
   
  91 
 
  
    EGAD00010000883 
   
  
    
    The ARGO-Larissa GWAS. 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-0 
      
    
   
  859 
 
  
    EGAD00010000886 
   
  
    
    samples using Affymetrix HG_U133_+2 
    
   
  
    
      
      Affymetrix HG_U133_+2 
      
    
   
  99 
 
  
    EGAD00010000887 
   
  
    
    Freeze 1 of the RP3 project 
    
   
  
    
      
      Illumina Human Methylation 450k BeadChip 
      
    
   
  3898 
 
  
    EGAD00010000889 
   
  
    
    Gencode control samples using SNP6.0 
    
   
  
    
      
      SNP6.0 
      
    
   
  183 
 
  
    EGAD00010000890 
   
  
    
    Understanding Society GWAS, all samples 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 
      
    
   
  20926 
 
  
    EGAD00010000891 
   
  
    
    Understanding Society GWAS, samples that passed quality control 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 
      
    
   
  19888 
 
  
    EGAD00010000892 
   
  
    
    Healthy individuals from Italy 
    
   
  
    
      
      Illumina 
      
    
   
  300 
 
  
    EGAD00010000897 
   
  
    
    Infinium 450K in Rhabdomyosarcoma 
    
   
  
    
      
      Infinium HumanMethylation450 BeadChip 
      
    
   
  53 
 
  
    EGAD00010000901 
   
  
    
    Russian Tuberculosis samples using Affymetrix 6.0 
    
   
  
    
      
      Affymetrix Genome-Wide Human SNP Array 6.0 Genotypes 
      
    
   
  11937 
 
  
    EGAD00010000902 
   
  
    
    Genome-wide study of resistance to severe malaria in eleven worldwide populations:Gambia 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  5594 
 
  
    EGAD00010000903 
   
  
    
    Genome-wide study of resistance to severe malaria in eleven worldwide populations:Malawi 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  3088 
 
  
    EGAD00010000904 
   
  
    
    Genome-wide study of resistance to severe malaria in eleven worldwide populations:Kenya 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  3865 
 
  
    EGAD00010000908 
   
  
    
    Illumina SNP-arrays for matching retinoblastoma-blood pairs and retinoblastoma cell lines. 
    
   
  
    
      
      HumanOmni1 Quad BeadChip 
      
    
   
  132 
 
  
    EGAD00010000909 
   
  
    
    HipSci - Embryonic Stem Cells - Methylation Array - April 2016 
    
   
  
    
      
      Illumina 
      
    
   
  2 
 
  
    EGAD00010000910 
   
  
    
    HipSci - Embryonic Stem Cells - Expression Array - April 2016 
    
   
  
    
      
      Illumina 
      
    
   
  2 
 
  
    EGAD00010000911 
   
  
    
    HipSci - Embryonic Stem Cells - Genotyping Array - April 2016 
    
   
  
    
      
      Illumina 
      
    
   
  2 
 
  
    EGAD00010000912 
   
  
    
    SEA 610K 
    
   
  
    
      
      Illumina 610K 
      
    
   
  1 
 
  
    EGAD00010000913 
   
  
    
    SEA 660K 
    
   
  
    
      
      Illumina 660K 
      
    
   
  3 
 
  
    EGAD00010000915 
   
  
    
    Affymetrix SNP6.0 breast cancer genome sequencing data 
    
   
  
    
      
      Affymetrix SNP6.0 
      
    
   
  344 
 
  
    EGAD00010000916 
   
  
    
    BASIS breast cancer DNA methylation Illumina 450k 
    
   
  
    
      
      Illumina 450k 
      
    
   
  457 
 
  
    EGAD00010000917 
   
  
    
    399 tumors profiled using Agilent miRNA microarrays (Product Number G4872A, design ID 046064). The arrays are based on miRBase release 19.0 and 2006 human miRNAs are represented. 150 ng total RNA was used as input. 
    
   
  
    
      
      Agilent miRNA microarrays 
      
    
   
  399 
 
  
    EGAD00010000918 
   
  
    
    Understanding Society GWAS, samples that passed quality control, imputed to UK10K + 1000 Genomes combined reference panel 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 chip, UK10K + 1000 Genomes combined reference panel imputed 
      
    
   
  19888 
 
  
    EGAD00010000919 
   
  
    
    samples using Illumina HUMANOMNI1QUAD 
    
   
  
    
      
      HUMANOMNI1QUAD 
      
    
   
  2 
 
  
    EGAD00010000920 
   
  
    
    samples using Illumina HUMANOMNIEXPRESS 
    
   
  
    
      
      HUMANOMNIEXPRESS 
      
    
   
  50 
 
  
    EGAD00010000921 
   
  
    
    samples using Affymetrix CYTOSCANHD 
    
   
  
    
      
      CYTOSCANHD 
      
    
   
  12 
 
  
    EGAD00010000922 
   
  
    
    Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with broader consent. 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-0 
      
    
   
  494 
 
  
    EGAD00010000923 
   
  
    
    Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with consent for osteoarthritis studies only. 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 
      
    
   
  463 
 
  
    EGAD00010000924 
   
  
    
    Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with consent for osteoarthritis studies only. 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-1 
      
    
   
  991 
 
  
    EGAD00010000925 
   
  
    
    Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with broader consent. 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 
      
    
   
  855 
 
  
    EGAD00010000926 
   
  
    
    Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with broader consent. 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-1 
      
    
   
  3075 
 
  
    EGAD00010000927 
   
  
    
    Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with consent for osteoarthritis studies only. 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-0 
      
    
   
  248 
 
  
    EGAD00010000928 
   
  
    
    WTCCC3_Primary Biliary Cirrhosis Replication Post-QC 
    
   
  
    
      
      Illumina ImmunoChip 
      
    
   
  2861 
 
  
    EGAD00010000929 
   
  
    
    WTCCC3_Primary Biliary Cirrhosis Replication 
    
   
  
    
      
      Illumina ImmunoChip 
      
    
   
  2981 
 
  
    EGAD00010000934 
   
  
    
    Agilent miRNA dataset 
    
   
  
    
      
      Agilent SurePrint Human miRNA Microarray 
      
    
   
  2 
 
  
    EGAD00010000935 
   
  
    
    ACGH 244K dataset 
    
   
  
    
      
      Agilent 244K 
      
    
   
  10 
 
  
    EGAD00010000936 
   
  
    
    Affymetrix Exon Array dataset 
    
   
  
    
      
      Affymetrix GeneChip Human Exon 1.0 ST 
      
    
   
  2 
 
  
    EGAD00010000937 
   
  
    
    ACGH 180K dataset 
    
   
  
    
      
      Agilent 180K 
      
    
   
  5 
 
  
    EGAD00010000938 
   
  
    
    mRNA Array Agilent 44K dataset 
    
   
  
    
      
      Agilent 44K 
      
    
   
  16 
 
  
    EGAD00010000939 
   
  
    
    Illumina 1M SNP Array dataset 
    
   
  
    
      
      Illumina 1M SNP Array 
      
    
   
  2 
 
  
    EGAD00010000940 
   
  
    
    Gambian specimens with trachomatous scarring WHO grade C2/C3 
    
   
  
    
      
      Illiumina Omni 2.5 
      
    
   
  1531 
 
  
    EGAD00010000941 
   
  
    
    Gambian specimens without trachomatous scarring 
    
   
  
    
      
      Illumina Omni 2.5 
      
    
   
  1531 
 
  
    EGAD00010000942 
   
  
    
    Breast lesions assayed with Affymetrix SNP 6.0 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  125 
 
  
    EGAD00010000943 
   
  
    
    Sahel population study using 2.5M 
    
   
  
    
      
      Illumina HumanOmni2.5 
      
    
   
  161 
 
  
    EGAD00010000944 
   
  
    
    Genotyping data from Southeast Borneo individuals 
    
   
  
    
      
      Illumina Human Omni Express Bead Chip-24 v1.0 
      
    
   
  41 
 
  
    EGAD00010000946 
   
  
    
    Human samples, 450k analysis 
    
   
  
    
      
      Illumina 450k 
      
    
   
  127 
 
  
    EGAD00010000947 
   
  
    
    Lymphoma samples using CytoSNP 
    
   
  
    
      
      Illumina CytoSNP 
      
    
   
  35 
 
  
    EGAD00010000948 
   
  
    
    Lymphoma samples using 450k 
    
   
  
    
      
      Illumina 450k 
      
    
   
  95 
 
  
    EGAD00010000949 
   
  
    
    Lymphoma samples using HumanOmni 
    
   
  
    
      
      Illumina HumanOmni2.5 
      
    
   
  104 
 
  
    EGAD00010000950 
   
  
    
    WTCCC2 Bacteraemia Susceptibility (BS) smaples using Affymetrix 6.0 
    
   
  
    
      
      Affymetrix 6.0 
      
    
   
  4924 
 
  
    EGAD00010000951 
   
  
    
    SNP array data for 668 cancer cell lines 
    
   
  
    
      
      Illumina 2.5M 
      
    
   
  668 
 
  
    EGAD00010000952 
   
  
    
    Where Are You From? samples types at 517K SNP loci 
    
   
  
    
      
      Illumina HumanOmniExpress-24 BeadChip 
      
    
   
  598 
 
  
    EGAD00010000953 
   
  
    
    Healthy adult volunteers and newborns recruited in various countries across Oceania. 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  937 
 
  
    EGAD00010000954 
   
  
    
    Healthy volunteers recruited in New Caledonia 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  356 
 
  
    EGAD00010000955 
   
  
    
    Rheumatic heart disease cases recruited in Fiji with higher density genotyping 
    
   
  
    
      
      HumanOmniExpressExome-8 BeadChip 
      
    
   
  32 
 
  
    EGAD00010000956 
   
  
    
    Rheumatic heart disease cases recruited in New Caledonia with higher density genotyping 
    
   
  
    
      
      HumanOmniExpressExome-8 BeadChip 
      
    
   
  34 
 
  
    EGAD00010000957 
   
  
    
    Rheumatic heart disease cases recruited in New Caledonia 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  465 
 
  
    EGAD00010000958 
   
  
    
    Healthy volunteers recruited in Fiji with higher density genotyping 
    
   
  
    
      
      HumanOmniExpressExome-8 BeadChip 
      
    
   
  32 
 
  
    EGAD00010000959 
   
  
    
    Healthy volunteers recruited in Fiji 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  854 
 
  
    EGAD00010000960 
   
  
    
    Definite and borderline rheumatic heart disease cases and patients with mild non-diagnostic valvulopathy recruited in Samoa 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  126 
 
  
    EGAD00010000961 
   
  
    
    Rheumatic heart disease cases recruited in Fiji 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  535 
 
  
    EGAD00010000962 
   
  
    
    Healthy volunteers and missing phenotype individuals recruited in New Caledonia with higher density genotyping 
    
   
  
    
      
      HumanOmniExpressExome-8 BeadChip 
      
    
   
  30 
 
  
    EGAD00010000963 
   
  
    
    Healthy volunteers recruited in Samoa 
    
   
  
    
      
      HumanCore-24 BeadChip 
      
    
   
  24 
 
  
    EGAD00010000965 
   
  
    
    Array data from 4778 individuals from general population of rural Uganda 
    
   
  
    
   
  4778 
 
  
    EGAD00010000983 
   
  
    
    MeDIP-seq RPM chromsome BED files for Peripheral Blood from EPITWIN Project (Columns 4-4353 represent samples) 
    
   
  
    
      
      MeDIP-seq 
      
    
   
  4350 
 
  
    EGAD00010001001 
   
  
    
    Primary renal cell carcinoma (RCC), RCC metastases and cell lines by Illumina 450K 
    
   
  
    
      
      Illumina 450K 
      
    
   
  62 
 
  
    EGAD00010001003 
   
  
    
    This data set contains two data files. First data file (file name: PREDO_GA_EGA_methylation_data.csv) includes methylation data from 485512 sites accross human genome from 96 individuals acquired from Illumina 450K -chip. The other data file (file name: PREDO_GA_EGA_phenotypes.csv) contains the gestation ages and the genders of the 96 samples. 
    
   
  
    
      
      Illumina 450K-chip (methylation data) 
      
    
   
  96 
 
  
    EGAD00010001004 
   
  
    
    WTCCC1 project samples from 1958 British Birth Cohort 
    
   
  
    
      
      Infinium 550K 
      
    
   
  1504 
 
  
    EGAD00010001005 
   
  
    
    Illumina HumanCoreExome-12v1-1_A chip typing in a Greek adolescent population 
    
   
  
    
      
      Illumina Human Core Exome 12v1.1 
      
    
   
  120 
 
  
    EGAD00010001006 
   
  
    
    Proteomics LC-MS MS dataset 
    
   
  
    
      
      Liquid chromatography–mass spectrometry 
      
    
   
  8 
 
  
    EGAD00010001012 
   
  
    
    BLUEPRINT DNA Methylation 450K data of mantle cell lymphoma 
    
   
  
    
      
      Illumina HumanMethylation 450K 
      
    
   
  86 
 
  
    EGAD00010001025 
   
  
    
    BLUEPRINT DNA methylation profiles of monocytes, T cells and B cells in type 1 diabetes-discordant monozygotic twins 
    
   
  
    
      
      Illumina 450K 
      
    
   
  302 
 
  
    EGAD00010001029 
   
  
    
    Summary statistics for a multi-cohort epigenome-wide association study. This includes summary statistics (effect-size, standard error, p-value) for 470,000 methylation markers. 
    
   
  
    
   
  - 
 
  
    EGAD00010001032 
   
  
    
    RNA Expression using Illumina HT12 v3 
    
   
  
    
      
      Illlumina HT12 v3 
      
    
   
  153 
 
  
    EGAD00010001034 
   
  
    
    WTCCC3 Anorexia Nervosa GWAS 
    
   
  
    
      
      Illumina Human670-QuadCustom_v1_A 
      
    
   
  1696 
 
  
    EGAD00010001040 
   
  
    
    Methylation changes in OA patients with chronic exposure to cobalt and chromium 
    
   
  
    
      
      Illumina HumanMethylation450 
      
    
   
  68 
 
  
    EGAD00010001043 
   
  
    
    WTCCC3 Anorexia Nervosa Infinium-HumanCoreExome 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0_A and HumanCoreExome-24v1-0_A 
      
    
   
  925 
 
  
    EGAD00010001045 
   
  
    
    APCDR AGV Project: Array data from 99 Igbo. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2.5-4v1_B 
      
    
   
  - 
 
  
    EGAD00010001046 
   
  
    
    APCDR AGV Project: Array data from 86 Sotho. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001047 
   
  
    
    APCDR AGV Project: Array data from 107 Ethiopians (Amhara, Oromo, Somali; subset of Ethiopian Genome Project Genotyping). Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001048 
   
  
    
    APCDR AGV Project: Array data from 79 Jola. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001049 
   
  
    
    APCDR AGV Project: Array data from 99 Kikuyu. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2.5-4v1_B 
      
    
   
  - 
 
  
    EGAD00010001050 
   
  
    
    APCDR AGV Project: Array data from 78 Wolof. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001051 
   
  
    
    APCDR AGV Project: Array data from 97 Barundi. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001052 
   
  
    
    APCDR AGV Project: Array data from 100 Kalenjin. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2.5-4v1_B 
      
    
   
  - 
 
  
    EGAD00010001053 
   
  
    
    APCDR AGV Project: Array data from 100 Banyarwanda. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001054 
   
  
    
    APCDR AGV Project: Array data from 74 Fula 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001055 
   
  
    
    APCDR AGV Project: Array data from 100 Baganda. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
   
  - 
 
  
    EGAD00010001056 
   
  
    
    APCDR AGV Project: Array data from 100 Zulu. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001057 
   
  
    
    APCDR AGV Project: Array data from 88 Mandinka. Raw data, intensity files and post-QC Plink files. 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  - 
 
  
    EGAD00010001062 
   
  
    
    blood-based gene expression from breast cancer cases and age-matched controls 
    
   
  
    
      
      IlluminaHuman AWG-6 and HT12 
      
    
   
  455 
 
  
    EGAD00010001063 
   
  
    
    blood-based gene expression from breast cancer cases 
    
   
  
    
      
      IlluminaHuman AWG-6 and HT12 
      
    
   
  173 
 
  
    EGAD00010001064 
   
  
    
    tumor-based gene expression from breast cancer cases 
    
   
  
    
      
      IlluminaHuman HT12 
      
    
   
  173 
 
  
    EGAD00010001074 
   
  
    
    Rare CNVs from schizophrenia cases and controls 
    
   
  
    
      
      Mulitple CNV platforms 
      
    
   
  1 
 
  
    EGAD00010001075 
   
  
    
    Argentine samples using 250K 
    
   
  
    
      
      Illumina Exome 250K 
      
    
   
  391 
 
  
    EGAD00010001079 
   
  
    
    Affymetrix SNP6.0 array breast cancer data 
    
   
  
    
      
      Affymetrix SNP6.0 
      
    
   
  66 
 
  
    EGAD00010001081 
   
  
    
    Summary statistics for Malaria Genomic Epidemiology Network, "A novel locus of resistance to severe malaria in a region of ancient balancing selection", Nature (2015) 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  1 
 
  
    EGAD00010001099 
   
  
    
    Digital images of ovarian cancer metastases 
    
   
  
    
      
      Aperio 
      
    
   
  127 
 
  
    EGAD00010001101 
   
  
    
    Genotype data from Chad, Lebanon, and Yemen 
    
   
  
    
      
      Illumina HumanOmni2.5-8 v1.1 B 
      
    
   
  - 
 
  
    EGAD00010001102 
   
  
    
    Genotype data from Chad, Lebanon, and Yemen 
    
   
  
    
      
      Illumina HumanOmni2.5-8 v1.2 A 
      
    
   
  - 
 
  
    EGAD00010001103 
   
  
    
    Genotype data from Chad, Lebanon, and Yemen 
    
   
  
    
      
      Illumina HumanOmni2.5-8 v1.1 B 
      
    
   
  - 
 
  
    EGAD00010001131 
   
  
    
    The 100 European-descent (EUB) and 100 African-descent (AFB) Belgians studied were genotyped for a total of 4,301,332 SNPs on the Illumina HumanOmni5-Quad BeadChips. Whole-exome sequencing was carried out for the same 200 individuals with the Nextera Rapid Capture Expanded Exome kit, on the Illumina HiSeq 2000 platform, with 100-bp paired-end reads. This kit delivers 62 Mb of genomic content per individual, including exons, untranslated regions (UTR), and microRNAs. Omni5 and exome datasets were merged, yielding a concordance rate between platforms of 99.93%. 
    
   
  
    
      
      Illumina HumanOmni5-Quad and exome sequencing 
      
    
   
  200 
 
  
    EGAD00010001139 
   
  
    
    HipSci - Healthy Normals - Methylation Array - October 2016 
    
   
  
    
      
      Illumina 
      
    
   
  181 
 
  
    EGAD00010001141 
   
  
    
    Summary data from Meta-analysis of Genome-Wide-Association Studies for plasma levels of Coagulation Factor XI (FXI) 
    
   
  
    
   
  - 
 
  
    EGAD00010001143 
   
  
    
    HipSci - Healthy Normals - Expression Array - September 2016 
    
   
  
    
      
      Illumina 
      
    
   
  613 
 
  
    EGAD00010001145 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Methylation Array - October 2016 
    
   
  
    
      
      Illumina 
      
    
   
  45 
 
  
    EGAD00010001147 
   
  
    
    HipSci - Healthy Normals - Genotyping Array - September 2016 
    
   
  
    
      
      Illumina 
      
    
   
  613 
 
  
    EGAD00010001149 
   
  
    
    HipSci - Monogenic Diebetes - Methylation Array - October 2016 
    
   
  
    
      
      Illumina 
      
    
   
  35 
 
  
    EGAD00010001153 
   
  
    
    Family Trios on aCGH 8x60K 
    
   
  
    
      
      Agilent 8x60K 
      
    
   
  138 
 
  
    EGAD00010001155 
   
  
    
    Crohn's disease DNA samples genotyped using UK Biobank Axiom array 
    
   
  
    
      
      Axiom UKB 
      
    
   
  1676 
 
  
    EGAD00010001157 
   
  
    
    Genotyping of additional Inflammatory Bowel Disease cases - 2014 (QC pass samples) 
    
   
  
    
      
      Illumina Human Core Exome 12v1-1_a 
      
    
   
  9247 
 
  
    EGAD00010001158 
   
  
    
    Genotyping of additional Inflammatory Bowel Disease cases - 2014 (all samples) 
    
   
  
    
      
      Illumina Human Core Exome 12v1-1_a 
      
    
   
  11767 
 
  
    EGAD00010001161 
   
  
    
    Oncotrack metastatic samples using 450K. The shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets are included on Oncotrack_450K_tumor (EGAD00010001162) dataset. 
    
   
  
    
      
      Illumina 450K 
      
    
   
  15 
 
  
    EGAD00010001162 
   
  
    
    Oncotrack primary tumor samples using 450K. The dataset includes shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets. 
    
   
  
    
      
      Illumina 450K 
      
    
   
  67 
 
  
    EGAD00010001176 
   
  
    
    This dataset contains 15 control SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients. 
    
   
  
    
      
      Illumina 
      
    
   
  15 
 
  
    EGAD00010001177 
   
  
    
    This dataset contains 61 tumors SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients. 
    
   
  
    
      
      Illumina 
      
    
   
  61 
 
  
    EGAD00010001179 
   
  
    
    Tissue samples using Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip 
    
   
  
    
      
      Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip 
      
    
   
  22 
 
  
    EGAD00010001184 
   
  
    
    This data set includes the following summary level data file used for the imputation data: imputation.sv.assoc.txt: results from single variant association analysis in imputed samples 
    
   
  
    
   
  - 
 
  
    EGAD00010001185 
   
  
    
    This data set includes the following summary level data files used for the GoT2D WGS analysis: wgs.assoc.samples.list: list of samples to keep for association analysis wgs.assoc.variants.list: list of variants to keep for association analysis wgs.sv.assoc.txt: single variant association results 
    
   
  
    
   
  - 
 
  
    EGAD00010001187 
   
  
    
    This data set includes the following summary level data file used for the exome chip analysis: exome_chip.sv.assoc.txt: results from single variant association analysis in exome chip 
    
   
  
    
   
  - 
 
  
    EGAD00010001188 
   
  
    
    This data set includes the following summary level data files used for the 13k analysis of T2D-GENES data: wes.variants.list: list of variants to keep for any analysis of the exomes data wes.assoc.samples.list: list of samples to keep for association analysis wes.assoc.variants.list: list of variants to keep for association analysis wes.sv.assoc.txt: single variant association analysis results wes.gene.ptv.variants.list.txt: list of protein truncating variants to use in gene-level analysis wes.gene.ptv.assoc.txt: results from gene-level tests of protein truncating variants wes.gene.nsstrict.variants.list.txt: list of NSstrict variants to use in gene-level analysis wes.gene.nsstrict.assoc.txt: results from gene-level tests of NSstrict variants wes.gene.nsbroad.variants.list.txt: list of NSbroad variants to use in gene-level analysis wes.gene.nsbroad.assoc.txt: results from gene-level tests of NSbroad variants wes.gene.ns.variants.list.txt: list of non synonymous variants to use in gene-level analysis wes.gene.ns.assoc.txt: results from gene-level tests of non synonymous variants 
    
   
  
    
   
  - 
 
  
    EGAD00010001192 
   
  
    
    Germline genotype data on 56,479 ovarian cancer cases and controls 
    
   
  
    
      
      Illumina OncoArray 
      
    
   
  56479 
 
  
    EGAD00010001196 
   
  
    
    Raw Array data from the CPCGene BRCA study 
    
   
  
    
      
      Affymetrix OncoScan FFPE Express 
      
    
   
  48 
 
  
    EGAD00010001198 
   
  
    
    Case control samples using Infinium Omni2.5 
    
   
  
    
      
      Infinium Omni2.5M 
      
    
   
  274 
 
  
    EGAD00010001200 
   
  
    
    Genotyping data from Indonesian sea nomad and surrounding populations 
    
   
  
    
      
      Illumina Omni 5 
      
    
   
  105 
 
  
    EGAD00010001202 
   
  
    
    Human genotyping data for patients infected by hepatitis C virus 
    
   
  
    
      
      Affymetrix UKBiobank Array 
      
    
   
  563 
 
  
    EGAD00010001204 
   
  
    
    MacTel Projet consortium case and control genotypes from Ilumina Omni5 chip 
    
   
  
    
      
      Illumina Omni5 
      
    
   
  1 
 
  
    EGAD00010001209 
   
  
    
    Genome-wide SNP genotyping data for 1,235 western Africans by Illumina HumanOmniExpress-12 array, used in the EGAS00001002078 study 
    
   
  
    
      
      Illumina HumanOmniExpress-12 
      
    
   
  1235 
 
  
    EGAD00010001211 
   
  
    
    Inverse variance weighted fixed effect meta-analysis of three European GWAS studies of the offspring of Pre-eclampsia affected births (2658 Cases and 308267 Controls). 
    
   
  
    
   
  - 
 
  
    EGAD00010001212 
   
  
    
    Genetic studies of pregnancy-related cardiometabolic disorders in Central Asian, Northern European, and Colombian populations 
    
   
  
    
      
      Illumina HumanOmniExpress-12v1_J 
      
    
   
  - 
 
  
    EGAD00010001216 
   
  
    
    Melanoma cell lines CNV by SNP6 
    
   
  
    
      
      SNP6 
      
    
   
  22 
 
  
    EGAD00010001218 
   
  
    
    Raw Array data from the CPCGene 200PG study 
    
   
  
    
      
      Affymetrix OncoScan FFPE Express 
      
    
   
  502 
 
  
    EGAD00010001221 
   
  
    
    Illumina Omni 2.5M SNPchip data (build37) of Ethiopian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019) 
    
   
  
    
      
      Illumina HumanOmni2-5_8v1_A 
      
    
   
  124 
 
  
    EGAD00010001223 
   
  
    
    Illumina Omni 2.5M SNPchip data (build37) of Egyptian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019) 
    
   
  
    
      
      Illumina HumanOmni2-5M-8v1-1_B 
      
    
   
  100 
 
  
    EGAD00010001228 
   
  
    
    Primary and PDX SqCC samples using Infinium OmniExpress-24 
    
   
  
    
      
      Infinium_OmniExpress-24v1.0 
      
    
   
  24 
 
  
    EGAD00010001232 
   
  
    
    CN/LOH-profile of Translocation-negative FL_8 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001233 
   
  
    
    CN/LOH-profile of Translocation-negative FL_5 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001234 
   
  
    
    CN/LOH-profile of Translocation-negative FL_9 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001235 
   
  
    
    CN/LOH-profile of Translocation-negative FL_11 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001236 
   
  
    
    CN/LOH-profile of Translocation-negative FL_4 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001237 
   
  
    
    CN/LOH-profile of Translocation-negative FL_10 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001238 
   
  
    
    CN/LOH-profile of Translocation-negative FL_2 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001239 
   
  
    
    CN/LOH-profile of Translocation-negative FL_6 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001240 
   
  
    
    CN/LOH-profile of Translocation-negative FL_1 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001241 
   
  
    
    CN/LOH-profile of Translocation-negative FL_7 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  1 
 
  
    EGAD00010001243 
   
  
    
    UK TGCT control samples using the Infinium 1.2M array 
    
   
  
    
      
      Illumina Infinium 1.2M array 
      
    
   
  4946 
 
  
    EGAD00010001246 
   
  
    
    UK TGCT controls samples using theInfinium OncoArray-500K BeadChip 
    
   
  
    
      
      Infinium OncoArray-500K BeadChip 
      
    
   
  7422 
 
  
    EGAD00010001247 
   
  
    
    UK TGCT case samples using theInfinium OncoArray-500K BeadChip 
    
   
  
    
      
      Infinium OncoArray-500K BeadChip 
      
    
   
  3206 
 
  
    EGAD00010001249 
   
  
    
    TGCT - GWAS loci Hi-C data 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  1 
 
  
    EGAD00010001251 
   
  
    
    Epigenome of 36 rainforest hunther-gathering Baka of Cameroon by Illumina HumanMethylation450 array, used in the EGAS00001002226 study 
    
   
  
    
      
      Illumina HumanMethylation450 
      
    
   
  38 
 
  
    EGAD00010001253 
   
  
    
    Affymetrix SNP 6.0 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  245 
 
  
    EGAD00010001255 
   
  
    
    Autosomal STR genotypes using 15 Identifiler loci 
    
   
  
    
      
      Applied Biosystems 
      
    
   
  990 
 
  
    EGAD00010001258 
   
  
    
    Pilot study on the interplay between genetic, epigenetic, and environmental risk factors for obesity and related cardiometabolic diseases with 973 samples from South Africa genotyped on Illumina Human MetaboChip array. 
    
   
  
    
      
      Human Cardio Metabochip 
      
    
   
  973 
 
  
    EGAD00010001260 
   
  
    
    DNAm Case samples using Illumina Infinium 450K 
    
   
  
    
      
      Illumina 450K array 
      
    
   
  33 
 
  
    EGAD00010001261 
   
  
    
    DNAm Case samples using Illumina Infinium 450K 
    
   
  
    
      
      Illumina 450K array 
      
    
   
  33 
 
  
    EGAD00010001262 
   
  
    
    DNAm Case samples using Illumina Infinium 450K 
    
   
  
    
      
      Illumina 450K array 
      
    
   
  32 
 
  
    EGAD00010001265 
   
  
    
    original population (oMSC) and highly migrative subpopulation (sMSC) of murine eGFP+ bone marrow MSC 
    
   
  
    
      
      Affymetrix Mouse Gene ST 2.0 
      
    
   
  6 
 
  
    EGAD00010001273 
   
  
    
    Affymetrix GeneChip® Human Transcriptome Array 2.0 
    
   
  
    
      
      Affymetrix GeneChip® Human Transcriptome Array 2.0 
      
    
   
  34 
 
  
    EGAD00010001274 
   
  
    
    Expression profiling by Nanostring cancer immune 
    
   
  
    
      
      Nanostring Cancer Immune 
      
    
   
  30 
 
  
    EGAD00010001275 
   
  
    
    Affymetrix GeneChip® Human Transcriptome Array 2.0 
    
   
  
    
      
      Affymetrix GeneChip® Human Transcriptome Array 2.0 
      
    
   
  34 
 
  
    EGAD00010001276 
   
  
    
    Expression profiling by Nanostring cancer pathway 
    
   
  
    
      
      Nanostring cancer pathway 
      
    
   
  30 
 
  
    EGAD00010001278 
   
  
    
    ATRX SNP6 data on Affymetrix 600k 
    
   
  
    
      
      Affymetrix 600K 
      
    
   
  - 
 
  
    EGAD00010001280 
   
  
    
    Transcriptome array dataset 
    
   
  
    
      
      Affymetrix HG_U133_+2 
      
    
   
  25 
 
  
    EGAD00010001281 
   
  
    
    SNP array dataset 
    
   
  
    
      
      HUMANOMNIEXPRESS 
      
    
   
  50 
 
  
    EGAD00010001283 
   
  
    
    Illumina HumanOmni5-Quad BeadChips 
    
   
  
    
      
      Illumina 
      
    
   
  229 
 
  
    EGAD00010001285 
   
  
    
    Genotyping of knee osteoarthritis patients who have undergone total joint replacement 
    
   
  
    
      
      Illumina InfiniumCoreExome-24v1-1_A 
      
    
   
  17 
 
  
    EGAD00010001287 
   
  
    
    Array methylation profiling of knee osteoarthritis patients who have undergone total joint replacement 
    
   
  
    
      
      Illumina HumanMethylation450K 
      
    
   
  68 
 
  
    EGAD00010001289 
   
  
    
    Resolving the Genetic Architecture of Aseptic Loosening After Total Hip Replacement 
    
   
  
    
      
      Illumina InfiniumCoreExome-24v1-1_A 
      
    
   
  2880 
 
  
    EGAD00010001291 
   
  
    
    Methylation profiling of hip osteoarthritis patients who have undergone total joint replacement 
    
   
  
    
      
      Illumina HumanMethylation450K 
      
    
   
  27 
 
  
    EGAD00010001292 
   
  
    
    Genotyping of hip osteoarthritis patients who have undergone total joint replacement 
    
   
  
    
      
      Illumina InfiniumCoreExome-24v1-1_A 
      
    
   
  9 
 
  
    EGAD00010001294 
   
  
    
    Methylation data using 450K 
    
   
  
    
      
      Illumina 450k 
      
    
   
  1128 
 
  
    EGAD00010001296 
   
  
    
    DNA methylation analysis from primary human JMML and normal blood samples using 450K 
    
   
  
    
      
      Illumina_450K 
      
    
   
  - 
 
  
    EGAD00010001298 
   
  
    
    primary human ACC and normal samples using 450K 
    
   
  
    
      
      Illumina_450K 
      
    
   
  110 
 
  
    EGAD00010001300 
   
  
    
    Medulloblastoma expression profiling 
    
   
  
    
      
      Affymetrix expression array 
      
    
   
  146 
 
  
    EGAD00010001301 
   
  
    
    Medulloblastoma expression profiling 
    
   
  
    
      
      Affymetrix expression array 
      
    
   
  246 
 
  
    EGAD00010001304 
   
  
    
    Genotyping data from Comorian individuals 
    
   
  
    
      
      Illumina Human Omni5 Bead Chip 
      
    
   
  49 
 
  
    EGAD00010001307 
   
  
    
    iOmics gene expression data using Expression Array 
    
   
  
    
      
      Affymetrix Human Gene 1.0 ST Array 
      
    
   
  269 
 
  
    EGAD00010001308 
   
  
    
    iOmics miRNA data via qPCR quantification 
    
   
  
    
      
      patented mSMRT-qPCR miRNA assay (MIRXES) 
      
    
   
  351 
 
  
    EGAD00010001309 
   
  
    
    iOmics genomic data using 2.5M and Exome array 
    
   
  
    
      
      Illumina 2.5M and Illumina Exome array 
      
    
   
  323 
 
  
    EGAD00010001310 
   
  
    
    iOmics lipid data via mass spectrometry (MS) 
    
   
  
    
      
      Agilent 1200 LC system 
      
    
   
  359 
 
  
    EGAD00010001315 
   
  
    
    Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual. Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved. 
    
   
  
    
      
      Illumina HiSeq4000 
      
    
   
  45 
 
  
    EGAD00010001319 
   
  
    
    Medulloblastoma methylation profiling 
    
   
  
    
      
      Illumina Infinium HumanMethylation450 BeadChip 
      
    
   
  345 
 
  
    EGAD00010001323 
   
  
    
    Medulloblastoma methylation profiling 
    
   
  
    
      
      Illumina Infinium HumanMethylation450 BeadChip 
      
    
   
  911 
 
  
    EGAD00010001326 
   
  
    
    Papuan Genotyping 
    
   
  
    
      
      Illumina Multi-EthnicGlobal_A1 
      
    
   
  380 
 
  
    EGAD00010001328 
   
  
    
    HipSci - Healthy Normals - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  - 
 
  
    EGAD00010001330 
   
  
    
    HipSci - Healthy Normals - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001332 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001334 
   
  
    
    HipSci - Monogenic Diabetes - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001340 
   
  
    
    HipSci - Bardet-Biedl Syndrome - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001342 
   
  
    
    HipSci - Monogenic Diabetes - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001344 
   
  
    
    HipSci - Hereditary Cerebellar Ataxias - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001346 
   
  
    
    HipSci - Hereditary Spastic Paraplegia - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001348 
   
  
    
    HipSci - Kabuki Syndrome - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001350 
   
  
    
    HipSci - Usher Syndrome - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001352 
   
  
    
    HipSci - Alport Syndrome - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001354 
   
  
    
    HipSci - Congenital Hyperinsulinia - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001356 
   
  
    
    HipSci - Hypertrophic Cardiomyopathy - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001358 
   
  
    
    HipSci - Primary Immune Deficiency - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  23 
 
  
    EGAD00010001360 
   
  
    
    HipSci - Bleeding and Platelet Disorders - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001362 
   
  
    
    HipSci - Macular Dystrophy - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001364 
   
  
    
    HipSci - Retinitis Pigmentosa - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001366 
   
  
    
    HipSci - Battens Disease - Genotyping Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001368 
   
  
    
    HipSci - Hereditary Cerebellar Ataxias - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001370 
   
  
    
    HipSci - Hereditary Spastic Paraplegia - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001372 
   
  
    
    HipSci - Kabuki Syndrome - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001374 
   
  
    
    HipSci - Usher Syndrome - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001376 
   
  
    
    HipSci - Alport Syndrome - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001378 
   
  
    
    HipSci - Congenital Hyperinsulinia - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001380 
   
  
    
    HipSci - Hypertrophic Cardiomyopathy - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001382 
   
  
    
    HipSci - Primary Immune Deficiency - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001384 
   
  
    
    HipSci - Bleeding and Platelet Disorders - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001386 
   
  
    
    HipSci - Macular Dystrophy - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001388 
   
  
    
    HipSci - Retinitis Pigmentosa - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001390 
   
  
    
    HipSci - Battens Disease - Expression Array - July 2017 
    
   
  
    
      
      Illumina 
      
    
   
  1 
 
  
    EGAD00010001392 
   
  
    
    Genotyping data from Swahili individuals 
    
   
  
    
      
      Illumina Human Omni5 Bead Chip 
      
    
   
  91 
 
  
    EGAD00010001395 
   
  
    
    A replication cohort consisting of 1428 adult survivors of any non-ALL pediatric cancer 
    
   
  
    
      
      Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific 
      
    
   
  1428 
 
  
    EGAD00010001396 
   
  
    
    A discovery cohort of 856 adult survivors of pediatric ALL 
    
   
  
    
      
      Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific 
      
    
   
  856 
 
  
    EGAD00010001400 
   
  
    
    Difference in gene expression values between case and control, log2 values. Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC) Post-genome Cohort taken up to eight years before brest cancer diagnosis. Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifiers. 
    
   
  
    
      
      Illumina HumanWG-6 
      
    
   
  467 
 
  
    EGAD00010001403 
   
  
    
    Gene expression read counts 
    
   
  
    
      
      Illumina HiSeq2000 
      
    
   
  132 
 
  
    EGAD00010001406 
   
  
    
    Breast cancer tissue and controls 
    
   
  
    
      
      Exiqon 7th generation miRCURY LNA microRNA microarray system 
      
    
   
  149 
 
  
    EGAD00010001408 
   
  
    
    Illumina Infinium 450K array data 
    
   
  
    
      
      Illumina 450K 
      
    
   
  34 
 
  
    EGAD00010001410 
   
  
    
    Genotyped samples using Illumina Infinium HumanCoreExome Beadchip 
    
   
  
    
      
      Illumina Infinium HumanCoreExome Beadchip 
      
    
   
  502 
 
  
    EGAD00010001412 
   
  
    
    Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC) 
    
   
  
    
      
      Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifiers. 
      
    
   
  920 
 
  
    EGAD00010001414 
   
  
    
    Raw Array data from the PRAD-CA for ICGC DCC Release26 
    
   
  
    
      
      Affymetrix OncoScan FFPE Express 
      
    
   
  - 
 
  
    EGAD00010001416 
   
  
    
    BBMRI - BIOS project - Freeze 2 - methylation 
    
   
  
    
      
      Illumina Human Methylation 450k BeadChip 
      
    
   
  4386 
 
  
    EGAD00010001418 
   
  
    
    HumanOmni25M-8v1-1 
    
   
  
    
      
      Illumina 
      
    
   
  24 
 
  
    EGAD00010001420 
   
  
    
    Read counts determined using HTSeq-count for the BBMRI BIOS Freeze 2 RNAseq data 
    
   
  
    
      
      RNAseq 
      
    
   
  3560 
 
  
    EGAD00010001422 
   
  
    
    1000G Phase 3 Imputed cases and controls from NSAID-induced PUD study 
    
   
  
    
      
      Illumina Omni 2.5 
      
    
   
  676 
 
  
    EGAD00010001424 
   
  
    
    Codelink Human Whole Genome from Blood taken at 72 hours after birth (11 cases) 
    
   
  
    
      
      Codelink Human Whole Genome Bioarray 
      
    
   
  11 
 
  
    EGAD00010001425 
   
  
    
    Codelink Human Whole Genome from Blood taken at 72 hours after birth (9 controls) 
    
   
  
    
      
      Codelink Human Whole Genome Bioarray 
      
    
   
  9 
 
  
    EGAD00010001427 
   
  
    
    Cardio-Metabochip genotypes for B99 cohort 
    
   
  
    
      
      Illumina 
      
    
   
  1336 
 
  
    EGAD00010001428 
   
  
    
    Cardio-Metabochip genotypes for IHIT cohort 
    
   
  
    
      
      Illumina 
      
    
   
  2791 
 
  
    EGAD00010001430 
   
  
    
    Gene expression analysis from primary human JMML samples using Illumina Human HT-12 v4 
    
   
  
    
      
      Illumina_HumanHT-12_V4 
      
    
   
  15 
 
  
    EGAD00010001433 
   
  
    
    ATRT methylation 
    
   
  
    
      
      Illumina HumanMethylation450 BeadChip 
      
    
   
  162 
 
  
    EGAD00010001443 
   
  
    
    SNP array 
    
   
  
    
      
      Affymetrix SNP6.0 
      
    
   
  154 
 
  
    EGAD00010001447 
   
  
    
    Array-based association data 
    
   
  
    
      
      Illumina Omni Express/Illumina Core Exome 
      
    
   
  784 
 
  
    EGAD00010001449 
   
  
    
    Methylation Control samples using 450K Array 
    
   
  
    
      
      Illumina_450K 
      
    
   
  22 
 
  
    EGAD00010001450 
   
  
    
    Methylation JMML samples using 450K Array 
    
   
  
    
      
      Illumina_450K 
      
    
   
  92 
 
  
    EGAD00010001452 
   
  
    
    Genome-wide SNP genotyping data for 102 Pakistani individuals by Illumina HumanOmni2.5-8 array, used in the EGAS00001002558 study 
    
   
  
    
      
      Illumina HumanOmni2.5-8 
      
    
   
  102 
 
  
    EGAD00010001455 
   
  
    
    illumina 450K 
    
   
  
    
      
      450K 
      
    
   
  1347 
 
  
    EGAD00010001457 
   
  
    
    These are the log2CPM (log2 counts per million) fragments per gene counts associated with the BAM files in EGAD00001003806, in tab separated format. Counts for 36 postmortem brain samples from 9 non-demented control subjects and 9 Hereditary cerebral hemorrhage with amyloidosis-Dutch type subjects are included (1 Frontal cortex sample and 1 Occipital cortex sample per subject). RNA samples were depleted for ribosomal RNA with the Ribo Zero Gold Human kit (Illumina) and strand specific RNA-Seq libraries were generated. Paired-end sequencing was performed on a HiSeq2500 Illumina system (2x50bp reads). Alignments were performed using GSNAP v2014-12-23 with setting "--npaths 1" on GRCh38 reference genome without the alternative contigs. Fragment per gene counting was performed using HTSeq-count v0.6.1p1 with setting "--stranded reverse". The gene annotation used for quantification were UCSC RefSeq genes for GRCh38 downloaded on 2015-07-13. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  36 
 
  
    EGAD00010001461 
   
  
    
    illumina 450K 
    
   
  
    
      
      450K 
      
    
   
  472 
 
  
    EGAD00010001463 
   
  
    
    Genotype cases using Illumina HumanOmni5 
    
   
  
    
      
      Illumina HumanOmni5 
      
    
   
  279 
 
  
    EGAD00010001470 
   
  
    
    Himalayan population genetic study raw data (Himalaya) 
    
   
  
    
      
      Illumina HumanOmniExpress-12-v1-0 
      
    
   
  170 
 
  
    EGAD00010001471 
   
  
    
    Himalayan population genetic study QC filtered data 
    
   
  
    
      
      Illumina HumanOmni1-Quad_v1-0, HumanOmniExpress-12-v1-0, humanomniexpress-24-v1-1, HumanOmni25-8v1-2_A1 
      
    
   
  738 
 
  
    EGAD00010001472 
   
  
    
    Himalayan population genetic study raw data (Himalaya) 
    
   
  
    
      
      Illumina HumanOmni1-Quad_v1-0 
      
    
   
  565 
 
  
    EGAD00010001473 
   
  
    
    Himalayan population genetic study raw data (Tibet) 
    
   
  
    
      
      Illumina humanomniexpress-24-v1-1 
      
    
   
  148 
 
  
    EGAD00010001479 
   
  
    
    SNP data for 991 Irish individuals 
    
   
  
    
      
      Illumina 
      
    
   
  991 
 
  
    EGAD00010001481 
   
  
    
    CONTROL SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA) 
    
   
  
    
      
      OpenArray 
      
    
   
  258 
 
  
    EGAD00010001482 
   
  
    
    CASE SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA) 
    
   
  
    
      
      OpenArray 
      
    
   
  657 
 
  
    EGAD00010001484 
   
  
    
    Genetic Overlap between Metabolic and Psychiatric disease 
    
   
  
    
      
      Illumina HumanCoreExome-12v1-0 
      
    
   
  2611 
 
  
    EGAD00010001486 
   
  
    
    290 controls 
    
   
  
    
      
      Illumina HumanOmniExpress BeadChip 
      
    
   
  290 
 
  
    EGAD00010001487 
   
  
    
    252 dengue fever patients and 159 dengue shock syndrome patients 
    
   
  
    
      
      Illumina Human 660W Quad BeadChip 
      
    
   
  411 
 
  
    EGAD00010001489 
   
  
    
    Genotype data for 5,699,237 genotyped and imputed SNPs in the 816 healthy donors of the Milieu Intérieur cohort 
    
   
  
    
   
  816 
 
  
    EGAD00010001491 
   
  
    
    ADP array data, comprised of 2217 samples of Asian ancestry (excluding the Japanese population from ADP). Samples were genotyped on different Illumina or Affy platform. 
    
   
  
    
      
      Affymetrix/Illumina 
      
    
   
  3933 
 
  
    EGAD00010001495 
   
  
    
    Intensity files for Immunochip genotypes from blood 
    
   
  
    
      
      Illumina Immunochip 
      
    
   
  314 
 
  
    EGAD00010001499 
   
  
    
    EXOME ARRAY ANALYSIS OF ADVERSE REACTIONS TO FLUOROPYRIMIDINE-BASED THERAPY FOR GASTROINTESTINAL CANCER 
    
   
  
    
      
      Illumina HumanExome Array 
      
    
   
  504 
 
  
    EGAD00010001500 
   
  
    
    miRNA profiling of human plucked hair follicle from frontal and occipital scalp 
    
   
  
    
      
      Affymetrix miRNA 4.0 Array 
      
    
   
  48 
 
  
    EGAD00010001501 
   
  
    
    mRNA profiling of human plucked hair follicle from frontal and occipital scalp 
    
   
  
    
      
      Illumina HT12 
      
    
   
  48 
 
  
    EGAD00010001506 
   
  
    
    Methylation array dataset 
    
   
  
    
      
      Illumina 450k 
      
    
   
  38 
 
  
    EGAD00010001509 
   
  
    
    A WTCCC2 project - replication study for bacteraemia susceptibility in 2518 individuals from Kenya, genotyped on the Illumina Immunochip chip. 
    
   
  
    
      
      Illumina Infinium ImmunoChip 
      
    
   
  2518 
 
  
    EGAD00010001511 
   
  
    
    SNP 6.0 arrays of LCNEC samples 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  54 
 
  
    EGAD00010001513 
   
  
    
    Copy Number Variation as determined on Illumia Omin Arrays 
    
   
  
    
      
      Illumina Beadchip 
      
    
   
  122 
 
  
    EGAD00010001515 
   
  
    
    Nanostring PanCancer immune profiling data for The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer 
    
   
  
    
      
      Nanostring 
      
    
   
  120 
 
  
    EGAD00010001519 
   
  
    
    Raw Array data from the PRAD-CA for ICGC DCC Release27 
    
   
  
    
      
      Affymetrix OncoScan FFPE Express 
      
    
   
  110 
 
  
    EGAD00010001521 
   
  
    
    Bisulfite Converted DNA obtained from Whole Blood analysed on IlluminaHumanMethylationEPIC BeadChip microarrays processed with bigmelon R package 
    
   
  
    
      
      IlluminaHumanMethylationEPIC 
      
    
   
  1175 
 
  
    EGAD00010001526 
   
  
    
    DNA for 2482 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 2 gene chips. 
    
   
  
    
      
      Illumina 
      
    
   
  2482 
 
  
    EGAD00010001527 
   
  
    
    DNA for 1546 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 1 gene chips. 
    
   
  
    
      
      Illumina 
      
    
   
  1546 
 
  
    EGAD00010001528 
   
  
    
    DNA for 2979 individuals from Guangzhou was extract from peripheral blood and genotyped by Sequenom, with digit-number working memory, visuospatial working memory, recent long-term memory measured. 
    
   
  
    
      
      Sequenom 
      
    
   
  2979 
 
  
    EGAD00010001533 
   
  
    
    A cohort of 2886 participants of the Japan PBC-GWAS Study 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide ASI 1 Array 
      
    
   
  2886 
 
  
    EGAD00010001535 
   
  
    
    mRNA expression profile of kidney cancer 
    
   
  
    
      
      nanostring 
      
    
   
  126 
 
  
    EGAD00010001536 
   
  
    
    kidney cancer tissue sample 
    
   
  
    
      
      Illumina CytoSNP 12 bead array 
      
    
   
  129 
 
  
    EGAD00010001538 
   
  
    
    502 genotypes obtained from Illumina DNA-arrays. Available as plink formatted files 
    
   
  
    
   
  502 
 
  
    EGAD00010001540 
   
  
    
    Oncoscan CHP files for the Mesothelemia Project 
    
   
  
    
      
      Illumina Oncoscan Array 
      
    
   
  100 
 
  
    EGAD00010001542 
   
  
    
    Expression data for 42 PMBCL patient samples (32 IL4R WT cases and 10 cases with mutations in IL4R) 
    
   
  
    
      
      Illumina DASL Assay 
      
    
   
  42 
 
  
    EGAD00010001544 
   
  
    
    Imputed genetic data for INTERVAL proteomics cohort 
    
   
  
    
      
      Affymetrix Axiom UK Biobank + imputation to 1000GP3 and UK10K 
      
    
   
  3301 
 
  
    EGAD00010001546 
   
  
    
    ATRT expression 
    
   
  
    
      
      Illumina HumanHT-12 v4.0 Array 
      
    
   
  43 
 
  
    EGAD00010001551 
   
  
    
    The Kibbutzim Family Study (KFS) aimed to investigate the environmental and genetic determinants of cardiometabolic traits (phenotype is LDL-C) 
    
   
  
    
      
      Illumina HumanCoreExome BeadChip array 
      
    
   
  901 
 
  
    EGAD00010001557 
   
  
    
    503 genotypes obtained from Illumina DNA-arrays. Available as plink formatted files 
    
   
  
    
      
      Illumina arrays 
      
    
   
  503 
 
  
    EGAD00010001561 
   
  
    
    Quantile-normalised and batch corrected 
    
   
  
    
      
      Illumina HT12.4 
      
    
   
  703 
 
  
    EGAD00010001562 
   
  
    
    WG mRNA profiling in FFPE primary melanoma 
    
   
  
    
      
      Illumina HT12.4 
      
    
   
  703 
 
  
    EGAD00010001564 
   
  
    
    Primary renal cell carcinoma (RCC) by Affymetrix GeneChip miRNA 4.0 
    
   
  
    
      
      Affymetrix GeneChip miRNA 4.0 
      
    
   
  56 
 
  
    EGAD00010001566 
   
  
    
    Allelic imbalance data for cell lines derived from RPE1 with TP53 knockout 
    
   
  
    
      
      humanomniexpress-24-v1-1-a 
      
    
   
  2 
 
  
    EGAD00010001569 
   
  
    
    Summary statistics from Stage-1 GWAS for blood pressure phenotypes 
    
   
  
    
   
  5 
 
  
    EGAD00010001571 
   
  
    
    Genomic Landscape of Chordoid Glioma 
    
   
  
    
      
      Illumina HumanCoreExome-24 array 
      
    
   
  9 
 
  
    EGAD00010001573 
   
  
    
    Variations on the Y chromosome from 44 samples 
    
   
  
    
   
  44 
 
  
    EGAD00010001574 
   
  
    
    Alignement including 83 MT AA sequences and 2 reference sequences, rCRS and RSRS 
    
   
  
    
   
  83 
 
  
    EGAD00010001575 
   
  
    
    This dataset contains the per-chromosome RFMix input and output files for the local ancestry inference of 59 Aboriginal Australian genomes as reported in Malaspinas et al., 2016. Local ancestry was inferred assuming four mixing ancestral populations represented by: Europeans (27 individuals), Asians (29 individuals), Papuans (13 individuals) and Native Australians (7 individuals from the WCD region). 
    
   
  
    
   
  66 
 
  
    EGAD00010001577 
   
  
    
    RNA from the same tumor sample (n=98) was also processed using the 3' IVT kit (Affymetrix) and hybridized to U133 Plus 2.0 arrays (Affymetrix). 
    
   
  
    
      
      Affymetrix GeneChip Scanner 3000 7g 
      
    
   
  98 
 
  
    EGAD00010001579 
   
  
    
    This dataset contains files generated from Affymetrix Oncoscan Arrays. For each sample there are two paired cel files containing the raw data from AT and GT channels. Raw data has been transfromed to OSCHP signal files also within this dataset. 
    
   
  
    
      
      Oncoscan Array 
      
    
   
  157 
 
  
    EGAD00010001581 
   
  
    
    Copy Number Alterations arrays from 21 patients and 24 samples performed by Affymetrix 6.0, Agilent 1M and Oncoscan CNV platforms 
    
   
  
    
      
      Affymetrix 6.0; Agilent 1M; Oncoscan CNV 
      
    
   
  24 
 
  
    EGAD00010001582 
   
  
    
    Gene Expression Profiling from 21 cases: 14 CCND1-negative Mantle Cell Lymphoma and 7 CCND1-positive Mantle Cell Lymphoma 
    
   
  
    
      
      Genechip Human Genome U133 Plus 2.0 array 
      
    
   
  21 
 
  
    EGAD00010001584 
   
  
    
    The CentralAfricanCMC_Pemberton dataset encompasses 153,798 SNPs from the Illumina Cardio-MetaboChip (Voight et al. 2012) genotyped in 406 individuals from 19 Central African Populations from Gabon, Cameroon, Centralafrican Republic and Uganda). Individual phenotypic and cultural information at the individual level for this data set encompass gender, lifestyle (hunter-gatherer or farmer), and, when available, stature phenotype (standing height in cm, sitting-height in cm, and weight in kg). Other cultural, linguistic, and geographical location information about the sampled populations can be found in Pemberton et al. , Human Genetics, 2018 (https://doi.org/10.1007/s00439-018-1902-3).This dataset can only be accessed and used for non-commercial research purposes with a finality complying with the informed consent provided by Central African donors for the study of human evolutionary history only. 
    
   
  
    
      
      Illumina Cardio-MetaboChip 
      
    
   
  406 
 
  
    EGAD00010001586 
   
  
    
    This data set contains an .Rdata file for all the processed segmentation profiles from 81 lpWGS samples included used in downstream analyses from Github repository Evo_history_CACRC. Lastly, there is an .Rdata object with 50 segmentations for 50 total samples from 25 sporadic SNP adenomas used in the comparison with colitis samples. 
    
   
  
    
      
      Low Pass Whole Genome Sequencing (LP-WGS) 
      
    
   
  131 
 
  
    EGAD00010001587 
   
  
    
    This dataset contains 30 idat files each from 15 SNP array runs on patient colitis-associated colorectal cancer tumours. All phenotypes are cancer. See Baker et al. 2018 Supplementary Table 2 for patient details of 12 tumours used in the analyses in the publication. 
    
   
  
    
      
      SNP array 
      
    
   
  15 
 
  
    EGAD00010001589 
   
  
    
    Primary renal cell carcinoma (RCC) and RCC metastases by Affymetrix GeneChip HTA 2.0 
    
   
  
    
      
      Affymetrix Human Transcriptome Array 2.0 
      
    
   
  112 
 
  
    EGAD00010001591 
   
  
    
    SNPtest association statistics from case-control analysis (includes imputed SNPs) namely : rsID, Chromosome, Position, Beta, SE. 
    
   
  
    
      
      Illumina_OncoArray-500K Bead Array 
      
    
   
  8169 
 
  
    EGAD00010001593 
   
  
    
    This dataset includes raw data (.idat) for the Illumina Human450k beadchip and methylation levels (.txt files). Methylation level were treated for normalization and background substraction. We removed probes with at least one of the following characteristics: (1) weak signal (p > 0.01) (2128 CpG sites), (2) SNP-enriched sites (4100 sites), (3) out of a CpG context (not on a CG) (3149 sites), or (4) located on sex chromosomes (11,129 sites). A total of 465,071 CpG sites were analyzed initially. Signal was then normalized, first by scaling to the internal controls using the methylumi R package, then by applying the method of subset-quantile within array normalization (SWAN) implemented in the minfi R package. 
    
   
  
    
      
      Illumina 450K 
      
    
   
  167 
 
  
    EGAD00010001594 
   
  
    
   
  
    
      
      Illumina 450K 
      
    
   
  24 
 
  
    EGAD00010001596 
   
  
    
    DNA methylation data from patient RMS tumor samples from Illumina 450 K arays 
    
   
  
    
      
      Illlimuna EPIC 450 K 
      
    
   
  32 
 
  
    EGAD00010001598 
   
  
    
    Batch 1 of unfiltered genotype data for DDD Study patients (N=2,997), some of which were used in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Samples were genotyped on the Illumina HumanCoreExome BeadChip. QC'd data is available in release EGAD00010001604 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-0 
      
    
   
  3000 
 
  
    EGAD00010001600 
   
  
    
    Batch 2 of unfiltered genotype for DDD Study patients (N=8,286), some of which were used in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Samples were genotyped on the Illumina InfiniumCoreExome Beadchip. QC'd data is available in release EGAD00010001604 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-1 
      
    
   
  8207 
 
  
    EGAD00010001602 
   
  
    
    Unfiltered genotype data for DDD Study trios (patient and parents) (N=2,166 samples), some of which were used for replication of neurodevelopmental disorder polygenic risk (Niemi et al., Nature 2018). Samples were genotyped on the Illumina HumanOmniExpress BeadChip 
    
   
  
    
      
      Illumina SangerDDD_OmniExPlusv1_15019773 
      
    
   
  3822 
 
  
    EGAD00010001604 
   
  
    
    Post-QC (pre-imputation) genotype data for N=6,983 DDD probands included in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Consists of filtered set of samples and variants from EGAD00010001598 and EGAD00010001600. Includes patient HPO phenotype terms and GWAS summary statistics (including imputed variants). Samples were genotyped on the Illumina HumanCoreExome BeadChip and Illumina InfiniumCoreExome Beadchip 
    
   
  
    
      
      Illumina HumanCoreExome-24v1 
      
    
   
  6987 
 
  
    EGAD00010001606 
   
  
    
    Post-QC (pre-imputation) genotype data for N=2,166, a subset of trios described in EGAD00010001602. These data form N=722 complete trios in which the proband has a neurodevelopmental phenotype (Niemi et al. Nature 2018). Includes HPO phenotype terms for patients. Samples were genotyped on the Illumina HumanOmniExpress BeadChip 
    
   
  
    
      
      Illumina SangerDDD_OmniExPlusv1_15019773 
      
      MiSeq 
      
    
   
  2225 
 
  
    EGAD00010001608 
   
  
    
    The T cell Receptor Sequencing dataset contains 84 files related to T cell receptor sequences obtained using ImmunoSeq by Adaptive Biotechnologies and phenotype metadata from 23 patients enrolled on a phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data on baseline and on-treatment samples from tumor and blood. 
    
   
  
    
      
      MiSeq 
      
    
   
  59 
 
  
    EGAD00010001610 
   
  
    
    DNA methylation of NF1-glioma 
    
   
  
    
      
      Illumina 850K Epic Array 
      
    
   
  31 
 
  
    EGAD00010001612 
   
  
    
    MAGEcontrol samples using omni 2.5M 
    
   
  
    
      
      Genotype 
      
    
   
  737 
 
  
    EGAD00010001618 
   
  
    
    Genome-wide DNA methylation profiles of MZ twins clinically discordant for MS generated using Illumina’s Infinium MethylationEPIC BeadChip assay (EPIC array) 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip assay 
      
    
   
  90 
 
  
    EGAD00010001620 
   
  
    
    Single cell RNA-seq analysis of human skin. 
    
   
  
    
      
      single cell RNA-seq 
      
    
   
  12 
 
  
    EGAD00010001622 
   
  
    
    Human Core Exome Genotyping for 1471 samples from the STudy Into Lean and Thin Subjects (STILTS) cohort 
    
   
  
    
      
      Illumina humancoreexome-12v1-1_a 
      
    
   
  1471 
 
  
    EGAD00010001623 
   
  
    
    Human Core Exome Genotyping for 1456 severe early onset obesity cases (SCOOP) 
    
   
  
    
      
      Illumina humancoreexome-12v1-1_a 
      
    
   
  1456 
 
  
    EGAD00010001624 
   
  
    
    Genetics of thinness compared to obesity - summary statistics 
    
   
  
    
      
      Illumina humancoreexome-12v1-1_a 
      
    
   
  2927 
 
  
    EGAD00010001626 
   
  
    
    mpMRI visible prostate tumour samples (PI-RADSv2 5) 
    
   
  
    
      
      OncoScan 
      
    
   
  20 
 
  
    EGAD00010001627 
   
  
    
    mpMRI invisible prostate tumour samples 
    
   
  
    
      
      OncoScan 
      
    
   
  20 
 
  
    EGAD00010001629 
   
  
    
    Methylation of anaplastic meningiona samples 
    
   
  
    
      
      Ilumina Infinium HumanMethylationEPIC BeadChip array 
      
    
   
  26 
 
  
    EGAD00010001631 
   
  
    
    SNP array datas of Matched cancer-PNE 
    
   
  
    
      
      GeneChip Human Mapping 250K NspI 
      
    
   
  124 
 
  
    EGAD00010001633 
   
  
    
    Genotyping data for 32 individuals from a family affected by HPAH 
    
   
  
    
      
      Illumina Infinium CoreExome-24 BeadChip v1.1 
      
    
   
  32 
 
  
    EGAD00010001635 
   
  
    
    Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array 
    
   
  
    
   
  415 
 
  
    EGAD00010001636 
   
  
    
    Over 2.5 million SNP and CNV loci are screened by Illumina Infinium Omni2.5Exome-8 Kit 
    
   
  
    
   
  196 
 
  
    EGAD00010001637 
   
  
    
    Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array 
    
   
  
    
   
  539 
 
  
    EGAD00010001638 
   
  
    
    Over 2.5 million SNP and CNV loci are screened by Illumina Infinium Omni2.5Exome-8 Kit 
    
   
  
    
   
  262 
 
  
    EGAD00010001640 
   
  
    
    The individuals were genotyped for the Illumina Human Omni Express Bead Chip (OmniExpress), containing 741,000 SNPs. 22 samples were excluded with more than 10% missing genotypes 
    
   
  
    
   
  478 
 
  
    EGAD00010001642 
   
  
    
   
  
    
      
      Illumina EPIC methylation bead array 
      
    
   
  25 
 
  
    EGAD00010001643 
   
  
    
   
  
    
      
      Illumina 450k methylation bead array 
      
    
   
  73 
 
  
    EGAD00010001645 
   
  
    
    Genotype of PTPN22 SNPs in LOTx donors and recipients 
    
   
  
    
   
  290 
 
  
    EGAD00010001647 
   
  
    
    Genotype data from the Affymetrix 6.0 platform for 4,375 Colombian pre-eclampsia cases and controls 
    
   
  
    
      
      Affmetrix 6.0 
      
    
   
  4375 
 
  
    EGAD00010001649 
   
  
    
    This set features unfiltered, aligned, UMI-based single cell RNA sequencing count data for 5290 Blood, intraepithelial ileum (IEL) and lamina propria ileum (LPL) T cells from Crohn's disease patients as published in Uniken Venema et al, Gastroenterology 2019 
    
   
  
    
      
      Smartseq2_adapted 
      
    
   
  3 
 
  
    EGAD00010001651 
   
  
    
    The individuals were assayed for genome-wide SNP genotypes using the Illumina Human Omni5 Bead Chip (Illumina), which surveys 4,284,426 single nucleotide markers regularly spaced across the genome 
    
   
  
    
      
      Illumina Human Omni5 Bead Chip (4,284,426 SNPs) 
      
    
   
  3 
 
  
    EGAD00010001653 
   
  
    
    Binary Plink files for post-GWAS quality control in 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher) 
    
   
  
    
      
      Axiom 815K Spanish Biobank array 
      
    
   
  7409 
 
  
    EGAD00010001654 
   
  
    
    Binary Plink files for pre-GWAS quality control in 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher) 
    
   
  
    
      
      Axiom 815K Spanish Biobank array 
      
    
   
  7409 
 
  
    EGAD00010001655 
   
  
    
    Intensity calculation on the pixel values of the DAT file for 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher) 
    
   
  
    
      
      Axiom 815K Spanish Biobank array 
      
    
   
  7409 
 
  
    EGAD00010001657 
   
  
    
    Meta-analysis association statistics from case-control analysis (includes imputed SNPs) 
    
   
  
    
      
      NA 
      
    
   
  30657 
 
  
    EGAD00010001659 
   
  
    
    Transcriptomics analysis results 
    
   
  
    
      
      Illumina HumanHT-12 v4 
      
    
   
  36 
 
  
    EGAD00010001660 
   
  
    
    Methylomics analysis results formatted as a beta matrix 
    
   
  
    
      
      Illumina HumanMethylation-450k 
      
    
   
  36 
 
  
    EGAD00010001664 
   
  
    
    4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE). 
    
   
  
    
      
      Illumina-Genotyping Array 
      
    
   
  4988 
 
  
    EGAD00010001665 
   
  
    
    4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC but not imputed. 
    
   
  
    
      
      Illumina-Genotyping Array 
      
    
   
  4988 
 
  
    EGAD00010001667 
   
  
    
    33 ETMR samples were genotyped using Illumina HumanOmni2.5M array. Hybridization and scanning was done according to the manufacturer's instructions (Illumina). Copy number (Log R) and B allele frequency estimates were obtained sing the Genotyping module (v1.9.4) in GenomeStudio v2011.1 (Illumina). Normalized and log2-transformed copy number measurements were imported from genomestudio and analysed using R package CopyNumber to identify segments with similar copy number. 
    
   
  
    
      
      Illumina Omni2.5 
      
    
   
  33 
 
  
    EGAD00010001669 
   
  
    
    77 ETMR samples were profiled using methylation array. DNA from frozen tissue and formalin-fixed, paraffin-embedded (FFPE) materials were analyzed with the Illumina Infinium HumanMethylation450 (450k) and MethylationEPIC (EPIC) array according to manufacturer’s instructions and with a modified method that was previously described (Torchia et al. Cancer Cell. 2016; Triche et al. Nucleic Acids Res. 2013). 
    
   
  
    
      
      Illumina Infinium HumanMethylation450K 
      
    
   
  77 
 
  
    EGAD00010001671 
   
  
    
    Raw sequencing reads from H3K27ac ChIP and input DNA from lymphoblastoid cells of three TET2 mutation carriers and two wild-type family members were quality and adapter trimmed with cutadapt version 1.16 in Trim Galore version 0.3.7 using default parameters. Trimmed reads were aligned to hs37d5 reference genome using Bowtie2 (version 2.1.0). Duplicate reads were removed with samtools rmdup (v1.7). Fragment coverage of paired-end reads was calculated from bam files with BEDtools genomecov (v2.26.0). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00010001673 
   
  
    
    Genotype array data from normal tissue 
    
   
  
    
      
      Illumina HumanOmni2.5-8 BeadChip 
      
    
   
  21 
 
  
    EGAD00010001674 
   
  
    
    Methylation data from tumor tissue 
    
   
  
    
      
      Agilent SureSelectXT Human Methyl. Seq 
      
    
   
  21 
 
  
    EGAD00010001675 
   
  
    
    Genotype array data from tumor tissue 
    
   
  
    
      
      Illumina HumanOmni2.5-8 BeadChip 
      
    
   
  21 
 
  
    EGAD00010001676 
   
  
    
    Expression array data from tumor tissue 
    
   
  
    
      
      Thermo Fisher Scientific GeneChip Human Transcriptome Array 2.0 
      
    
   
  21 
 
  
    EGAD00010001678 
   
  
    
    Genotype data for 140 present-day individuals from five populations in Pakistan in The first horse herders and the impact of early Bronze Age steppe expansions into Asia DOI: 10.1126/science.aar7711. Sampling details are presented in supplementary section S2.1 Data generation 
    
   
  
    
      
      Infinium OmniExpressExome-8 v.1.3 BeadChip 
      
    
   
  140 
 
  
    EGAD00010001680 
   
  
    
    Illumina Infinium Human 450k methylation arrays 
    
   
  
    
      
      Illumina 
      
    
   
  17 
 
  
    EGAD00010001681 
   
  
    
    Affymetrix PrimeView Human Gene Expression 
    
   
  
    
      
      Affymetrix 
      
    
   
  56 
 
  
    EGAD00010001683 
   
  
    
    Case_control_meta_analysis 
    
   
  
    
      
      Array 
      
    
   
  1 
 
  
    EGAD00010001685 
   
  
    
    EXCEED samples imputed to HRC reference panel using Michigan Imputation server 
    
   
  
    
      
      Axiom UK Biobank array 
      
    
   
  5216 
 
  
    EGAD00010001687 
   
  
    
    Meningioma Methylation 
    
   
  
    
      
      Illumina 
      
    
   
  280 
 
  
    EGAD00010001689 
   
  
    
    Tumor biopsies from LAM disease were retrospectively analyzed by multiple techniques to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations. 
    
   
  
    
      
      Affymetrix OncoScan 
      
    
   
  24 
 
  
    EGAD00010001691 
   
  
    
    4 AC samples, each with adenoma, carcinoma and normal colon tissue (12 samples in total) were analysed on the Infinium MethylationEPIC BeadChip for copy number alteration analyses. 
    
   
  
    
      
      Beadarray 
      
    
   
  12 
 
  
    EGAD00010001695 
   
  
    
    Islet_HumanMethylation450K_ThurnerEtAl 
    
   
  
    
      
      HumanMethylation450K 
      
    
   
  41 
 
  
    EGAD00010001699 
   
  
    
    EXCEED genotyping 
    
   
  
    
      
      Axiom UK Biobank array 
      
    
   
  5216 
 
  
    EGAD00010001701 
   
  
    
    Total RNA (100ng) from 21 ETMRs with C19MC structural alterations and 28 other PBTs was prepared with nCounter miRNA Sample Prep Kit according to standard protocol. miRNA expression profiling was conducted with human v1, v2, or v3 miRNA panel on nCounter miRNA expression platform (NanoString Technologies, Seattle, WA) according to manufacturer’s protocol. Signal normalization was done using nSolver Analysis and batch corrected using ComBat (Johnson et al. Biostatistics. 2007). 565 miRNAs overlapped between all three versions and was used for further analyses. Fold change and supervised t-test with FDR correction was calculated between the ETMRs and other PBTs. 
    
   
  
    
      
      Nanostring 
      
    
   
  49 
 
  
    EGAD00010001703 
   
  
    
    RNAseq reads were aligned with STAR 2.5.3a and gene expression was quantified with RSEM 1.3.0 
    
   
  
    
   
  144 
 
  
    EGAD00010001705 
   
  
    
    ALL SAMPLES USING ClariomD microarray (Affymetrix) 
    
   
  
    
      
      Affymetrix ClariomD 
      
    
   
  54 
 
  
    EGAD00010001707 
   
  
    
    Table of gene-level RNA counts from 21 newborn screening dried blood spot (DBS) samples. These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). Rows correspond to genes and columns to samples, where there is an additional column (BS13sub), corresponding to sample BS13, which was downsampled to 1/4 of its original depth. 
    
   
  
    
      
      Illumina NextSeq-500 
      
    
   
  21 
 
  
    EGAD00010001709 
   
  
    
    Gene expression for 303 ADME and ADME related genes (averaged log2 signal intensities using Human-WG6v2 Expression BeadChip) 
    
   
  
    
      
      Illumina Human-WG6v2 Expression BeadChip 
      
    
   
  150 
 
  
    EGAD00010001711 
   
  
    
    Illumina Infinium MethylationEPIC BeadChip kit (Illumina, Inc., San Diego, CA). Standard Illumina procedures using Illumina iScan scanner. 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip 
      
    
   
  120 
 
  
    EGAD00010001713 
   
  
    
    Illumina Infinium Omni2.5 Genome-Wide Genotyping Array 
    
   
  
    
      
      Illumina Infinium Omni2.5 BeadChip 
      
    
   
  48 
 
  
    EGAD00010001715 
   
  
    
    Data on Affymetrix 6.0 arrays for Genome-Wide Association Study of colorectal cancer in the Spanish population. Additionally, geographical origin for each sample is provided, which constitutes the largest to-date Spanish genomic sample population 
    
   
  
    
      
      Affymetrix 6.0 array 
      
    
   
  1299 
 
  
    EGAD00010001717 
   
  
    
    This is the affymetrix gene expression data of the metastatic tumours related to this study. 
    
   
  
    
      
      Human Clariom D Arrays 
      
    
   
  11 
 
  
    EGAD00010001719 
   
  
    
    RNA-sequencing data 
    
   
  
    
      
      Paired RNA-sequencing data 
      
    
   
  20 
 
  
    EGAD00010001720 
   
  
    
    Data from Infinium EPIC 850K DNA methylation beadchip 
    
   
  
    
      
      Infinium EPIC DNA methylation beadchip 
      
    
   
  77 
 
  
    EGAD00010001722 
   
  
    
    GWAS results in epacts format for Danjou et al, Nature Genetics 2015 
    
   
  
    
      
      Illumina arrays 
      
    
   
  6305 
 
  
    EGAD00010001724 
   
  
    
    598764 SNPs genotyped for 719 indivuals, merge from Illumina Omni1 and Illumina Omni2.5 
    
   
  
    
      
      Illumina Omni1 and Illumina Omni2.5 
      
    
   
  719 
 
  
    EGAD00010001726 
   
  
    
    Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare hematologic malignancy that is most similar in expression profiles to plasmacytoid dendritic cells. However, patients often exhibit features of AML and can progress to AML. In this project, we will determine the differentially and commonly expressed genes between BPDCN and AML specimens. Available BPDCN and TET2-mutated AML specimens were taken for transcriptome microarray analysis. 
    
   
  
    
      
      ThermoFisher Scientific ClariomTM D Pico Assay 
      
    
   
  7 
 
  
    EGAD00010001727 
   
  
    
    Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare hematologic malignancy that is most similar in expression profiles to plasmacytoid dendritic cells. However, patients often exhibit features of AML and can progress to AML. In this project, we will determine the differentially and commonly expressed genes between BPDCN and AML specimens. Available BPDCN and TET2-mutated AML specimens were taken for transcriptome microarray analysis. 
    
   
  
    
      
      ThermoFisher Scientific ClariomTM D Pico Assay 
      
    
   
  6 
 
  
    EGAD00010001729 
   
  
    
    2619 individuals with visual contour perception phenotype scores, in the form of averaged accuracy 
    
   
  
    
      
      NA 
      
    
   
  2619 
 
  
    EGAD00010001731 
   
  
    
    DNA methylation in bronchial biopsies of asthmatics, asthma in remission and healthy subjects 
    
   
  
    
      
      Infinium HumanMethylation450 BeadChip array (450k array) 
      
    
   
  179 
 
  
    EGAD00010001733 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Vietnam 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  1728 
 
  
    EGAD00010001734 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Ghana 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  782 
 
  
    EGAD00010001735 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:PNG 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  815 
 
  
    EGAD00010001736 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Nigeria 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  419 
 
  
    EGAD00010001737 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Mali 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  900 
 
  
    EGAD00010001738 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Gambia 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  5594 
 
  
    EGAD00010001739 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:BurkinaFaso 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  1446 
 
  
    EGAD00010001740 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Cameroon 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  1471 
 
  
    EGAD00010001741 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Malawi 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  3088 
 
  
    EGAD00010001742 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Kenya 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  3865 
 
  
    EGAD00010001743 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Tanzania 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  979 
 
  
    EGAD00010001746 
   
  
    
    Functional genomics approaches to understand osteoarthritis 
    
   
  
    
      
      Illumina HumanCoreExome-24v1-1 
      
    
   
  77 
 
  
    EGAD00010001748 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: phased genotypes 
    
   
  
    
      
      Illumina Omni 2.5M 
      
    
   
  17960 
 
  
    EGAD00010001755 
   
  
    
    Skin tumour 
    
   
  
    
      
      Illumina EPIC 
      
    
   
  7 
 
  
    EGAD00010001757 
   
  
    
    Tumor biopsies from LAM disease were analyzed by MLPA to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations. The dataset include 44 samples. 
    
   
  
    
      
      3730xl 
      
    
   
  44 
 
  
    EGAD00010001761 
   
  
    
    Tumor biopsies from LAM disease were analyzed by sanger to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations. The dataset include 21 samples. 
    
   
  
    
      
      sanger(3730 XL) 
      
    
   
  21 
 
  
    EGAD00010001763 
   
  
    
    Association results from Polish cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  1062 
 
  
    EGAD00010001764 
   
  
    
    Association results from the Dutch cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  3378 
 
  
    EGAD00010001765 
   
  
    
    Association results from Spanish cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  2325 
 
  
    EGAD00010001766 
   
  
    
    Association results fromthe Agentinian cohort two 
    
   
  
    
      
      Immunochip 
      
    
   
  465 
 
  
    EGAD00010001767 
   
  
    
    Association results from British cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  16002 
 
  
    EGAD00010001768 
   
  
    
    Association results from the Agentinian cohort one 
    
   
  
    
      
      Immunochip 
      
    
   
  741 
 
  
    EGAD00010001769 
   
  
    
    Association results from the Irish cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  848 
 
  
    EGAD00010001770 
   
  
    
    Results of the celiac diease meta-analysis 
    
   
  
    
      
      Immunochip 
      
    
   
  27786 
 
  
    EGAD00010001771 
   
  
    
    Association results from the Italian cohort 
    
   
  
    
      
      Immunochip 
      
    
   
  2965 
 
  
    EGAD00010001775 
   
  
    
    Genotypes of Russian people from Ustuyzhna (Vologda Oblast, Russia) 
    
   
  
    
      
      Infinium OmniExpress-24v1-2_A1, iScan+ (Illumina) 
      
    
   
  46 
 
  
    EGAD00010001776 
   
  
    
    Genotypes of nenets people from Yamalo-Nenets Autonomous Okrug (Russia) 
    
   
  
    
      
      Infinium OmniExpress-24v1-2_A1, iScan+ (Illumina) 
      
    
   
  41 
 
  
    EGAD00010001783 
   
  
    
    Western Mediterranean Illumina Infinium Omni 2.5 array data 
    
   
  
    
      
      Illumina Infinium Omni2.5M 
      
    
   
  142 
 
  
    EGAD00010001795 
   
  
    
    Array data for oesophageal and related samples – kno_paper_methyl_release 
    
   
  
    
      
      Illumina 
      
    
   
  78 
 
  
    EGAD00010001797 
   
  
    
    Methylation microarray profiling (Illumina Human Methylation 450k and EPIC platforms) of 60 adult glioblastomas. Tumours were subtyped using the approach from Sturm et al. (https://doi.org/10.1016/j.ccr.2012.08.024): 12 IDH, 18 MES, 12 RTK I, 18 RTK II. DNA was prepared, assayed on the microarrays, and raw data computationally processed as described in Capper et al., "DNA methylation-based classification of central nervous system tumours": https://www.nature.com/articles/nature26000 
    
   
  
    
   
  60 
 
  
    EGAD00010001799 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: Sequenom MassArray genotypes 
    
   
  
    
      
      Sequenom MassArray (Agena Bioscience) 
      
    
   
  40256 
 
  
    EGAD00010001801 
   
  
    
    Sample genotyped with Axiom InCor BB (Affymetrix) with local ancestry masking of non-Native American ancestry 
    
   
  
    
      
      Axiom InCor BB (Affymetrix) 
      
    
   
  59 
 
  
    EGAD00010001802 
   
  
    
    Sample genotyped with Axiom InCor BB (Affymetrix) 
    
   
  
    
      
      Axiom InCor BB (Affymetrix) 
      
    
   
  83 
 
  
    EGAD00010001803 
   
  
    
    Sample genotyped with Axiom Human Origins (Affymetrix) 
    
   
  
    
      
      Axiom Human Origins (Affymetrix) 
      
    
   
  12 
 
  
    EGAD00010001805 
   
  
    
    CASE AND CONTROL SAMPLES USING Infinium MethylationEPIC 
    
   
  
    
      
      Infinium MethylationEPIC 
      
    
   
  24 
 
  
    EGAD00010001807 
   
  
    
    Genotyping of Y chromosome in Polish population 
    
   
  
    
      
      iScan, Illumina 
      
    
   
  2705 
 
  
    EGAD00010001811 
   
  
    
    h5 files from 15 single cell PDAC samples described in "Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution" 
    
   
  
    
   
  15 
 
  
    EGAD00010001813 
   
  
    
    Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array 
    
   
  
    
      
      Affymetrix SNP 6.0 array 
      
    
   
  91 
 
  
    EGAD00010001814 
   
  
    
    Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array 
    
   
  
    
      
      Affymetrix SNP 6.0 array 
      
    
   
  195 
 
  
    EGAD00010001816 
   
  
    
    The Jerusalem Perinatal Study (JPS) aimed to examine the developmental origins of cardiometabolic risk. 
    
   
  
    
      
      Affymetrix Biobank array 
      
    
   
  2714 
 
  
    EGAD00010001818 
   
  
    
    EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: Gambian trio HLA typing 
    
   
  
    
      
      Sanger Sequencing 
      
    
   
  96 
 
  
    EGAD00010001822 
   
  
    
    Array data for oesophageal and related samples – sj_paper_methyl_tumour_release 
    
   
  
    
      
      Illumina 
      
    
   
  285 
 
  
    EGAD00010001825 
   
  
    
    Expression measurements in NK cells 
    
   
  
    
      
      Affymetrix Human Gene_1.0ST array 
      
    
   
  140 
 
  
    EGAD00010001826 
   
  
    
    Expression measurements in CD4 cells 
    
   
  
    
      
      Affymetrix Human Gene_1.0ST array 
      
    
   
  123 
 
  
    EGAD00010001827 
   
  
    
    Expression measurements in moonoocytes 
    
   
  
    
      
      Affymetrix Human Gene_1.0ST array 
      
    
   
  131 
 
  
    EGAD00010001828 
   
  
    
    Expression measurements in B cells 
    
   
  
    
      
      Affymetrix Human Gene_1.0ST array 
      
    
   
  124 
 
  
    EGAD00010001829 
   
  
    
    Expression measurements in CD8 cells 
    
   
  
    
      
      Affymetrix Human Gene_1.0ST array 
      
    
   
  146 
 
  
    EGAD00010001830 
   
  
    
    Illumina Immunochip genotypes 
    
   
  
    
      
      Illumina Immunochip 
      
    
   
  170 
 
  
    EGAD00010001834 
   
  
    
    Array data for oesophageal and related samples – sj_paper_methyl_normal_release 
    
   
  
    
      
      Illumina 
      
    
   
  100 
 
  
    EGAD00010001838 
   
  
    
    Array data for oesophageal and related samples – sj_paper_methyl_barretts_release 
    
   
  
    
      
      Illumina 
      
    
   
  150 
 
  
    EGAD00010001841 
   
  
    
    Copy Number Alterations arrays from 153 samples performed by Affymetrix 6.0 and Oncoscan CNV platforms 
    
   
  
    
      
      Affymetrix 6.0; Cytoscan 
      
    
   
  153 
 
  
    EGAD00010001842 
   
  
    
    Gene Expression Profiling from 44 Mantle Cell Lymphoma cases 
    
   
  
    
      
      Human Genome U219 array plate 
      
    
   
  44 
 
  
    EGAD00010001844 
   
  
    
    mtDNA variant positions vcf files for 86 human samples 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  86 
 
  
    EGAD00010001846 
   
  
    
    Whole-genome DNA methylation profiling of CD14+ monocytes obtained from CD-active, CD-remissive and non-CD individuals 
    
   
  
    
      
      Illumina Infinium HumanMethylation 450k BeadChip 
      
    
   
  25 
 
  
    EGAD00010001848 
   
  
    
    Genotype data obtained using the coreExome Illumina SNP chip array for all the individuals included in the study of gene expression regulation in human primary regulatory CD4+ T cells (Tregs) 
    
   
  
    
      
      coreExome Illumina SNP chip array 
      
    
   
  120 
 
  
    EGAD00010001850 
   
  
    
    Epigenome wide DNA methylation assay of OSCC-GB using Illumina methylation array 
    
   
  
    
      
      Illumina Infinium 450K BeadChip Array; Illumina Infinium EPIC Beadchip Array 
      
    
   
  174 
 
  
    EGAD00010001852 
   
  
    
    NanoString raw data for a noeadjuvant combination PD-L1 plus CTLA-4 blockade trial on patients with cisplatin-ineligible operable urothelial carcinoma. All samples were FFPE tumor samples. Raw probe count data (.RCC files) were generated from nCounter Digital Analyzer (4.0.0.3). 
    
   
  
    
      
      Nanostring expression immunology panel 
      
    
   
  34 
 
  
    EGAD00010001854 
   
  
    
    Genome-wide DNA methylation profiling of genomic DNA isolated from blood PBMCs of breat cancer patients before the start of therapy. 
    
   
  
    
      
      Infinium MethylationEPIC BeadChip array 
      
    
   
  8 
 
  
    EGAD00010001857 
   
  
    
    Circulating tumor cells for comprehensive and multiregional non-invasive genetic characterization of multiple myeloma (arrays set) 
    
   
  
    
      
      Affymetrix Cytoscan HD 
      
    
   
  71 
 
  
    EGAD00010001859 
   
  
    
    Epigenome of brainstem gliomas 
    
   
  
    
      
      Illumina Methylation Array 
      
    
   
  123 
 
  
    EGAD00010001861 
   
  
    
    Genome-wide DNA methylation profiling of Waldenstrom's macroglobulinemia (WM) patient samples 
    
   
  
    
      
      Infinium MethylationEPIC Kit (Illumina) 
      
    
   
  48 
 
  
    EGAD00010001863 
   
  
    
    DNA Copy Number. Milan samples. 
    
   
  
    
      
      Affymetrix SNP 6.0 
      
    
   
  27 
 
  
    EGAD00010001864 
   
  
    
    Gene expression. Milan samples. 
    
   
  
    
      
      Agilent Sureprint G3 Human Gene Expression 8x60K microarrays (G4851A) 
      
    
   
  33 
 
  
    EGAD00010001865 
   
  
    
    DNA Methylation. Milan samples. 
    
   
  
    
      
      Illumina Infinium HumanMethylation450K 
      
    
   
  32 
 
  
    EGAD00010001867 
   
  
    
    Yemen and Chad Genotyping 
    
   
  
    
      
      Unknown HumanOmni25-8v1-2 
      
    
   
  1 
 
  
    EGAD00010001868 
   
  
    
    Yemen and Chad Genotyping 
    
   
  
    
      
      Unknown HumanOmni25-8v1-2_A1 
      
    
   
  258 
 
  
    EGAD00010001870 
   
  
    
    Lebanon_Genotyping 
    
   
  
    
      
      Unknown HumanOmni25M-8v1-1 
      
    
   
  126 
 
  
    EGAD00010001872 
   
  
    
    EPIC methylation arrays on PT1-derived PDXs 
    
   
  
    
      
      EPIC arrays 
      
    
   
  32 
 
  
    EGAD00010001874 
   
  
    
    Patients with T1DM genotyped on Illumina HiScan using Illumina Infinium OmniExpress Exome-8 v1.4 arrays 
    
   
  
    
      
      Infinium HD Super Microarray 
      
    
   
  576 
 
  
    EGAD00010001877 
   
  
    
    DNA methylation profiling from 70 Mantle Cell Lymphoma cases 
    
   
  
    
      
      Infinium MethylationEPIC BeadChip 
      
    
   
  70 
 
  
    EGAD00010001879 
   
  
    
    Control human dermal fibroblasts from patient forearm 
    
   
  
    
      
      Illumina 450k 
      
    
   
  12 
 
  
    EGAD00010001880 
   
  
    
    Case human dermal fibroblasts from patient forearm 
    
   
  
    
      
      Illumina 450k 
      
    
   
  12 
 
  
    EGAD00010001886 
   
  
    
    This dataset contains PLINK processed (PED and MAP) genotype data, from 1000 samples from the UAE using the Illumina Omni5 Exome Bead Chip 
    
   
  
    
      
      Illumina 
      
    
   
  1000 
 
  
    EGAD00010001888 
   
  
    
    Illumina 450K DNA methylation profiles of 314 fresh-frozen colorectal mucosa, adenoma or adenocarcinoma samples. 
    
   
  
    
      
      Illumina 450k 
      
    
   
  314 
 
  
    EGAD00010001895 
   
  
    
    Multi-omics profiling of paired primary and recurrent glioblastoma patient tissues 
    
   
  
    
      
      AB GeneChip Scanner 3000 7G System Clariom S 
      
    
   
  22 
 
  
    EGAD00010001901 
   
  
    
    This dataset contains CEL files and rma normalized expression value for microarray of stage I lung adenocarcinomas from Asian patients. In total, there are 69 patients and 138 samples, including 69 tumor samples and 69 adjacent normal samples. 
    
   
  
    
      
      Affy miRNA 3.0 array 
      
    
   
  138 
 
  
    EGAD00010001902 
   
  
    
    This dataset contains PAIR files and processed somatic copy number alteration value for array CGH of stage I lung adenocarcinomas from Asian patients. In total, there are 111 patients and 222 samples, including 111 tumor samples and 111 adjacent normal samples. 
    
   
  
    
      
      NimbleGen HG18 CGH 385K 
      
    
   
  222 
 
  
    EGAD00010001905 
   
  
    
    234 samples genotyped at 15 loci LGC Genomics, Hoddesden, UK using the PCR-based KASP assay (Semagn,e tal (2014). Single nucleotide polymorphism genotyping using Kompetitive Allele Specific PCR (KASP): overview of the technology and its application in crop improvement. Mol Breeding 33, 1–14.) 
    
   
  
    
      
      PCR-based KASP assay 
      
    
   
  233 
 
  
    EGAD00010001906 
   
  
    
    234 samples genotyped at 40 loci using the MassArrayiPLEX genotyping assay using the iPLEX Gold genotyping kit (Agena Biosciences, cat. 10148-2) Gabriel et al. (2009) [Gabriel, S. , Ziaugra, L. and Tabbaa, D. (2009), SNP Genotyping Using the Sequenom MassARRAY iPLEX Platform. Current Protocols in Human Genetics, 60: 2.12.1-2.12.18. doi:10.1002/0471142905.hg0212s60]). Products were detected on a MassArray mass spectrophotometer and data were acquired in real time with MassArray RT software 4.0.0.2 (Agena Biosciences). SNP clustering and validation was carried out with Typer 4.0.26.75 software (Agena Biosciences). 
    
   
  
    
      
      MassArrayiPLEX genotyping assay using the iPLEX Gold genotyping kit 
      
    
   
  233 
 
  
    EGAD00010001911 
   
  
    
    Fresh frozen breast cancer H&E tissue images collected and annotated by the International Cancer Genome Consortium (ICGC), that included the BASIS collaboration. Associated with whole genome sequence data as originally described by Nik-Zainal et al, Nature, 2016 (DOI: 10.1038/nature17676) and deposited with ID EGAS00001001178 
    
   
  
    
      
      H and E image 
      
    
   
  151 
 
  
    EGAD00010001913 
   
  
    
    Raw data files for 94 Argentinean samples 
    
   
  
    
      
      Affymetrix Axiom LAT1 
      
    
   
  94 
 
  
    EGAD00010001917 
   
  
    
    GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology 
    
   
  
    
      
      Unknown HumanOmni25M-8v1-1 
      
    
   
  180 
 
  
    EGAD00010001918 
   
  
    
    GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology 
    
   
  
    
      
      Unknown HumanOmni2-5-8-v1-1-C 
      
    
   
  2658 
 
  
    EGAD00010001919 
   
  
    
    GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology 
    
   
  
    
      
      Unknown HumanOmni25-8v1-1 
      
    
   
  58 
 
  
    EGAD00010001921 
   
  
    
    EPIC array data from 72 tumor samples with muscle invasive bladder cancer. 
    
   
  
    
      
      EPIC BeadChip (Illumina, San Diego, CA) 
      
    
   
  72 
 
  
    EGAD00010001923 
   
  
    
    ChIP-seq narrowPeaks. Software: MACS2 v2.1.2 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  23 
 
  
    EGAD00010001924 
   
  
    
    Gene expression gene-level count values from Stringtie processing including alignment to GRCh37 or GRCh38 genome version. Software: HISAT2 v2.1.0;StringTie v1.3.4d. 
    
   
  
    
      
      Illumina HiSeq 2500/NovaSeq 6000 
      
    
   
  438 
 
  
    EGAD00010001925 
   
  
    
    CpG methylation. Software: minimap2 v2.16;Nanopolish. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  202 
 
  
    EGAD00010001926 
   
  
    
    Allelic imbalance (AI) region calls: start and end positions and the measured mBAF and LRR mean of each region after the BAF segmentation algorithm. Software: Illumina GenomeStudio;PennCNV v. 1.0.4;BAF segmentation v1.2.0. 
    
   
  
    
      
      Illumina Infinium HumanCore-24/HumanOmni2.5-8 
      
    
   
  2186 
 
  
    EGAD00010001927 
   
  
    
    Haplotype expression counts. Software: phASER v1.1.1. 
    
   
  
    
      
      Illumina HiSeq 2500/NovaSeq 6000 
      
    
   
  438 
 
  
    EGAD00010001928 
   
  
    
    ATAC-seq non-overlapping fixed width peaks with score normalization. Software: MACS2 v2.1.2 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  31 
 
  
    EGAD00010001930 
   
  
    
    metastatic ccRCC 
    
   
  
    
      
      Affymetrix HTA 2.0 
      
    
   
  409 
 
  
    EGAD00010001932 
   
  
    
    Samples from Puno, Peru 
    
   
  
    
      
      Axiom LAT Array 
      
    
   
  61 
 
  
    EGAD00010001933 
   
  
    
    Samples from eastern Polynesia, Taiwan, and Vanuatu 
    
   
  
    
      
      Axiom LAT Array 
      
    
   
  354 
 
  
    EGAD00010001934 
   
  
    
    Samples from Magdalena de Cao, Peru 
    
   
  
    
      
      Illumina MEGA Array 
      
    
   
  20 
 
  
    EGAD00010001936 
   
  
    
    Gene expression of 12 colon cancer TSCs (sensitive or resistant) after 12h treatment with 3µM NCT02 or DMSO (control) was analyzed using Illumina microarrays (HumanHT-12 v4 BeadChip). 
    
   
  
    
      
      oligonucleotide beads of HumanHT-12 V4 R2 Expression BeadChips (Illumina) 
      
    
   
  24 
 
  
    EGAD00010001938 
   
  
    
    This dataset includes IDAT files from 2,790 blood samples. The samples were profiled using the Illumina Infinium HumanMethylation450 (450k) BeadChip. 
    
   
  
    
      
      Illumina Infinium HumanMethylation450 
      
    
   
  2790 
 
  
    EGAD00010001940 
   
  
    
    DNA methylation measures on neutrophils 
    
   
  
    
      
      Illumina MethylationEPIC BeadChip 
      
    
   
  31 
 
  
    EGAD00010001941 
   
  
    
    DNA methylation measures on CD34 cells 
    
   
  
    
      
      Illumina MethylationEPIC BeadChip 
      
    
   
  4 
 
  
    EGAD00010001943 
   
  
    
    SNP data from 49 paired samples (tumor/germline) with muscle invasive bladder cancer. 
    
   
  
    
      
      Illumina Infinium Human Global Screening Array GSAMD-24v2-0_20024620_a1 BeadChip 
      
    
   
  98 
 
  
    EGAD00010001945 
   
  
    
    Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes lllumina 2.5-8 genotyping of maternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator). 
    
   
  
    
      
      llumina 2.5-8 
      
    
   
  3004 
 
  
    EGAD00010001947 
   
  
    
    Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes OmniExpress genotyping of maternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator). 
    
   
  
    
      
      OmniExpress 
      
    
   
  2305 
 
  
    EGAD00010001949 
   
  
    
    Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes Infinium GSA genotyping of maternal, paternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator). 
    
   
  
    
      
      Infinium GSA 
      
    
   
  2321 
 
  
    EGAD00010001951 
   
  
    
    Called genotypes of samples in batch 1 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  682 
 
  
    EGAD00010001952 
   
  
    
    Raw data files of samples in batch 2 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  190 
 
  
    EGAD00010001953 
   
  
    
    Raw data files of samples in batch 2 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  36 
 
  
    EGAD00010001954 
   
  
    
    Raw data files of samples in batch 1 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  692 
 
  
    EGAD00010001955 
   
  
    
    Raw data files of samples in batch 1 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  397 
 
  
    EGAD00010001956 
   
  
    
    Called genotypes of samples in batch 2 of CRU303 GWAS 
    
   
  
    
      
      Affymetrix Axiom UKB WCSG 
      
    
   
  190 
 
  
    EGAD00010001958 
   
  
    
    Genome-wide data for 98 Native American individuals from Andes and Amazon 
    
   
  
    
      
      Illumina 2.5M Human Omni array 
      
    
   
  98 
 
  
    EGAD00010001960 
   
  
    
    Gene expression after cell culture, 12h Hyper-IL6 stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001961 
   
  
    
    Gene expression after cell culture, 12h unstimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001962 
   
  
    
    Gene expression after cell culture, 24h unstimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001963 
   
  
    
    Gene expression after cell culture, 24h IL6+sgp130-Fc stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001964 
   
  
    
    Gene expression after mammosphere culture, quiescent single cells 
    
   
  
    
   
  10 
 
  
    EGAD00010001965 
   
  
    
    Gene expression after cell culture, 24h IL6 stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001966 
   
  
    
    Gene expression after mammosphere culture, non-label-retaining cells 
    
   
  
    
   
  5 
 
  
    EGAD00010001967 
   
  
    
    Gene expression after cell culture, 12h IL6 stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001968 
   
  
    
    Gene expression after mammosphere culture, label-retaining cells 
    
   
  
    
   
  8 
 
  
    EGAD00010001969 
   
  
    
    Gene expression after cell culture, 24h Hyper-IL6 stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001970 
   
  
    
    Gene expression after cell culture, 12h IL6+sgp130-Fc stimulated 
    
   
  
    
   
  3 
 
  
    EGAD00010001972 
   
  
    
    Array data for oesophageal and related samples - aks_paper_methyl_barretts_release 
    
   
  
    
      
      Illumina 
      
    
   
  107 
 
  
    EGAD00010001974 
   
  
    
    DLBCL DNA methylation data measured by 450k and EPIC Illumina arrays 
    
   
  
    
      
      450k and EPIC Illumina arrays 
      
    
   
  67 
 
  
    EGAD00010001975 
   
  
    
    DNA methylation of ICGC CLL patients measured by Illumina 450k array 
    
   
  
    
      
      Illumina 450k 
      
    
   
  490 
 
  
    EGAD00010001976 
   
  
    
    DLBCL gene expression data using 133.plus.2 Affymetrix array 
    
   
  
    
      
      133.plus.2 
      
    
   
  43 
 
  
    EGAD00010001978 
   
  
    
    1 cell line and 82 Oncoscan SNP tumor initial samples, zipped Affymetrix CEL file types, Oncoscan CNV FFPE Assay Kit, Thermo Fisher Scientific GeneChipTM Scanner 3000 7G 
    
   
  
    
      
      Affymetrix, Thermo Fisher Scientific 
      
    
   
  85 
 
  
    EGAD00010001980 
   
  
    
    Affymetrix SNP6.0 data for 341 DLBCL patients 
    
   
  
    
      
      Affymetrix SNP6.0 
      
    
   
  341 
 
  
    EGAD00010001983 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Europe (UK, Iceland, Norway, and Denmark) and Central Asia (Kazakhstan and Uzbekistan). 
    
   
  
    
   
  4 
 
  
    EGAD00010001984 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Europe (UK, Iceland, Norway, Denmark and Finland). 
    
   
  
    
   
  12 
 
  
    EGAD00010001985 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Central Asia (Kazakhstan and Uzbekistan). 
    
   
  
    
   
  4 
 
  
    EGAD00010001986 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Europe (UK, Iceland, Norway, and Denmark). 
    
   
  
    
   
  10 
 
  
    EGAD00010001987 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Central Asia (Kazakhstan and Uzbekistan). 
    
   
  
    
   
  4 
 
  
    EGAD00010001988 
   
  
    
    Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Europe (UK, Iceland, Norway, Denmark and Finland) and Central Asia (Kazakhstan and Uzbekistan). 
    
   
  
    
   
  4 
 
  
    EGAD00010001990 
   
  
    
    Genome-wide data for 59 Native American individuals from Peru 
    
   
  
    
      
      Illumina 2.5M Human Omni array 
      
    
   
  59 
 
  
    EGAD00010001991 
   
  
    
    Genome-wide data for 71 Native American individuals from Peru 
    
   
  
    
      
      Illumina 2.5M Human Omni array 
      
    
   
  71 
 
  
    EGAD00010001992 
   
  
    
    Genome-wide data for 130 Native American individuals from Peru 
    
   
  
    
      
      Illumina 2.5M Human Omni array 
      
    
   
  130 
 
  
    EGAD00010001994 
   
  
    
    hormone receptor-positive early breast cancer by Nanostring BC360 panel 
    
   
  
    
      
      Nanostring panel 
      
    
   
  612 
 
  
    EGAD00010001996 
   
  
    
    The samples were genotyped on the H3Africa array (~2.3M SNPs) using the Illumina FastTrack Sequencing Service2. The default Illumina pipeline was used for the genotype calling (build GRCh37/hg19). The data was converted to PLINK using the h3abionet/h3agwas/call2plink pipeline and QC done using the h3abionet/h3agwas/qc pipeline 
    
   
  
    
   
  10776 
 
  
    EGAD00010001998 
   
  
    
    DNA methylation analysis of JMML patients from Europe, Japan and USA using EPIC arrays 
    
   
  
    
      
      Infinium HumanMethylation450K and EPIC BeadChip 
      
    
   
  32 
 
  
    EGAD00010001999 
   
  
    
    DNA methylation analysis of JMML patients from Europe, Japan and USA using 450k arrays 
    
   
  
    
      
      Infinium HumanMethylation450K and EPIC BeadChip 
      
    
   
  292 
 
  
    EGAD00010002000 
   
  
    
    DNA methylation analysis of JMML patients from Europe, Japan and USA using EPIC arrays 
    
   
  
    
      
      Infinium HumanMethylation450K and EPIC BeadChip 
      
    
   
  47 
 
  
    EGAD00010002002 
   
  
    
    4 oesophageal cancer derived organoid lines and 2 ovarian cancer derived organoid lines 
    
   
  
    
      
      GSA-MD V3 
      
    
   
  6 
 
  
    EGAD00010002004 
   
  
    
    PDAC primary cell lines methylation 
    
   
  
    
   
  7 
 
  
    EGAD00010002005 
   
  
    
    PDAC PDX methylation 
    
   
  
    
   
  18 
 
  
    EGAD00010002007 
   
  
    
    Mixed exocrine and purified ductal and de-differentiated acinar human cells obtained from normal pancreases. Ductal and de-differentiatedacinar cells where isolated by FACS after 4 days culture of the exocrine mixed population 
    
   
  
    
   
  9 
 
  
    EGAD00010002017 
   
  
    
    DNA methylation arrays were performed to molecularly subtype these samples based on Capper D, Jones DTW, Sill M, et al. DNA methylation-based classification of central nervous system tumours. Nature. 2018;555(7697):469-474. doi:10.1038/nature26000 
    
   
  
    
   
  3 
 
  
    EGAD00010002019 
   
  
    
    This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC exomechip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper 
    
   
  
    
      
      Illumina 
      
    
   
  26067 
 
  
    EGAD00010002020 
   
  
    
    Finnish cases and controls 
    
   
  
    
      
      Illumina 
      
    
   
  2257 
 
  
    EGAD00010002021 
   
  
    
    Greek cases and controls 
    
   
  
    
      
      Illumina 
      
    
   
  195 
 
  
    EGAD00010002022 
   
  
    
    Belgian cases and controls 
    
   
  
    
      
      Illumina 
      
    
   
  896 
 
  
    EGAD00010002023 
   
  
    
    French cases and controls 
    
   
  
    
      
      Illumina 
      
    
   
  624 
 
  
    EGAD00010002024 
   
  
    
    Cases and controls from USA 
    
   
  
    
      
      Illumina 
      
    
   
  13632 
 
  
    EGAD00010002026 
   
  
    
    Clinical remission (ClinR) was defined as the absence of asthma symptoms and medication for at least 12 months, and complete remission (ComR) was defined as ClinR with normal lung function and absence of airway hyperresponsiveness. We analyzed differential DNA methylation of ClinR and ComR comparing to persistent asthma (PersA) in whole blood samples (n=72) and nasal brushing samples (n=97) in a longitudinal cohort of well characterized asthma patients. 
    
   
  
    
      
      Illumina 450K 
      
    
   
  169 
 
  
    EGAD00010002028 
   
  
    
    Array data from a family with high prevalence of psychosis 
    
   
  
    
      
      Infinium Global Screening Array-24 v1.0 (GSA) from Illumina 
      
    
   
  34 
 
  
    EGAD00010002030 
   
  
    
    Array data from a family with high prevalence of psychosis 
    
   
  
    
      
      Infinium Global Screening Array-24 v1.0 (GSA) from Illumina 
      
    
   
  12 
 
  
    EGAD00010002032 
   
  
    
    The genetic structure of Norway 
    
   
  
    
      
      Illumina OmniExpress 24 v 1.1 chip 
      
    
   
  6368 
 
  
    EGAD00010002034 
   
  
    
    CASE SAMPLES USING Affymetrix SNP6.0 technology (Thermo Fisher Scientific company): OncoScan FFPE Assay Kit was used for FFPE tissue samples (designed for degraded DNA) and the Cytoscan HD Array Kit was used for the fresh-frozen tissues 
    
   
  
    
      
      Affymetrix 
      
    
   
  710 
 
  
    EGAD00010002036 
   
  
    
    CUP samples using 850k 
    
   
  
    
      
      Illumina 850k 
      
    
   
  55 
 
  
    EGAD00010002038 
   
  
    
    SNP data for 473 tumor samples 
    
   
  
    
      
      Illumina Infinium Human Global Screening Array GSAMD-24v2-0_20024620_a1 BeadChip 
      
    
   
  473 
 
  
    EGAD00010002039 
   
  
    
    SNP data for 473 germline samples 
    
   
  
    
      
      Illumina Infinium OncoArray-500K 
      
    
   
  473 
 
  
    EGAD00010002041 
   
  
    
    Contains test sample 1-26 
    
   
  
    
      
      Illumina Iscan 
      
    
   
  26 
 
  
    EGAD00010002043 
   
  
    
    USA Multiple Sclerosis cases and controls 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  1830 
 
  
    EGAD00010002044 
   
  
    
    Germany Multiple Sclerosis cases and normal controls 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  1066 
 
  
    EGAD00010002045 
   
  
    
    Belgium Multiple Sclerosis cases and normal controls 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  635 
 
  
    EGAD00010002046 
   
  
    
    France Multiple Sclerosis cases and normal controls 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  741 
 
  
    EGAD00010002047 
   
  
    
    Australia and New Zealand Multiple Sclerosis case 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  1021 
 
  
    EGAD00010002048 
   
  
    
    Finland Multiple Sclerosis case 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  471 
 
  
    EGAD00010002049 
   
  
    
    This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC immunochip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper. 
    
   
  
    
      
      Illumina HumanImmuno v1.0 
      
    
   
  4713 
 
  
    EGAD00010002051 
   
  
    
    Second batch of ChIP-seq narrowPeaks. Software: MACS2 v2.1.2 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD00010002053 
   
  
    
    Data from Infinium EPIC 850K DNA methylation beadchip 
    
   
  
    
      
      Infinium EPIC DNA methylation beadchip 
      
    
   
  139 
 
  
    EGAD00010002055 
   
  
    
    Illumina EPIC methylation array of frontal lobe tissue from post-mortem human brains of the RiMod-FTD project 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip 
      
    
   
  47 
 
  
    EGAD00010002057 
   
  
    
    This file fileset has 4607 Greenlanders scored on the Illumina MEGA array (1,622,813 sites), and has been put on the plus strand. The data is in PLINK bed/bim/fam format. The Greenlandic individuals originate from two population surveys, B99 and IHIT. 
    
   
  
    
      
      Illumina MEGA array 
      
    
   
  4607 
 
  
    EGAD00010002059 
   
  
    
    NIHR BioResource Common Disease Patients 2016. The dataset includes 13489 samples from blood donors, they were not screened for any particular disease, and therefore they are representative of the general population. Genomic data includes 845487 snps collected using the UK BioBank V1 Affymetrix array. Phenotypic data includes gender, age, ethnicity and disease. According to our internal quality check there are 81 duplicates in this dataset. 
    
   
  
    
      
      Genotyped using UK Biobank Axiom Array (Applied Biosystems/Thermofisher), read on GeneTitan Multi Channel System (Affymetrix/ThermoFisher) and analysed with the Axiom Analysis Suite (Applied Biosystems/Thermofisher) 
      
    
   
  13490 
 
  
    EGAD00010002061 
   
  
    
    Sample genotyped with Axiom Human Origins (Affymetrix) 
    
   
  
    
      
      Axiom Human Origins (Affymetrix) 
      
    
   
  37 
 
  
    EGAD00010002063 
   
  
    
    Genome-wide Genotyping of 620 Arab individuals using Illumina iSelect platform with HumanOmniExpress bead chips 
    
   
  
    
      
      Illumina HumanOmniExpress BeadChip 
      
    
   
  1 
 
  
    EGAD00010002065 
   
  
    
    Liverpool Preterm Birth Biomarker Study Transcriptomics 
    
   
  
    
      
      Clariomâ„¢ D Human assay 
      
    
   
  114 
 
  
    EGAD00010002066 
   
  
    
    Liverpool Preterm Birth Biomarker Study Genomics 
    
   
  
    
      
      UK Biobank Axiomâ„¢ array 
      
    
   
  310 
 
  
    EGAD00010002068 
   
  
    
    Raw methylation from cervical samples of controls (both HPV+ and HPV-) 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  527 
 
  
    EGAD00010002069 
   
  
    
    Raw methylation from cervical samples of individuals who did not develop CIN. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  218 
 
  
    EGAD00010002070 
   
  
    
    Raw methylation from cervical samples of cases (CIN1-3+) 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  513 
 
  
    EGAD00010002071 
   
  
    
    Raw methylation from cervical samples of individuals who developed CIN 1-4 years after sampling. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  226 
 
  
    EGAD00010002073 
   
  
    
    Raw methylation from breast biopsies in BRCA mutation carriers or controls before and after 3 months of preventive mifepristone treatment. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  77 
 
  
    EGAD00010002074 
   
  
    
    Raw methylation data from normal breast tissue adjacent to a malignancy (TNBC) 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  14 
 
  
    EGAD00010002075 
   
  
    
    Raw methylation data from breast tissue collected during risk-reducing surgery in BRCA1/2 mutation carriers. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  14 
 
  
    EGAD00010002076 
   
  
    
    Raw methylation data from normal breast tissue. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  14 
 
  
    EGAD00010002077 
   
  
    
    Raw methylation data from triple negative breast cancer. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  14 
 
  
    EGAD00010002079 
   
  
    
    Raw methylation data from cervical samples in controls. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  1094 
 
  
    EGAD00010002080 
   
  
    
    Raw methylation data from cervical samples in controls. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  202 
 
  
    EGAD00010002081 
   
  
    
    Raw methylation data from cervical samples in individuals with breast cancer. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  442 
 
  
    EGAD00010002082 
   
  
    
    Raw methylation data from buccal samples in individuals with breast cancer. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  200 
 
  
    EGAD00010002084 
   
  
    
    Raw methylation data from cervical samples in individuals with endometrial cancer. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  281 
 
  
    EGAD00010002086 
   
  
    
    Raw methylation data from cervical samples in individuals with ovarian cancer. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  289 
 
  
    EGAD00010002088 
   
  
    
    1,094 genotyped Philippine samples 
    
   
  
    
   
  1094 
 
  
    EGAD00010002090 
   
  
    
    HumanCytoSNP 850K on tissue DNA 
    
   
  
    
      
      HumanCytoSNP 850K 
      
    
   
  2 
 
  
    EGAD00010002091 
   
  
    
    HumanCytoSNP-12 v2.1 on tissue DNA 
    
   
  
    
      
      HumanCytoSNP12-2-1 
      
    
   
  13 
 
  
    EGAD00010002093 
   
  
    
    Dublin Aspirin platelet response genomics cohort 
    
   
  
    
      
      UK Biobank Axiom array 
      
    
   
  91 
 
  
    EGAD00010002094 
   
  
    
    Liverpool Aspirin platelet response genomics cohort 
    
   
  
    
      
      UK Biobank Axiom array 
      
    
   
  91 
 
  
    EGAD00010002096 
   
  
    
    Genome-wide methylation analysis of upper urinary tract urothelial carcinoma using Infinium MethylationEPIC BeadChip Kit 
    
   
  
    
      
      illumina 
      
    
   
  94 
 
  
    EGAD00010002098 
   
  
    
    Genome-wide copy number analysis of upper urinary tract urothelial carcinoma using GeneChip Human Mapping 250K Nspl 
    
   
  
    
      
      Affymetrix 
      
    
   
  205 
 
  
    EGAD00010002100 
   
  
    
    Genotype data for new samples in Lopez et al 2021 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins 1 Array 
      
    
   
  1243 
 
  
    EGAD00010002102 
   
  
    
    Genome-Wide Human SNP Array 6.0 or the CytoScan HD array, according to the manufacturer’s instructions (Affymetrix, Santa Clara, CA, USA) now part of Thermo Fisher Scientific (Thermo Fisher Scientific, Inc.) 
    
   
  
    
      
      Genome-Wide Human SNP Array 6.0 or the CytoScan HD array 
      
    
   
  42 
 
  
    EGAD00010002113 
   
  
    
    Genome-wide data for population genetics analyses 
    
   
  
    
      
      Illumina Infinium H3Africa_2017_20021485_A2 
      
    
   
  162 
 
  
    EGAD00010002115 
   
  
    
    Greek Multiple Sclerosis cases and controls 
    
   
  
    
      
      Illumina iSelect 
      
    
   
  195 
 
  
    EGAD00010002116 
   
  
    
    US Multiple Sclerosis cases and controls 
    
   
  
    
      
      Illumina iSelect 
      
    
   
  10584 
 
  
    EGAD00010002117 
   
  
    
    Australia Multiple Sclerosis cases and controls 
    
   
  
    
      
      Illumina iSelect 
      
    
   
  851 
 
  
    EGAD00010002118 
   
  
    
    This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC Replicationchip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper. 
    
   
  
    
      
      Illumina iSelect 
      
    
   
  11711 
 
  
    EGAD00010002124 
   
  
    
    Genotypes generated for Puno cohort Batch 2. Case (PRE) and control (PUN) families were recruited in hospital. Raw genotypes no QC. Includes unrelated genotyping controls (HG). 
    
   
  
    
      
      Affymetrix Axiom LAT 
      
    
   
  467 
 
  
    EGAD00010002125 
   
  
    
    Phenotypes from case families listed in medical records and used in analyses. 
    
   
  
    
   
  558 
 
  
    EGAD00010002126 
   
  
    
    Genotypes generated for Puno cohort Batch 1. Case (PRE) and control (PUN) families were recruited in hospital, and additional unrelated controls were recruited in university (UNA). Raw genotypes no QC. 
    
   
  
    
      
      Affymetrix Axiom LAT 
      
    
   
  480 
 
  
    EGAD00010002127 
   
  
    
    Combined genotypes for Batch 1 and 2 after quality control. All individuals included in analyses. 
    
   
  
    
      
      Affymetrix Axiom LAT 
      
    
   
  877 
 
  
    EGAD00010002132 
   
  
    
    SOMAscan plasma proteome datasets generated from participants consuming the fiber blend snack prototype (study 2) 
    
   
  
    
      
      SOMAscan 1.3K Proteomic Assay 
      
    
   
  70 
 
  
    EGAD00010002133 
   
  
    
    SOMAscan plasma proteome datasets generated from participants consuming the pea fibre snack prototype (study 1) 
    
   
  
    
      
      SOMAscan 1.3K Proteomic Assay 
      
    
   
  72 
 
  
    EGAD00010002137 
   
  
    
    SNP measurement. Illumina BeadArray SNP arrays for the study "Molecular characteristics in Burkitt lymphoma over age groups" 
    
   
  
    
      
      Illumina InfiniumOmniExpressExome-8 
      
    
   
  93 
 
  
    EGAD00010002139 
   
  
    
    Genotype data for BaYaka hunter-gatherers Congo 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins 1 array 
      
    
   
  - 
 
  
    EGAD00010002140 
   
  
    
    Genotype data for Agta hunter-gatherers Philippines 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins 1 array 
      
    
   
  - 
 
  
    EGAD00010002141 
   
  
    
    Genotype data for Palanan farmers Philippines 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins 1 array 
      
    
   
  1 
 
  
    EGAD00010002143 
   
  
    
    Illumina HumanCytoSNP-12v2.1 BeadChip 
    
   
  
    
      
      BeadChip 
      
    
   
  7 
 
  
    EGAD00010002146 
   
  
    
    metabolite levels provided by UM platform (Creative Dynamics Inc, NY, USA) (the data is raw abundance. Mapping was applied on log10 transformed data) 
    
   
  
    
   
  482 
 
  
    EGAD00010002147 
   
  
    
    covarites phenotypes, including gender (1=Female/0=Male), age and contraceptive 
    
   
  
    
   
  482 
 
  
    EGAD00010002148 
   
  
    
    metabolite levels measured by general metabolomics (Boston, USA) (the data is raw abundance. Mapping was applied on log10 transformed data) 
    
   
  
    
      
      flow-injection TOF-M spectrometry. 
      
    
   
  482 
 
  
    EGAD00010002149 
   
  
    
    Genotype data from healthy Dutch individuals measured by Illumina humanOmniExpress Exome-8v1.0 SNP chip Calling by Opticall 7.0 
    
   
  
    
      
      Illumina humanOmniExpress Exome-8v1.0 SNP chip 
      
    
   
  482 
 
  
    EGAD00010002150 
   
  
    
    metabolite levels measured by Brainshake Metabolomics/Nightingale Health metabolic platform (log2) 
    
   
  
    
      
      Nightingale's technology 
      
    
   
  482 
 
  
    EGAD00010002152 
   
  
    
    This resource contains the SV annotations using the AnnotSV tool. The description of annotations can be found in AnnotSV web page https://lbgi.fr/AnnotSV/ or GCAT-BSC web page: http://cg.bsc.es/GCAT_BSC_iberianpanel 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  785 
 
  
    EGAD00010002153 
   
  
    
    This dataset includes the .hap, .legend and .sample files from the GCAT|Panel (Iberian reference panel), built from 785 samples, after QC, from the 808 WGS GCAT cohort, including 30.3M SNVs, 5M Indels and 89K SVs. This resource has been generated using Shapeit4 and WhatsHap software. Technology used HiSeq 4000, read length 150 bp, inner mate disatance 300 bp. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  785 
 
  
    EGAD00010002155 
   
  
    
    Third batch of ChIP-seq narrowPeaks. Software: MACS2 v2.1.2 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  5 
 
  
    EGAD00010002157 
   
  
    
    This dataset includes IDAT files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using the Illumina Infinium MethylationEPIC Array. 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip 
      
    
   
  11 
 
  
    EGAD00010002165 
   
  
    
    Genotyping using Global Screening Array 
    
   
  
    
      
      Global Screening Array 
      
    
   
  50 
 
  
    EGAD00010002166 
   
  
    
    Genotyping using Illumina OncoArray BeadChip 
    
   
  
    
      
      Illumina OncoArray BeadChip 
      
    
   
  332 
 
  
    EGAD00010002168 
   
  
    
    HapMap samples for haplotyping and copy-number profiling via SNP array 
    
   
  
    
      
      Illumina HumanCytoSNP-12 v2.1 
      
    
   
  11 
 
  
    EGAD00010002169 
   
  
    
    PGT samples for haplotyping and copy-number profiling via SNP array 
    
   
  
    
      
      Illumina HumanCytoSNP-12 v2.1 
      
    
   
  39 
 
  
    EGAD00010002171 
   
  
    
    Cohort: Raw genotype files for Hostage2 cohort. Genotype Chip: Illumina’s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37 
    
   
  
    
   
  306 
 
  
    EGAD00010002172 
   
  
    
    Cohort: Raw genotype files for BRACOVID cohort. Genotype Chip: Axiom_PMRA.r3 array genomic build: b37 
    
   
  
    
   
  348 
 
  
    EGAD00010002173 
   
  
    
    Cohort: Raw genotype files for Hostage3 cohort. Genotype Chip: Illumina’s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37 
    
   
  
    
   
  71 
 
  
    EGAD00010002174 
   
  
    
    Cohort: Raw genotype files for INMUNGEN_CoV2 cohort. Genotype Chip: HumanCore Exome Chip (Illumina) and Axiom Spanish Biobank Array (Thermofisher) genomic build: b37 
    
   
  
    
   
  367 
 
  
    EGAD00010002176 
   
  
    
    Cohort: Raw genotype files for SPGRX cohort. Genotype Chip: the Illumina Global Screening Array-24 v3.0 genomic build: b38 
    
   
  
    
   
  364 
 
  
    EGAD00010002177 
   
  
    
    Cohort: Raw genotype files for GEN_COVID cohort. Genotype Chip: Illumina Global Screening Array-24 v3.0 + Multi-Disease beadchip genomic build: b37 
    
   
  
    
   
  1141 
 
  
    EGAD00010002178 
   
  
    
    Cohort: Raw genotype files for Hostage4 cohort. Genotype Chip: Illumina’s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37 
    
   
  
    
   
  121 
 
  
    EGAD00010002179 
   
  
    
    Cohort: Raw genotype files for BelCovid2 cohort. Genotype Chip: Illumina Global Screening Array-24 v3.0 + Multi-Disease beadchip genomic build: b37 
    
   
  
    
   
  392 
 
  
    EGAD00010002180 
   
  
    
    Cohort: Raw genotype files for Hostage1 cohort. Genotype Chip: Illumina’s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37 
    
   
  
    
   
  847 
 
  
    EGAD00010002182 
   
  
    
    Tumor biopsies profiled by DNA methylation array 
    
   
  
    
      
      Illumina Human Methylation EPIC 
      
    
   
  133 
 
  
    EGAD00010002184 
   
  
    
    Genotype data of 7,281 individuals with colorectal cancer from the National Study of Colorectal Cancer Genetics (NSCCG) study. Individuals genotyped on the Illumina OncoArray. Data provided in plink format and has not been quality controlled. Control samples used were obtained from the PRACTICAL and BCAC consortia, and are available through the respective Data Access Coordination Committees (http://practical.icr.ac.uk and http://bcac.ccge.medschl.cam.ac.uk/) 
    
   
  
    
      
      Illumina OncoArray 
      
    
   
  7281 
 
  
    EGAD00010002186 
   
  
    
    Genotype data of 1,950 individuals from the COIN and COIN-B trials of advanced/metastatic colorectal cancer. Data provided in plink format, and has been quality controlled. Control data used was from the WTCCC2 project National Blood Donors (NBS) Cohort (EGAD00000000024). 
    
   
  
    
      
      Illumina 
      
    
   
  1950 
 
  
    EGAD00010002188 
   
  
    
    Genome-wide genotypes of women with misoprostol-induced high fever 
    
   
  
    
      
      Illumina Infinium Global Screening Array 
      
    
   
  50 
 
  
    EGAD00010002189 
   
  
    
    Genome-wide genotypes of women with misoprostol-induced high fever 
    
   
  
    
      
      Illumina Infinium Global Screening Array 
      
    
   
  46 
 
  
    EGAD00010002191 
   
  
    
    SOMAscan plasma proteome datasets generated from participants consuming the orange fiber snack prototype (study 2) 
    
   
  
    
      
      SOMAscan 1.3K Proteomic Assay 
      
    
   
  - 
 
  
    EGAD00010002192 
   
  
    
    SOMAscan plasma proteome datasets generated from participants consuming the pea fiber snack prototype (study 1) 
    
   
  
    
      
      SOMAscan 1.3K Proteomic Assay 
      
    
   
  - 
 
  
    EGAD00010002194 
   
  
    
    Raw idat files for 90 RS + DLBCL + CLL samples. 
    
   
  
    
      
      Illumina EPIC microarray 
      
    
   
  90 
 
  
    EGAD00010002198 
   
  
    
    (A)FAP Colon Crypt - EPIC Methylation Array 
    
   
  
    
   
  1 
 
  
    EGAD00010002199 
   
  
    
    Endometrium Gland - EPIC Methylation Array 
    
   
  
    
   
  1 
 
  
    EGAD00010002200 
   
  
    
    Normal Colon Crypt - EPIC Methylation Array 
    
   
  
    
   
  1 
 
  
    EGAD00010002201 
   
  
    
    Small Intestine Crypt - EPIC Methylation Array 
    
   
  
    
   
  1 
 
  
    EGAD00010002206 
   
  
    
    KIR gene content imputation from single-nucleotide polymorphisms in the Finnish population 
    
   
  
    
      
      SNP genotyping array 
      
    
   
  818 
 
  
    EGAD00010002209 
   
  
    
    Expression dataset for CD34 sorted primary CML bone marrow samples 
    
   
  
    
      
      Illumina Beadchip HT12v4 
      
    
   
  34 
 
  
    EGAD00010002210 
   
  
    
    Expression dataset for DAC+PTC209 treated CD34 sorted primary CML bone marrow samples 
    
   
  
    
      
      Illumina Beadchip HT12v4 
      
    
   
  44 
 
  
    EGAD00010002211 
   
  
    
    Expression dataset for DAC treated CD34 sorted primary CML bone marrow samples 
    
   
  
    
      
      Illumina Beadchip HT12v4 
      
    
   
  48 
 
  
    EGAD00010002213 
   
  
    
    96 genotyped Philippine samples 
    
   
  
    
      
      Illumina 
      
    
   
  96 
 
  
    EGAD00010002216 
   
  
    
    DNA methylation array from primary samples 
    
   
  
    
      
      Illumina 450K 
      
    
   
  65 
 
  
    EGAD00010002218 
   
  
    
    TIGER samples PISA genotyping array data 
    
   
  
    
      
      Illumina 
      
    
   
  127 
 
  
    EGAD00010002220 
   
  
    
    Single blastomeres from blastocyst and familial samples for haplotyping and copy-number profiling via SNP array 
    
   
  
    
      
      Illumina HumanCytoSNP-12 v2.1 
      
    
   
  21 
 
  
    EGAD00010002223 
   
  
    
    Real patient variability Benchmark Dataset for Optimization of DIA data analysis workflows 
    
   
  
    
      
      Orbitrap Eclipse 
      
    
   
  92 
 
  
    EGAD00010002225 
   
  
    
    SNP Array from CB1003 using Cytoscan HD, Thermo Fisher Scientific 
    
   
  
    
      
      Cytoscan HD 
      
    
   
  3 
 
  
    EGAD00010002229 
   
  
    
    ASD samples using Illumina Infinium Human Core-24 BeadChip platform 
    
   
  
    
      
      Illumina Infinium HumanCore-24 BeadChip platform 
      
    
   
  139 
 
  
    EGAD00010002231 
   
  
    
    Raw methylation data from buccal samples. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  227 
 
  
    EGAD00010002232 
   
  
    
    Raw methylation data from cervical samples. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  229 
 
  
    EGAD00010002233 
   
  
    
    Raw methylation data from blood samples. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  232 
 
  
    EGAD00010002235 
   
  
    
    CN Array samples from lymphoma patient tumours on Affymetrix platforms 
    
   
  
    
      
      Affymetrix Oncoscan, Affymetrix SNP6.0, Affymetrix Cytocan HD 
      
    
   
  95 
 
  
    EGAD00010002237 
   
  
    
    Proteom characterization in primary colorectal cancer and corresponding liver metastasis 
    
   
  
    
      
      Qexactive Plus 
      
    
   
  42 
 
  
    EGAD00010002239 
   
  
    
    Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_cytoscan) 
    
   
  
    
      
      Cytoscan 
      
    
   
  749 
 
  
    EGAD00010002241 
   
  
    
    Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_oncoscan) 
    
   
  
    
      
      Oncoscan 
      
    
   
  349 
 
  
    EGAD00010002243 
   
  
    
    Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_agilent) 
    
   
  
    
      
      Agilent 
      
    
   
  56 
 
  
    EGAD00010002248 
   
  
    
    Genotypes for 2 human skeletal muscle samples 
    
   
  
    
      
      Illumina Infinium multi-ethnic global-8 v1 kit 
      
    
   
  2 
 
  
    EGAD00010002250 
   
  
    
    Genotyping array data for normal mammary gland control samples 
    
   
  
    
      
      Illumina 
      
    
   
  50 
 
  
    EGAD00010002251 
   
  
    
    Genotyping array data for breast cancer and matched normal mammary gland samples 
    
   
  
    
      
      Illumina 
      
    
   
  100 
 
  
    EGAD00010002253 
   
  
    
    Methylation data of tumors using illumina Infinium MethylationEPIC 
    
   
  
    
      
      Infinium MethylationEPIC 
      
    
   
  57 
 
  
    EGAD00010002255 
   
  
    
    A total of 87 microarrays from HCC patients treated with anti-PD1 inhibitors 
    
   
  
    
      
      Clariom S Array, human 
      
    
   
  87 
 
  
    EGAD00010002257 
   
  
    
    SNP Array Data for EGAS00001004666 
    
   
  
    
      
      Illumina Global Screening Array-24 V1 HTS GSA+Multi-Disease 
      
    
   
  100 
 
  
    EGAD00010002259 
   
  
    
    Myeloma methylation data 
    
   
  
    
      
      Illumina Infinium HumanMethylation450 (450k) 
      
    
   
  442 
 
  
    EGAD00010002261 
   
  
    
    Genotypes generated for study investigating signals of selection in Peruvians from three ecological regions. 96 genotypes in plink format after QC filtering (missingness per individual, per variant and minor allele freq). See publication for more details on QC filtering. 
    
   
  
    
      
      Illumina MEGA Array 
      
    
   
  95 
 
  
    EGAD00010002263 
   
  
    
    Nasal DNA methylation at three CpG sites predicts childhood allergic disease 
    
   
  
    
      
      Illumina 450K 
      
    
   
  696 
 
  
    EGAD00010002273 
   
  
    
    Polynesian genotypes 
    
   
  
    
      
      AxiomLAT 
      
    
   
  78 
 
  
    EGAD00010002275 
   
  
    
    Raw methylation array data for tumor samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma. 
    
   
  
    
      
      Illumina MethylationEPIC BeadChip Array 
      
    
   
  48 
 
  
    EGAD00010002277 
   
  
    
    Whole-genome DNA methylation profiling of PBL obtained from male patients with PSC-UC, or UC alone, or healthy individuals. 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  47 
 
  
    EGAD00010002279 
   
  
    
    Illumina EPIC arrays of human osteoblastomas and their mimics 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip array 
      
    
   
  50 
 
  
    EGAD00010002281 
   
  
    
    Assessment of methylation status of ~850,000 sites 
    
   
  
    
      
      Illumina HT12 
      
    
   
  20 
 
  
    EGAD00010002283 
   
  
    
    Genome-wide DNA Methylation Data from Illumina HumanMethylationEPIC arrays for whole blood samples from 403 healthy individuals. Additional raw data (IDAT files) and associated phenotype information are available for all individuals included in this study (n=570) directly from CIBMTR. Data are available under controlled access release upon reasonable request and execution of a data use agreement. Requests should be submitted to CIBMTR at info-request@mcw.edu and include the study reference IB17-04 
    
   
  
    
      
      EPIC BeadChip 
      
    
   
  403 
 
  
    EGAD00010002285 
   
  
    
    The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 452 individuals generated and analyzed in Kutanan, Liu et al 2021 study of 33 ethnolinguistic groups in Thailand and Laos. 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins array 
      
    
   
  452 
 
  
    EGAD00010002287 
   
  
    
    The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 260 individuals generated and analyzed in Liu et al 2020 study of 22 ethnolinguistic groups in Vietnam. 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins array 
      
    
   
  260 
 
  
    EGAD00010002289 
   
  
    
    DNA methylation profiles of samples included in the EORTC 26091 TAVAREC trial 
    
   
  
    
      
      Infinium MethylationEpic BeadChip array 
      
    
   
  125 
 
  
    EGAD00010002291 
   
  
    
    Blood samples were obtained from 119 healthy individuals of British ancestry. Genomic DNA was isolated from a suspension of PBMCs from each individual using a DNA isolation kit (Qiagen). Genotyping was then performed using the Infinium CoreExome-24 (v1.3) chip (Illumina). 
    
   
  
    
      
      Infinium CoreExome-24 (v1.3) chip (Illumina) 
      
    
   
  127 
 
  
    EGAD00010002294 
   
  
    
    Single Nucleotide Polymorphisms in autosomes of Canary Islanders 
    
   
  
    
      
      Axiom® Genome-Wide Human CEU 1 Array 
      
    
   
  863 
 
  
    EGAD00010002296 
   
  
    
    nasopharyngeal carcinoma genome-wide human SNP array data for 4083 NPC cases and 4811 controls 
    
   
  
    
      
      Illumina 
      
    
   
  8894 
 
  
    EGAD00010002298 
   
  
    
    nasopharyngeal carcinoma genome-wide human SNP array data for 423 NPC cases and 573 controls 
    
   
  
    
      
      Illumina 
      
    
   
  996 
 
  
    EGAD00010002302 
   
  
    
    the tar archive contains unflitered genotype data from Reich et al AJHG 2011 study in plink format 
    
   
  
    
      
      Affymetrix 6.0 array 
      
    
   
  262 
 
  
    EGAD00010002304 
   
  
    
    The tar archive contains a) the txt file with the genotypes, b) illumina annotation file with info on SNPs, c) sample info file unfiltered illumina data, autosomes only data from Pugach et al MBE 2016 The Complex Admixture History and Recent Southern Origins of Siberian Populations 
    
   
  
    
      
      Illumina 660W-Quad arrays 
      
    
   
  96 
 
  
    EGAD00010002306 
   
  
    
    The tar archive contains unflitered genotype data from Pugach et al 2018 in plink format 
    
   
  
    
      
      Affymetrix Axiom Human Origins array 
      
    
   
  181 
 
  
    EGAD00010002308 
   
  
    
    Combined genotyping files from 13 PBMC samples 
    
   
  
    
      
      illumina Infinium Omni2.5-8 
      
    
   
  13 
 
  
    EGAD00010002310 
   
  
    
    Renal cell carcinoma (RCC) cases comprised adult patients with histologically proven RCC were collected through two sources within the UK. First, 856 cases from SORCE, a MRC collection of surgically treated RCC cases ascertained through UK clinical oncology centres. Second, 189 RCC cases collected through the ICR and Royal Marsden NHS Hospitals Trust. Cases included 590 clear cell carcinomas (CCCs), 42 papillary carcinomas (PCs), 33 chromophobe carcinomas (CCs) and 19 mixed or other histological subtypes. DNA was extracted from EDTA-venous blood samples using the conventional methods and quantified using PicoGreen (Invitrogen). Cases were genotyped using the Human OmniExpress-12 BeadChip according to the manufacturer's recommendations (Illumina Inc, San Diego, CA, USA). After strict QC, 944 cases were retained. Data provided in plink format. Controls used were data from the Wellcome Trust Case Control Consortium 2 (WTCCC2) 1958 birth cohort and the UK Blood Service Control Group (available as EGAS00000000028).  
    
   
  
    
      
      Illumina Omni Express BeadChip 
      
    
   
  944 
 
  
    EGAD00010002312 
   
  
    
    Illumina EPIC arrays Naevus Melanoma Spitz Case 
    
   
  
    
      
      Illumina EPIC array 
      
    
   
  24 
 
  
    EGAD00010002314 
   
  
    
    Comparative proteome-based analysis of different autologous bone entities used for alveolar onlay grafting 
    
   
  
    
      
      Qexactive Plus 
      
    
   
  75 
 
  
    EGAD00010002316 
   
  
    
    Column 1 rsid: SNP identifier;Column 2 chromosome: name of chromosome on which the SNP is located;Column 3: position: base pair position on the chromosome;Column 4 minor_test_allele: the base that constitutes the minor allele;Column 5 major_allele: the base that constitutes the major allele;Column 6 maf: the frequency of the minor allele, indicated as a fraction of 1;Column 7 allele_freq_cases: the minor allele frequency in cases;Column 8 allele_freq_controls: the minor allele frequency in controls;Column 9 regression_pvalue: the p-value for the difference in allele frequency between cases and controls;Column 10 odds_ratio: the odds ratio, as calculated using logistic regression under an additive model with adjustment for the first ten principal components of ancestry 
    
   
  
    
   
  1 
 
  
    EGAD00010002319 
   
  
    
    Japanese COVID-19 PLINK file 
    
   
  
    
      
      Infinium Asian Screening Array (Illumina, USA) 
      
    
   
  2393 
 
  
    EGAD00010002321 
   
  
    
    Methylation arrays (850K) 
    
   
  
    
      
      EPIC BeadChips (Illumina) 
      
    
   
  33 
 
  
    EGAD00010002323 
   
  
    
    GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Reustle et al, Genome Med 12:2020 32. Preprocessing of microarray data was performed using Robust Multi-array Average (RMA). 
    
   
  
    
      
      GeneChip HTA 2.0 
      
    
   
  53 
 
  
    EGAD00010002325 
   
  
    
    high-risk localized ccRCC 
    
   
  
    
      
      Affymetrix HTA 2.0 
      
    
   
  236 
 
  
    EGAD00010002327 
   
  
    
    This study includes 1146 samples of host genotyping data (imputed) from Illumina Omni arrays, using https://imputation.sanger.ac.uk/ with the Haplotype Reference Consortium v1.1. Samples were collected from adults (>16 yrs) patients with CSF confirmed bacterial meningitis in the Netherlands between 2006 and 2015. Metadata includes patient outcome, species of bacteria, and for 467 samples a link to an ENA run with the associated bacterial genome (S. pneumoniae only). 
    
   
  
    
      
      Illumina Human Omni1-Quad beadchip. 
      
    
   
  1149 
 
  
    EGAD00010002328 
   
  
    
    This study includes 1146 samples of host genotyping data (genotyped) from Illumina Omni arrays. Samples were collected from adults (>16 yrs) patients with CSF confirmed bacterial meningitis in the Netherlands between 2006 and 2015. Metadata includes patient outcome, species of bacteria, and for 467 samples a link to an ENA run with the associated bacterial genome (S. pneumoniae only). 
    
   
  
    
      
      Illumina Human Omni1-Quad beadchip. 
      
    
   
  1149 
 
  
    EGAD00010002330 
   
  
    
    We performed a proteomic serum profiling of patients with non-metastasized breast cancer (BC) who received neoadjuvant chemotherapy (NACT). Samples were collected at three timepoints during NACT. Furthermore, we compared serum samples of BC patients pre-NACT to a control group of healthy volunteers. 
    
   
  
    
      
      Qexactive Plus 
      
    
   
  84 
 
  
    EGAD00010002336 
   
  
    
    K562 cells were treated with different HSP90 inhibitors (PuH71 and Coumermycin A1) and the CNV profil was compared to the parental K562 (untreated). In addition, the CNV profile of HSP90AB1 knockout K562 cells was analyzed. 
    
   
  
    
      
      Illumina NextSeq 550 
      
    
   
  4 
 
  
    EGAD00010002338 
   
  
    
    Chordoma tumors DNA methylation profiling by genome tiling array 
    
   
  
    
      
      Illumina 
      
    
   
  68 
 
  
    EGAD00010002340 
   
  
    
    PLINK file of Japanese controls 
    
   
  
    
      
      Infinium Asian Screening Array (Illumina, USA) 
      
    
   
  2380 
 
  
    EGAD00010002342 
   
  
    
    Diagnostic yield of affymetrix optima microarray in patients with non-syndromic autism spectrum disorders in India. 
    
   
  
    
      
      Affymetrix CytoScan Optima 
      
    
   
  99 
 
  
    EGAD00010002344 
   
  
    
    Blood DNA samples from 1,433 contemporary ni-Vanuatu were genotyped on the Illumina Infinium Omni 2.5-8 array. Genotype calling was performed using the Illumina GenomeStudio software. 
    
   
  
    
      
      Infinium Omni2.5-8 BeadChip 
      
    
   
  1433 
 
  
    EGAD00010002346 
   
  
    
    Human islet samples genotype data 
    
   
  
    
      
      NA 
      
    
   
  128 
 
  
    EGAD00010002350 
   
  
    
    Shotgun Proteomics; Glioblastoma samples from 11 patients were obtained at initial and recurrent tumor stages. Proteins were extracted, identified and quantified via tandem mass spectrometry based on a TMT isobaric labelling approach. Quatitative proteomics reveals 146 differentially abundant proteins using a patient-matched statistical modelling. Analysis of proteolytic processing reveals differential proteolytic patterns in recurrent tumors. Proteogenomics reveals the presense of 30 single-amino acid variants present in glioblastoma tumor and 1 of those as increased in recurrent tumor. 
    
   
  
    
      
      Qexactive Plus 
      
    
   
  22 
 
  
    EGAD00010002352 
   
  
    
    GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Reustle et al., Clin Transl Med 12:2022 e883. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.26.0, R version 3.6.1). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 23). 
    
   
  
    
      
      GeneChip HTA 2.0 
      
    
   
  124 
 
  
    EGAD00010002353 
   
  
    
    GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Buettner et al, Genome Med 2022. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.26.0, R version 3.6.1). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 23). 
    
   
  
    
      
      GeneChip HTA 2.0 
      
    
   
  306 
 
  
    EGAD00010002355 
   
  
    
    Methylation microarray data (Illumina 850K) of 52 thymic epithelial tumors. 13 patients with thymoma A and B, 32 thymic carcinoma (TC) and 7 neuroendocrine tumors of the thymus (NET). 
    
   
  
    
      
      Illumina 850k 
      
    
   
  52 
 
  
    EGAD00010002357 
   
  
    
    Methylation files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina Infinium MethylationEPIC 
      
    
   
  12 
 
  
    EGAD00010002359 
   
  
    
    Tumor and matched normal DNA profiling by SNP array 
    
   
  
    
      
      Illumina Infinium OmniExpress-24 BeadChip array 
      
    
   
  111 
 
  
    EGAD00010002361 
   
  
    
    Samples from the Mexican Biobank 
    
   
  
    
      
      Illumina MEGA Array 
      
    
   
  6057 
 
  
    EGAD00010002363 
   
  
    
   
  
    
      
      Affymetrix 
      
    
   
  46 
 
  
    EGAD00010002365 
   
  
    
    The 6431 samples were genotyped on the H3Africa array. 
    
   
  
    
      
      Illumina 
      
    
   
  6431 
 
  
    EGAD00010002367 
   
  
    
    CONTROL_SAMPLES using platform ….. 
    
   
  
    
   
  155 
 
  
    EGAD00010002368 
   
  
    
    CASE_SAMPLES using platform ….. 
    
   
  
    
   
  113 
 
  
    EGAD00010002370 
   
  
    
    450k methylation arrays of primary and relapse tumor of a single case of sonic hedgehog medulloblastoma with Li-Fraumeni syndrome 
    
   
  
    
      
      HumanMethylation450 
      
    
   
  2 
 
  
    EGAD00010002372 
   
  
    
    This dataset includes IDAT files from 160 samples (57 primary prostate cancers, 95 proste-derived brain metastases, and 7 normal tissues). The samples were profiled using the Illumina Infinium MethylationEPIC BeadChips (850K) 
    
   
  
    
      
      Illumina Infinium MethylationEPIC 850K 
      
    
   
  160 
 
  
    EGAD00010002374 
   
  
    
    DNA was extracted from saliva samples and genotyping was performed on Illumina Infinium Global Screening Array. 
    
   
  
    
      
      Global Screening Array 
      
    
   
  1880 
 
  
    EGAD00010002375 
   
  
    
    DNA was extracted from saliva samples and genotyping was performed on Illumina Infinium HumanCoreExome beadchips. 
    
   
  
    
      
      HumanCoreExome 
      
    
   
  3295 
 
  
    EGAD00010002377 
   
  
    
    Gene transcript data from ALI-cultured airway cells, acquired using microarrays. 
    
   
  
    
      
      Affymetrix Genetitan 
      
    
   
  19 
 
  
    EGAD00010002379 
   
  
    
    This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Visium Experiments, as well as the corresponding imaging data (Analyses). 
    
   
  
    
      
      10x Genomics spatial transcriptomics (Visium) 
      
    
   
  8 
 
  
    EGAD00010002381 
   
  
    
    This dataset includes raw label-free mass spectrometry proteomics data of different sinonasal tumor entities as well as normal sinonasal tissue. 72 samples were processed on a Q Exactive HF-X instrument coupled to an easy nanoLC 1200 system using one microgram of peptides and an 110 minutes gradient. 
    
   
  
    
      
      Q Exactive HF-X instrument 
      
    
   
  72 
 
  
    EGAD00010002383 
   
  
    
    Data from 59 whole blood samples from pregnant mothers, unexposed and exposed to the Rwandan genocide, was generated using Infinium MethylationEPIC BeadChip Kit. 
    
   
  
    
      
      IlluminaEpic 
      
    
   
  59 
 
  
    EGAD00010002386 
   
  
    
    DNA methylation of PDAC prescursors and normal pancreas cell population 
    
   
  
    
      
      Illumina EPIC Array 
      
    
   
  108 
 
  
    EGAD00010002388 
   
  
    
    Shotgun Proteomics, Proteomic characterization of the residual PDAC tumor mass after neoadjuvant chemo or combined chemo-radiation therapy 
    
   
  
    
      
      Qexactive Plus 
      
    
   
  79 
 
  
    EGAD00010002390 
   
  
    
    Paediatric tumour cell models DNA methylation EPIC array 
    
   
  
    
      
      EPIC 
      
    
   
  151 
 
  
    EGAD00010002392 
   
  
    
    GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) and RCC metastases related to Guergen et al, Front Oncol 12:2022 889789. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.34.0). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 25). 
    
   
  
    
      
      GeneChip HTA 2.0 
      
    
   
  24 
 
  
    EGAD00010002394 
   
  
    
    Genome-wide SNP from 221 individuals from Northwestern Amazonia genotyped on the Affymetrix Human Origins Array 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins array 
      
    
   
  221 
 
  
    EGAD00010002396 
   
  
    
    Analysis of cocaine use disorder (CUD) associated epigenome-wide DNA methylation (DNAm) alterations in human postmortem brain tissue of Brodmann Area 9. Tissue samples from N=21 CUD cases and N=21 individuals without CUD originating from the Douglas Bell Canada Brain Bank (DBCBB) were included. Epigenome-wide DNAm was investigated using the Illumina Infinium MethylationEPIC array. 
    
   
  
    
      
      Infinium MethylationEPIC array 
      
    
   
  84 
 
  
    EGAD00010002398 
   
  
    
    Tumor and matched normal DNA profiling by SNP array 
    
   
  
    
      
      Illumina Infinium OmniExpress-24 BeadChip array 
      
    
   
  82 
 
  
    EGAD00010002400 
   
  
    
    We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles 
    
   
  
    
      
      Illumina Infinium EPIC 
      
    
   
  8 
 
  
    EGAD00010002404 
   
  
    
    Methylation array 
    
   
  
    
      
      iScan 
      
    
   
  6 
 
  
    EGAD00010002406 
   
  
    
    Tertiary lymphoid structure signatures are associated with immune checkpoint inhibitor related acute interstitial nephritis 
    
   
  
    
      
      Nanostring 
      
    
   
  22 
 
  
    EGAD00010002408 
   
  
    
    Two primary tumor-derived PDAC organoids were subjected to SNP array, RNA-seq, and single-cell WGS 
    
   
  
    
      
      Illumina Infinium Global Screening Array-24 
      
    
   
  2 
 
  
    EGAD00010002410 
   
  
    
    Average methylation difference 12 months vs 0 months at Roadmap Epigenomics chromatin state annotations from different cell types using nanopolish. Data from 8 individuals. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002411 
   
  
    
    Average hypermethylation on transcription factor binding sites based on nanopolish calls; only positions showing higher methylation than sample’s average methylation at enhancers were included when defining the average methylation level. Data from 6 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002412 
   
  
    
    Average genome-wide methylation levels per sample at different time points using nanopolish calls. Data from 8 individuals. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002413 
   
  
    
    Average methylation levels based on nanopolish calls from Roadmap Epigenomics chromatin state annotations using different cell types. Data from 8 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002414 
   
  
    
    Average hydroxymethylation levels based on megalodon calls from Roadmap Epigenomics chromatin state annotations using different cell types. Data from 8 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002415 
   
  
    
    Average hydroxymethylation difference 12 months vs 0 months at Roadmap Epigenomics chromatin state annotations from different cell types. Data from 8 individuals. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002416 
   
  
    
    Proportion of hyper- and hypomethylated positions at Roadmap annotations. Data from 8 individuals. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002417 
   
  
    
    Average hydroxymethylation levels based on megalodon calls from Roadmap Epigenomics histone mark annotations using different cell types. Data from 8 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002418 
   
  
    
    CpG hydroxymethylation. Software: minimap2 v.2.16; Megalodon. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  24 
 
  
    EGAD00010002419 
   
  
    
    Average genome-wide hydroxymethylation levels per sample at different time points using megalodon calls. Data from 8 individuals. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002420 
   
  
    
    CpG methylation. Software: minimap2 v2.16;Nanopolish. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  24 
 
  
    EGAD00010002421 
   
  
    
    Average hydroxymethylation levels on transcription factor binding sites obtained from ENCODE (ChIP-sequencing of GM12878 lymphoblastoid cell line). Data from 6 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002422 
   
  
    
    Average methylation levels based on nanopolish calls from Roadmap Epigenomics histone mark annotations using different cell types. Data from 8 individuals at different time points. 
    
   
  
    
      
      Oxford Nanopore 
      
    
   
  1 
 
  
    EGAD00010002424 
   
  
    
    The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 55 individuals generated and analyzed in Liu et al 2023 study of Taiwanese groups. 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins array 
      
    
   
  55 
 
  
    EGAD00010002427 
   
  
    
    PLINK file of the Japanese population 
    
   
  
    
      
      Infinium Asian Screening Array (Illumina, USA) 
      
    
   
  142 
 
  
    EGAD00010002431 
   
  
    
    RCC files of 17 Cartridges' Panel Standards 
    
   
  
    
      
      NanoString nCounter® PanCancer IO 360™ 
      
    
   
  17 
 
  
    EGAD00010002432 
   
  
    
    RCC files of 17 Cartridges from metastatic melanoma 
    
   
  
    
      
      NanoString nCounter® PanCancer IO 360™ 
      
    
   
  185 
 
  
    EGAD00010002434 
   
  
    
    51 Ashaninka individuals from Peru (Pasco) genotyped with Axiom Human Origins (Affymetrix) 
    
   
  
    
      
      Axiom Human Origins (Affymetrix) 
      
    
   
  51 
 
  
    EGAD00010002436 
   
  
    
    postQC genotype data from the Affymetrix AxiomTM HGCoV2 1 array in plink binary format. QC was carried out using PLINK v1.9 
    
   
  
    
      
      Affymetrix AxiomTM HGCoV2 1 
      
    
   
  1192 
 
  
    EGAD00010002437 
   
  
    
    preQC genotype data from the Affymetrix AxiomTM HGCoV2 1 array in plink ped/map format 
    
   
  
    
   
  1226 
 
  
    EGAD00010002441 
   
  
    
    Methylation data on tumor (n=102) and normal nerve (n=7) DNA samples 
    
   
  
    
      
      Infinium HumanMethylationEPIC beadchip array 
      
    
   
  109 
 
  
    EGAD00010002443 
   
  
    
    Microarray data of 14 patient-derived PDAC cultures 
    
   
  
    
      
      Affymetrix Human Clariom S 
      
    
   
  14 
 
  
    EGAD00010002445 
   
  
    
    Genotyping data for ACE2 (rs2285666), MX1 (rs469390) and TMPRSS2 (rs2070788) variants. Patients are classified as mild (n=34) and severe (n=32). DNA genotyping was performed using the TaqMan® Genotyping Master Mix (Applied Biosystems). Allelic discrimination assays were performed on a 7900HT Fast Real-Time PCR System (Applied Biosystems). 
    
   
  
    
      
      7900HT Fast Real-Time PCR System 
      
    
   
  66 
 
  
    EGAD00010002447 
   
  
    
    SomaLogic data 
    
   
  
    
      
      SomaLogic 
      
    
   
  1188 
 
  
    EGAD00010002449 
   
  
    
    Genotype data for 343 Japanese subjects obtained with Infinium Asian Screening Array. 
    
   
  
    
      
      Infinium Asian Screening Array 
      
    
   
  1 
 
  
    EGAD00010002451 
   
  
    
    Methylation profiling of 345 sarcoma and TFCP2-rearranged rhadomyosarcoma samples, using the approach described "Genomic, transcriptomic, functional, and mechanistic characterization of rhabdomyosarcoma with FUS-TFCP2 or EWSR1-TFCP2 fusions" 
    
   
  
    
      
      Infinium Methylation EPIC BeadChip 
      
    
   
  345 
 
  
    EGAD00010002453 
   
  
    
    Synthetic dataset containing genome-wide genotypes of 500.000 individuals was generated using a hybrid approach combining coalescent approach and resampling based methods
 
    
   
  
    
   
  500000 
 
  
    EGAD00010002456 
   
  
    
    This dataset included 110 samples with high hyperdiploid acute lymphoblastic leukemia that were genotyped using Affymetrix SNP Array or Illumina's BeadArray platform. 
    
   
  
    
      
      Affymetrix CytoScan HD, Illumina Human1M-Duo v3.0, Illumina HumanOmni1-Quad v1.0 and Illumina HumanOmni5-4v1 
      
    
   
  110 
 
  
    EGAD00010002458 
   
  
    
    The compressed file contains plink format files for the Affymetrix Human Origins SNP array data of 208 Angolan individuals 
    
   
  
    
      
      Affymetrix Axiom Genome-Wide Human Origins array 
      
    
   
  209 
 
  
    EGAD00010002461 
   
  
    
    Methylation of peripheral blood leukocytes from patients with Li-Fraumeni syndrome 
    
   
  
    
      
      Illumina HumanMethylation450 BeadChip/Illumina HumanMethylationEPIC BeadChip 
      
    
   
  400 
 
  
    EGAD00010002463 
   
  
    
    EPIC Array data from human lung fibroblasts isolated from fresh and cryopreserved lung tissue (16 samples, 3 donors) 
    
   
  
    
      
      Illumina_EPIC 
      
    
   
  16 
 
  
    EGAD00010002465 
   
  
    
    The dataset includes IDAT raw files for 10 samples and the analyzed DMP file which describes the differential methylation positions based on Illumina Infinium MethylationEPIC BeadChip. All samples (5 lung cancer cases vs. 5 benign lung disease controls) were obtained from bronchial washings at the site of the lesion under bronchoscopy manipulation. The histological type of the five lung cancer cases is adenocarcinoma and squamous cell carcinoma. 
    
   
  
    
      
      Illumina Infinium MethylationEPIC BeadChip (850 K) 
      
    
   
  10 
 
  
    EGAD00010002467 
   
  
    
    Individuals genotyped on the Illumina Omni2.5. Autosome and X chromosome. 
    
   
  
    
      
      Illumina SNP Array, Omni2.5-8 v1.3 
      
    
   
  2 
 
  
    EGAD00010002468 
   
  
    
    Individuals genotyped on the Illumina GSA v2. Autosome and X chromosome. 
    
   
  
    
      
      Illumina SNP Array, Global Screening Array v2 
      
    
   
  30 
 
  
    EGAD00010002470 
   
  
    
    Raw methylation data from blood samples in breast cancer cases. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  105 
 
  
    EGAD00010002471 
   
  
    
    Raw methylation data from blood samples in controls. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  211 
 
  
    EGAD00010002473 
   
  
    
    84 Indigenous and admixed individuals from Panama genotyped with Axiom Human Origins (Affymetrix) 
    
   
  
    
      
      Axiom Human Origins (Affymetrix) 
      
    
   
  84 
 
  
    EGAD00010002475 
   
  
    
    HC genotyping data for lead SNPs using Illuminia Global Array V2.0 
    
   
  
    
   
  1 
 
  
    EGAD00010002476 
   
  
    
    AS genotyping data for lead SNPs using Illuminia Global Array V2.0 
    
   
  
    
      
      Illuminia Global Array V2.0 
      
    
   
  40 
 
  
    EGAD00010002478 
   
  
    
    RNA-seq (Illumina HiSeq 2500) of 142 Human Breast Cancer samples 
    
   
  
    
      
      Illumina 
      
    
   
  142 
 
  
    EGAD00010002482 
   
  
    
    SNP array genotyping of multi-site HGSOC samples 
    
   
  
    
      
      InfiniumOmniExpress-24v1-2_A1 
      
    
   
  305 
 
  
    EGAD00010002484 
   
  
    
    Genotype and phenotype data on 301 MS patients from Germany, Mainz. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  301 
 
  
    EGAD00010002485 
   
  
    
    Genotype and phenotype data on 575 MS patients from Netherlands. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  575 
 
  
    EGAD00010002486 
   
  
    
    Genotype and phenotype data on 538 MS patients from Italy, OSR. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  538 
 
  
    EGAD00010002487 
   
  
    
    Genotype and phenotype data on 246 MS patients from Austria. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  246 
 
  
    EGAD00010002488 
   
  
    
    Genotype and phenotype data on 209 MS patients from Netherlands. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  209 
 
  
    EGAD00010002489 
   
  
    
    Genotype and phenotype data on 683 MS patients from Germany, TUM. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  683 
 
  
    EGAD00010002490 
   
  
    
    Genotype and phenotype data on 1067 MS patients from Italy, Piedmont. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  1067 
 
  
    EGAD00010002491 
   
  
    
    Genotype and phenotype data on 943 MS patients from UK. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  943 
 
  
    EGAD00010002492 
   
  
    
    Genotype and phenotype data on 151 MS patients from Spain. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  151 
 
  
    EGAD00010002493 
   
  
    
    Genotype and phenotype data on 140 MS patients from Greece. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  140 
 
  
    EGAD00010002495 
   
  
    
    Additional Methylation files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities" 
    
   
  
    
      
      Illumina Infinium MethylationEPIC 
      
    
   
  3 
 
  
    EGAD00010002497 
   
  
    
    methylation array data of cfDNA from plasma samples of individuals after running a marathon, a 40 min run and resting 
    
   
  
    
      
      Illumina 850K EPIC methylation array 
      
    
   
  6 
 
  
    EGAD00010002499 
   
  
    
    Individuals of Native American ancestry from Southern Chile genotyped with the Human Origins SNP Chip 
    
   
  
    
      
      Human Origins Axiom 
      
    
   
  64 
 
  
    EGAD00010002501 
   
  
    
    Raw methylation data from technical replicates processed on EPIC v1.0. 
    
   
  
    
      
      Illumina MethylationEPIC Array 
      
    
   
  48 
 
  
    EGAD00010002503 
   
  
    
    TANDEMsamplesgenoypedontheIlluminaH3AarrayattheCGPR,SouthAfrica. 
    
   
  
    
      
      Illumina 
      
    
   
  107 
 
  
    EGAD00010002505 
   
  
    
    The sys4MS cohort comprises 350 patients with Multiple Sclerosis (MS) and 9 controls, with 2 years of follow-up. Baseline data includes demographics, clinical scales, disease duration and subtype and use of disease-modifying drugs, brain MRI (volumetry and lesion load), retinal thickness by OCT, genomics (GWAS), cytomics, and phosphoproteomics. Data at the end of follow-up includes clinical scales, brain MRI and OCT. 
    
   
  
    
      
      Illumina HumanOmniExpress-24 v1.2 array 
      
    
   
  400 
 
  
    EGAD00010002507 
   
  
    
    Active TB patients (sputum smear-positive and GeneXpert-positive) recruited at the Temeke District Hospital in Dar es Salaam, Tanzania, as part of a prospective study that ran between November 2013 and June 2022. 
    
   
  
    
      
      Illumina Infinium H3Africa (V2) with custom add-ons 
      
    
   
  1409 
 
  
    EGAD00010002509 
   
  
    
    SNP Genotyping for Lassa Fever cases and population controls from Nigeria and Sierra Leone using Illumina Omni 2.5M and 5M 
    
   
  
    
      
      Illumina Omni 2.5M, Illumina Omni 5M 
      
    
   
  2667 
 
  
    EGAD00010002510 
   
  
    
    SNP Genotyping for Lassa Fever cases and population controls from Nigeria and Sierra Leone using Illumina H3Africa array version 1 
    
   
  
    
      
      Illumina H3Africa array version 1 
      
    
   
  1345 
 
  
    EGAD00010002512 
   
  
    
    Accesstoproteomicfiles(DIA)ofMIBCpatient-derivedxenografts(N=8) 
    
   
  
    
      
      OrbitrapLumos 
      
    
   
  12 
 
  
    EGAD00010002513 
   
  
    
    Accesstoproteomicfiles(DIA)ofpatientswithtreatment-naiveMIBC(N=51),treatment-naiveNMIBC(N=17)andneoadjuvantMIBC(N=11) 
    
   
  
    
      
      OrbitrapLumos 
      
    
   
  86 
 
  
    EGAD00010002515 
   
  
    
    51 DNA methylation arrays of human samples initially diagnosed as mesenchymal chondrosarcoma. Microdissection of the cartilage and/or the small round cell component from the same sample may have occurred. As mentioned in the sample descriptions, the diagnoses of four samples have been revised. Additional molecular investigations were conducted for a subset of samples as described in the related publication. 
    
   
  
    
      
      EPIC array (Illumina) 
      
    
   
  51 
 
  
    EGAD00010002517 
   
  
    
    Chromosome 12 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002518 
   
  
    
    Chromosome 22 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002519 
   
  
    
    Chromosome 8 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002520 
   
  
    
    Chromosome 18 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002521 
   
  
    
    Chromosome 3 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002522 
   
  
    
    Chromosome X imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002523 
   
  
    
    Chromosome 2 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002524 
   
  
    
    Chromosome 19 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002525 
   
  
    
    Chromosome 1 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002526 
   
  
    
    Chromosome 17 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002527 
   
  
    
    Chromosome 14 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002528 
   
  
    
    Chromosome 7 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002529 
   
  
    
    Chromosome 4 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002530 
   
  
    
    Chromosome 10 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002531 
   
  
    
    Chromosome 13 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002532 
   
  
    
    Chromosome 21 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002533 
   
  
    
    Chromosome 15 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002534 
   
  
    
    Chromosome 16 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002535 
   
  
    
    Chromosome 9 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002536 
   
  
    
    Chromosome 5 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002537 
   
  
    
    Chromosome 20 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002538 
   
  
    
    Chromosome 11 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002539 
   
  
    
    Chromosome 6 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38) 
    
   
  
    
      
      Axiom Array 
      
    
   
  1195 
 
  
    EGAD00010002543 
   
  
    
    Gene Expression Profiles measured using Affymetrix HGU133plus2.0 Array 
    
   
  
    
      
      Affymetrix HGU133plus2.0 
      
    
   
  83 
 
  
    EGAD00010002544 
   
  
    
    Copy Number profiles measured using Affymetrix SNP Array 6.0 
    
   
  
    
      
      Affymetrix SNP Array 6.0 
      
    
   
  83 
 
  
    EGAD00010002546 
   
  
    
    bulk TCR-seq data IMCISION on the PBMCs of responding patients bulkTCR-seq data generated with the immunoSEQ platform (Adaptive Biotechnologies) on PBMCs of responding patients, pre- and post-treatment. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  18 
 
  
    EGAD00010002551 
   
  
    
    3421 Samples from Nigeria and Ghana, sequenced with the Illumina NestSeq 500 
    
   
  
    
      
      Illumina NestSeq 500 
      
    
   
  3421 
 
  
    EGAD00010002553 
   
  
    
    Expression of immune related genes in 12 familial adenomatous polyposis patients. Expression assessed by analyzing whole blood-derived RNA samples using a Nanostring nCounter Immunology V2 panel (579 genes) 
    
   
  
    
      
      Nanostring nCounter 
      
    
   
  12 
 
  
    EGAD00010002554 
   
  
    
    Expression of immune related genes in 12 healthy donors. Expression assessed by analyzing whole blood-derived RNA samples using a Nanostring nCounter Immunology V2 panel (579 genes) 
    
   
  
    
      
      Nanostring nCounter 
      
    
   
  12 
 
  
    EGAD00010002556 
   
  
    
    SNP array ARID1B patients 
    
   
  
    
      
      Illumina Infinium PsychArray-24 BeadChip v1.3 
      
    
   
  5 
 
  
    EGAD00010002559 
   
  
    
    H5 files generated for each sample with Tapestri Pipeline 
    
   
  
    
      
      Tapestri 
      
    
   
  28 
 
  
    EGAD00010002560 
   
  
    
    Bed files as whitelist for Tapestri Insights analysis 
    
   
  
    
      
      Tapestri 
      
    
   
  2 
 
  
    EGAD00010002561 
   
  
    
    Loom files generated for each sample with Tapestri Pipeline 
    
   
  
    
      
      Tapestri 
      
    
   
  28 
 
  
    EGAD00010002562 
   
  
    
    Tap files from Tapestri Insights 
    
   
  
    
      
      Tapestri 
      
    
   
  28 
 
  
    EGAD00010002564 
   
  
    
    Array CGH derived copy number variations from dicentric chromosome dic(9;20) positive pediatric Acute lymphocytic leukemia B-lymphocyte samples, by utilizing an Agilent 400K SurePrint G3 Custom CGH Human Genome Microarray (e-Array design 84704) 
    
   
  
    
      
      SurePrint G3 CGH 
      
    
   
  58 
 
  
    EGAD00010002567 
   
  
    
    Unfiltered genotype data for a pilot study (Batch 1) of 1,140 DDD Study participants (and 12 "Empty" samples). Samples include 380 mothers, 382 fathers and 378 probands, and form 376 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  1140 
 
  
    EGAD00010002568 
   
  
    
    QC-ed data of 9,534 DDD Study participants, including 8,879 individuals with inferred GBR ancestry. Details of genotype QC can be found in https://www.medrxiv.org/content/10.1101/2023.04.20.23288860v1.full.pdf. Genome builds are indicated in the file name. Related individuals have not been removed. Of the 9,534 samples there are 3,148 mothers, 3,138 fathers and 3,248 probands, which form 3,099 trios. Of the 8,879 GBR samples, there are 2,931 mothers, 2,937 father and 3,011 probands, which form 2,788 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  9534 
 
  
    EGAD00010002569 
   
  
    
    Unfiltered genotype data for a larger batch (Batch 2) of 8,697 DDD Study participants (and 1 "Blank" sample). Samples include 2,858 mothers, 2,857 fathers and 2,982 probands, and form 2,918 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array. 
    
   
  
    
      
      Illumina Global Screening Array 
      
    
   
  9846 
 
  
    EGAD00010002571 
   
  
    
    Methylation profile (array data using EPIC_850K) from tumour samples (epithelioid sarcoma) 
    
   
  
    
      
      EPIC_850K 
      
    
   
  32 
 
  
    EGAD00010002575 
   
  
    
    This dataset contains the cleaned genotype data from 2173 African eosphageal squamous cell cancer cases and population controls. The genotype data was generated using the H3Africa Illumina Custom microarray. 
    
   
  
    
      
      Illumina HiScan 
      
    
   
  2173 
 
  
    EGAD00010002577 
   
  
    
    VaccGene HLA imputation panel variants and HLA allele calls for all individuals across all tested sequence platforms and countries, and the high quality direct genotypes for these individuals (with genotype data available) across the MHC in PLINK format 
    
   
  
    
      
      Illumina HumanOmni25M-8v1-1 
      
    
   
  2499 
 
  
    EGAD00010002578 
   
  
    
    Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 1391 individuals from EMaBS in Entebbe, Uganda. 
    
   
  
    
      
      Illumina HumanOmni25-8v1-1 
      
    
   
  1391 
 
  
    EGAD00010002579 
   
  
    
    Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 355 individuals from the VAC050 trial performed in Banfora, Burkina Faso - X chromosome. 
    
   
  
    
      
      Illumina HumanOmni25M-8v1-1 
      
    
   
  353 
 
  
    EGAD00010002580 
   
  
    
    Genotype data and (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 750 individuals from Respiratory and Meningeal Pathogens Unit in Soweto, South Africa - X chromosome. 
    
   
  
    
      
      Illumina HumanOmni25M-8v1-1 
      
    
   
  755 
 
  
    EGAD00010002581 
   
  
    
    Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 355 individuals from the VAC050 trial performed in Banfora, Burkina Faso - autosomes. 
    
   
  
    
      
      Illumina HumanOmni25M-8v1-1 
      
    
   
  353 
 
  
    EGAD00010002582 
   
  
    
    Genotype data and (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 750 individuals from Respiratory and Meningeal Pathogens Unit in Soweto, South Africa - autosomes. 
    
   
  
    
      
      Illumina HumanOmni25M-8v1-1 
      
    
   
  755 
 
  
    EGAD00010002583 
   
  
    
    Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 1391 individuals from EMaBS in Entebbe, Uganda - X chromosome. 
    
   
  
    
      
      Illumina HumanOmni25-8v1-1 
      
    
   
  1391 
 
  
    EGAD00010002585 
   
  
    
    Genome-wide CpG methylation information of cell-free DNA samples from healthy controls 
    
   
  
    
      
      NovaSeq 6000 
      
    
   
  93 
 
  
    EGAD00010002586 
   
  
    
    Genome-wide CpG methylation information of genomic DNA samples from white blood cells 
    
   
  
    
      
      NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD00010002587 
   
  
    
    Genome-wide CpG methylation information of cell-free DNA samples from cancer patients 
    
   
  
    
      
      NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD00010002588 
   
  
    
    Genome-wide CpG methylation information of genomic DNA samples from tumor tissue 
    
   
  
    
      
      NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD00010002590 
   
  
    
    Longitudinal whole-genome DNA methylation profiling of PBL obtained from UC patients categorized as responders and non-responders 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  56 
 
  
    EGAD00010002592 
   
  
    
    122 unpaired initial tumor samples (122 FFPE samples) of sFL patients measured with OncoScan SNP microarrays, Affymetrix CEL intensity data file types (Thermo Fisher Scientific, Waltham, Massachusetts, USA) 
    
   
  
    
      
      Affymetrix 
      
    
   
  122 
 
  
    EGAD00010002593 
   
  
    
    149 unpaired initial tumor samples (133/149 FFPE and 16/149 fresh frozen samples) of lFL patients measured with OncoScan SNP microarrays, Affymetrix CEL intensity data file types (Thermo Fisher Scientific, Waltham, Massachusetts, USA) 
    
   
  
    
      
      Affymetrix 
      
    
   
  149 
 
  
    EGAD00010002596 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains expression array data for 32 patients of the screening cohort with 8 of them having paired normal tissue plus an additional relapse tumor/normal pair of one of those 8 patients and a patient only with normal tissue. Data was generated on a HumanHT-12 v4 Bead Array (Illumina) and is stored in IDAT file format. 
    
   
  
    
      
      HumanHT-12 
      
    
   
  41 
 
  
    EGAD00010002597 
   
  
    
    The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed.
This dataset contains SNP array data for tumor/normal pairs for a subset of 36 patients from the screening cohort plus an additional relapse tumor of one of those 36 patients. Data was generated on a OmniExpress-24 v1.1 Bead Array (Illumina) and is stored in IDAT file format. 
    
   
  
    
      
      OmniExpress-24 v1.1 
      
    
   
  73 
 
  
    EGAD00010002599 
   
  
    
    In this study nanopore sequencing was applied to obtain sparse DNA methylation profiles from pediatric CNS tumor samples. A neural network was used to classify the tumor based on the obtained methylation profile. 
    
   
  
    
      
      Illumina Infinium EPIC 
      
    
   
  94 
 
  
    EGAD00010002608 
   
  
    
    DNA methylation arrays (850K, Illumina) 
    
   
  
    
      
      EPIC BeadChips (Illumina) 
      
    
   
  10 
 
  
    EGAD00010002610 
   
  
    
    Peripheral blood DNA methylome in adalimumab-treated patients with rheumatoid arthritis 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  93 
 
  
    EGAD00010002612 
   
  
    
    Raw .RCC files for NanoString. nCounter PanCancer IO 360 Panel was used 
    
   
  
    
      
      NanoString nCounter PanCancer IO 360 Panel 
      
    
   
  60 
 
  
    EGAD00010002613 
   
  
    
    Raw idat files for DNA methylation profiling for 12 CCAs and 7 normal bile duct tissues. DNA methylation profiling was performed using Infinium MethylationEPIC v2.0 Kit. 
    
   
  
    
      
      Infinium MethylationEPIC v2.0 Kit 
      
    
   
  19 
 
  
    EGAD00010002617 
   
  
    
    S3 genotype data wave 5 (QC+ SNPs) 
    
   
  
    
      
      Illumina OmniExpress 
      
    
   
  4411 
 
  
    EGAD00010002618 
   
  
    
    S3 genotype data wave 6 (phenotypes) 
    
   
  
    
   
  2287 
 
  
    EGAD00010002619 
   
  
    
    S3 genotype data wave 6 (all SNPs) 
    
   
  
    
      
      Illumina OmniExpress 
      
    
   
  2287 
 
  
    EGAD00010002620 
   
  
    
    S3 genotype data wave 6 (QC+ SNPs) 
    
   
  
    
      
      Illumina OmniExpress 
      
    
   
  2287 
 
  
    EGAD00010002621 
   
  
    
    S3 genotype data wave 2-4 (phenotypes) 
    
   
  
    
   
  4412 
 
  
    EGAD00010002622 
   
  
    
    S3 genotype data wave 5 (all SNPs) 
    
   
  
    
      
      Illumina OmniExpress 
      
    
   
  4411 
 
  
    EGAD00010002623 
   
  
    
    S3 genotype data wave 1 (phenotypes) 
    
   
  
    
   
  435 
 
  
    EGAD00010002624 
   
  
    
    S3 genotype data wave 2-4 (all SNPs) 
    
   
  
    
      
      Affymetrix 6.0 
      
    
   
  4412 
 
  
    EGAD00010002626 
   
  
    
    S3 genotype data wave 1 (all SNPs) 
    
   
  
    
      
      Affymetrix 5.0 
      
    
   
  435 
 
  
    EGAD00010002627 
   
  
    
    S3 genotype data wave 5 (phenotypes) 
    
   
  
    
   
  4411 
 
  
    EGAD00010002628 
   
  
    
    S3 genotype data wave 2-4 (QC+ SNPs) 
    
   
  
    
      
      Affymetrix 6.0 
      
    
   
  4412 
 
  
    EGAD00010002633 
   
  
    
    Genetic characterisation of primary sclerosing cholangitis 
    
   
  
    
      
      Illumina Omni2.5-8Exome BeadChip 
      
    
   
  1 
 
  
    EGAD00010002635 
   
  
    
    Buccal sample methylation from breast cancer cases 
    
   
  
    
      
      Illumina HumanMethylationEPIC v1 
      
    
   
  94 
 
  
    EGAD00010002636 
   
  
    
    Buccal sample methylation data was generated from healthy controls 
    
   
  
    
      
      Illumina HumanMethylationEPIC v1 
      
    
   
  93 
 
  
    EGAD00010002638 
   
  
    
    CONTROL SAMPLES methylation data using Illumina EPIC technology 
    
   
  
    
      
      Illumina EPIC 
      
    
   
  791 
 
  
    EGAD00010002639 
   
  
    
    CASE SAMPLES methylation data using Illumina EPIC technology 
    
   
  
    
      
      Illumina EPIC 
      
    
   
  320 
 
  
    EGAD00010002645 
   
  
    
    Samples genotyped using Illumina Infinium Global Screening Array v3 for assessing pharmacogenomic genes 
    
   
  
    
      
      Illumina Global Screening Array v3 
      
    
   
  74 
 
  
    EGAD00010002647 
   
  
    
    DNA-methylation data of samples included in the GLASS-NL cohort 
    
   
  
    
      
      Illumina Infinium MethylationEpic BeadChip array 
      
    
   
  231 
 
  
    EGAD00010002649 
   
  
    
    Longitudinal DNA methylation discovery data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the AmsterdamUMC prior to and during ustekinumab treatment 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  117 
 
  
    EGAD00010002650 
   
  
    
    DNA methylation validation data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the John Radcliffe Hospital, Oxford, UK prior to ustekinumab treatment 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  34 
 
  
    EGAD00010002651 
   
  
    
    Longitudinal DNA methylation discovery data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the AmsterdamUMC prior to and during vedolizumab treatment 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  124 
 
  
    EGAD00010002652 
   
  
    
    DNA methylation validation data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the John Radcliffe Hospital, Oxford, UK prior to vedolizumab treatment 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  25 
 
  
    EGAD00010002654 
   
  
    
    Two to four sections of 10 µm thickness each (depending on sample size) from the respective formalin-fixed paraffin-embedded (FFPE) tissue sample were used for RNA isolation on 46 resected tumors as well as 17 paired biopsies. Digital gene expression analysis was performed on the NanoString nCounter platform, utilizing the NanoString MAX/FLEX system, with the PanCancer Immune Profiling panel as well as the PanCancer Pathway panel provided by NanoString. 
    
   
  
    
      
      NanoString nCounter MAX/FLEX 
      
    
   
  93 
 
  
    EGAD00010002656 
   
  
    
    EPIC methylation data of pleural mesothelioma samples and healthy pleura samples 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  29 
 
  
    EGAD00010002657 
   
  
    
    EPIC methylation data of pleural mesothelioma samples and healthy pleura samples 
    
   
  
    
      
      Illumina Infinium HumanMethylation EPIC BeadChip 
      
    
   
  11 
 
  
    EGAD00010002660 
   
  
    
    PREGO indivudals' birthplaces in epsg.io/2154 (RGF93 v1 / Lambert-93 -- France) geographic coordinates. 
    
   
  
    
   
  3234 
 
  
    EGAD00010002661 
   
  
    
    Core set of the PREGO biobank containing experimental genotypes at 209,706 autosomal sites measured on Affymetrix PMRA Axiom array plates for a group of 3,234 individuals from Western France. 
    
   
  
    
      
      Affymetrix PMRA Axiom array plates 
      
    
   
  3234 
 
  
    EGAD00010002663 
   
  
    
    86 samples from four human populations: two from Central Asia and two from Southeast Asia 
    
   
  
    
      
      Illumina 
      
    
   
  86 
 
  
    EGAD00010002667 
   
  
    
    Illumina Infinium MethylationEPIC Array profiling of 93 pheochromocytoma and paraganglioma tumours with and a germline SDHB mutation 
    
   
  
    
      
      Illumina Infinium MethylationEPIC Array 
      
    
   
  93 
 
  
    EGAD00010002671 
   
  
    
    Whole-skin DNA Methylation profiled using Illumina Infinium HumanMethylation450 BeadChip Arrays. Methylation was quantified in beta values, which were normalised using the regRCPqn algorithm. 
    
   
  
    
      
      Illumina Infinium HumanMethylation450 BeadChip Arrays 
      
    
   
  414 
 
  
    EGAD00010002674 
   
  
    
   
  
    
      
      Illumina HumanMethlationEPIC version 1 
      
    
   
  421 
 
  
    EGAD00010002675 
   
  
    
   
  
    
      
      Illumina HumanMethlationEPIC version 1 
      
    
   
  423 
 
  
    EGAD00010002676 
   
  
    
   
  
    
      
      Illumina HumanMethlationEPIC version 1 
      
    
   
  422 
 
  
    EGAD00010002678 
   
  
    
    Comprehensive data on lifestyle-related modifiable factors, sociodemographic, anthropometric, economic, biochemical, and genetic markers related to the occurrence of cardiometabolic diseases as part of the observational cross-sectional survey (2015 Health Survey of Sao Paulo with Focus on Nutrition (2015 ISA-Nutrition), a population-based study. Data of 805218 SNPs for 841 individuals was genotyped using the Axioma 2.0 Precision Medicine Research Array in the Thermo Fisher Scientific laboratory (Affymetrix Inc, Santa Clara, CA). 
    
   
  
    
      
      Affymetrics Axiom SNP Array 2.0 
      
    
   
  841 
 
  
    EGAD00010002684 
   
  
    
    GWAS data of the AlpeDPD trail cohort 
    
   
  
    
      
      Illumina GSA V1.0 
      
    
   
  1146 
 
  
    EGAD50000000001 
   
  
    
    Targeted sequencing with the Myeloid Solutions™ Panel (MYS) of SOPHiA Genetics for COVID-19 patientds (n=241 deceased, n=239 survivors). 
    
   
  
    
      
      NextSeq 500 
      
    
   
  480 
 
  
    EGAD50000000005 
   
  
    
    We used novel processing techniques to obtain whole genome data together with 3D anatomic and histomorphologic analysis in two men (GP5 and GP12) with high risk PrCa undergoing radical prostatectomy. A total of 22 whole genome-sequenced sites (16 primary cancer foci and 6 lymph node metastatic) were analyzed using evolutionary reconstruction tools and spatio-evolutionary models. Probability models were used to trace spatial and chronological origins of the primary tumor and metastases, chart their genetic drivers, and distinguish metastatic and non-metastatic subclones.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD50000000006 
   
  
    
    Dataset: AfricanNeo_B; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
Dataset: AfricanNeo_B; genotyping batch: TE-2567_201023_2019; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).
 
    
   
  
    
   
  156 
 
  
    EGAD50000000007 
   
  
    
    Dataset: AfricanNeo_F; genotyping batch: OE-0808_150625; array: Illumina Omni2.5-Octo BeadChip. 
    
   
  
    
   
  29 
 
  
    EGAD50000000008 
   
  
    
    Dataset: AfricanNeo_A; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
Dataset: AfricanNeo_A; genotyping batch: TI-2658_201112; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).
 
    
   
  
    
   
  1027 
 
  
    EGAD50000000009 
   
  
    
    Dataset: AfricanNeo_C; genotyping batch: TE-2567_201023_2017; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
Dataset: AfricanNeo_C; genotyping batch: TE-2567_201023_2019; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1). 
    
   
  
    
   
  300 
 
  
    EGAD50000000010 
   
  
    
    Dataset: AfricanNeo_D; genotyping batch: RK-2011_190308; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
Dataset: AfricanNeo_D; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
 
    
   
  
    
   
  151 
 
  
    EGAD50000000011 
   
  
    
    Dataset: AfricanNeo_E; genotyping batch: TC-2508_200401_A; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).
Dataset: AfricanNeo_E; genotyping batch: TC-2508_200401_B; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1). 
    
   
  
    
   
  100 
 
  
    EGAD50000000012 
   
  
    
    QuantSeq 3'-mRNAseq. 5 donors, 4 stimuli, 2 time points 
    
   
  
    
      
      NextSeq 500 
      
    
   
  60 
 
  
    EGAD50000000013 
   
  
    
    Dataset contains text files that describe the chromosome, position, and read count for an amplicon using the RealSeqS assay. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  150 
 
  
    EGAD50000000014 
   
  
    
    ChIP-seq has been perfomed on 5 tumor fresh-frozen primary breast cancer tissues from female patients. Immunoprecipitation has been performed for ERa (SC-543, Santa Cruz). Raw single-end fastq data have been aligned using bwa-mem using Hg19 genome assebly as reference. Unfiltered aligment bam files are provided.
 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  6 
 
  
    EGAD50000000015 
   
  
    
    A oligo-captured STARR-seq library was generated using as targets ERa binding regions indetified by ChIP-seq in Ishikawa (Endometrial cancer) and T47D (breast cancer) cell lines and, breast cancer primary tissues. STARR-seq assays have been performed in MCF-7 and Ishikawa cell lines. Cells have been cultured for 3 days in hormone deprivated (phenol-red free) media and stimulated for 6 hours by 10nM estradiol (E2) or DMSO (negative control). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD50000000016 
   
  
    
    We evaluate the analytical performance of the PGDx™ elio™ tissue complete assay, a 505 gene next-generation sequencing (NGS) tissue-based assay, that has now been FDA-cleared for use by physicians to help guide treatment decisions for cancer patients, using a NSCLC cohort of 38 patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  38 
 
  
    EGAD50000000018 
   
  
    
    SLE serum stimulation induced a unique expression profile in human colon organoids marked by a reduction in goblet cell marker expression and mucus composition. Transcriptomic analysis of SLE human colon biopsies displayed a downregulation of epithelial secretory markers. Collection of raw sequencing data used in this publication https://doi.org/10.1101/2023.07.04.547690 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD50000000020 
   
  
    
    Here we performed single-cell RNA sequencing to characterize NK cell subsets. Sorted NK cell subsets representing discrete stages of NK cell differentiation as well as bulk NK cells were sequenced.  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD50000000021 
   
  
    
    Cholangiocarcinomas (CCAs) is a type of cancer with few effective systemic therapies. Elucidation of the molecular landscape of the disease from genomic studies based on next generation sequencing (NGS) has contributed to the introduction of new targeted therapies. One of these treatments consists of a class of small molecules that target members of the FGFR family of receptor tyrosine kinases. These drugs are effective and have been approved for cholangiocarcinomas with fusions or rearrangements of FGFR genes. In contrast, the role of these inhibitors in cholangiocarcinomas with mutations in FGFR genes is less well defined. We report here a patient with a cholangiocarcinoma bearing a FGFR2 p.Ser252Trp mutation. The patient was treated with two different FGFR inhibitors, as the first caused ocular toxicity. She obtained clinical benefit from both. This case illustrates the efficacy of FGFR inhibitors on cholangiocarcinoma with specific point mutations. This is the first case to report the clinical benefit of these drugs in FGFR2 p.Ser252Trp mutation. Clinical benefit can be sustained, as seen in our patient. Our case also shows that FGFR inhibitors-induced adverse effects, such as ocular toxicities, may not recur after re-challenge with an alternative drug of the same class. 
    
   
  
    
      
      Ion Torrent S5 
      
    
   
  1 
 
  
    EGAD50000000022 
   
  
    
    7 Isoform sequencing or Long-read RNAseq mapped bam files. Reads were aligned to HG38 reference using Minimap2. 
    
   
  
    
      
      PacBio RS II 
      
    
   
  7 
 
  
    EGAD50000000023 
   
  
    
    19 H3K27ac HICHIP from T-ALL patient samples and one healthy normal control sample. Reads were aligned to HG38 reference. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000024 
   
  
    
    19 ATAC-seq from T-ALL patient samples and one healthy normal control sample. Reads were aligned to HG38 reference 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000025 
   
  
    
    FEGA worflow test 
    
   
  
    
   
  1 
 
  
    EGAD50000000026 
   
  
    
    Second test file 
    
   
  
    
   
  1 
 
  
    EGAD50000000027 
   
  
    
    RNA-seq data for 85 patients with acute lymphoblastic leukemia expressing the gamma delta T cell receptor (γδ T-ALL)  
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  85 
 
  
    EGAD50000000028 
   
  
    
    Whole genome sequencing data for 61 samples with acute lymphoblastic leukemia expressing the gamma delta T cell receptor (γδ T-ALL) and 29 germline samples 
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  90 
 
  
    EGAD50000000029 
   
  
    
    Single-cells from primary cB-ALL samples were isolated using an inverted microscope coupled to a micromanipulator equipped with glass capillary for cell collection. A minimum of 20 cells were isolated per sample in microdrops of 2.5 μl of phosphate buffered saline (PBS) with 0.5% polyvinylpyrrolidone. Cell lysis and DNA amplification was performed using the SurePlex DNA Amplification System (Illumina). Genomic DNA was subsequently fragmented and tagged with the VeriSeq PGS transposome and the TruSeq Index adapters by PCR for library preparation (VeriSeq PGS Library Prep Kit, Illumina). Equal volumes of normalized libraries were pooled and sequenced on an Illumina MySeq platform with 1×75-bp single-end sequencing. Reads were subsequently aligned to the human reference genome (GRCh38/hg38) using Bowtie2 (version 2.2.4).  
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  9 
 
  
    EGAD50000000030 
   
  
    
    This dataset contains 224 paired fastq files (112 single cells) from the following samples: brains from two Multiple System Atrophy patients and one control, and non-brain controls (fibroblasts, NA12878) 
    
   
  
    
      
      unspecified 
      
    
   
  112 
 
  
    EGAD50000000031 
   
  
    
    This dataset comprises a genomic and a phenotypic excel sheet displaying 137 cases classified as suspected Lynch syndrome. Included are genotypic data such as tumour mutational burden, tumour mutational signatures and germline/somatic variant calls for each colorectal, endometrial or sebaceous skin tumour screened. Genomic data is derived from targeted multigene panel sequencing including ~300 hereditary cancer genes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  137 
 
  
    EGAD50000000032 
   
  
    
    RNAseq for #1049, #111, #1217, #206, COV362 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000033 
   
  
    
    WGS of tumour PDX and matched patient blood 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000034 
   
  
    
    Exome sequencing of PDX and matched patient blood 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000035 
   
  
    
    BROCA panel sequencing using the BROCA-HR v8 and BROCA-GO v1 versions of the gene panel 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD50000000036 
   
  
    
    Deposited here are whole-genome sequencing bam or fastq files from the experimental isogenic cell lines used in the study "The chemotherapeutic CX-5461 is extremely mutagenic and may increase cancer risk, Koh, Gene (2023)". 
The bam files were aligned to GRCh38/hg38 using BWA-MEM. The corresponding variant call data have been deposited on Mendeley Data, V2, doi: 10.17632/d58cv549v6.2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD50000000037 
   
  
    
    Genotype measured by Illumina GSA 
    
   
  
    
   
  1063 
 
  
    EGAD50000000038 
   
  
    
    Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. In this study, a novel SFT/HPC was characterized using whole genome sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000039 
   
  
    
    Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. This study established and characterized a novel SFT/HPC patient-derived cell line called SFT-S1. Potential drug candidates that could be repurposed for the treatment of SFT/HPC were screened. Screening was performed through RNA-Seq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000040 
   
  
    
    Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. This study established and characterized a novel SFT/HPC patient-derived cell line called SFT-S1 using the twist human methylome panel. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000042 
   
  
    
    sEVs were isolated from postmortem human brain tissue. RNA was extracted from the source brain tissue and the sEVs and prepared for RNA sequencing. cDNA was sequenced on the Illumina HiSeq using a 2x150 paired-end read configuration.  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  48 
 
  
    EGAD50000000043 
   
  
    
    Human brain sEVs were isolated from postmortem human brain tissue. RNA was extracted from both the source brain tissue and sEVs and prepared for long-read sequencing. cDNA was sequenced on the PacBio Sequel II. The data provided have undergone processing using the tool ccs (v6.0.0) to generate circula consensus reads. For further processing, it is recommended to follow the guidelines at https://isoseq.how/, starting with using lima to remove cDNA primers. 
    
   
  
    
      
      Sequel II 
      
    
   
  48 
 
  
    EGAD50000000044 
   
  
    
    Single-nucleus mRNA Sequencing of prenatal and postnatal samples from the brain and its border regions. Most samples were multiplexed with several samples run in one 10X reaction. A separate immune cell dataset was combined with published data from Braun et al 2023 and Yang et al 2021 integrated using harmony is included. 
    
   
  
    
      
      NextSeq 1000 
      
      NextSeq 550 
      
    
   
  11 
 
  
    EGAD50000000045 
   
  
    
    Single-cell RNA-sequencing. Some samples were analyzed using CITE-Seq and samples were multiplexed and run on the same 10X reaction 
    
   
  
    
      
      Illumina HiSeq 1000 
      
      NextSeq 1000 
      
      NextSeq 550 
      
    
   
  6 
 
  
    EGAD50000000046 
   
  
    
    mCEL-Seq2 analysis with reference mapping of human brain tissues and border regions 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  17 
 
  
    EGAD50000000047 
   
  
    
    Single-nucleus fixed mRNA profiling of FFPE samples from patients after sex-mismatched blood stem cell transplantation and matched controls. 
    
   
  
    
      
      NextSeq 1000 
      
    
   
  5 
 
  
    EGAD50000000048 
   
  
    
    CGMH-OCCC-WES data (tumor-normal paired) from 104 patients with ovarian clear cell carcinoma.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  208 
 
  
    EGAD50000000049 
   
  
    
    This dataset contains exome sequencing data, phenotypic information, and somatic mutation analysis results for 44 diagnosis-relapse DLBCL pairs. Exon sequencing data were captured using the Agilent HaloPlex exon kit. bam files were obtained from illumina sequencing followed by BWA alignment. 
    
   
  
    
      
      unspecified 
      
    
   
  108 
 
  
    EGAD50000000050 
   
  
    
    The Papua New Guinean Lowlanders dataset includes 41 whole genome sequences for Papua New Guinean individuals sampled in Daru. DNA was extracted from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements. 
    
   
  
    
      
      HiSeq X Five 
      
    
   
  41 
 
  
    EGAD50000000051 
   
  
    
    This dataset contains 84 paired fastq files of Bulk RNAseq collected from 14 patients with Extramedullary multiple myeloma, 14 patients with newly diagnosed multiple myeloma and 14 patients with Relapsed/refractory multiple myeloma 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  42 
 
  
    EGAD50000000052 
   
  
    
    This dataset contains 72 paired fastq files of raw Whole exome sequencing data from 14 patients with Extramedullary myeloma, 8 paired samples from the same patients at the time of Newly diagnosed myeloma, and the corresponding normal samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD50000000053 
   
  
    
    This dataset contains 16 fastq files of raw single-cell RNAsequencing data from 6 patients with Extramedullary multiple myeloma used for the study "Longitudinal Multi-Omics Study Reveals Molecular Drivers and Tumor Microenvironment in Extramedullary Multiple Myeloma" 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000054 
   
  
    
    We performed 10X Chromium 5' scRNA and scTCR sequencing on 4 on-treatment HNSCC PBMC samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000055 
   
  
    
    We performed 10X Chromium 3' scRNA sequencing of 23 pre- and on-treatment HNSCC biopsy samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD50000000056 
   
  
    
    We performed 10X Chromium 5' scRNA (n=50), scTCR (n=48) and scBCR (n=49) sequencing of pre- and on-treatment HNSCC biopsy samples.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  50 
 
  
    EGAD50000000058 
   
  
    
    This dataset is generated for EpiHK. It contains Histone PTM ChIP-seq, WGBS, RNA-seq of 7 Human Hepatocellular Carcinoma and 7 tumor adjacent normal tissue sample. 
    
   
  
    
      
      Illumina HiSeq 1500 
      
      Illumina HiSeq 4000 
      
      NextSeq 500 
      
    
   
  14 
 
  
    EGAD50000000059 
   
  
    
    Skeletal muscle of Inuit homozygous carriers of the common Greenlandic TBC1D4 p.Arg684Ter variant is severely insulin resistant but have normal metabolic responses during exercise
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  58 
 
  
    EGAD50000000060 
   
  
    
    We aimed to identify somatic mutations and transcriptional differences that could explain the resistance to Doxorubicin. This dataset includes RNA-Seq of HCC biopsies and Organoids and WES of Organoids. 
 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
      unspecified 
      
    
   
  52 
 
  
    EGAD50000000062 
   
  
    
    This dataset includes combined single-cell RNA-Seq and T cell receptor profiling data for SARS-CoV-2 spike protein-reactive CD4+ T cells and NK cells from blood, liver, lungs, and bone marrow of human donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000063 
   
  
    
    The data set consists of fastq raw files from RNA-seq of seven mucosal biopsies of the colon from seven patients, among them three patients with irritable bowel syndrome with diarrhea-predominant symptoms. Paired end sequencing on Illumina NovaSeq 6000 was used. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000064 
   
  
    
    These data consist of transcriptome and chromatin accessibility data (RNAseq and ATACseq respectively) derived from five High-Grade Serous Ovarian Carcinoma (HGSC) cell lines. HGSC lines included PEO1, PEO4, PEA2, OVCAR5 and OVCAR8. Samples were taken pre- and post-treatment with a novel epigenetic compound, HKMTi-1-005, which targets the activity of two histone methyltransferases, EZH2 and G9a.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD50000000065 
   
  
    
    Fastq files of 18 pediatric sarcoma PDX WXS samples. Each sample was sequenced on 2 Lanes. 
    
   
  
    
      
      unspecified 
      
    
   
  36 
 
  
    EGAD50000000066 
   
  
    
    Heterozygous (HET) truncating mutations in the TTN gene (TTNtv) encoding the giant titin protein are the most common genetic cause of dilated cardiomyopathy (DCM). We investigated 127 clinically identified DCM human cardiac samples with targeted sequencing using the TruSight Cardio panel on an Illumina MiSeq system with a special focus on TTNtvs.
This dataset belongs to the publication of Kellermayer, D et al. Truncated titin is integrated into the human dilated cardiomyopathic sarcomere 
 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  127 
 
  
    EGAD50000000067 
   
  
    
    single-cell RNAseq dataset 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  53 
 
  
    EGAD50000000068 
   
  
    
    Bulk RNAseq PBMC 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000069 
   
  
    
    Bulk CD14 RNAseq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000070 
   
  
    
    PBMC RNAseq drug in vitro 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD50000000072 
   
  
    
    Agilent CNS cohort 
    
   
  
    
      
      NextSeq 500 
      
    
   
  97 
 
  
    EGAD50000000073 
   
  
    
    Agilent Sarcoma cohort 
    
   
  
    
      
      NextSeq 500 
      
    
   
  52 
 
  
    EGAD50000000074 
   
  
    
    Illumina cohort 
    
   
  
    
      
      NextSeq 500 
      
    
   
  40 
 
  
    EGAD50000000075 
   
  
    
    Enzymatic conversion-based methylation sequencing data (EM-seq) for colon cancer used in the MESA study 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  228 
 
  
    EGAD50000000076 
   
  
    
    This dataset contains long read transcriptome sequencing of 19 chronic lymphocytic leukemia (CLL) patients with or without mutation in SF3B1, as well as 6 B cell samples from healthy individuals. The BAM files are unaligned CCS reads. 
    
   
  
    
      
      Sequel II 
      
    
   
  25 
 
  
    EGAD50000000077 
   
  
    
    This dataset contains long read transcriptome sequencing of 25 myelodysplastic sndrome (MDS) patients with or without mutation in SF3B1. The BAM files are unaligned CCS reads. 
    
   
  
    
      
      Sequel II 
      
    
   
  25 
 
  
    EGAD50000000078 
   
  
    
    This dataset contains Illumina stranded RNA-seq from of 19 chronic lymphocytic leukemia (CLL) patients with or without mutation in SF3B1, as well as 6 B cell samples from healthy individuals. RNA was extracted from CLL cells. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  27 
 
  
    EGAD50000000079 
   
  
    
    387 swab samples were sequenced. The bam files contain consensus reads. 
    
   
  
    
      
      DNBSEQ-G400 
      
    
   
  387 
 
  
    EGAD50000000080 
   
  
    
    40 buccal mucosa samples and a paired blood sample were whole-exome sequenced. 
    
   
  
    
      
      DNBSEQ-G400 
      
    
   
  41 
 
  
    EGAD50000000081 
   
  
    
    This dataset contains 10x scRNA sequencing of glioblastoma samples. Sequencing was performed on a Illumina HiSeq 4000. The sequencing was always paired. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  22 
 
  
    EGAD50000000082 
   
  
    
    Recordings of elevated calcium levels allowed selections of cells from heterogenous populations for transcriptomic analysis. Paired RNA-Seq from S24 cells using Takara SMARTer Ultra Low Input RNA v4 kit and sequencing on a Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD50000000084 
   
  
    
    This dataset consists of amplicon targeted NGS paired-end raw data (FASTQ R1 and R2) obtained from 148 colon cancer patients. Specifically, we have 148 primary tumor tissue samples, 148 white blood cell samples, and 118 plasma samples collected at baseline. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  414 
 
  
    EGAD50000000085 
   
  
    
    This dataset contains VCF files for 124 central nervous system glioma samples for the study "Cerebrospinal fluid cfDNA sequencing for classification of central nervous system glioma" 
    
   
  
    
   
  124 
 
  
    EGAD50000000086 
   
  
    
    Medullary thyroid cancer (MTC) is a rare malignant tumor that arises from parafollicular cells. Approximately 8% of thyroid cancer cases are MTC, and about 25% of these have a hereditary component. Incorporating molecular parameters into tumor classification is important. Besides, the presence of pathogenic germline variants can impact directly on cancer prevention. Thus, the aim of this study was to perform whole exome sequencing (WES) on a consecutive series of hereditary RET wild-type MTC patients to identify genetic variants that may be involved in the carcinogenesis of this tumor. WES was performed on 28 patients negative for germline RET pathogenic variants using the NovaSeq 6000 platform. Variant classification followed American College of Medical Genetics and Genomics guidelines.Our study represents a significant advancement in gene discovery for MTC genetics. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  28 
 
  
    EGAD50000000087 
   
  
    
    The project concerns whole genome sequencing (short-read and long-read) and RNA-sequencing of lipomatous tumors with 12q-amplification. A subset of lipomatous tumors is driven by amplification of genes mapping to chromosome arm 12q, including the MDM2 gene. The goals of the study were to compare expression levels of genes mapping to 12q in tumors with amplification in rod-shaped or circularized chromosomes as well as to assess and compare the structural variants in those tumors. In total, 20 samples were analyzed, and the data were correlated with genomic data on bulk and single cell DNA from the same tumors. The fastq files from the tumors were uploaded to EGA. 
    
   
  
    
      
      BGISEQ-500 
      
      NextSeq 500 
      
      Sequel 
      
    
   
  21 
 
  
    EGAD50000000088 
   
  
    
    Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO) were grown in glucose, lactate or with DCA . Samples were collected and processed for bulk ATAC-seq. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD50000000089 
   
  
    
    Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO) were grown in either glucose or lactate, in the presence and absence of DCA or BRD4 inhibition (JQ1). Samples were collected and processed for bulk RNA-seq. 
    
   
  
    
      
      NextSeq 2000 
      
      NextSeq 500 
      
    
   
  21 
 
  
    EGAD50000000090 
   
  
    
    Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO)  were grown in glucose, lactate or with DCA. Samples were collected 7 hours after the treatments and processed for bulk CHIC-seq. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD50000000091 
   
  
    
    This dataset contains 2 BAM files sequenced with Illumina NextSeq 500 e NovaSeq 600 and 20 files with variant calling sequenced with Illumina NextSeq 500 e NovaSeq 600. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD50000000092 
   
  
    
    This dataset contains scRNA sequencing data for CD8-Positive lymphocytes samples. Sequencing was performed on a Illumina NextSeq 550. The sequencing was always paired. 
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD50000000093 
   
  
    
    This dataset contains WGS and RNA sequencing data for 4 melanoma samples. Sequencing was performed on Illumina Novaseq 6000 and Illumina HiSeq X. The sequencing was always paired. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000095 
   
  
    
    RNA-seq data for 101 samples with B-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  101 
 
  
    EGAD50000000096 
   
  
    
    Whole genome sequencing data for 45 samples with B-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  45 
 
  
    EGAD50000000097 
   
  
    
    Whole exome sequencing data for 69 samples with B-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  69 
 
  
    EGAD50000000098 
   
  
    
    CD45- Single Cell RNA Sequencing data on 8 High Grade Serous Carcinoma Primary Tumors  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000099 
   
  
    
    Whole genome bisulfite sequencing of prostate cancer samples upon oral pimonidazole administration 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD50000000100 
   
  
    
    Long-read (PacBio) sequencing of two retinal organoid samples. One sample was treated with cyclohexamide. Files are ccs BAM format files generated by a Sequel IIe machine. 
    
   
  
    
      
      Sequel IIe 
      
    
   
  2 
 
  
    EGAD50000000101 
   
  
    
    Long-read (PacBio) RNA sequencing dataset of three neural retinal samples. Files are raw BAM format files generated by a Sequel II machine. Additionally, the ccs3 BAM format files are included. 
    
   
  
    
      
      Sequel II 
      
    
   
  4 
 
  
    EGAD50000000107 
   
  
    
    The dataset includes DNA targeted paired-wise (2 FASTQ per sample) sequencing of a manually curated panel of 44 genes with a documented role in homologous recombination (HR), a pathway of DNA damage response. Sequencing was conducted in 69 tumors from patients with metastatic colorectal cancer after serial passaging in mice (patient-derived xenografts, PDXs). For each PDX model profiled for HR gene mutations, therapeutic annotation of response to FOLFIRI (a chemotherapeutic regimen consisting of the combination of 5-fluorouracil and irinotecan) is available.  
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  69 
 
  
    EGAD50000000112 
   
  
    
    WES data of 17 tumors from 9 individuals that have biallelic germline CHEK2 pathogenic variants. 
Shallow whole genome sequencing files from 16 tumors samples from 9 individuals with germline biallleic pathogenic variants in CHEK2. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  17 
 
  
    EGAD50000000113 
   
  
    
    Whole exome sequencing data from the tumors of individuals with constitutional mismatch repair deficiency. These are 16 tumors from the bigger study of 41 tumors in total (from 17 individuals in total). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000114 
   
  
    
    This study includes an open-label, phase 2 study to determine the activity of the anti-VEGF receptor tyrosine-kinase inhibitor, pazopanib, combined with the anti-PD-L1 immune checkpoint inhibitor, durvalumab, in unselected advanced sarcomas. We conducted whole exome and transcriptomic sequencing with pre-treatment tissue biopsy to correlate clinical outcomes with molecular and genomic biomarkers to identify patients who would most likely benefit from the combination treatment. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  83 
 
  
    EGAD50000000115 
   
  
    
    Intestinal organoids treated with interferon-gamma 1 ng/mL either for 6h or 18h.
 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD50000000116 
   
  
    
    RNAseq for #111, #1177, #206, #201, #29, #931, WO-19, WO-2 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000117 
   
  
    
    This study explores the evolution of tumor and blood immune microenvironment and related mechanisms that shape breast cancer progression. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD50000000119 
   
  
    
    Synthetic - This submission contains a subset of a synthetic dataset derived from the project Heilsa Tryggvedottir - a Nordic collaboration on sharing sensitive human data. Heilsa Tryggvedottir is funded by the Nordic e-Infrastructure Collaboration (NeIC), the ELIXIR nodes of Finland, Norway, and Sweden, Computerome in Denmark, and the Estonian Scientific Computing Infrastructure (ETAIS).
In the synthetic data creation process, it was attempted to strike a fine balance between the usability of the datasets (e.g. technical FEGA development, testing, user training, and basic bioinformatics) and compliance with GDPR. File names and file content (e.g. headers in fastq) are anonymized. Moreover, the X, Y, and mitochondrial sequences have been discarded from the original data since these data can be used for maternal, paternal, or ethnic origin tracing. The dataset does not follow natural haplotype distribution (inherent to imputation panels). The only inputs derived from real sequence data are variant distribution density per chromosome and learning sequencing error models.
The synthetic dataset consists of two fastq files, a cram file, a vcf file, and two index files.  
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD50000000120 
   
  
    
    We used RNA sequencing of 81 cervical cancer specimens to developed an immune-based gene expression signature to predict distant metastasis in cervical cancer patients treated with RT/cisplatin. Our 55-gene risk score, validated across independent cohorts, was strongly linked to higher rates of metastasis and lower survival. The score also correlated with higher copy-number alteration and a less immune-responsive tumor microenvironment, indicating its potential in identifying high-risk patients and informing targeted therapies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      NextSeq 500 
      
    
   
  81 
 
  
    EGAD50000000123 
   
  
    
    ATAC-seq dataset for the paper:
Title
Multi-omics analysis of human population variation in immune function and in vivo response to BCG vaccination 
Abstract
Immune responses are tightly regulated, yet highly variable between individuals. To investigate human population variation of trained immunity, we immunized healthy individuals with Bacillus Calmette-Guérin (BCG). This live attenuated vaccine induces not only an adaptive immune response against tu-berculosis, but also triggers innate immune activation and memory. We established personal immune profiles and chromatin accessibility maps over a time course of BCG vaccination in 323 individuals. This large resource uncovered genetic and epigenetic predictors of baseline immunity and BCG vaccine response. We found that BCG vaccination enhances the innate immune response only in individuals with dormant immune states at baseline, suggesting that exogeneous induction of trained immunity is not a universal booster of innate immunity, but specifically elevates weak innate immune responses. This study advances our understanding of BCG’s heterologous immune-stimulatory effects and trained immunity in humans. Moreover, our results highlight the value of epigenetic cell states as an “endo-phenotype” that connects immune function with genotype and the environment.  
    
   
  
    
      
      Illumina HiSeq 3000 
      
    
   
  861 
 
  
    EGAD50000000124 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner.
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 10 samples (5 female, 5 male of Opole Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  10 
 
  
    EGAD50000000125 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner.
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 12 samples (6 female, 6 male of Podlaskie Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  12 
 
  
    EGAD50000000127 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 19 samples (8 female, 11 male of Lubusz Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  19 
 
  
    EGAD50000000128 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 15 samples (8 female, 7 male of Warmian-Mazurian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  15 
 
  
    EGAD50000000129 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 17 samples (7 female, 10 male of West Pomeranian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  17 
 
  
    EGAD50000000130 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner.
The dataset includes BAM and BAM.BAI files from Whole Exome Sequencing of 450 samples (227 female, 223 male, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA Prep with Enrichment.
Reference Genome: GRCh37. 
    
   
  
    
   
  450 
 
  
    EGAD50000000131 
   
  
    
    Buccal epithelial cells of chimeric twins were isolated using laser-capture microdissection. Cells were pooled (13-60 cells) per batch to create a genomic DNA library using an NEB low-input kit.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000132 
   
  
    
    scRNAseq data generated with 10x genomics 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000134 
   
  
    
    ChIPseq using anti-EZH2 with a sheared input DNA control to assess EZH2 genomic biding in one longitudinal pair of samples (pre- and post-treatment) from one UP and one DOWN responder GBM patient 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000135 
   
  
    
    Bulk RNA-seq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000136 
   
  
    
    Spatial Transcriptomic 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000137 
   
  
    
    Single cell RNA and TCR sequencing of γδ T cells FACS sorted from peripheral blood of two Merkel cell carcinoma patients 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000138 
   
  
    
    The AVENIO ctDNA Expanded Kit is a next-generation sequencing (NGS) liquid biopsy assay with a 77 gene panel (192 kb) containing genes in U.S. National Comprehensive Cancer Network (NCCN) Guidelines and emerging cancer biomarkers. This pan-cancer assay was applied to 100 plasma samples from patients with lung cancer undergoing treatment in the OSCILLATE trial. After 150 bp paired-end sequencing, reads were aligned to the human genome reference with the AVENIO Oncology Analysis Software. These files are the sorted non-deduplicated alignments generated by the analysis software used for subsequent variant, indel and CNV calling. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  100 
 
  
    EGAD50000000139 
   
  
    
    Source data for Figure.2A heatmap showing the gene expression across samples.	 
    
   
  
    
   
  1213 
 
  
    EGAD50000000140 
   
  
    
    Cell annotation for single T and NK cells of PBMC collected from 113 patients, including UMAP coordinates and annotation. 
    
   
  
    
   
  1213 
 
  
    EGAD50000000141 
   
  
    
    Raw count matrix for CITE-seq data of PBMC collected from 113 patients. 
    
   
  
    
   
  1213 
 
  
    EGAD50000000142 
   
  
    
    Raw count matrix for scRNA-seq data of PBMC collected from 113 patients. 
    
   
  
    
   
  1213 
 
  
    EGAD50000000143 
   
  
    
    Clinical data for IMvigor130 cohort of patients, including treatment, response, survial time, and tumor PDL1-IC staining score.	 
    
   
  
    
   
  1213 
 
  
    EGAD50000000144 
   
  
    
    Cell annotation for single CD8 T cells of PBMC collected from 113 patients, including UMAP coordinates and annotation	 
    
   
  
    
   
  1213 
 
  
    EGAD50000000145 
   
  
    
    Cell annotation for single cells of PBMC collected from 113 patients, including UMAP coordinates and annotation.	 
    
   
  
    
   
  1213 
 
  
    EGAD50000000146 
   
  
    
    A list of ctDNA samples included in the dataset. 
    
   
  
    
   
  171 
 
  
    EGAD50000000147 
   
  
    
    ctDNA sample data (one sample per line) for BFAST Cohort D including sample-level summaries like tumor fraction (cTF). 
    
   
  
    
   
  171 
 
  
    EGAD50000000148 
   
  
    
    ctDNA mutation calls (one mutation per sample per line) for BFAST Cohort D including mutation-level data like coding change, allele frequency, etc. 
    
   
  
    
   
  171 
 
  
    EGAD50000000149 
   
  
    
    Clinical data for BFAST Cohort D, including time-to-event, tumor response, and baseline prognostics data. 
    
   
  
    
   
  231 
 
  
    EGAD50000000150 
   
  
    
    Whole exome sequencing data for 425 samples with B-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  425 
 
  
    EGAD50000000151 
   
  
    
    RNA-seq data for 14 samples with B-cell acute lymphoblastic leukemia 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  14 
 
  
    EGAD50000000152 
   
  
    
    We obtained different clones from early passage CRC tumoroids to study mutational signatures specific for truncal or private somatic alterations. These whole genome sequences are part of a larger study on the heterogeneity and evolution of DNA mutation rates in microsatellite-stable colorectal cancer. 
    
   
  
    
      
      unspecified 
      
    
   
  15 
 
  
    EGAD50000000153 
   
  
    
    Dataset of 46 mCRC tumoroids from different passages (23 early-passage 3, 23 late-passage 8-12). FASTQ files of paired sequencing of PolyA-enriched total RNA 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD50000000154 
   
  
    
    3 datasets included: 
- H3K27ac ChIP in ETS2-edited and unedited TPP macrophages from 3 biological replicates
- H3K27ac ChIP in ETS2-overexpressing and control M0 (resting) macrophages from 3 biological replicates
- ATAC-seq (deep sequencing) in ETS2-edited and unedited TPP macrophages from 3 biological replicates 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD50000000155 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 24 samples (14 female, 10 male of Lublin Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  24 
 
  
    EGAD50000000156 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 30 samples (15 female, 15 male of Lower Silesian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  30 
 
  
    EGAD50000000157 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 22 samples (11 female, 11 male of Subcarpathian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  22 
 
  
    EGAD50000000158 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 22 samples (11 female, 11 male of Kuyavian-Pomeranian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  22 
 
  
    EGAD50000000159 
   
  
    
    Whole genome/exome sequencing to detect spontaneous acquired mutations in mismatch repair-deficient human colon organoids. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      Illumina NovaSeq X 
      
    
   
  9 
 
  
    EGAD50000000160 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 25 samples (14 female, 11 male of Pomeranian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  25 
 
  
    EGAD50000000161 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 37 samples (19 female, 18 male of Greater Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  37 
 
  
    EGAD50000000162 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 6 samples (6 female of Holy Cross Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  6 
 
  
    EGAD50000000163 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 47 samples (23 female, 24 male of Silesian Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  47 
 
  
    EGAD50000000164 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 43 samples (16 female, 27 male of Mazovia Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  43 
 
  
    EGAD50000000165 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 41 samples (20 female, 21 male of Lodz Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
   
  41 
 
  
    EGAD50000000166 
   
  
    
    This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. 
The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 32 samples (14 female, 18 male of Lesser Voivodeship, Poland from POPULOUS collection).
Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation.
Reference Genome: GRCh37. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD50000000167 
   
  
    
    Fragle is a deep learning based two stage model that quantifies ctDNA from blood plasma derived cfDNA bam files. Fragle was developed using some previously published datasets and some newly generated data. Fragle was evaluated using some validation cohorts and some unseen cohorts. Some of these cohorts are newly created consisting of total 365 low pass (2-3X) whole genome sequencing bam files mapped to hg19/GRCh37. This dataset contains these newly generated bam files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  365 
 
  
    EGAD50000000168 
   
  
    
    Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). Here we are submitting the dataset that contain NGS files from rare thoracic malignancies (n=102) 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  102 
 
  
    EGAD50000000169 
   
  
    
    This dataset consists of RNA-seq CRAM raw-data files of 1063 primary colorectal cancer and 120 adjacent normal tissue samples. The expression profiles for these samples can be found at the ArrayExpress with accession number E-MTAB-12862.  
    
   
  
    
      
      BGISEQ-500 
      
    
   
  1183 
 
  
    EGAD50000000171 
   
  
    
    This study investigates translocation renal cell carcinoma (tRCC) to identify mutations contributing to tRCC progression. The analysis relies on Whole Exome Sequencing (WES) data obtained from 11 patients treated at University of Texas Southwestern Medical Center (UTSW) affiliated hospitals, including Parkland Hospital and Children’s Medical Center. Tumor samples and their corresponding normal DNA were sequenced using the Illumina platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  34 
 
  
    EGAD50000000172 
   
  
    
    This study investigates differentially expressed genes associated with the progression of translocation renal cell carcinoma (tRCC) by analyzing RNA-Seq data from 23 tumor samples obtained from 12 patients. The samples were collected from affiliated hospitals of the University of Texas Southwestern Medical Center (UTSW), including Parkland Hospital and Children’s Medical Center. RNA isolated from tumor samples was sequenced using the Illumina platform. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD50000000173 
   
  
    
    This dataset encompasses single-cell RNA sequencing data derived from nasal swabs of a paired cohort of school-aged children with cystic fibrosis. The study includes samples both before (n=13) and after initiation (n=13) of elexacaftor/tezacaftor/ivacaftor (ETI) treatment. Additionally, age- and sex-matched controls were included (n=12). Detailed information about the study design and methodology can be found in the manuscript: “Pharmacological improvement of CFTR function rescues airway epithelial homeostasis and host defense in children with cystic fibrosis”. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD50000000174 
   
  
    
    T-ALL relapse usually occurs early but can occur much later, which has been suggested to represent a de novo leukemia. However, we conclusively demonstrate late relapse can evolve from a pre-leukemic subclone harbouring a non-coding mutation that evades initial chemotherapy.
Data include 19 WGS samples:
- 5 cases with presentation, relapse and remission (germline) samples
- 2 cases with presentation and relapse samples, but not remission (germline)
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD50000000175 
   
  
    
    This dataset contains RNA, ATAC and WGS sequencing data of 9 acute myeloid leukemia samples. Sequencing was performed on Illumina HiSeq 2000, HiSeq 4000, NovaSeq 6000 and HiSeq X Ten. The sequencing was always paired. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 4000 
      
      Illumina HiSeq X 
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000176 
   
  
    
    This dataset includes FASTQ files of low coverage whole genome sequencing of normal tissue (n = 8), tumor tissue (n = 55) and cell free DNA from plasma samples (n=101) from patients with metastatic colorectal cancer treated with bevacizumab. A total of 164 samples are present from two different cohorts. The first batch of samples (n = 139, sample names sAPD302T until sAPD502_P0) were collected from the AC-Angiopredict Phase 2 trial (NCT01822444), the second batch of samples (n = 25m, sample names sAPD503_P0 until sAPD527_P0) were collected from the UMM cohort. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  164 
 
  
    EGAD50000000177 
   
  
    
    Whole genome sequencing data of 19 high-grade serous carcinoma (HGSC) patients (48 samples) sequenced with MGISEQ-2000 
    
   
  
    
      
      unspecified 
      
    
   
  48 
 
  
    EGAD50000000180 
   
  
    
    16S v3-v4 amplicon sequencing of milk, fecal and oral cavity samples, along with sequencing controls, of a subset of newborn infants from Lifelines NEXT cohort 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  98 
 
  
    EGAD50000000181 
   
  
    
    16S-ITS-23S amplicon long read PacBio sequencing of milk, fecal and oral cavity samples, along with sequencing controls, of a subset of newborn infants from Lifelines NEXT cohort 
    
   
  
    
      
      Sequel II 
      
    
   
  66 
 
  
    EGAD50000000183 
   
  
    
    This dataset consists of raw unimputed genotype data for 81 individuals with multiple sclerosis (n=33) and other neurological disease (n=48). 
    
   
  
    
   
  81 
 
  
    EGAD50000000184 
   
  
    
    This dataset contains fastq files from single-cell RNA sequencing of cerebrospinal fluid of 81 patients with multiple sclerosis (n=33) or other neurological disease (n=48) using the 10x Genomics Chromium single cell 3’ v2 chemistry. Sequencing was performed using an Illumina HiSeq4000 sequencer. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  71 
 
  
    EGAD50000000185 
   
  
    
    This dataset contains 10x scRNA sequencing data of 16 NSCLC samples. Sequencing was performed on Illumina HiSeq 4000. The sequencing was always paired. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  16 
 
  
    EGAD50000000186 
   
  
    
    COVID-19 whole blood bulk trasncriptomics single-center samples (all Dexamethasone)
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  92 
 
  
    EGAD50000000187 
   
  
    
    COVID-19 whole blood bulk trasncriptomics multi-center 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  90 
 
  
    EGAD50000000188 
   
  
    
    COVID-19 PBMC single-cell transcriptomics, multiplexed 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000189 
   
  
    
    exon 11 mutated UWB1.289 and COV362 cell lines 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000190 
   
  
    
    Spatial transcriptomic data of HGSOC patients before and after treatment 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000191 
   
  
    
    scRNAseq data of HGSOC patients before and after treatment 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000192 
   
  
    
    Bulk RNAseq of cultured fibroblasts 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000193 
   
  
    
    Paired RNA-Seq data from 16 samples of different tumors CD8+ T cells added to the study "Proteogenomic analysis reveals RNA as a source for tumor-agnostic neoantigen identification (H021)". Sequencing was performed on Illumina NextSeq 500. The sequencing was always paired 
    
   
  
    
      
      NextSeq 500 
      
    
   
  16 
 
  
    EGAD50000000194 
   
  
    
    Human RNA-seq data for TRIM24-MET fusion tumor primary samples and patient/PDOX-derived cell lines. Cells were treated with 0.1% DMSO or EC90 of MET inhibitors (capmatinib, cabozantinib, crizotinib) for 4 hours. BAM files containing aligned (and unaligned) reads to hg19 are provided. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD50000000195 
   
  
    
    Relevant clinical data for IMpower133 (GO30081) including treatment arm, overall survival, progression-free survival, best overall response, PD-L1 IHC, molecular subtype, and baseline clinical features. 
    
   
  
    
   
  271 
 
  
    EGAD50000000196 
   
  
    
    This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from IMpower133 (GO30081). 
    
   
  
    
   
  271 
 
  
    EGAD50000000197 
   
  
    
    We generated single-cell transcriptomics, single-nuclei chromatin accessibility, CITE-seq and Multiomics of patients with PML. 
Pools are genetically multiplexed across donors, genotypic variation is included in this dataset to enable demultiplexing.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD50000000198 
   
  
    
    We performed multi-omics profiling of 38 Crohn's disease and Ulcerative colitis patients across several stimulations (RPMI, LPS, Salmonella, in total 80 samples). Nuclei were profiled using the 10X Multiome protocol which offers paired RNA+ATAC from the same nucleus (e.g. shared barcodes). Per library, the ATAC (I1, R1, R2, R3 reads) and RNA (I1, I2, R1, R2 reads) are provided.
Pools are genetically multiplexed across donors. Genotype files are provided to allow genetic demultiplexing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000199 
   
  
    
    In this study, we analyzed 28 primary cutaneous follicle center lymphoma samples from 20 patients and compared the copy number profiles to a cohort of diagnostic samples of 64 nodal follicular lymphoma patients using low-coverage whole genome sequencing (lcWGS). 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  92 
 
  
    EGAD50000000200 
   
  
    
    RNA has been extracted by Rneasy mini kit from around 30mg of flash frozen tissue derived from 3 healthy and 3 tumor endometrial tissues of post-menopausal patients. Raw paired-end fastq.gz files are provided.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000201 
   
  
    
    Single cell full transcriptome sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  59 
 
  
    EGAD50000000202 
   
  
    
    We generated 10X, droplet-based paired snRNAseq+snATACseq (Multiome) of patients suffering from Long Covid. In total, we included 31 patients.
Single nuclei libraries are genetically multiplexed across donors, genotype files are available to enable demultiplexing. Phenotype sheets provide information of pools/donors as well as donor phenotype. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD50000000203 
   
  
    
    We generated 10X, droplet-based paired snRNAseq+snATACseq (Multiome) of the response to P. Aeruginosa or RPMI in patients suffering from Long Covid. In total, we included 15 patients.
Single nuclei libraries are genetically multiplexed across donors, genotype files are available to enable demultiplexing. Phenotype sheets provide information of pools/donors as well as donor phenotype. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  31 
 
  
    EGAD50000000205 
   
  
    
     78 bulk-RNAseq from HGSOC patients from SCANDARE MACARON 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  78 
 
  
    EGAD50000000206 
   
  
    
    These are 28 targeted exome sequencing fastq files (tumour and germline) generated from ovarian cancer samples which had been exposed to PARP inhibitors, for the described project. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD50000000207 
   
  
    
    snRNA-sequencing data from 3 FSHD and 1 control multinucleated myotube cell cultures 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000208 
   
  
    
    Dataset comprising 27 pairs of high-depth (300x) WES results obtained from 54 PDXs derived from primary CRCs resected synchronously with their corresponding metastases. These exoms sequences are part of a larger study on the heterogeneity and evolution of DNA mutation rates in microsatellite-stable colorectal cancer. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  81 
 
  
    EGAD50000000209 
   
  
    
    Deposited here are whole-genome sequencing bam files from the experimental isogenic cell lines used in the study "Redefined indel taxonomy reveals insights into mutational signatures, Koh, Gene (2023)". 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD50000000210 
   
  
    
    Ultra-deep next-generation panel-sequencing on 59 FFPE-samples (20 LTR, 26 relapsed (rHL: 11 initial-diagnosis, 15 relapse) and 13 primary-refractory (prHL: 8 initial-diagnosis, 5 progression) from 44 cHL-patients applying a hybrid-capture approach. Data was processed as described in the publication and mark duplicated bam files were uploaded.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  84 
 
  
    EGAD50000000211 
   
  
    
    Whole genome sequencing data of 25 high-grade serous carcinoma (HGSC) patients (65 samples) sequenced with Illumina Novoseq 6000 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  65 
 
  
    EGAD50000000212 
   
  
    
    Dataset contains raw fastq-files from RNA-sequencing analysis of total RNA extracted from Brodmann Area 9 human postmortem brain tissue samples.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD50000000213 
   
  
    
    The dataset contain 20 lung cancer and 20 healthy control cfDNA samples from plasma sequenced using 150bp PE sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  40 
 
  
    EGAD50000000215 
   
  
    
    Exome sequencing of tumour PDX and matched patient blood 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000216 
   
  
    
    Examination of Sample Multiplexing Reagents for Single Cell RNA-Seq. Nine techniques applied to samples from four PDX models: #105, #177, #233, #264 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  9 
 
  
    EGAD50000000217 
   
  
    
    Whole genome sequencing of patient tumour and blood 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000218 
   
  
    
    Genome-wide NanoRCS of cfDNA from plasma of Granulosa cell tumor patients
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
      PromethION 
      
    
   
  8 
 
  
    EGAD50000000219 
   
  
    
    Genome-wide NanoRCS of cfDNA from ascites of Ovarian cancer patients
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
      PromethION 
      
    
   
  18 
 
  
    EGAD50000000220 
   
  
    
    Genome-wide NanoRCS of cfDNA from plasma of healthy individuals 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
      PromethION 
      
    
   
  9 
 
  
    EGAD50000000221 
   
  
    
    Genome-wide NanoRCS of cfDNA from plasma of Esophageal cancer patients
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
    
   
  14 
 
  
    EGAD50000000222 
   
  
    
    Single-cell Cut&Tag of three histone modifications and chromatin accessibility over a timecourse of brain organoid development from pluripotency. We profiled 5 time points of brain organoid development and 2 time points of retinal organoid development. At each time point used scCut&Tag to profile histone modifications (H3K27me3, H3K27ac, H3K4me3) as well as multiome-seq (10X) to jointly profile transcriptome and chromatin accessibility from the same cell suspension. Library preparation was carried out with the 10x genomics platform.  
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD50000000223 
   
  
    
    Single-cell Cut&Tag of three histone modifications and chromatin accessibility over a timecourse of brain organoid development from pluripotency. We profiled 5 time points of brain organoid development and 2 time points of retinal organoid development. We used scRNA-seq to profile transcriptome. Library preparation was carried out with the 10x genomics platform.  
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD50000000224 
   
  
    
    Transcriptomics and epigenomics after chemical inhibition of EED during early human brain organoid development. To investigate the role of H3K27me3 inhibition during neuroectoderm induction, we treated brain organoids with 3 concentrations of EED inhibitor (and control) during the transition of pluripotency to neuroepithelium. At the neuroepithelium stage, we profiled the organoids using scRNA-seq (transcriptome) and bulk Cut&Tag (H3K27me3, H3K27ac). Library preparation for scRNA-seq was carried out with the 10x genomics platform. 
    
   
  
    
      
      unspecified 
      
    
   
  2 
 
  
    EGAD50000000225 
   
  
    
    scRNA-seq and Cut&Tag from a sample of the human fetal brain. To compare neurogenesis in the primary human brain with that in organoids, we profiled transcriptome and chromatin accessibility (10X multiome) as well as bulk Cut&Tag of histone modifications H3K27me3, H3K27ac and H3K4me3. We used a human brain sample from GW 19 for all experiments. 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD50000000226 
   
  
    
    Patients included in this study were over 18 years of age and had a histology-confirmed diagnosis of glioblastoma multiforme (GBM). Exclusion criteria were the previous administration of any anti-tumor therapy including radiation therapy. All patients gave written informed consent. The study was approved by the local ethics committee (TUM Medical school) and conducted following the Declaration of Helsinki. During resection of the tumors, tumor tissue and tissue from normal appearing brain within the operative channel was collected. Blood was drawn during the surgical procedure. Single cell suspensions were prepared from the tumor tissue, the normal appearing brain, and the blood. CD4+ T cells and CD8+ T cells were sorted by flow cytometry. Only patients with a complete set of specimens (CD4+ tumor infiltrating lymphocytes (TIL), CD8+ TIL, CD4+ T cells from normal appearing brain, CD8+ T cells from normal appearing brain, blood-derived CD4+ and CD8+ T cells) containing a minimum of 1000 cells in each sorted sample were further analyzed (n=9). Total RNA was isolated from sorted cell populations using the RNAeasy Plus micro kit (Qiagen, 74034). Quality and integrity of total RNA was controlled on a Bioanalyzer 2100 (Agilent Technologies). Library preparation for bulk-sequencing of poly(A)-RNA was done as described previously (Parekh et al., 2016). Briefly, barcoded cDNA of each sample was generated with a Maxima RT polymerase (ThermoFisher Scientific, EP0742) using oligo-dT primer containing barcodes, unique molecular identifiers (UMIs) and an adaptor. Ends of the cDNAs were extended by a template switch oligo (TSO) and full-length cDNA was amplified with primers binding to the TSO-site and the adaptor. NEB UltraII FS kit was used to fragment cDNA. After end repair and A-tailing, a TruSeq adapter was ligated and 3'-end-fragments were finally amplified using primers with Illumina P5 and P7 overhangs. In comparison to previous descriptions (Parekh et al., 2016), the P5 and P7 sites were exchanged to allow sequencing of the cDNA in read 1 and barcodes and UMIs in read 2 to achieve a better cluster recognition. The library was sequenced on a NextSeq 500 (Illumina) with 59 cycles for the cDNA in read 1 and 16 cycles for the barcodes and UMIs in read 2. Data were processed using the published Drop-seq pipeline (v1.0) to generate sample- and gene-wise UMI tables (Macosko et al., 2015). Reference genome (GRCh38) was used for alignment. Transcript and gene definitions were used according to the Genecode Annotation Version 35. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  60 
 
  
    EGAD50000000227 
   
  
    
    This dataset contains single-end fastq files for human oocytes (n=12), zygotes (n=5), and early embryos (n=10) sequenced with Illumina NextSeq500. The sequencing libraries were prepared from single oocytes, zygotes, and embryos in two batches.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  27 
 
  
    EGAD50000000228 
   
  
    
    Deposited here are time series bulk RNA-seq data generated from isogenic wild-type (WT), XPA and XPG gene knockouts hiPSC cells during cortical neuronal differentiation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  61 
 
  
    EGAD50000000229 
   
  
    
    ANCA-associated glomerulonephritis (AGN) associates with a high risk of end-stage kidneydisease. The role of kidney immune cells in local inflammation remains unclear. This study investigates kidney immune cell diversity and function. Kidney tissue from AGN patients (n=5) and a lupus nephritis (LN) patient (n=1) were aquired during a biopsy procedure for a clinical indication. Needle-core biopsies were obtained for histopathological examination, and an additional pass was performed to retrieve kidney tissue for scRNA-seq. Healthy kidney tissue (n=1) was obtained from a kidney that was surgically removed do tue due  to a (non-invasive) papillary urothelial carcinoma. Immediately after collection, kidney tissue was processed into a single-cell suspension and sorted using a 4-color flow cytometry panel to isolate living, CD45+immune cells. To aid in the multi-omic characterization, surface markers and T and B cell repertoires were sequenced in 2 samples (1 AGN patient and the nephrectomy control). These samples were incubated with an oligo-antibody TotalSeq-C cocktail containing 130 unique cell surface antigens.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD50000000230 
   
  
    
    Deposited here are time series bulk RNA-seq data generated from XP patients and controls as well as rescued patient hiPSC cells during cortical neuronal differentiation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  87 
 
  
    EGAD50000000231 
   
  
    
    Human datasets associated with paper 'Clonally heritable gene expression imparts a layer of diversity within cell types' published in Cell Systems 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  7092 
 
  
    EGAD50000000232 
   
  
    
    Deposited here are WGS data generated from differentiated neural stem cells (NSCs) of NXPG-32 patient (XPG patient with neurodegeneration) and a matched healthy control CTRL-33. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000233 
   
  
    
    This dataset contain WGS sequencing data of head and neck cancer samples as well as blood plasma controls. Sequencing was performed on Illumina Novaseq 6000 using KAPA HyperPrep Kit. The sequencing was always paired. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  191 
 
  
    EGAD50000000234 
   
  
    
    To explore how JAK2V617F disrupts cell-fate and the regulatory chromatin landscape of HSCs, we applied GoT-ChA to CD34+ sorted progenitor cells from 21 human primary samples, comprising 18 patients with JAK2V617F-mutated MF (no additional mutations, except for Pt-08), who either had no treatment (n = 12; including three longitudinal samples from a PV patient who progressed to MF) or who were treated with ruxolitinib (n = 6), a JAK1/2 inhibitor. We included a JAK2V617F CH sample (Pt-19) to explore early epigenetic changes, before the onset of overt hematological abnormality.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD50000000235 
   
  
    
    Data set for scRaCH-seq manuscript. We base call the fast5 using guppy.  
    
   
  
    
      
      PromethION 
      
    
   
  21 
 
  
    EGAD50000000236 
   
  
    
    This dataset contains data from nanopore amplicon sequencing of FLG exon 3 in 22 patients and Nanopore adaptive sampling sequencing of the EDC region in two patients.  
    
   
  
    
      
      GridION 
      
    
   
  22 
 
  
    EGAD50000000237 
   
  
    
    Whole Genome Sequencing (120X) of the two tumors and blood from 4 pediatric cases with second malignancies (12 WGS) 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  12 
 
  
    EGAD50000000238 
   
  
    
    Whole Genome Sequencing of normal tissues from case 3 (9 WGS 120X), including parents blood Whole Genome Sequencing (2 WGS 30X). 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  11 
 
  
    EGAD50000000239 
   
  
    
    Duplex Sequencing of normal and tumor tissues from case 2 and case 3 (10 DS) 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  10 
 
  
    EGAD50000000240 
   
  
    
    Whole Genome Sequencing of 2 expanded clones from a cell line derived from the rhabdoid tumor case 3 (2 WGS 20X) 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  2 
 
  
    EGAD50000000241 
   
  
    
    This set contains a total of 78 files cram files with RNA sequencing data from 20 patients included in the PANDA study treated with 1 cycle of monotherapy atezolizumab and 4 cycles atezolizumab plus chemotherapy (docetaxel, oxaliplatin and capecitabine).  RNA was isolated from fresh frozen material and sequenced at 4 timepoints baseline, after monotherapy atezolizumab, after combination atezolizumab plus chemotherapy and at resection (due to 2 missing samples, there is a total of 78 samples).
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  78 
 
  
    EGAD50000000242 
   
  
    
    This set contains 40 bam files with whole exome sequencing data from 20 patients included in the PANDA study treated with 1 cycle of monotherapy atezolizumab and 4 cycles atezolizumab plus chemotherapy (docetaxel, oxaliplatin and capecitabine).  Tumor DNA was isolated from tumor samples and germline DNA was isolated from PBMCs, which was used for whole exome sequencing.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  40 
 
  
    EGAD50000000243 
   
  
    
    We performed whole exome sequencing, whole genome sequencing and transcriptome sequencing of multiple tumor regions from 65 patients with SCLC.  
    
   
  
    
      
      unspecified 
      
    
   
  423 
 
  
    EGAD50000000244 
   
  
    
    1034 shotgun metagenomes from baseline FIT CRCbiome samples 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  1034 
 
  
    EGAD50000000245 
   
  
    
    Exome sequencing of FFPE and patient-derived cultures from patients enrolled in clinical study NCT03860376 
    
   
  
    
      
      DNBSEQ-G400 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD50000000246 
   
  
    
    Transcriptome sequencing of FFPE and patient-derived cultures from clinical study NCT03860376 
    
   
  
    
      
      DNBSEQ-G400 
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD50000000247 
   
  
    
    The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peripheral blood mononuclear cells (PBMC) from patients with colorectal cancer (CRC) and peritoneal metastases (PM). Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD50000000248 
   
  
    
    The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal fluid (PF) from patients with colorectal cancer (CRC) and peritoneal metastases (PM). Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD50000000249 
   
  
    
    The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal metastases (PM) from patients with colorectal cancer (CRC) and PM. Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000250 
   
  
    
    The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peripheral blood mononuclear cells (PBMC) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000251 
   
  
    
    The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal fluid (PF) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000252 
   
  
    
    The data published here contains cellular indexing of transcriptomes and epitopes sequencing (CITEseq) using the oligo-tagged TotalSeq™-B Human Universal Cocktail, V1.0 from BioLegend on peritoneal fluid (PF) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000.
Below an overview of the epitopes to oligotags as obtained from BioLegend:
DNA_ID	Description	clone	Sequence	Ensembl ID
B0006	anti-human CD86	IT2.2	GTCTTTGTCAGTGCA	ENSG00000114013
B0007	anti-human CD274 (B7-H1, PD-L1)	29E.2A3	GTTGTCCGACAATAC	ENSG00000120217
B0020	anti-human CD270 (HVEM, TR2)	122	TGATAGAAACAGACC	ENSG00000157873
B0023	anti-human CD155 (PVR)	SKII.4	ATCACATCGTTGCCA	ENSG00000073008
B0024	anti-human CD112 (Nectin-2)	TX31	AACCTTCCGTCTAAG	ENSG00000130202
B0026	anti-human CD47	CC2C6	GCATTCTGTCACCTA	ENSG00000196776
B0029	anti-human CD48	BJ40	CTACGACGTAGAAGA	ENSG00000117091
B0031	anti-human CD40	5C3	CTCAGATGGAGTATG	ENSG00000101017
B0032	anti-human CD154	24-31	GCTAGATAGATGCAA	ENSG00000102245
B0033	anti-human CD52	HI186	CTTTGTACGAGCAAA	ENSG00000169442
B0034	anti-human CD3	UCHT1	CTCATTGTAACTCCT	ENSG00000167286
B0046	anti-human CD8	SK1	GCGCAACTTGATGAT	ENSG00000153563
B0047	anti-human CD56 (NCAM)	5.1H11	TCCTTTCCTGATAGG	ENSG00000149294
B0050	anti-human CD19	HIB19	CTGGGCAATTACTCG	ENSG00000177455
B0052	anti-human CD33	P67.6	TAACTCAGGGCCTAT	ENSG00000105383
B0053	anti-human CD11c	S-HCL-3	TACGCCTATAACTTG	ENSG00000140678
B0058	anti-human HLA-A,B,C	W6/32	TATGCGAGGCTTATC	ENSG00000206503
B0063	anti-human CD45RA	HI100	TCAATCCTTCCGCTT	ENSG00000081237
B0064	anti-human CD123	6H6	CTTCACTCTGTCAGG	ENSG00000185291
B0066	anti-human CD7	CD7-6B7	TGGATTCCCGGACTT	ENSG00000173762
B0068	anti-human CD105	43A3	ATCGTCGAGAGCTAG	ENSG00000106991
B0070	anti-human/mouse CD49f	GoH3	TTCCGAGGATGATCT	ENSG00000091409
B0071	anti-human CD194 (CCR4)	L291H4	AGCTTACCTGCACGA	ENSG00000183813
B0072	anti-human CD4	RPA-T4	TGTTCCCGCTCAACT	ENSG00000010610
B0073	anti-mouse/human CD44	IM7	TGGCTTCAGGTCCTA	ENSG00000026508
B0081	anti-human CD14	M5E2	TCTCAGACCTCCGTA	ENSG00000170458
B0083	anti-human CD16	3G8	AAGTTCACTCTTTGC	ENSG00000203747
B0085	anti-human CD25	BC96	TTTGTCCTGTACGCC	ENSG00000134460
B0087	anti-human CD45RO	UCHL1	CTCCGAATCATGTTG	ENSG00000081237
B0088	anti-human CD279 (PD-1)	EH12.2H7	ACAGCGCCGTATTTA	ENSG00000188389
B0089	anti-human TIGIT (VSTM3)	A15153G	TTGCTTACCGCCAGA	ENSG00000181847
B0090	Mouse IgG1, κ isotype 	MOPC-21	GCCGGACGACATTAA	
B0091	Mouse IgG2a, κ isotype 	MOPC-173	CTCCTACCTAAACTG	
B0092	Mouse IgG2b, κ isotype 	MPC-11	ATATGTATCACGCGA	
B0095	Rat IgG2b, κ isotype 	RTK4530	GATTCTTGACGACCT	
B0100	anti-human CD20	2H7	TTCTGGGTCCCTAGA	ENSG00000156738
B0101	anti-human CD335 (NKp46)	9E2	ACAATTTGAACAGCG	ENSG00000189430
B0124	anti-human CD31	WM59	ACCTTTATGCCACGG	ENSG00000261371
B0127	anti-Human Podoplanin	NC-08	GGTTACTCGTTGTGT	ENSG00000162493
B0134	anti-human CD146	P1H12	CCTTGGATAACATCA	ENSG00000076706
B0136	anti-human IgM	MHM-88	TAGCGAGCCCGTATA	ENSG00000211899
B0138	anti-human CD5	UCHT2	CATTAACGGGATGCC	ENSG00000110448
B0141	anti-human CD195 (CCR5)	J418F1	CCAAAGTAAGAGCCA	ENSG00000160791
B0142	anti-human CD32	FUN-2	GCTTCCGAATTACCG	ENSG00000143226
B0143	anti-human CD196 (CCR6)	G034E3	GATCCCTTTGTCACT	ENSG00000112486
B0144	anti-human CD185 (CXCR5)	J252D4	AATTCAACCGTCGCC	ENSG00000160683
B0145	anti-human CD103 (Integrin αE)	Ber-ACT8	GACCTCATTGTGAAT	ENSG00000083457
B0146	anti-human CD69	FN50	GTCTCTTGGCTTAAA	ENSG00000110848
B0147	anti-human CD62L	DREG-56	GTCCCTGCAACTTGA	ENSG00000188404
B0149	anti-human CD161	HP-3G10	GTACGCAGTCCTTCT	ENSG00000111796
B0151	anti-human CD152 (CTLA-4)	BNI3	ATGGTTCACGTAATC	ENSG00000163599
B0152	anti-human CD223 (LAG-3)	11C3C65	CATTTGTCTGCCGGT	ENSG00000089692
B0153	anti-human KLRG1 (MAFA)	SA231A2	CTTATTTCCTGCCCT	ENSG00000139187
B0154	anti-human CD27	O323	GCACTCCTGCATGTA	ENSG00000139193
B0155	anti-human CD107a (LAMP-1)	H4A3	CAGCCCACTGCAATA	ENSG00000185896
B0156	anti-human CD95 (Fas)	DX2	CCAGCTCATTAGAGC	ENSG00000026103
B0158	anti-human CD134 (OX40)	Ber-ACT35 (ACT35)	AACCCACCGTTGTTA	ENSG00000186827
B0159	anti-human HLA-DR	L243	AATAGCGAGCAAGTA	ENSG00000204287
B0160	anti-human CD1c	L161	GAGCTACTTCACTCG	ENSG00000158481
B0161	anti-human CD11b	ICRF44	GACAAGTGATCTGCA	ENSG00000169896
B0162	anti-human CD64	10.1	AAGTATGCCCTACGA	ENSG00000150337
B0163	anti-human CD141 (Thrombomodulin)	M80	GGATAACCGCGCTTT	ENSG00000178726
B0164	anti-human CD1d	51.1	TCGAGTCGCTTATCA	ENSG00000158473
B0165	anti-human CD314 (NKG2D)	1D11	CGTGTTTGTTCCTCA	ENSG00000213809
B0167	anti-human CD35	E11	ACTTCCGTCGATCTT	ENSG00000203710
B0168	anti-human CD57 Recombinant	QA17A04	AACTCCCTATGGAGG	ENSG00000109956
B0170	anti-human CD272 (BTLA)	MIH26	GTTATTGGACTAAGG	ENSG00000186265
B0171	anti-human/mouse/rat CD278 (ICOS)	C398.4A	CGCGCACCCATTAAA	ENSG00000163600
B0174	anti-human CD58 (LFA-3)	TS2/9	GTTCCTATGGACGAC	ENSG00000116815
B0176	anti-human CD39	A1	TTACCTGGTATCCGT	ENSG00000138185
B0179	anti-human CX3CR1	K0124E1	AGTATCGTCTCTGGG	ENSG00000168329
B0180	anti-human CD24	ML5	AGATTCCTTCGTGTT	ENSG00000272398
B0181	anti-human CD21	Bu32	AACCTAGTAGTTCGG	ENSG00000117322
B0185	anti-human CD11a	TS2/4	TATATCCTTGTGAGC	ENSG00000005844
B0187	anti-human CD79b (Igβ)	CB3-1	ATTCTTCAACCGAAG	ENSG00000007312
B0189	anti-human CD244 (2B4)	C1.7	TCGCTTGGATGGTAG	ENSG00000122223
B0206	anti-human CD169 	7-239	TACTCAGCGTGTTTG	ENSG00000088827
B0214	anti-human/mouse integrin β7	FIB504	TCCTTGGATGTACCG	ENSG00000139626
B0215	anti-human CD268 (BAFF-R)	11C1	CGAAGTCGATCCGTA	ENSG00000159958
B0216	anti-human CD42b	HIP1	TCCTAGTACCGAAGT	ENSG00000203618
B0217	anti-human CD54	HA58	CTGATAGACTTGAGT	ENSG00000090339
B0218	anti-human CD62P (P-Selectin)	AK4	CCTTCCGTATCCCTT	ENSG00000174175
B0219	anti-human CD119 (IFN-γ R α chain)	GIR-208	TGTGTATTCCCTTGT	ENSG00000027697
B0224	anti-human TCR α/β	IP26	CGTAACGTAGAGCGA	
B0236	Rat IgG1, κ isotype 	RTK2071	ATCAGATGCCCTCAT	
B0238	Rat IgG2a, κ Isotype 	RTK2758	AAGTCAGGTTCGTTT	
B0242	anti-human CD192 (CCR2)	K036C2	GAGTTCCCTTACCTG	ENSG00000121807
B0246	anti-human CD122 (IL-2Rβ)	TU27	TCATTTCCTCCGATT	ENSG00000100385
B0352	anti-human FcεRIα	AER-37 (CRA-1)	CTCGTTTCCGTATCG	ENSG00000179639
B0353	anti-human CD41	HIP8	ACGTTGTGGCCTTGT	ENSG00000005961
B0355	anti-human CD137 (4-1BB)	4B4-1	CAGTAAGTTCGGGAC	ENSG00000049249
B0358	anti-human CD163	GHI/61	GCTTCTCCTTCCTTA	ENSG00000177575
B0359	anti-human CD83	HB15e	CCACTCATTTCCGGT	ENSG00000112149
B0363	anti-human CD124 (IL-4Rα)	G077F6	CCGTCCTGATAGATG	ENSG00000077238
B0364	anti-human CD13	WM15	TTTCAACGCCCTTTC	ENSG00000166825
B0367	anti-human CD2	TS1/8	TACGATTTGTCAGGG	ENSG00000116824
B0368	anti-human CD226 (DNAM-1)	11A8	TCTCAGTGTTTGTGG	ENSG00000150637
B0369	anti-human CD29	TS2/16	GTATTCCCTCAGTCA	ENSG00000150093
B0370	anti-human CD303 (BDCA-2)	201A	GAGATGTCCGAATTT	ENSG00000198178
B0371	anti-human CD49b	P1E6-C5	GCTTTCTTCAGTATG	ENSG00000164171
B0373	anti-human CD81 (TAPA-1)	5A6	GTATCCTTCCTTGGC	ENSG00000110651
B0384	anti-human IgD	IA6-2	CAGTCTCCGTAGAGT	ENSG00000211898
B0385	anti-human CD18	TS1/18	TATTGGGACACTTCT	ENSG00000160255
B0386	anti-human CD28	CD28.2	TGAGAACGACCCTAA	ENSG00000178562
B0389	anti-human CD38	HIT2	TGTACCCGCTTGTGA	ENSG00000004468
B0390	anti-human CD127 (IL-7Rα)	A019D5	GTGTGTTGTCCTATG	ENSG00000168685
B0391	anti-human CD45	HI30	TGCAATTACCCGGAT	ENSG00000081237
B0393	anti-human CD22	S-HCL-1	GGGTTGTTGTCTTTG	ENSG00000012124
B0394	anti-human CD71	CY1G4	CCGTGTTCCTCATTA	ENSG00000072274
B0396	anti-human CD26	BA5b	GGTGGCTAGATAATG	ENSG00000197635
B0398	anti-human CD115 (CSF-1R)	9-4D2-1E4	AATCACGGTCCTTGT	ENSG00000182578
B0404	anti-human CD63	H5C6	GAGATGTCTGCAACT	ENSG00000135404
B0406	anti-human CD304 (Neuropilin-1)	12C2	GGACTAAGTTTCGTT	ENSG00000099250
B0407	anti-human CD36	5-271	TTCTTTGCCTTGCCA	ENSG00000135218
B0408	anti-human CD172a (SIRPα)	15-414	CGTGTTTAACTTGAG	ENSG00000198053
B0419	anti-human CD72	3F3	CAGTCGTGGTAGATA	ENSG00000137101
B0420	anti-human CD158 (KIR2DL1/S1/S3/S5)	HP-MA4	TATCAACCAACGCTT	ENSG00000125498
B0446	anti-human CD93	VIMD2	GCGCTACTTCCTTGA	ENSG00000125810
B0575	anti-human CD49a	TS2/7	ACTGATGGACTCAGA	ENSG00000213949
B0576	anti-human CD49d	9F10	CCATTCAACTTCCGG	ENSG00000115232
B0577	anti-human CD73 (Ecto-5'-nucleotidase)	AD2	CAGTTCCTCAGTTCG	ENSG00000135318
B0579	anti-human CD9	HI9a	GAGTCACCAATCTGC	ENSG00000010278
B0581	anti-human TCR Vα7.2	3C10	TACGAGCAGTATTCA	
B0582	anti-human TCR Vδ2	B6	TCAGTCAGATGGTAT	
B0591	anti-human LOX-1	15C4	ACCCTTTACCGAATA	ENSG00000173391
B0592	anti-human CD158b (KIR2DL2/L3, NKAT2)	DX27	GACCCGTAGTTTGAT	ENSG00000243772
B0599	anti-human CD158e1 (KIR3DL1, NKB1)	DX9	GGACGCTTTCCTTGA	ENSG00000167633
B0822	anti-human CD142	NY2	CACTGCCGTCGATTA	ENSG00000117525
B0830	anti-human CD319 (CRACC)	162.1	AGTATGCCATGTCTT	ENSG00000026751
B0864	anti-human CD352 (NTB-A)	NT-7	AGTTTCCACTCAGGC	ENSG00000162739
B0867	anti-human CD94	DX22	CTTTCCGGTCCTACA	ENSG00000134539
B0871	anti-human CD162	KPL-1	ATATGTCAGAGCACC	ENSG00000110876
B0896	anti-human CD85j (ILT2)	GHI/75	CCTTGTGAGGCTATG	ENSG00000104972
B0897	anti-human CD23	EBVCS-5	TCTGTATAACCGTCT	ENSG00000104921
B0902	anti-human CD328 (Siglec-7)	6-434	CTTAGCATTTCACTG	ENSG00000168995
B0918	anti-human HLA-E	3D12	GAGTCGAGAAATCAT	ENSG00000204592
B0920	anti-human CD82	ASL-24	TCCCACTTCCGCTTT	ENSG00000085117
B0944	anti-human CD101 (BB27)	BB27	CTACTTCCCTGTCAA	ENSG00000134256
B1046	anti-human CD88 (C5aR)	S5/1	GCCGCATGAGAAACA	ENSG00000197405
B1052	anti-human CD224	KF29	CTGATGAGATGTCAG	ENSG00000100031
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000253 
   
  
    
    This dataset is the second batch of WGS uploaded from FL GenomeCanada data. The other batch is in EGAD00001011343 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  78 
 
  
    EGAD50000000255 
   
  
    
    Bulk RNA sequencing of iPSCs and iPSC derived pericytes from three cell lines MNZTASi019-A, MNZTASi021-A and MNZTASi022-A. Data includes one iPSC data set per cell line and iPSC derived pericyte differentiations per cell line (3 differentiations from MNZTASi019-A and MNZTASi021-A, 2 from MNZTASi022-A). RNA-seq data was generated using an Illumina Stranded mRNA 150bp paired-end library preparation.  
    
   
  
    
      
      NextSeq 2000 
      
    
   
  11 
 
  
    EGAD50000000257 
   
  
    
    Raw paired-end whole-genome sequencing data of plasma cell free DNA on the NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  810 
 
  
    EGAD50000000258 
   
  
    
    Metagenomic sequencing of human fecal samples 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  239 
 
  
    EGAD50000000259 
   
  
    
    Fastq and vcf files called by Agilent Sure Call softare 
    
   
  
    
      
      NextSeq 550 
      
    
   
  20 
 
  
    EGAD50000000260 
   
  
    
    Single cell sequencing of expanded regulatory T cells (Tregs) in 9 APS-1 patients and 9 age and gender matched controls.
Gene expression (GEX) libraries were generated by using the Library Construction Kit from 10x genomics.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD50000000261 
   
  
    
    Single cell TCR sequencing of expanded regulatory T cells (Tregs) in 9 APS-1 patients and 9 ange and gender matched controls.
T-cell receptor (TCR) libraries were generated using the Single Cell Human TCR Amplification Kit and the Library Construction Kit from 10x genomics. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD50000000262 
   
  
    
    Single cell sequencing of expanded regulatory T cells (Tregs) in 8 APS-1 patients and 8 age and gender matched controls (same patients and controls as for global gene expression and TCR sequencing, excluding one for each group (control sample and patient sample #13)).
Each Sample has two technical repeats.
10x Genomics Target Hybridization Kit and Human Immunology Panel was used with GEX libraries. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000264 
   
  
    
    Despite major advances in linking single genetic variants to single causal genes, the significance of genetic variation on transcript-level regulation of expression, transcript-specific functions, and relevance to human disease has been poorly investigated. Strawberry notch homolog 2 (SBNO2) is a candidate gene in a susceptibility locus with different variants associated with Crohn’s disease and bone mineral density. The SBNO2 locus is also differentially methylated in Crohn’s disease but the functional mechanisms are unknown. Here we show that the isoforms of SBNO2 are differentially regulated by lipopolysaccharide and IL-10. We identify Crohn’s disease associated isoform quantitative trait loci that negatively regulate the expression of the noncanonical isoform 2 corresponding with the methylation signals at the isoform 2 promoter in IBD and CD. The two isoforms of SBNO2 drive differential gene networks with isoform 2 dominantly impacting antimicrobial activity in macrophages. Our data highlight the role of isoform quantitative trait loci to understand disease susceptibility and resolve underlying mechanisms of disease.
This dataset contains RNAseq raw data from CD14+ monocyte-derived macrophages and siRNA-mediated knockdown experiments, as well as RNAseq raw data from THP-1 monocytes-derived macrophages following ectopic expression of SBNO2 isoforms. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD50000000266 
   
  
    
    The breadth and depth at which cancer models are interrogated contribute to successful translation of drug discovery efforts to the clinic. In colorectal cancer (CRC), model availability is limited by a dearth of large-scale collections of patient-derived xenografts (PDXs) and paired tumoroids from metastatic disease, the setting where experimental therapies are typically tested. XENTURION is a unique open-science resource that combines a platform of 129 PDX models and a sister platform of 129 matched PDX-derived tumoroids (PDXTs) from patients with metastatic CRC, with accompanying multidimensional molecular and therapeutic characterization. In this specific dataset we focused our attention on early (passage 3) and late (passage 8-12) PDXTs with their matched PDXs and normal liver 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  92 
 
  
    EGAD50000000267 
   
  
    
    Amplicon sequencing 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  589 
 
  
    EGAD50000000269 
   
  
    
    This dataset contains 21 pediatric brain tumor cases which where obtained during surgery using a cavitating ultrasonic aspiration and submitted to nanopore whole genome sequencing. 
    
   
  
    
      
      MinION 
      
    
   
  21 
 
  
    EGAD50000000270 
   
  
    
    Targeted sequencing using the TruSight Oncology 500 DNA probes panel, on samples 162 and 521 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000271 
   
  
    
    TruSight Oncology 500 RNA probes panel on case 521 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD50000000272 
   
  
    
    RNA extracted from FFPE tumour samples 368, 455, 503, and 521, sequenced using the TruSight RNA Fusion Panel 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000273 
   
  
    
    311 genes  
    
   
  
    
   
  1 
 
  
    EGAD50000000274 
   
  
    
    This data set contains 16 paired fastq files (WGS) and 4 paired fastq files (WES). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  4 
 
  
    EGAD50000000275 
   
  
    
    Whole genome sequencing data of 56 high-grade serous carcinoma (HGSC) patients (208 samples) sequenced with Novoseq 6000 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  208 
 
  
    EGAD50000000276 
   
  
    
    The synthetic genomes have been created trying to mimic real cancer data of 4 patients (Named 185,186,187 and 188). Mutations are based on real CRC patients from the PCAWG dataset. For each patient, two tumor samples at different time points and one healthy sample have been simulated. The cancer intra-tumor heterogeneity and evolution in the patients is depicted by simulating reads from tumor subclones separately and then mixing them according to their clonal proportions in each sample. For rapid use and transfer only selected chromosomes have been generated for each patient.
Chromosomes per patient:
-185: chr4, chr5, chr7, chr17
-186: chr1, chr7, chr12, chr17
-187: chr1, chr2, chr5, chr12, chr17
-188: chr2, chr5, chr12, chr13, chr17
Worflows used to create BAM/BAI, VCF and MAF files from FASTQ (Alignment with GRCh38):
- https://usegalaxy.eu/published/workflow?id=2c3d05023c02113e
- https://usegalaxy.eu/published/workflow?id=1da86d74f8535f4e 
    
   
  
    
      
      unspecified 
      
    
   
  8 
 
  
    EGAD50000000277 
   
  
    
    Chemotherapy is the standard-of-care treatment for metastatic colorectal cancer (mCRC) and benefits some patients, but what distinguishes responders from non-responders is unclear. In this study, we leveraged a comprehensive collection of 27 molecularly annotated patient-derived xenografts to uncover functional predictors of response to 5-FU and irinotecan combination therapy (FOLFIRI) in mCRC. Genetic analyses revealed that treatment sensitivity was marked by genomic scars indicative of BRCAness, suggesting homologous recombination (HR) deficiency as a key determinant. Accordingly, we surveyed a manually curated panel of 44 genes with a documented role in HR for the potential presence of pathogenic mutations. We did not observe a specific enrichment of HR gene mutations based on response to FOLFIRI. This result, combined with the absence of widespread biallelic inactivation of the analyzed genes and the predominance of mutations categorized as variants of unknown significance, suggests that FOLFIRI sensitivity is not primarily governed by underlying mutations in HR genes responsible for mitigating the genotoxic effects of this therapeutic regimen. 
    
   
  
    
      
      unspecified 
      
    
   
  27 
 
  
    EGAD50000000283 
   
  
    
    The dataset comprises data for n=329 participants who underwent saliva sampling. Shotgun metagenomic paired-end sequencing was conducted using the Illumina NovaSeq 6000 platform, and the resulting files are in FASTQ format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  331 
 
  
    EGAD50000000285 
   
  
    
    DM1 patient blood transcriptome samples, mix of 6 patients per sample 
    
   
  
    
      
      Sequel II 
      
    
   
  3 
 
  
    EGAD50000000286 
   
  
    
    The capture panel targets the functional methylome in human whole blood. Regions incorporated in the panel design included hypomethylated windows generated from merged WGBS data. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  527 
 
  
    EGAD50000000287 
   
  
    
    Gut microbiome 16S rRNA raw data for N=7174 FINRISK 2002 participants.
FINRISK fecal samples were mailed to the Knight laboratory at the University of California (San Diego, CA), for microbiota sequencing using the standard Earth Microbiome Project protocols (https://earthmicrobiome.org/protocols-and-standards/). DNA was extracted using a magnetic bead-based DNA extraction protocol. Amplicon sequence data for the V4 region of the 16S rRNA gene was generated using 515F (Parada) and 806R (Apprill) primers. For a total of 25μl reaction volume, 13 µl PCR-grade water was combined with 10 µl PCR master mix (Platinum Hot Start PCR Master Mix, 2x, ThermoFisher), 0.5 µl forward primer (10 µM), 0.5 µl reverse primer (10 µM), and 1 µl template DNA. Amplification was performed in triplicate reactions, and triplicate PCR reactions were pooled afterwards. Expected products were visualized on agarose gels (300–350 bp) and quantified with Quant-iT PicoGreen dsDNA kit (Invitrogen). Equal amounts (240 ng) of amplicon were combined for each sample and cleaned (MoBio UltraClean PCR Clean-Up Kit). Cleaned amplicon pools were sequenced with 515F and 806R primers. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  7174 
 
  
    EGAD50000000291 
   
  
    
    Starting point is one patient from which iPSC cells are created, some received a genome correction to be called iCtrl from those two batches (DJ1 and iCtrl). 3 independent differentiation to Microglia cells were done, named R1, R2 and R3.
On top, growing conditions were either untreated or LPS, leading to the 12 samples. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD50000000292 
   
  
    
    Single-cell RNA-sequencing of bronchoalveolar lavage (BAL) samples from patients with severe COVID-19 with or without Dexamethasone treatment and for responders and non-responders was performed using 10x Genomics technology. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000293 
   
  
    
    This dataset consists of WES NGS paired-end raw data (FASTQ R1, R2 and UMI sequence) obtained from 25 localised colon cancer patients. Specifically, we have 25 primary tumor tissue samples, 17 metastatic tissues, 25 white blood cell samples, 25 plasma samples collected at relapse, 12 baseline plasma samples and 15 plasma samples post-surgery. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  119 
 
  
    EGAD50000000294 
   
  
    
    Ultra high-resolution chromatin capture data in CACO2, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000295 
   
  
    
    ChIP-Seq data for CTCF, H3K4me1, H3K4me3, H3K27ac, H3K27me3, H3K36me3 in C32, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000296 
   
  
    
    ATAC-Seq data for C32, CACO2, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines, and HCEC-1CT normal colon cell line 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000297 
   
  
    
    RNA-Seq data for C32, CACO2, CL11, HT29, SW403, SW948 MSS CRC cell lines and HCEC-1CT normal colon cell line 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD50000000298 
   
  
    
    The dataset represents a total of 58 DNA samples from 16 male and 12 female pediatric patients affected with embryonal central nervous system tumors. The samples were subject to whole genome sequencing, WGS, [48 samples, (representing 12 male and 11 female individuals)] and whole exome sequencing, WES, [10 samples, (representing 4 male and 1 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 26 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from two patients. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 58 files in the CRAM format (lossless compression) with a total file size of ~8,8 TB. All CRAM files but one, are derived from one sequence run and one sample. P4551_227N_P4552_112N is a CRAM file where 2 sequence runs (P4551_227N and P4552_112N) from peripheral blood samples from the same individual, P019, were aligned into one single CRAM file. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  58 
 
  
    EGAD50000000299 
   
  
    
    The dataset represents a total of 18 DNA samples from 6 male and 3 female pediatric patients affected with central or peripheral nervous system tumors not classified as embryonal central nervous system tumors, nor gliomas, glioneuronal, or neuronal tumors. One tumor tissue sample and one peripheral blood sample from each patient were subject to whole genome sequencing (WGS) and were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 18 files in the CRAM format (lossless compression) with a total file size of ~3,4 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  18 
 
  
    EGAD50000000300 
   
  
    
    The dataset represents a total of 85 DNA samples from 22 male and 20 female pediatric patients affected with gliomas, glioneuronal, and neuronal tumors. The samples were subject to whole genome sequencing, WGS, [71 samples, (representing 18 male and 17 female individuals)] and whole exome sequencing, WES, [14 samples, (representing 4 males and 3 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 84 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from one patient. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 85 files in the CRAM format (lossless compression) with a total file size of ~13,3 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina HiSeq 2500 
      
    
   
  85 
 
  
    EGAD50000000301 
   
  
    
    EBNA2 ChIP-Re-ChIP in primary bulk B cells 4 days post EBV-infection with B95-8. DNA binding elements of the viral master regulator EBNA2, EBNA2-CBF transcription factor complex or EBNA2-EBF1 transcription factor complex were analyzed to identify virally targeted genes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000302 
   
  
    
    PBMCs from six kidney transplant recipients receiving as part of the Trex001 study autologous Tregs and donor bone marrow and six control patients not receiving either of the two treatments were collected pre-transplant and at one, three and six month post-transplant. Donor reactive T-cells were identified by mixed lymphocyte reactions (MLR) and lineage specific T-cell receptor (TCR) repertoires of native T-cells and proliferating and non-proliferating T-cells from MLRs were determined by next generation sequencing based profiling of the TCR. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  181 
 
  
    EGAD50000000303 
   
  
    
    Targeted cfDNA and WBC sequencing data from patients profiled in the study "Prediction of plasma ctDNA fraction and prognostic implications of liquid biopsy in advanced prostate cancer". Note that 'Exome sequencing' is listed under Dataset Type since there is no option for targeted panel sequencing. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1228 
 
  
    EGAD50000000304 
   
  
    
    Organoid cultures were exposed to different E.Coli strains and a dye control. In total 25 organoid cultures were whole-genome sequenced using the Novaseq6000 platforms. The data is deposited as .bam format. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD50000000305 
   
  
    
    Naïve (CD27-IgD+) B cells were isolated from buffy coat preparations of healthy donors using CD19 magnetic beads, followed by reals of CD19 beads and incubation with IgD-biotin and anti-biotin magnetic beads. B cells were infected with EBV by spinoculation or stimulated with heat-inactivated EBV and control cells were left uninfected. RNA was extracted immediately after isolation in un-infected B cells. From EBV-infected B cells and B cells stimulated with heat-inactivated virus, RNA was extracted 24 and 96 hours after infection / stimulation. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  31 
 
  
    EGAD50000000306 
   
  
    
    B cells were isolated from buffy coat preparations of healthy donors using CD19 magnetic beads.  B cells were then infected with EBV using spinoculation or activated with heat-inactivated EBV, respectively. Additionally cells were treated with 5ng/ml CpG or a BCR-crosslinking mixture in presence or absence of 10 µM Linrodostat (IDO1 inhibitor). RNA samples were isolated at 2 days post-infection / post-activation and one day 0 control with non-infected B cells was included.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD50000000307 
   
  
    
    This research project was a collaboration between the Karolinska Institute and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 5,876 Bipolar case/control samples from collaborators in Sweden. Genomic DNA from each samples was sequenced to a mean depth of 20x. 
    
   
  
    
      
      HiSeq X Ten 
      
      Illumina Genome Analyzer IIx 
      
    
   
  5876 
 
  
    EGAD50000000308 
   
  
    
    The data set consists of unprocessed RNA-Seq data from 225 patients diagnosed with T cell acute lymphoblastic leukemia in fastq file format. Samples from bone marrow or peripheral blood were subjected to mRNA library prep using Poly-A selection and sequencing on a NovaSeq 6000 system yielding approximately 30 million reads per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  225 
 
  
    EGAD50000000309 
   
  
    
    Whole exome sequencing data was mapped to GRCh38 using bwa-mem2 as implemented in the nfcore sarek workflow. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  13 
 
  
    EGAD50000000310 
   
  
    
    Genomic data from Diffuse-large B-cell lymphomas at diagnosis and during the treatment. The processed samples were : 
   -Circulating tumor DNA: at diagnosis and after2-cycles of treatment.
   -Tumour lymph node at diagnosis
   -Genomic DNA 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  154 
 
  
    EGAD50000000311 
   
  
    
    We identified 2 germline mutations in the DCLRE1B gene encoding the Apollo protein by Whole Exome Sequencing (WES) in two families with inherited clear-cell Renal Cell Carcinoma. The raw data submitted here corresponds to WES performed on Genomic Platform at Gustave Roussy Institute. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000312 
   
  
    
    RNAsequencing from 271 samples.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  271 
 
  
    EGAD50000000313 
   
  
    
    The dataset contains whole genome sequencing data of 23 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 89 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  89 
 
  
    EGAD50000000317 
   
  
    
    We included 16 fresh tumor samples of biopsy-proven invasive penile squamous cell carcinomas and 6 adult non-malignant inner prepuce samples. Single-cell RNA sequencing was performed (10x genomics). An HPV reference genome of 15 high-risk HPV types was generated and we mapped all single cell reads to the GRCh38 human and HPV reference genome using CellRanger. Targeted next-generation sequencing (tNGS) was performed for the detection of TP53 loss-of-function mutations.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD50000000318 
   
  
    
    Shallow whole genome sequencing of 170 samples from 24 esophageal adenocarcinoma's. DNA was obtained from FFPE stored material. Illumna Hiseq 400 was used for sequencing.
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  170 
 
  
    EGAD50000000319 
   
  
    
    This project used NGS (next generation sequencing) in mismatch repair deficient colorectal cancer samples. The project investigated the role of secondary MMR (mismatch repair) gene mutations in tumor evolution. This dataset includes BAM files from multi-region Whole Exome Sequencing of 49 samples from 22 patient tumors.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  71 
 
  
    EGAD50000000320 
   
  
    
    Fastq files of bulk RNAseq data from DCIS, invasive and microinvasive breast cancer at diagnosis. This dataset covers 18 DCIS cases, 17 microinvasive and 20 primary invasive breast cancer.  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  55 
 
  
    EGAD50000000321 
   
  
    
    Fastq files from scRNAseq data of cancer associated pericytes-like from 3 patients. 
CAP were isolated from a total of 3 primary BC (surgical residues prior to any treatment) by using BDFACS ARIA III sorter (BD Biosciences). BC were collected directly from the operating room after surgical specimen macroscopic examination and selection of areas of interest by a pathologist. Samples were cut into small pieces (around 1 mm3) and digested in CO2-independent medium (Gibco #18045-054) supplemented with 150 μg/mL liberase (Roche #05401020001) and Dnase I (Roche #11284932001) for 40 min at 37°C with shaking (180 rpm). After digestion, cells were processed and stained as described above (#Flow Cytometry analysis of BC samples). CAP fibroblasts were then gated on the Live/Dead negative fraction and defined as EPCAM- CD45- CD31- CD235a- FAPMed CD29High.
CAP scRNA-seq: Upon isolation, CAP cells were directly collected into RNase-free tubes (Thermo Fisher Scientific, #AM12450) precoated with DMEM (GE Life Sciences, #SH30243.01) supplemented with 10% FBS (Biosera, #1003/500). Single-cell capture, lysis, and cDNA library construction were performed using Chromium system from 10X Genomics, with the following kits: Chromium Single Cell 3′ Library & Gel Bead Kit v2 kit (10X Genomics, #120237) and Chromium Single Cell A Chip Kits (10X Genomics, #1000009). Generation of gel beads in Emulsion (GEM), barcoding, post GEM-reverse transcription cleanup and cDNA amplification were performed according to the manufacturer’s instructions. Cells were loaded accordingly on the Chromium Single cell A chips, and 12 cycles were performed for cDNA amplification. cDNA quality and quantity were checked on Agilent 2100 Bioanalyzer using Agilent High Sensitivity DNA Kit (Agilent, #5067-4626) and library construction followed according to 10X Genomics protocol. Libraries were next run on the Illumina HiSeq (for patients P1) and NovaSeq (for patients P2–3) with a depth of sequencing of 50,000 reads per cell. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000322 
   
  
    
    Fastq files from spatial transcriptomic of breast cancer coming from 8 Breast cancer sections. 
Sample preparation:  frozen BC samples were chosen based on tissue structure and RNA quality (RIN > 8). The “Visium Spatial Tissue Optimization Slide and Reagent Kit” (10X Genomics; #PN-1000193) was then used to optimize permeabilization conditions for BC tissues. Briefly, sections were fixed, stained and then permeabilized at different time points to capture mRNA, and the reverse transcription was performed to generate fluorescently labeled cDNA. The permeabilization time that resulted in the highest fluorescence signal with the lowest background diffusion was chosen. The best permeabilization time for BC tissue was 18 min. Cryostat sections of 10 μm of thickness were cut and placed on Visium Spatial Gene Expression slides (10X Genomics, PN-1000184). The slide was incubated for 1 min at 37°C, then fixed with methanol for 30 min at -20°C followed by Hematoxylin and Eosin (H&E) staining and images were taken under a high-resolution microscope. After imaging, the coverslip was detached by holding the slide in water and the slide was mounted in a plastic slide cassette. The spatial gene expression process, including tissue permeabilization, second strand synthesis and cDNA amplification, was performed according to the manufacturer’s instructions (10X Genomics; #CG000239). cDNA quality was next assessed using Agilent High sensitivity DNA Kit (Agilent, #5067-4626). The spatial gene libraries were constructed using Visium Spatial Library Construction Kit (10X Genomics, PN-1000184). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000323 
   
  
    
    Fastq files from bulk RNAseq of fibroblasts after culture and facs sorting (N=9). 
Sorted FAP+ CAF cells RNAs were extracted using Qiagen miRNeasy Kit (Qiagen, #217004) according to the manufacturer's instructions. Verification of RNA integrity and quality was performed using the Agilent RNA 6000 nano Kit (Agilent Technologies, #5067-1511). cDNA libraries were prepared using the TruSeq Stranded mRNA Kit (Illumina, #20020594) followed by sequencing on NovaSeq (Illumina).  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000326 
   
  
    
    Targeted glioma panel sequencing of pediatric hemispheric high-grade gliomas and diffuse midline gliomas. File type is paired-read fastq files (2 per sample). Sequencing was performed on Illumina NextSex instruments. 
Genes covered: TP53, H3F3A, HIST1H3B, HIST1H3C, IDH1, IDH2, KRAS, PIK3CA, TERT promoter, PTPN11, FGFR1/2/3, MYB, BRAF, MYBL1, EGFR, PDGFRA, MYCN, MYC, CDKN2A.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  140 
 
  
    EGAD50000000327 
   
  
    
    This dataset contains aligned BAM files from whole-exome sequencing of 29 patients from the Oxel pilot study. Alignment to GRCh38 reference genome was performed using the BWA aligner. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  29 
 
  
    EGAD50000000328 
   
  
    
    Single-cell, single-nucleus and CITE-sequencing of neuroblastoma tumors with 10X Genomics 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD50000000329 
   
  
    
    CITEseq was performed as outlined in Biolegend ‘TotalSeqTM-A Antibodies and Cell Hashing with 10x Single Cell 3' Reagent Kit v3 3.1 Protocol’ with minor modifications, using Biolegend oligo-conjugated antibodies and streptavidin TotalSeq reagents. Briefly, ADAPT-NK cells were stained with CD56-biotin mAb (Miltenyi, clone REA196), followed by TotalSeq antibodies and streptavidin-PE and Live/Dead Aqua (Invitrogen). Cells were subsequently sorted for viable CD56+ cells by flow cytometry. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  2 
 
  
    EGAD50000000330 
   
  
    
    Repeated Sampling Experiment containing 135 fastq files of RNA sequencing. These represent time series of different organ areas sampled during autopsies. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  135 
 
  
    EGAD50000000331 
   
  
    
    20 fastq files of RNA-sequencing of the 2 main histopathological grow patterns observed in liver metastases. 
    
   
  
    
      
      unspecified 
      
    
   
  20 
 
  
    EGAD50000000332 
   
  
    
    Dataset containing scRNA and scTCR sequencing of 7 patients with cutaneous T cell lymphoma. Sequencing was performed on Illumina NextSeq 550, HiSeq 4000 and NovaSeq 6000. The sequencing was always paired.
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 550 
      
    
   
  14 
 
  
    EGAD50000000333 
   
  
    
    RNA-seq data from nasal and bronchial tissues in 649 subjects, many with lung cancer.
Lung cancer is the leading cause of cancer-related death in the world. In contrast to many other cancers, a direct connection to lifestyle risk in the form of cigarette smoke has long been established. More than 50% of all smoking-related lung cancers occur in former smokers, often many years after smoking cessation. Despite extensive research, the molecular processes for persistent lung cancer risk are unclear. CT screening of current and former smokers has been shown to reduce lung cancer mortality by up to 26%. To examine whether clinical risk stratification can be improved upon by the addition of genetic data, and to explore the mechanisms of the persisting risk in former smokers, we have analyzed transcriptomic data from accessible airway tissues of 487 subjects. We developed a model to assess smoking associated gene expression changes and their reversibility after smoking is stopped, in both healthy subjects and clinic patients. We find persistent smoking associated immune alterations to be a hallmark of the clinic patients. Integrating previous GWAS data using a transcriptional network approach, we demonstrate that the same immune and interferon related pathways are strongly enriched for genes linked to known genetic risk factors, demonstrating a causal relationship between immune alteration and lung cancer risk. Finally, we used accessible airway transcriptomic data to derive a non-invasive lung cancer risk classifier. Our results provide initial evidence for germline-mediated personalised smoke injury response and risk in the general population, with potential implications for managing long-term lung cancer incidence and mortality.
 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
    
   
  649 
 
  
    EGAD50000000334 
   
  
    
    This data set contains the raw FASTq files and processed files of scRNA seq experiment which were conducted on 3 EMD samples. The libraries were generated using 10x genomics Dual index Single Cell 3' v3.1. The processed files are TSV files of features, barcodes, and gene expression matrix   
    
   
  
    
      
      NextSeq 2000 
      
    
   
  3 
 
  
    EGAD50000000335 
   
  
    
    This object contains raw FATSq files and processed files of the 43 sections of an entire EMD lesion of one patient PT01A. The raw files represents bulk RNA sequencing of each of the sections.The processed file is a CSV file that contains gene expression.    
    
   
  
    
      
      NextSeq 2000 
      
    
   
  1 
 
  
    EGAD50000000336 
   
  
    
    This data set contains raw FASTQ files and processed files of spatial transcriptomics of 6 EMD samples collected from MM patients. The processed files contains the h5 expression matrices, Image of the Visium slide, and TSV of spatial coordinates. Patients whom are included in this data set are PT01A, PT01B, PT02, PT03, PT07, and PT08.  
    
   
  
    
      
      NextSeq 2000 
      
    
   
  8 
 
  
    EGAD50000000337 
   
  
    
    This dataset includes WES and RNAseq for 7 patients with metastatic melanoma (4), non-small cell lung adenocarcnioma (1), cervix adenocarcinoma (1), and Epidermoid nasophaeyngeal carcinoma (1), enrolled in phase I clinical trials (NCT03475134m NCT04643574 and NCT05195619). WES was performed on matched cancer and healthy tissues, whereas RNAseq was performed on cancer tissues, using Illumina HiSeq 4000/6000 and Illumina NextSeq 500/550 systems. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
      NextSeq 550 
      
    
   
  33 
 
  
    EGAD50000000338 
   
  
    
    The dataset contains scRNA-seq gene expression matrix (csv file) and metadata (csv file) after quality control filtering for data generated with the Smart-seq2 protocol. Transcript expression was quantified with Salmon v0.11.3 using cDNA sequences from GRCh38.94 and k-mer length 25, and was aggregated to gene level and transcript-length-corrected using tximport v1.8.0. The dataset comprises IgA+ transglutaminase 2-specific and other IgA+ B cells from the peripheral blood of 355 cells from two untreated celiac disease patients. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD50000000339 
   
  
    
    The dataset contains processed sequencing data from Chromium Single Cell 5’ gene expression, human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. The raw sequencing data has been processed with Cell Ranger v.6.0.2 with the multi and aggr functions using the pre-built Cell Ranger references GRCh38 version 2020-A for gene expression and GRCh38-alts-ensembl-5.0.0 for V(D)J analysis. The dataset consists of a gene expression and antibody capture expression matrix (cell barcodes and feature names in tsv.gz file, expression matrix in mtx.gz file) and VDJ sequences in AIRR format (csv file). A metadata file (csv file) details cells passing our custom quality control based on number of detected genes, UMIs, mitochondrial genes, immunoglobulin genes and a productively rearranged immunoglobulin heavy chain of the IgA isotype. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000340 
   
  
    
    Raw scRNA-seq data from 355 IgA+ peripheral blood B-lineage cells of two untreated celiac disease patients. The data was generated with the Smart-seq2 protocol and sequenced on a NextSeq500 instrument (Illumina) with 75 bp paired-end reads in high-output mode. The dataset contains R1 and R2 reads for each single cell (fastq.gz files) for cells passing quality control based on number of detected genes, reads, mitochondrial genes, reads mapping to the reference transcriptome and a productively rearranged immunoglobulin heavy chain IgA isotype reconstructed by the computational tool BraCeR. Metadata for the cells is provided in a csv file. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD50000000341 
   
  
    
    The dataset contains reconstructed VDJ sequences (fasta files) and accompanying metadata for each cell (csv file) from scRNA-seq data generated with the Smart-seq2 protocol. The VDJ sequences were reconstructed with the computational tool BraCeR using raw fastq files as input. The dataset contains sequences from 355 IgA+ peripheral blood B-lineage cells of two untreated celiac disease patients. The sequences comprise both IgA+ transglutaminase 2-specific and other IgA+ B cells. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  2 
 
  
    EGAD50000000342 
   
  
    
    The dataset contains raw fastq files (fastq.gz) for Chromium Single Cell 5’ gene expression (GEX), human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. Single cell 5’ gene expression, V(D)J-enriched and cell surface protein libraries were generated using Chromium single cell kits, and barcoded cDNA from a total of 5,000-10,000 cells per sample was generated using the 10x Genomics Chromium Controller. The libraries were pooled prior to sequencing on a NovaSeq 6000 instrument (Illumina) using the following configuration: read 1: 26 cycles, read 2: 89 cycles, index read 1: 8 cycles. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000343 
   
  
    
    The dataset contains VDJ sequences in FASTA format, the same sequences run through IMGT/HighV-QUEST (tsv file) and accompanying metadata (csv file) including antigen specificity (transglutaminase 2-specific or other). The data was generated from cultured single B-cell clones from the peripheral blood of four untreated celiac disease patients. Sequences were obtained by a nested RT-PCR approach targeting the immunoglobulin chains followed by Sanger sequencing. 
    
   
  
    
   
  4 
 
  
    EGAD50000000344 
   
  
    
    The data published here contains bulk RNA-sequencing (RNAseq) data as obtainedfrom monocyte-derived dendritic cells in treated with/without LPS and with/without CESi (WWL113). Sequencing was performed in a paired-ended fashion on the NovaSeq6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD50000000345 
   
  
    
    Type 1 diabetes mellitus (T1DM) is a prototypic endocrine autoimmune disease resulting from an immune-mediated destruction of pancreatic insulin-secreting beta-cells. A comprehensive immune cell phenotype evaluation in T1DM has not been performed thus far at the single. In this cross-sectional analysis, we generated a single-cell transcriptomic dataset of peripheral blood mononuclear cells (PBMCs) from 46 manifest T1DM (Stage 3) cases and 31 matched controls.Our study reveals a surprisingly strong systemic dimension at the level of immune cell network in T1DM, defines disease-relevant molecular subtypes and has the potential to guide non-invasive test development and patient stratification. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      unspecified 
      
    
   
  22 
 
  
    EGAD50000000346 
   
  
    
    We sequenced the genomes of 141 Korean never-smoker lung adenocarcinoma patients, excluding EGFR and ALK alterations. We utilized the TruSeq DNA Library Prep Kit and performed by the Illumina NovaSeq 6000 instrument, yielding sequencing paired reads of approximately 150 bp in size. Afterward, we processed the FASTQ files using the GATK Best Practice pipeline. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  281 
 
  
    EGAD50000000347 
   
  
    
    RNA-seq data from CD20+ sorted cells obtained from peripheral blood and lymph node Follicular Lymphoma patients. These samples were unstimulated after thawing and additionally, peripheral blood samples were stimulated during 7 days in culture as described in Dobaño-López C et al.  Stranded RNA-seq libraries were performed using the TruSeq library kit (Illumina, San Diego, CA, USA). Libraries were sequenced on a NextSeq 2000 (Illumina) in a 2x50bp length. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  12 
 
  
    EGAD50000000348 
   
  
    
    Sixty-eight patients with advanced prostate cancer in castration-resistant or castration-sensitive settings undergoing treatment at the University Hospital Basel  or the St. Claraspital Basel (Switzerland) were selected for targeted parallel sequencing analysis on liquid biopsy (plasma cfDNA) and matched formalin-fixed, paraffin-embedded (FFPE) tumor tissue samples.
The liquid biopsy sequencing (plasma cfDNA) was performed using a custom-designed targeted AmpliSeq HD Prostate Cancer cfDNA panel on all 68 patients. The sequencing on 42 matched FFPE tumor biopsy samples was performed using a custom-designed or an alternative commercial panel (ThermoFisher).
Raw data underwent automated processing on the Ion Torrent Server v5.16.1 (ThermoFIsher) and were aligned to a hg19 reference genome using the Torrent Alignment Software (ThermoFihser).
 
    
   
  
    
      
      Ion GeneStudio S5 Prime 
      
    
   
  129 
 
  
    EGAD50000000350 
   
  
    
    Patients with endocrine-resistant breast cancer in Stockholm Sweden. DNA obtained from patients primary and relapse tumors, and tumor-free lymph nodes used as germline control. DNA was extracted from formalin-fixed paraffin-embedded tissue and sequenced by 370-gene panel-based sequencing with Kapa HyperPlus library preparation and Twist Bioscience hybrid capture. Custom bait sets (panels) from Twist Bioscience. Paired end 2x150 bp using NovaSeq X. The data is presented as fastq-files. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  54 
 
  
    EGAD50000000351 
   
  
    
    Fresh peripheral blood mononuclear cells of four human donors were cultured together with either lung adenocarcinoma A549 cancer cells or A549-expressing H1N1 Sialidase cancer cells. These treatments induced the differentiation of donor cells into immunosuppressive MDSC-like cells, which were further subjected to single-cell RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000352 
   
  
    
    This dataset contains 241 samples sequenced with immunogene panel (2533 genes). The samples are sorted CD4+ or CD8+ T cells, skin, or fibroblast samples from patients with various hematological disorders (n=90) and healthy blood donors (n=21).
The detailed description of sample processing, sequencing, and read alignment can be found in the publication (Somatic mutations associate with clonal expansion of CD8+ T cells, PMID: [will be updated]) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  241 
 
  
    EGAD50000000353 
   
  
    
    Sequencing libraries were prepared and barcoded using the unique molecular identifier and index tagging following the VariantPlex Somatic Protocol (ArcherDx). Pool-library was loaded at 1.2 pM concentration with 20% PhiX and paired-end sequencing was performed using the NextSeq 500 Illumina sequencer using 300 cycle high output reagent kit. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  28 
 
  
    EGAD50000000354 
   
  
    
    Multi-region tumor samples were cut from frozen sections or FFPEs and reviewed from microdissection and H&E staining in order to select ones with high cellularity. DNA extraction was done using DNeasy Blood & Tissue Kits for frozen samples following the manufacturer’s guideline. Extracted DNAs were processed on an Illumina HiSeq 2500 in a paired end mode (100x100) using a custom targeted panel based on the list of all unique somatic mutations from the original WES data by the Integrated Genomics Operation (IGO) at Memorial Sloan Kettering Cancer Center (New York, NY). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  244 
 
  
    EGAD50000000355 
   
  
    
    70 bam files generated from deep whole exome sequencing from samples from oesophageal adenocarcinoma from 17 patients. 17 bam files generated from deep whole exome sequencing from matching blood (germline control) from patients with oesophageal adenocarcinoma. Samples were collected within the clinical MEMORI trial. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  97 
 
  
    EGAD50000000356 
   
  
    
    Bulk B Cell Receptor high-throughput sequencing data across 25 serial breast tumour biopsies obtained from 10 patients during neoadjuvant therapy. The samples were sequenced on an Illumina MiSeq instrument and their raw FastQ files deposited here. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  25 
 
  
    EGAD50000000357 
   
  
    
    80 bam files generated from 3'RNAseq of tumour biopsies from oesophageal adenocarcinoma. Samples were collected within the clinical MEMORI trial.  
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  80 
 
  
    EGAD50000000358 
   
  
    
    Single patient case of HER2-Positive Metastatic Extramammary Paget’s Disease 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000359 
   
  
    
    Pig-to-human xenotransplantation is rapidly approaching the clinical arena; however, it is unclear which immunomodulatory regimens will effectively control human immune responses to pig xenografts. We transplanted a gene-edited pig kidney into a brain-dead human recipient on pharmacologic immunosuppression and studied the human immune response to the xenograft using spatial transcriptomics and single-cell RNA sequencing. Human immune cells were uncommon in the porcine kidney cortex early after xenotransplantation and consisted of primarily myeloid cells. Both the porcine resident macrophages and human infiltrating macrophages expressed genes consistent with an alternatively activated, anti-inflammatory phenotype. No significant infiltration of human B or T cells into the porcine kidney xenograft was detected. Altogether, these findings provide proof of concept that conventional pharmacologic immunosuppression is sufficient to restrict infiltration of human immune cells into the xenograft early after compatible pig-to-human kidney xenotransplantation.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD50000000360 
   
  
    
    We analyzed 264 plasma samples collected between June 2016 and September 2021 from 63 epithelial ovarian cancer patients using tumor-guided plasma cell-free DNA analysis to detect residual disease after treatment. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000361 
   
  
    
    RNA sequencing was performed on 108 NSCLC tumor samples and their paired adjacent normal tissues (n=21) to identify associations with clinical and immune characteristics. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  129 
 
  
    EGAD50000000362 
   
  
    
    CYLD cutaneous syndrome (CCS) is a rare autosomal dominant disorder characterized by germline CYLD mutations and by multiple benign skin tumors dependent on NF-kB pathway. We assembled a large cohort of CCS rare skin tumors that was profiled with whole exome or genome sequencing, RNA sequencing and methylation arrays to better understand genetic mechanisms of CCS tumorogenesis. 
    
   
  
    
      
      BGISEQ-500 
      
    
   
  39 
 
  
    EGAD50000000364 
   
  
    
    Single nuclei RNAseq data from 14 HGSOC primary tumour samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD50000000366 
   
  
    
    Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9 
    
   
  
    
   
  293 
 
  
    EGAD50000000367 
   
  
    
    Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9 
    
   
  
    
   
  293 
 
  
    EGAD50000000368 
   
  
    
    Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9 
    
   
  
    
   
  293 
 
  
    EGAD50000000369 
   
  
    
    Matrix of counts from the serum peptide mass spec data from patients enrolled in the CITYSCAPE trial, and the sample annotation. Specifically, serum samples at C1D1, C2D1, C3D1, SCRN were collected from a total of 132 patients and subject to Mass Spec at Biognosys. The current study focused on patients who have samples at both C1D1 and C2D1 (n = 64 pairs). Associated metadata also included. 
    
   
  
    
   
  293 
 
  
    EGAD50000000370 
   
  
    
    Matrices of counts from single-cell RNA-seq data and single-cell CITE-seq data collected from 16 patients enrolled in GO30103 trial, and the cell level annotation. Specifically, PBMCs at C1D1, C1D15 (2 weeks after treatment), C2D1 (3 weeks after treatment) and C4D1 (9 weeks after treatment) were collected and subject to 10x Genomics protocol. Associated metadata also included. 
    
   
  
    
   
  293 
 
  
    EGAD50000000371 
   
  
    
    This dataset is related to the NeoBCC trial and includes 12 human tissue samples 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000373 
   
  
    
    Genomic alterations accumulate in the somatic cells throughout an individual’s lifetime. Recent sequencing studies have documented widespread mutations in the nuclear genome and the frequent clonal competition of normal cells carrying mutations. However, the landscape of mitochondrial DNA (mtDNA) heteroplasmy in normal human tissues is poorly understood. This study investigated the whole genome sequences (WGSs) of 2,096 clones established from non-neoplastic healthy single cells obtained from 31 donors. In addition, we analyzed 31 WGSs of neoplastic cells, including 12 clones established from adenomatous polyps from one individual with MUTYH-associated polyposis and 19 matched colorectal carcinomas from individuals who donated normal colorectal clones. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  823 
 
  
    EGAD50000000375 
   
  
    
    We performed a systematic, genome-wide investigation of enhancer regions in colorectal cancer (CRC). We identified 12,117 putative enhancer regions using H3K27ac and H3Kme1 ChIP-seq and ATAC-seq. We performed scRNA-seq in HT29 and SW480 (MSS CRC cell lines) using the Parse Biosciences WT-mega kit with CRISPRi/dCas9 inhibition of these regions (Perturb-seq). The Parse split-pipe pipeline was used to demultiplex the raw fastq files into the processed files (mtx files for genes and gRNA) for each cell line. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000376 
   
  
    
    Single-cell RNA-seq profiling of effector (KLRG1+ PD1-), transitional (KLRG1+ PD1-intermediate),  dysfunctional (CD39+ PD1+) and memory (IL7Rα+)  CD8 T cells isolated from 4 human tumors (1 melanoma, 1 renal carcinoma and 2 ovarian carcinoma). Respective cell populations were identified and isolated using FACS. FASTQ files contain gene expression data, feature barcode antibody capture or TCR data on individual CD8 T cell subsets across the 4 different samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000377 
   
  
    
    Single-cell RNA-seq profiling of immune cells from human ovarian carcinoma, renal carcinoma and melanoma samples  after 48h of ex vivo culture using the patient-derived tumor fragment platform (5 samples total). Samples were cultured in various conditions, including: untreated, CD8-IL2v-treated, CD8-IL2v + LCKi -treated, aPD1-treated, CD8-IL2v + aPD1 -treated, Untargeted IL2-treated,  aCD3-treated and aCD3 + CD8-IL2-treated . FASTQ files contain gene expression data, feature barcode antibody capture or TCR data on immune cells from all conditions combined per patient. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD50000000378 
   
  
    
    FFPE tissue and liquid biopsy (blood, pleural and peritoneal effusions) samples, processed with the cfRRBS (cell-free reduced representation bisulfite sequencing) protocol. 
    
   
  
    
      
      unspecified 
      
    
   
  181 
 
  
    EGAD50000000379 
   
  
    
    TCRseq data from Lauss et al Nat Comm 2024: Molecular patterns of resistance to immune checkpoint blockade in melanoma. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  31 
 
  
    EGAD50000000380 
   
  
    
    Whole Exome Sequencing data from Lauss et al Nat Comm 2024. Molecular patterns of resistance to immune checkpoint blockade in melanoma. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  74 
 
  
    EGAD50000000382 
   
  
    
    The dataset consist of DNA and RNA sequencing results and metadata of the samples. All sample numbers starting with 6716 are tumor samples which has been sequenced using WES (see BAM files). It concerns biopsies of metastatic lesions from patients with BRAFV600 mutated melanoma, obtained before, during and after the study treatment (see samples metadata) and in some cases blood for germline mutation analysis. Sequencing is performed using the Illumina Novaseq 6000 system. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  41 
 
  
    EGAD50000000383 
   
  
    
    For RNA sequencing: All sample numbers starting with 6717 are tumor samples which has been sequenced using transcriptomics (see BAM files). It concerns biopsies of metastatic lesions from patients with BRAFV600 mutated melanoma, obtained before, during and after the study treatment (see samples metadata). RNA sequencing is performed using the Illumina Novaseq 6000 system. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD50000000384 
   
  
    
    This dataset contains a tumor + normal DNA sequence data and tumor RNA seq data for a medulloblastoma patient. 
    
   
  
    
      
      unspecified 
      
    
   
  2 
 
  
    EGAD50000000385 
   
  
    
    Data pertains longitudinal transcriptomic data measured from blood obtained from patients with Crohn's disease that were starting treatment with vedolizumab. Samples were obtained prior to treatment and approximately 26 weeks into treatment during response assessment. At response assessment, patients were classified as responders (R) or non-responders (NR) based on a strict combination of endoscopic, biochemical and clinical criteria: ≥50% reduction in the endoscopic SES-CD score, corticosteroid-free clinical remission (≥3 point drop98 in HBI or HBI ≤4 and no systemic steroids) and/or biochemical response (C-reactive protein (CRP) and fecal calprotectin reduction ≥50% or ≤5 mg/L and fecal calprotectin ≤250 µg/g).
Modified response was defined as a combination of corticosteroid-free clinical- (HBI ≤4) and biochemical (CRP ≤5 mg/L and/or fecal calprotectin ≤250 µg/g) remission between week 26-52 without treatment change through week 52. 
Transcriptomic analyses was conducted through RNA sequencing, wherein mRNA was extracted utilizing the QIAsymphony system, converted into cDNA and sequenced in a paired-end format on the Illumina NovaSeq6000 at the Amsterdam UMC Core Facility Genomics, generating a dataset comprising 40 million 150 bp-reads.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD50000000386 
   
  
    
    Data pertains longitudinal transcriptomic data measured from blood obtained from patients with Crohn's disease that were starting treatment with ustekinumab. Samples were obtained prior to treatment and approximately 26 weeks into treatment during response assessment. At response assessment, patients were classified as responders (R) or non-responders (NR) based on a strict combination of endoscopic, biochemical and clinical criteria: ≥50% reduction in the endoscopic SES-CD score, corticosteroid-free clinical remission (≥3 point drop98 in HBI or HBI ≤4 and no systemic steroids) and/or biochemical response (C-reactive protein (CRP) and fecal calprotectin reduction ≥50% or ≤5 mg/L and fecal calprotectin ≤250 µg/g).
Modified response was defined as a combination of corticosteroid-free clinical- (HBI ≤4) and biochemical (CRP ≤5 mg/L and/or fecal calprotectin ≤250 µg/g) remission between week 26-52 without treatment change through week 52. 
Transcriptomic analyses was conducted through RNA sequencing, wherein mRNA was extracted utilizing the QIAsymphony system, converted into cDNA and sequenced in a paired-end format on the Illumina NovaSeq6000 at the Amsterdam UMC Core Facility Genomics, generating a dataset comprising 40 million 150 bp-reads.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  47 
 
  
    EGAD50000000387 
   
  
    
    The current data pertains RNA-sequencing reads obtained from thyroid samples acquired from fetuses with Down syndrome and fetuses with no genetic/developmental abnormality. Total RNA was isolated from left lobe from thyroid samples using a hand-held homogenizer and the Promega ReliaPrep RNA Miniprep System (Thermo Fisher Scientific). RNA yield was determined with the NanoDrop Microvolume Spectrophotometer (Thermo Fisher Scientific). Fragmentation and mRNA library preparation was performed using the Kapa mRNA Hyperprep Kit (Roche, Basel, Switzerland). Libraries were equimolar pooled and quality was checked on a TapeStation system using the DNA1000 ScreenTape (Agilent Technologies, Santa Clara, CA, USA). Libraries were sequenced with poly(A) selection to sequence all messenger RNA for gene expression analysis on the NovaSeq6000 PE150 (Illumina, San Diego, CA, USA), producing at least 40M 150-bp paired-end reads per library. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000388 
   
  
    
    These are genomic and transcriptomic high risk molecular segments for 167 samples collected at baseline from deidentified patients enrolled in cohorts A,B or D in the CC-220-MM-001 clinical study.  Data are provided in a patient call table in csv format.  Calls are generated from WGS data with mutations called by mutect2 best practices pipeline, CNV calls from Battenberg; and RNA-seq aligned with star aligner and quantified by Salmon. 
    
   
  
    
      
      Illumina Genome Analyzer 
      
    
   
  167 
 
  
    EGAD50000000389 
   
  
    
    This dataset contains 228 paired fastq files sequenced with Illumina Novaseq 6000, and the file with the sample and clinical data. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  114 
 
  
    EGAD50000000391 
   
  
    
    This dataset contains unpaired FASTQ files of 316 endometrial cancer cases and 316 matched controls representing circulating RNAs.  RNA count files are provided as the output of sncRNA pipeline and represent 12 RNA types (isomiR, lncRNA, precursor miRNA, miRNA, miscRNA, mRNA, piRNA, scaRNA, snoRNA, snRNA, tRF, and tRNA). We also provide metadata of the samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  632 
 
  
    EGAD50000000392 
   
  
    
    This research project was a collaboration between Trinity College Dublin, Ireland and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 191 Bipolar case/control samples from collaborators in Ireland. Genomic DNA from each sample was sequenced to a mean depth of 20x.  The project used Illumina WXS sequencing of DNA and the file type is cram. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  191 
 
  
    EGAD50000000393 
   
  
    
    The dataset includes WES sequencing data on PRE- treatment biopsies of lymph node metastasis (n=79) The technology used for sequencing is llumina HiSeq. The PRADO trial tested a personalized response-directed treatment approach based on the pathologic response after neoadjuvant ipilimumab plus nivolumab in stage III melanoma patients. In patients achieving a major pathologic response, in their index lymph node (the largest lymph node metastasis at baseline), therapeutic lymph node dissection (TLND) and adjuvant therapy were omitted. Patients with partial response underwent TLND only, whereas patients with pathologic non-response underwent TLND and adjuvant systemic therapy ± synchronous radiotherapy. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  158 
 
  
    EGAD50000000394 
   
  
    
    The data set is composed of single cell transcriptomics data for adrenal glands from 11 deceased organ donors (10x 3') and spatial transcriptomic data for adrenal glands from 4 deceased organ donors (10x Visium). 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      NextSeq 2000 
      
    
   
  14 
 
  
    EGAD50000000395 
   
  
    
    RNA sequencing of 168 pulmonary samples including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=38), adenocarcinoma in situ (AIS, N=22), minimally invasive adenocarcinoma (MIA, N=19) and invasive lung adenocarcinoma (ADC, N=38) and adjacent lung tissues (Normal, N=62). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  168 
 
  
    EGAD50000000396 
   
  
    
    Whole genome sequencing of 42 pulmonary samples including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=5), adenocarcinoma in situ (AIS, N=7), minimally invasive adenocarcinoma (MIA, N=6) and invasive lung adenocarcinoma (ADC, N=8) and adjacent lung tissues (Normal, N=16). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  42 
 
  
    EGAD50000000397 
   
  
    
    Multi-region exome sequencing of 271 pulmonary nodules and matched adjacent lung tissue including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=49), adenocarcinoma in situ (AIS, N=42), minimally invasive adenocarcinoma (MIA, N=37), invasive lung adenocarcinoma (ADC, N=86) and adjacent lung tissues (Normal, N=57). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  271 
 
  
    EGAD50000000400 
   
  
    
    This dataset contains single-cell BCR sequencing data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 5 LN and 1 PB samples of 5 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples. 
    
   
  
    
   
  13 
 
  
    EGAD50000000401 
   
  
    
    This dataset contains WES data analyzed using the Genomon2 pipeline (v.2.6.2, https://github.com/Genomon-Project) from 14 patients with T follicular helper cell lymphomas (TFHLs). A list of somatic mutations called by the Genomon2 pipeline for each sample is provided as a txt file. 
    
   
  
    
   
  32 
 
  
    EGAD50000000402 
   
  
    
    This dataset contains single-cell count data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 9 LN and 16 PB samples of 14 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples. 
    
   
  
    
   
  32 
 
  
    EGAD50000000403 
   
  
    
    This dataset contains single-cell TCR sequencing data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 9 LN and 16 PB samples of 14 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples. 
    
   
  
    
   
  32 
 
  
    EGAD50000000404 
   
  
    
    The abscopal effects of radiation may sensitize immunologically “cold” tumors to immune checkpoint inhibition (ICI). We investigated the immunostimulatory effects of radiotherapy leveraging multi-omic analyses of serial tissue and blood biospecimens (n=293) from a phase 2 clinical trial of stereotactic body radiation therapy (SBRT) followed by pembrolizumab in metastatic non-small cell lung cancer (NSCLC; NCT02492568). Patients with immunologically-cold tumors (low tumor mutation burden, null PD-L1 expression, WNT-pathway mutated) in the SBRT arm had significantly longer progression-free survival compared to ICI alone (P<0.05). Induction of interferon-gamma, interferon-alpha, and antigen processing and presentation gene sets was significantly enriched post SBRT in non-irradiated tumor sites (FDR adjusted P<0.01). Significant on-therapy expansions of new and pre-existing TCR clones in both the tumor and blood compartments were noted in the SBRT arm (P<0.05). These findings support the systemic anti-tumor effects of immuno-radiotherapy and may open a therapeutic window of opportunity to overcome resistance to ICI. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina NovaSeq 6000 
      
    
   
  218 
 
  
    EGAD50000000405 
   
  
    
    Sequencing results of single-cell transcriptome and antibody libraries from two biological experiments of cord blood progenitor cells. Four samples from different time points were processed using 10X Genomics Chromium Next GEM Single Cell 3’ Reagent Kits v3.1. Gene expression and antibody-derived libraries were sequenced separately. 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  16 
 
  
    EGAD50000000406 
   
  
    
    scRNA sequencing and scTCR sequencing on three tumor lesions derived from one patient receiving adoptive tumor-infiltrating lymphocyte (TIL) therapy. Expanded metastasis was used for TIL expansion and yielded the TIL product which also was used for scTCR sequencing. 10 weeks after transfer of the TIL product the regressing metastasis was resected. After the patient progressed the progressing metastasis was removed (61 weeks).  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000408 
   
  
    
    This dataset contains RNA-sequencing data of 169 IDH-mutant astrocytoma samples included in the GLASS-NL cohort 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  169 
 
  
    EGAD50000000411 
   
  
    
    A mutation accumulation experiment in colorectal cancer (CRC) derived tumoroids. A sequential single-cell cloning approach was adopted to measure the mutation rate in eight tumoroids obtained from five patients. WGS was also performed on their matched normal tissue and on standard tumoroids cultures without any cloning step. 
    
   
  
    
      
      unspecified 
      
    
   
  188 
 
  
    EGAD50000000413 
   
  
    
      The dataset consists of three samples: two controls and one case of juvenile Parkinsonism. Whole Exome Sequencing (WXS) was performed on these samples using hybrid selection for library preparation. Sequencing was carried out on the Illumina NextSeq 500 platform. For each sample, two FASTQ files containing paired-end reads (R1 and R2) were generated. The data deposited consists of the corresponding FASTQ files. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  3 
 
  
    EGAD50000000414 
   
  
    
    RNA sequencing (fastq files) of white blood cells (WBCs) from healthy donors (n=376) and cancer patients (n=421) with different diagnoses, stages of disease and previously administered treatments, was performed. Samples from cancer patients were collected from the BostonGene clinical program; all patients provided written consent per IRB-approved protocols. Blood samples from healthy donors were purchased from multiple collection centers throughout the United States.
Whole blood samples (3 ml) in K2-EDTA tubes received within 24 hours of collection at RT underwent red blood cell (RBC) lysis to isolate WBCs. Isolated WBCs for RNA sequencing were centrifuged at 300 x g for 5 minutes with a maximum of 10^6 cells per vial. The supernatant was removed, and the cells were resuspended in cold Homogenization Buffer (2% 1-Thioglycerol, Promega). Samples were then frozen at -80°C until extraction. RNA extraction was performed from frozen samples with Maxwell RSC simplyRNA Cells Kit (Promega) using the benchtop automated Maxwell RSC Instrument (Promega).
Libraries were prepared with Illumina TruSeq® Stranded mRNA Library Prep (Poly-A mRNA; stranded). Libraries were sequenced on NovaSeq 6000 as Paired-End Reads (2x150) with targeted coverage of 50 mln reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  797 
 
  
    EGAD50000000415 
   
  
    
    The ChRCC study WES dataset contains raw whole exome sequencing data of 17 tumor and 7 adjacent normal samples from 7 UTSW patients, who have consented to depositing their genomic data to public repository. WES was performed using 75bp paired-end fragments at an average read depth > 100x on a HiSeq2500 platform (Illumina, San Diego, CA, USA). The raw data is in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  24 
 
  
    EGAD50000000416 
   
  
    
    The ChRCC study RNA-Seq dataset contains raw whole transcriptome sequencing data of 12 tumor and 6 adjacent normal samples from 7 UTSW patients, who have consented to depositing their genomic data to public repository. RNA-Seq was performed using 50bp single-end on a HiSeq2500 platform (Illumina, San Diego, CA, USA). 50M reads per sample on average. The raw data is in fastq format. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  18 
 
  
    EGAD50000000417 
   
  
    
    This dataset contains 125 BAM files of cord blood hematopoietic stem and progenitor cell DNA, sequenced with Illumina Novaseq 6000, and 125 files with the variant calling performed to the sequencing data. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  125 
 
  
    EGAD50000000418 
   
  
    
    In order to perform comprehensive SNV and CNA analyses of a cohort of BCP-LBL patients, we performed whole exome sequencing of 41 tissue samples from BCP-LBL patients. Because the material was available as FFPE, the target coverage of the samples was >200x. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  41 
 
  
    EGAD50000000419 
   
  
    
    In order to perform comprehensive transcriptomics and gene fusion analyses of a cohort of BCP-LBL patients, we performed RNA sequencing of 49 tissue samples from BCP-LBL patients. Because the material was available as FFPE, and had a relative low quality, we used a capture-based approach, where NGS libraries obtained from total RNA were captured in 4-plex, using a whole exome capture panel. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  49 
 
  
    EGAD50000000420 
   
  
    
    Cancer samples for neo-open reading frame peptides that comprise the tumor framome are a rich source of neoantigens for cancer immunotherapy. 
    
   
  
    
      
      HiSeq X Ten 
      
      PromethION 
      
    
   
  61 
 
  
    EGAD50000000421 
   
  
    
    Clinicopathologic features of the ten patients with ovarian immature teratomas studied by whole-exome sequencing 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  70 
 
  
    EGAD50000000422 
   
  
    
    This is a meta-analysis of myeloma datasets, both with and without the UK Biobank cohort included. 
    
   
  
    
   
  1 
 
  
    EGAD50000000424 
   
  
    
    This dataset contains fastq files for paired blood-tumor scRNA-seq samples from 5 NSCLC patients and paired blood-tumor scATAC-seq samples from 2 NSCLC patients, both sequenced with NovaSeq or Illumina HiSeq - Rapid Run. 
    
   
  
    
      
      Illumina HiSeq X 
      
      Illumina NovaSeq X 
      
    
   
  14 
 
  
    EGAD50000000426 
   
  
    
    All files for TIX individuals 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  67 
 
  
    EGAD50000000427 
   
  
    
    The dataset includes bam files from WGS of 26 monoclonal patient derived organoid (PDO) lines isolated from 6 independent tumor subclones of a dMMR colorectal tumor. These organoids were grown as part of an in vitro timecourse experiment for the duration of 9 weeks; bam files represent the mutational load at the start of the timecourse (t0) and the end of the timecourse (t1). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  53 
 
  
    EGAD50000000428 
   
  
    
    RNAseq aligned to hg38.p14 using nfcore/RNA-seq 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  4 
 
  
    EGAD50000000429 
   
  
    
    Research study of genetic susceptibility and analyses of polygenic risk scores in allergic diseases. This dataset consists of variant data derived from whole-genome sequencing in reference samples from one of the parental populations (i.e., North Africans). The dataset consist of 15 VCF files (and corresponding index files) with a total of 1.58 million variants called using GATK v4 and following the Broad Variant Calling Best Practices, from a set of unrelated individuals sequenced in paired-read mode 2x150 bp using an Illumina HiSeq 4000 sequencer.
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  15 
 
  
    EGAD50000000430 
   
  
    
    Dataset corresponding to the bulk RNA seq dataset from prefrontal cortex for a total of N=44 samples. Tissue samples correspond to patients with different alpha synucleinopathies. Specifically, idiopathic Parkinson's disease (PD, N=20) monogenic PD caused by LRRK2 mutations (LRRK2-PD, N=7), multiple system atrophy (MSA, N=6) and neurologically healthy controls (N=11). RNA seq was carried out using ribosomal depletion.
The totality of these samples were also sequenced using single nucleus RNA seq, and available within the same study as a separate dataset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  90 
 
  
    EGAD50000000431 
   
  
    
    Single nucleus RNA seq dataset from prefrontal cortex of a total of N=46 individuals with alpha synucleinopathies and healthy controls. Tissue samples correspond to patients with Parkinson's disease (PD, N=20) monogenic PD caused by LRRK2 mutations (LRRK2-PD, N=7), multiple system atrophy (MSA, N=6) and neurologically healthy controls (N=13). RNA seq was carried out employing 10X Genomics Chromium.
These samples were also sequenced using conventional bulk tissue RNA seq, and available within the same study as a separate dataset. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  90 
 
  
    EGAD50000000432 
   
  
    
    Single nucleus RNA seq dataset for CI stratification of Parkinson's disease
Single nucleus RNA seq dataset of prefrontal cortex tissue from a total of N=18 samples. Tissue samples correspond to patients with idiopathic Parkison's disease (PD) with varying levels of Complex-I activity (N=12) and neurologically healthy controls (N=6). Single nucleus RNA seq was carried out using 10X Genomics Chromium.
This dataset corresponds to a  subset of the samples sequenced using conventional bulk tissue RNA seq and available within this Study as a separate Dataset. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  116 
 
  
    EGAD50000000433 
   
  
    
    Dataset corresponding to the bulk RNA seq dataset from prefrontal cortex for a total of N=98 samples. Tissue samples correspond to patients with idiopathic Parkison's disease (PD) with varying levels of Complex-I activity (N=79) and neurologically healthy controls (N=19). RNA seq was carried out using ribosomal depletion.
A subset of these samples was additionally sequenced using single nucleus RNA seq and is available in the same Study as an additional Dataset. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  115 
 
  
    EGAD50000000434 
   
  
    
    Raw sequencing files from scRNA-seq dataset used in Schmassmann et al. 2023. Single-cell characterization of human GBM reveals regional differences in tumor-infiltrating leukocyte activation. Elife 12 (https://elifesciences.org/articles/92678)
Dataset content: 14 samples from 5 donors 
File type: paired-end fastq files 
Technology: Illumina sequencing
Experimentation used: scRNA-seq using the 10X technology 
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  14 
 
  
    EGAD50000000435 
   
  
    
    Long-read transcriptomic data of control and FECD exp positive CECs 
    
   
  
    
      
      Sequel 
      
    
   
  9 
 
  
    EGAD50000000436 
   
  
    
    Short-read transcriptomic data of unaffected control, expansion negative, and expansion positive FECD CECs 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  10 
 
  
    EGAD50000000439 
   
  
    
    18 patients were biopsied at resistance to an FGFR inhibitor with tumor WES and/or WTS performed to identify resistance mechanisms to selective FGFR2 inhibitors across FGFR2-driven malignancies.
 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  63 
 
  
    EGAD50000000441 
   
  
    
    The dataset contains whole genome sequencing data of 17 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 46 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  46 
 
  
    EGAD50000000442 
   
  
    
    The dataset contains whole genome sequencing data of 32 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 82 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  82 
 
  
    EGAD50000000443 
   
  
    
    This dataset regroups RNAseq of paired primary colorectal cancers and liver metastases extracted from 113 patients In Paris and Besançon University Hospitals. 
216 FASTQ files from single-end 3' PolyA (QuantSeq 3′mRNA-Seq Kit FWD for Illumina (Lexogen)) RNAseq samples were generated On NOVASEQ 6000 (ILLUMINA). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  216 
 
  
    EGAD50000000446 
   
  
    
    The current dataset represents bulk RNA-Seq gene expression profiling (with an exome library preparation kit) of muscle-invasive bladder cancer tissue samples obtained before and after platinum-based chemotherapy. 89 samples are pre-treatment transurethral resection of the bladder tumor (TUR-BT) tissue at diagnosis (baseline), 86 are post-treatment cystectomy tissue (resected tumor bulk), comprising 76 pairs of samples from the same patients. FASTQ files contain gene expression data. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  175 
 
  
    EGAD50000000447 
   
  
    
    Sanger sequencing data of cell lines derived from bulk MV4-11 cells, targeting the R248 locus of TP53, consisting of either wildtype or homozygous mutant lines. 
    
   
  
    
      
      AB 3730xL Genetic Analyzer 
      
    
   
  4 
 
  
    EGAD50000000448 
   
  
    
    Targeted-capture sequencing data from 168 ENKTCL patients 
    
   
  
    
      
      unspecified 
      
    
   
  168 
 
  
    EGAD50000000449 
   
  
    
    snRNA-seq and spatial transcriptomic data and analysis of healthy (CTRL) and inflamed (immune-mediated necrotizing myopathy (IMNM) and inclusion body myositis (IBM)) human quadriceps muscle. This data set includes 19 snRNA-seq (7 CTRL, 4 IMNM, 8 IBM) and 8 ST samples (3 CTRL, 2 IMNM, 3 IBM). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  27 
 
  
    EGAD50000000450 
   
  
    
    To study monocyte and macrophage activation in ANCA-associtated vasculitis (AAV), we performed bulk RNA sequencing of bead-selected monocytes and in vitro cultured monocyte-derived macrophages from AAV patients and healthy controls. 
Overview patients included for sequencing monocytes: 
- AAV active disease, n=4, MPO-AAV=4 
- AAV remission, n=10, PR3-AAV=5, MPO-AAV=5 
- Healthy controls, n=6 
Overview patients included for sequencing monocyte-derived macrophages: 
- AAV active, n=1, PR3-AAV=1 
- AAV remission, n=3, PR3-AAV=3 
- Healthy controls, n=3 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  48 
 
  
    EGAD50000000451 
   
  
    
    This dataset contains whole exome sequencing (WES) of 29 HIV- EBV- primary central nervous system lymphoma (PCNSL) tumors and 5 activated B-cell-like PCNSL (ABC-PCNSL) and activated B-cell-like diffuse large B-cell lymphoma (ABC-DLBCL) cell lines (TK, HKBML, OCI-Ly3, HBL-1, TMD-8). Cases with secondary involvement of the CNS were excluded. DNA was extracted with AllPrep DNA/RNA FFPE kit and libraries were prepared using the Agilent SureSelect Human All Exon v6 + UTR kit. Paired end reads were aligned to GRCh37 using bwa mem v.0.7.17, and reads were sorted and duplicated were marked in the final BAM files.
Access to this data is controlled. There are a number of steps that a researcher must take to obtain access to this data, including execution of a Data Access Agreement between the institutions. The process is overseen by the Technology Development Office; please contact our general email address TDOadmin@phsa.ca.
Please only click the "request data" button on the EGA website after a Data Access Agreement is fully executed. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  34 
 
  
    EGAD50000000453 
   
  
    
    Pulmonary pleomorphic carcinoma (PPC) is an aggressive and highly heterogeneous non-small-cell lung carcinoma whose underlying biology is still poorly understood. Forty-two tumor areas including 39 primary tumors and 3 metastases from 20 PPC patients were microdissected and the histologically distinct components were subjected to whole exome sequencing (WES) separately. Twist Human Core Exome + RefSeq + Mito-Panel kit (Twist Bioscience) was used for the whole exome capturing according to manufacturer’s guidelines. Paired-end 100-bp reads were generated on the Illumina NovaSeq 6000. 
After the sequencing, reads were aligned against the reference human genome GRCh38 using Burrows-Wheeler Aligner (BWA, v0.7.12) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  62 
 
  
    EGAD50000000454 
   
  
    
    ctDNA biomarker and relevant clinical data for divarasib phase I GO42144 study, including tumor type, KRAS G12C mutation detectability in plasma at baseline, baseline SLD, sites of metastasis, lines of therapy, best response, confirmed best response, PFS, ctDNA tumor fraction and KRAS G12C VAF at baseline/C1D15/C3D1.  
    
   
  
    
   
  308 
 
  
    EGAD50000000458 
   
  
    
    Universal targeted haplotyping by droplet digital PCR sequencing (amplicon sequencing) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  22 
 
  
    EGAD50000000459 
   
  
    
    Universal targeted haplotyping by droplet digital PCR sequencing (Target Capture sequencing) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  25 
 
  
    EGAD50000000460 
   
  
    
    Our study sought to resolve, with single-molecule fidelity, the mismatches and damage events that precede DNA mutations. Using a novel single-molecule, long-read sequencing method (HiDEF-seq) we detect base substitutions when present in either one or both DNA strands. We also detect cytosine deamination, a common type of DNA damage, with single-molecule fidelity. This study profiled 134 samples from diverse tissues, including from individuals with cancer predisposition syndromes. These samples revealed single-strand mismatch and damage signatures. Since double-strand DNA mutations are only the endpoint of the mutation process, our approach enables new studies of how mutations arise in a variety of contexts, especially in cancer and aging. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      Illumina NovaSeq X 
      
    
   
  21 
 
  
    EGAD50000000461 
   
  
    
    H3K27ac ChIP-seq datasets in human insulinoma samples 
    
   
  
    
      
      unspecified 
      
    
   
  12 
 
  
    EGAD50000000462 
   
  
    
    RNA-seq datasets in human insulinoma samples 
    
   
  
    
      
      unspecified 
      
    
   
  11 
 
  
    EGAD50000000463 
   
  
    
    H3K27me3 Cut&Tag datasets in human pancreatic islets and the EndoC-bH1 cell line 
    
   
  
    
      
      unspecified 
      
    
   
  2 
 
  
    EGAD50000000464 
   
  
    
    Whole-Genome Sequencing datasets of insulinoma samples and paired blood controls 
    
   
  
    
      
      unspecified 
      
    
   
  26 
 
  
    EGAD50000000467 
   
  
    
    Fastq files and BAM files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  101 
 
  
    EGAD50000000468 
   
  
    
    For mature miRNAs and hairpins, BAM and FASTq files 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  101 
 
  
    EGAD50000000469 
   
  
    
    Raw fastq file for bulk RNA-seq of checkpoint-blockade treated lung cancer cohorts 
    
   
  
    
      
      unspecified 
      
    
   
  355 
 
  
    EGAD50000000470 
   
  
    
    Whole Exome Sequencing of 924 Bipolar cases and matched controls performed at the Broad Institute on a cohort from Umea, Sweden.  The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  924 
 
  
    EGAD50000000471 
   
  
    
    The impact of MSC(UC) on peripheral B cells from Systemic Lupus Erythematosus (SLE) patients was studied by 10X scRNAseq. This scRNAseq study encompassed 3 SLE patients at 3 time points: before or after (1 month, and 3 months) MSC injection in order to analyze B cell subsets and their DEG. The aim of this study was to observe the potential changes of B cell subsets after MSC(UC) injection in SLE patients. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  3 
 
  
    EGAD50000000473 
   
  
    
    This project contains the 16 WGS tumor samples and the corresponding 15 WGS normal tissue samples (germline control) not yet deposited in the public domain (e.g., EGA, dbGaP) at the time of submission of the manuscript (Zhu et al.) 
    
   
  
    
      
      unspecified 
      
    
   
  31 
 
  
    EGAD50000000474 
   
  
    
    Whole genome sequencing (WGS) data of 12 tumours was generated on an Illumina NovaSeq or HiSeqX instrument. For frozen samples, libraries were constructed using a PCR-free library construction method. For tumours preserved by formalin fixation and paraffin embedding (FFPE), libraries were constructed with a method that included S1 nuclease treatment. Reads were aligned to the grch37 reference with bwa-mem 0.7.17.  
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  12 
 
  
    EGAD50000000475 
   
  
    
    Osteosarcoma is a primary bone tumor that exhibits a complex genome characterized by gross chromosomal abnormalities. Osteosarcoma patients often develop metastatic disease, resulting in limited therapeutic options and poor survival rates. To gain knowledge on the mechanisms underlying osteosarcoma heterogeneity and metastatic process, it is important to obtain a detailed profile of the genomic alterations that accompany osteosarcoma progression. Therefore, in this study we performed WGS on multiple tissue samples from six patients with osteosarcoma, including the treatment naïve biopsy of the primary tumor, resection of the primary tumor after neoadjuvant chemotherapy, local recurrence and distant metastases.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD50000000476 
   
  
    
    Data corresponding to the fastq of the 654 individuals included in the study. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  654 
 
  
    EGAD50000000477 
   
  
    
    The cohort included 98 mild and 75 severe cases with a median age of 53 years. We amplified and sequenced the T Cell Receptor (TCR β) chain complementary determining region 3 (CDR3b) and performed bioinformatic analyses to assess repertoire diversity, clonality, allelic usage, and epitope affinity CDR3b clustering.  CDR3b sequences were amplified by multiplex PCR and sequenced by Illumina. The resulting raw files were processed for clonotype assembly, filtering for coding clonotypes, removal of non-TRB alleles, and downsampling. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  173 
 
  
    EGAD50000000479 
   
  
    
    Mixture of 5 unrelated individuals sequenced by 10x as a scATAC-seq with low cell count. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile using the De-goulash pipeline. The separation file is submitted as the processed file. The bam are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000480 
   
  
    
    Mixture of 5 unrelated individuals sequenced by 10x as a scATAC-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile using the De-goulash pipeline. The separation file is submitted as the processed file. The bam are submitted as unprocessed files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000481 
   
  
    
    reference whole exome sequence serving as a reference of individuals. Includes the raw GSa files and the called variants in vcf format merge for all individuals (S1-S5). 
    
   
  
    
      
      unspecified 
      
    
   
  1 
 
  
    EGAD50000000482 
   
  
    
    The dataset contains RNAseq profiles of 1040 patients from the ROBUST clinical trial (NCT02285062). The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (75PE, 50M) were constructed using Illumina TruSeq RNA Access method. Fastq files are included. Also includes processed WGS mutation calls output. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  1095 
 
  
    EGAD50000000483 
   
  
    
    Metadata associated with 16SV4 ribosomal RNA gene sequencing and Long read microbiome whole metagenome sequencing of pediatric GI samples. 
    
   
  
    
   
  20 
 
  
    EGAD50000000484 
   
  
    
    Sample sheet for aligning samples with patients.  
    
   
  
    
   
  20 
 
  
    EGAD50000000485 
   
  
    
    16SV4 ribosomal RNA gene sequencing data of GI samples and associated mixed community cultures collected from pediatric patients at risk for IBD. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  11 
 
  
    EGAD50000000486 
   
  
    
    Long read microbiome whole metagenome sequencing of pediatric GI samples. 
    
   
  
    
      
      unspecified 
      
    
   
  9 
 
  
    EGAD50000000488 
   
  
    
    ScRNA-seq FASTQ files from myocarditis and control cardiac muscle samples; and myositis and control skeletal muscle samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD50000000489 
   
  
    
    Targeted capture sequencing of 364 samples representing a mix of Burkitt lymphoma, diffuse large B-cell lymphoma, follicular lymphoma, and high-grade B-cell lymphomas with MYC and BCL2 or BCL6 rearrangements. Capture space covers 3.5 Mb of the human genome including regions around MYC, BCL2, BCL6, PAX5, and IGH/K/L.  
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  364 
 
  
    EGAD50000000490 
   
  
    
    This dataset contains 190 fastq files sequenced with Illumina HiSeq500.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  190 
 
  
    EGAD50000000491 
   
  
    
    Samples, in a form of PAXgene fixed and paraffin-embedded biopsies, were collected from the multi-site, double-blind, randomized, placebo-controlled trial, aimed at dose-finding and assessing the efficacy and tolerability of a 6-week treatment with ZED1227 capsules vs. placebo in subjects with well-controlled celiac disease undergoing gluten challenge.
Total RNA was extracted from the PaxFPE biopsy specimens (n = 116) using additional cuttings from the samples on which histomorphometry was previously assessed. For the extraction, an RNeasy Kit (Qiagen, Hilden, Germany) was used according to the manufacturer’s instructions. Library preparation and next-generation sequencing (NGS) were performed by the Qiagen NGS Service. A total of 10 ng of purified RNA was converted into cDNA NGS libraries. Library preparation was quality controlled using capillary electrophoresis. Based on the quality of the inserts and the concentration measurements, the libraries were pooled in equimolar ratios and then sequenced on a NextSeq (Illumina Inc., San Diego, USA) sequencing instrument according to the manufacturer’s instructions, with 100 bp read length for read 1 and 27bp for read 2. The raw data were de-multiplexed, and FASTQ files for each sample were generated using bcl2fastq2 software (Illumina Inc., San Diego, USA).  
    
   
  
    
      
      NextSeq 550 
      
    
   
  116 
 
  
    EGAD50000000492 
   
  
    
    uman duodenal tissues for establishing organoid cultures used in this study were sourced from de-identified surgical specimens (n = 3) of the duodenum obtained from patients who had undergone biopsy procedures unrelated to CeD at Tampere University Hospital. The protocol was approved by the Ethics Committee of Tampere University Hospital, Tampere, Finland (ETL code R18082). 
RNA from the duodenal organoids was isolated using an RNeasy Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. RNA purity and concentration were measured using a NanoDrop One spectrophotometer (NanoDrop Technologies, Wilmington, Delaware, USA). Preparation of the RNA library and transcriptome sequencing was conducted by Novogene Co., LTD (Cambridge, UK). Messenger RNA was purified from total RNA using polyA selection and subjected to library construction. Sequencing was performed on an Illumina platform, and 150 bp paired-end reads were generated.   
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  11 
 
  
    EGAD50000000493 
   
  
    
    Single cell RNA libraries were prepared using 10x Genomics Chromium Next GEM Single Cell 3’ v2 reagents. The samples were barcoded and each library were pooled with two samples at equimolar concentrations. The pooled libraries (n=4) were sequenced on the NextSeq 500 machine (Illumina) with paired-end sequencing and dual indexing as recommended in the manufacturer’s protocol; 26 and 98 cycles for the respective Read 1 and 2, and 8 cycles for i7 index.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000494 
   
  
    
    This dataset includes 19 bamfiles of tumor, relapsed tumor and normal samples derived from 7 patients. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD50000000496 
   
  
    
    Paired-end ribodepletion RNAseq performed on 257 samples of Burkitt lymphoma, diffuse large B cell lymphoma, follicular lymphoma, and high-grade B cell lymphoma with MYC and BCL2 or BCL6 rearrangements. RNA was derived from either fresh frozen or formalin fixed/paraffin embedded (FFPE) tissues. Sequencing was performed on Illumina HiSeqX or NovaSeqX instruments.  
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  257 
 
  
    EGAD50000000497 
   
  
    
    CITE-Seq (cellular indexing of transcriptomes and epitopes) of 51 lymph node (LN) samples from including mantle cell lymphoma (MCL, n = 8),  follicular lymphoma (FL, n = 12), germinal center (GCB, n = 5) or activated B-cell (non-GCB/ABC, n = 7) diffuse large B-cell lymphoma (DLBCL), and marginal zone lymphomas (MZL, n = 11), in addition to non-malignant reactive lymph nodes (rLN, n = 8). Of the malignant LN samples, 20 were collected at the time of initial diagnosis and 23 were from patients who had previously undergone one or more lines of systemic treatment. Relapse samples were collected at least 3 months after cessation of systemic treatment. This dataset also includes 5 prime RNA sequencing and immune receptor sequencing (10X Genomics) for 11 of these samples (2 rLN, 2 MCL, 3 FL, 2 DLBCL (GCB) and 2 MZL). 
    
   
  
    
      
      NextSeq 2000 
      
    
   
  51 
 
  
    EGAD50000000498 
   
  
    
    This dataset contains scRNAseq data of human telencephalic organoids, at day 120, from 4 different sequencing runs (libraries), as follows (please see cell line clone nomenclature in original publication): 
1. Library 177136: Pat.2 ARID1B+/+ clone 2c (2 organoids, E11rep_1, E11rep_2), Pat.2 ARID1B+/- clone 2a (2 organoids: B002orig_1, B002orig_2) 
2. Library 178119: Pat.2 ARID1B+/+ clone 2c (3 organoids: E11rep_1, E11rep_2, E11rep_3), Pat.2 ARID1B+/- clone 2a (3 organoids: B002orig_1, B002orig_2, B002orig_3) 
3. Library 178120: Pat.2 ARID1B+/+ clone 2d (3 organoids: A3rep_1, A3rep_2, A3rep_3), Pat.2 ARID1B+/- clone 2b (3 organoids: B002_F7_1, B002_F7_2, B002_F7_3) 
4. Library 184337: Pat.1 ARID1B+/- clone 1a (3 organoids: B001orig_1, B001orig_2, B001orig_3), HD.1 ARID1B+/+ clone 3a (3 organoids: 176_1, 176_2, 176_3), HD.1 ARID1B+/- clone 3b (3 organoids: F10_1, F10_2, F10_3) 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  25 
 
  
    EGAD50000000499 
   
  
    
    This dataset contains scMultiomics (scRNAseq + scATACseq) data of human telencephalic organoids, at day 120, from 3 different sequencing runs (libraries), as follows (please see cell line clone nomenclature in original publication): 
1. Library 233269(RNA)/235293(ATAC): Pat.1 ARID1B+/- clone 1a (9 pooled organoids).
2. Library 233270(RNA)/235294(ATAC): HD.1 ARID1B+/+ clone 3a (9 pooled organoids).
3. Library 233271(RNA)/235295(ATAC): HD.1 ARID1B+/- clone 3b (8 pooled organoids). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000501 
   
  
    
    Single-end 75 basepair small-RNA sequencing of 90 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Samples were prepared using the NEXTFLEX® Small RNA-Seq Kit v3 (Bioo Scientific ). Samples were sequenced at the molecular genomics core (Peter MacCallum Cancer Centre) using 50bp single end sequencing on the Illumina NextSeq 500 (Illumina, USA).   This dataset contains raw sequencing reads in FASTQ format.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  90 
 
  
    EGAD50000000502 
   
  
    
    Paired-end 150 basepair whole genome sequencing of 94 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Libraries were prepared using the Illumina® TruSeq™ DNA Nano library preparation method according to the manufacturer’s instructions at The University of Melbourne Centre for Cancer Research (UMCCR) using 200ng input DNA and a 550 base pair insert size. Samples were sequenced in separate batches on the Illumina® Nova-Seq 6000 according to manufacturer’s instructions (Illumina, USA).  Included in this dataset are FASTQ format files containing raw read data, CRAM format files containing reads aligned to GRCh38, and VCF format files containing germline variants calls.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  173 
 
  
    EGAD50000000503 
   
  
    
    10x genomics single-nuclei ATAC sequencing of 7 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma, and 1 normal adrenal medulla sample. snATAC-seq was conducted using the "Van Helsing" protocol (dx.doi.org/10.17504/protocols.io.bw52pg8e). Once processed, snATAC-seq libraries were sequenced on the Illumina NextSeq 500 (Illumina, USA) using 50bp paired-end sequencing.  Raw sequencing data in BCL format was demultiplexed using cellranger-atac mkfastq (V2.0.0). This dataset contains sequencing reads in FASTQ format. R1/R3 files contain the forward and reverse sequencing reads, respectively, R2 contains the 10x Barcode.   
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000504 
   
  
    
    Paired-end 150 basepair whole transcriptome sequencing of 91 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Samples were prepared using the NEB-Next directional RNA-Seq kit (NEB, USA) and underwent 150bp paired-end sequencing on the Illumina NovaSeq 6000 (Illumina, USA) according to manufacturer’s instructions.  This dataset contains raw sequencing reads in FASTQ format.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  91 
 
  
    EGAD50000000505 
   
  
    
    10x genomics single-nuclei RNA sequencing of 9 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma.  snRNA-seq was performed using the ‘Frankenstein’ protocol (dx.doi.org/10.17504/protocols.io.bqxymxpw). Briefly, nuclei were extracted from frozen tissues and subjected to fluorescence-activated nuclei sorting (FANS) using 4′,6-diamidino-2-phenylindole (DAPI).  Sorting was performed using a BD FACSaria 2 instrument, sorting between 3000 and 10,000 nuclei per sample, capturing both diploid and tetraploid populations. FAN-sorted nuclei were immediately processed using either the 10x Chromium Single Cell 5’ (PN-1000006, 4 samples) or 3’ (PN-1000075, 4 samples) Library & Gel Bead Kit (10x Genomics, USA). Once processed, snRNA-seq libraries were sequenced on the Illumina Nova-Seq 6000 (Illumina, USA) using 150bp paired-end sequencing. This dataset contains raw sequencing reads in FASTQ format.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000506 
   
  
    
    Raw NGS data of primary Acute Myeloid Leukemia samples.
The libraries were obtained with Sophia Genetics Myeloid Solution kit, targeting 30 genes involved in AML. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  33 
 
  
    EGAD50000000507 
   
  
    
    The dataset contains whole genome sequencing data of 42 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 100 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  100 
 
  
    EGAD50000000508 
   
  
    
    Shallow WGS long-read sequencing of primary neuroblastoma samples. 
    
   
  
    
      
      MinION 
      
    
   
  13 
 
  
    EGAD50000000509 
   
  
    
    Shallow long-read nanopore WGS of ecDNA containing neuroblastoma cell lines. 
    
   
  
    
      
      MinION 
      
    
   
  5 
 
  
    EGAD50000000510 
   
  
    
    scATAC-seq by 10xGenomics Droplet Sequencing from two human donors. scATAC sequences are available for FACS-sorted CD45+ cells from blodd, skin and vat (visceral adipose tissue) 
    
   
  
    
      
      NextSeq 500 
      
    
   
  12 
 
  
    EGAD50000000512 
   
  
    
    To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated targeted epigenomic data from long-term endurance (8 men) and strength (8 men) trained individuals and healthy age-matched untrained controls (8 men). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and 3hrs following acute exercise. Control subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks, athletes completed one bout in their respective form of sports. All 96 samples were used for DNA extraction and targeted library construction using a custom Twist Biosciences panel and following EM-methylation tranformation were sequenced (2x150bp paired end) on the Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  96 
 
  
    EGAD50000000513 
   
  
    
    Genotype array data from 152 South African Coloured individuals. Typed on Illumina H3Africa array. 
    
   
  
    
   
  152 
 
  
    EGAD50000000514 
   
  
    
    For two donors, we collected CD34 plus and CD34 minus hematopoietic cells for the single-cell RNA-seq analysis. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  4 
 
  
    EGAD50000000515 
   
  
    
    Genome-wide studies have uncovered multiple independent signals at the RREB1 locus associated with altered type 2 diabetes risk and related glycaemic traits. However, little is known about the function of the zinc finger transcription factor Ras-responsive element binding protein 1 (RREB1) in glucose homeostasis or how changes in its expression and/or function influence diabetes risk. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  62 
 
  
    EGAD50000000516 
   
  
    
    The coding variant (p.Arg192His) in the transcription factor PAX4 is associated with an altered risk for type 2 diabetes (T2D) in East Asian populations. In mice, Pax4 is essential for beta cell formation but its role on human beta cell development and/or function is unknown. Participants carrying the PAX4 p.His192 allele exhibited decreased pancreatic beta cell function compared to homozygotes for the p.192Arg allele in a cross-sectional study in which we carried out an intravenous glucose tolerance test and an oral glucose tolerance test. In a pedigree of a patient with young onset diabetes, several members carry a newly identified p.Tyr186X allele. In the human beta cell model, EndoC-βH1, PAX4 knockdown led to impaired insulin secretion, reduced total insulin content, and altered hormone gene expression. Deletion of PAX4 in human induced pluripotent stem cell (hiPSC)-derived islet-like cells resulted in derepression of alpha cell gene expression. In vitro differentiation of hiPSCs carrying PAX4 p.His192 and p.X186 risk alleles exhibited increased polyhormonal endocrine cell formation and reduced insulin content that can be reversed with gene correction. Together, we demonstrate the role of PAX4 in human endocrine cell development, beta cell function, and its contribution to T2D-risk. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  64 
 
  
    EGAD50000000517 
   
  
    
    Resolving causal genes for type 2 diabetes at loci implicated by genome-wide association studies (GWAS) requires integrating functional genomic data from relevant cell types. Chromatin features in endocrine cells of the pancreatic islet are particularly informative and recent studies leveraging chromosome conformation capture (3C) with Hi-C based methods have elucidated regulatory mechanisms in human islets. However, these genome-wide approaches are less sensitive and afford lower resolution than methods that target specific loci. Methods: To gauge the extent to which targeted 3C further resolves chromatin-mediated regulatory mechanisms at GWAS loci, we generated interaction profiles at 23 loci using next-generation (NG) capture-C in a human beta cell model (EndoC-βH1) and contrasted these maps with Hi-C maps in EndoC-βH1 cells and human islets and a promoter capture Hi-C map in human islets. Results: We found improvements in assay sensitivity of up to 33-fold and resolved ~3.6X more chromatin interactions. At a subset of 18 loci with 25 co-localised GWAS and eQTL signals, NG Capture-C interactions implicated effector transcripts at five additional genetic signals relative to promoter capture Hi-C through physical contact with gene promoters. Conclusions: High resolution chromatin interaction profiles at selectively targeted loci can complement genome- and promoter-wide maps. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD50000000518 
   
  
    
    Identification of the genes and processes mediating genetic association signals for complex diseases represents a major challenge. As many of the genetic signals for type 2 diabetes (T2D) exert their effects through pancreatic islet-cell dysfunction, we performed a genome-wide pooled CRISPR loss-of-function screen in a human pancreatic beta cell line. We assessed the regulation of insulin content as a disease-relevant readout of beta cell function and identified 580 genes influencing this phenotype. Integration with genetic and genomic data provided experimental support for 20 candidate T2D effector transcripts including the autophagy receptor CALCOCO2. Loss of CALCOCO2 was associated with distorted mitochondria, less proinsulin-containing immature granules and accumulation of autophagosomes upon inhibition of late-stage autophagy. Carriers of T2D-associated variants at the CALCOCO2 locus further displayed altered insulin secretion. Our study highlights how cellular screens can augment existing multi-omic efforts to support mechanistic understanding and provide evidence for causal effects at genome-wide association studies loci. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  6 
 
  
    EGAD50000000519 
   
  
    
    Population level variation and molecular mechanisms behind insulin secretion in response to carbohydrate, protein, and fat remain uncharacterized despite ramifications for personalized nutrition. We now define prototypical insulin secretion dynamics in response to the three macronutrients in islets from 140 cadaveric donors, including those diagnosed with type 2 diabetes. We leverage the insulin response heterogeneity and use transcriptomics and proteomics to identify molecular pathways of specific nutrient responsiveness. Surprisingly, we find robust insulin secretion to fatty acid stimulus in ~8% of donors, challenging the idea that fat has negligible effects on insulin release. Distinct islet proteomes with differences in metabolic signalling networks convey this hyper-responsiveness to fat relative to carbohydrate. By comparing human islets to human embryonic stem cell-derived islet clusters, we show that, unlike glucose-responsiveness, fat hyper-responsiveness is equivalent and may be a hallmark of functionally immature cells. Our study represents the first comparison of dynamic responses to nutrients and multi-omics analysis in human insulin secreting cells. Responses of different people’s islets to carbohydrate, protein, and fat lay the groundwork for personalized nutrition. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  96 
 
  
    EGAD50000000520 
   
  
    
    This dataset contains the FASTQ files, the correspondent H&E pictures with the fiducial frames and the json files that were used for our paper. Within the json files names, one can find the information about the slide name (V11M111-111) and the capture area (A1) for each sample.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000521 
   
  
    
    Dataset including FASTQ files for snRNA-seq samples from subcortical Multiple Sclerosis (MS) lesions, together with the curated snRNA-seq atlas. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000522 
   
  
    
    Curated characterization of cell subtypes derived from the 9 principal cell types within our subcortical Multiple Sclerosis (MS) lesions snRNA-seq atlas. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000523 
   
  
    
    The dataset contains samples of 6 organotypic co-cultures, assembled with patient-derived material from ovarian cancer (OC) patients. Tumor cells, both as bulk and as cancer stem cells-enriched (OCSC) populations, are cultured or not with in vitro peritoneal TME (for details see Battistini C et al, Tumor microenvironment-induced FOXM1 regulates ovarian cancer stemness, CDDis 2024).
Dataset is composed by fastq file (paired end) type from bulk RNA-Seq. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD50000000524 
   
  
    
    BAM and VCF files from WES of an affected proband with a novel immunodeficiency phenotype caused by homozygous mutation of SLC19A1. 
    
   
  
    
      
      Ion Torrent Proton 
      
    
   
  1 
 
  
    EGAD50000000525 
   
  
    
    This dataset contains count data (from 10X CellRanger) and metadata from 7 acute myeloid leukemia subjects treated with chemotherapy, as well as data from 3 healthy donors. Additionally, the dataset includes VDJ data for all subjects except for 31_base and the 3 healthy donors (TCRseq not performed). 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  15 
 
  
    EGAD50000000526 
   
  
    
    The dataset contains whole genome sequencing data of 8 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 16 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  16 
 
  
    EGAD50000000527 
   
  
    
    Raw 16SV4-sequence data from bronchial brushing DNA obtained from healthy volunteers prior to and four weeks after ICS treatment. 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1 
 
  
    EGAD50000000528 
   
  
    
    Associated metadata for sequencing data. 
    
   
  
    
   
  63 
 
  
    EGAD50000000529 
   
  
    
    Barcodes associated with 16SV4 sequencing. 
    
   
  
    
   
  63 
 
  
    EGAD50000000530 
   
  
    
    Raw data of ctDNA profiling using the PredicineWES+ or PredicineBEACON assay in patients enrolled in divarasib phase I GO42144 study.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  303 
 
  
    EGAD50000000531 
   
  
    
    The dataset contains HMO measurements for 1542 samples from Lifelines NEXT cohort 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1502 
 
  
    EGAD50000000532 
   
  
    
    This dataset contains 16S sequencing of fecal and milk samples for HMO-microbiome Lifelines NEXT study 
    
   
  
    
      
      Illumina MiSeq 
      
    
   
  1501 
 
  
    EGAD50000000533 
   
  
    
    10x Genomics immune profiling including sequenced libraries of: 
5'-end mRNA transcriptomics
Beta and alpha chain VDJ transcripts
Cell surface expression by 204 DNA barcoded antibodies 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  143 
 
  
    EGAD50000000535 
   
  
    
    This dataset comes from shallow whole genome sequencing data of STIC project 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  38 
 
  
    EGAD50000000536 
   
  
    
    This is the dataset for Ampliseq sequencing 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  11 
 
  
    EGAD50000000537 
   
  
    
    Bulk RNAseq analysis of 84 PDAC samples : Normalized read counts  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000540 
   
  
    
    Smart-seq2 single cell RNA sequencing of human BCC, SCC, melanoma (ALM) and healthy control skin samples. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  15 
 
  
    EGAD50000000545 
   
  
    
    This dataset includes DNA methylation profiles from 112 young with Type 1 Diabetes (T1D) at T1D diagnosis, who were longitudinally monitored for hyperglycemia over an average duration of 3 years. These datasets were generated using whole-genome bisulfite sequencing. It includes 1872 fastq files (i.e. 936 paired-end fastq files) generated through 150 bp paired-end sequencing on Illumina HiSeqX. 
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  224 
 
  
    EGAD50000000547 
   
  
    
    This research project was a collaboration between Cardiff University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 2,458 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2512 
 
  
    EGAD50000000548 
   
  
    
    DNA Sequencing of 864 genes from KiCS cancer panel. Multiple family members affected with multifocal GIST who underwent whole genome sequencing of the germline and tumor. Affected individuals with GIST harbored a germline variant found within exon 13 of the KIT gene, (c.1965T>G; p.Asn655Lys, p.N655K) and a variant in the MSR1 gene (c.877C>T; p.Arg293*, pR293X). 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  7 
 
  
    EGAD50000000549 
   
  
    
    Whole genome sequencing of tumour and normal samples from familial GIST.  Multiple family members were affected with multifocal GIST who underwent whole genome sequencing of the germline and tumor. Affected individuals with GIST harbored a germline variant found within exon 13 of the KIT gene, (c.1965T>G; p.Asn655Lys, p.N655K) and a variant in the MSR1 gene (c.877C>T; p.Arg293*, pR293X). 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  7 
 
  
    EGAD50000000550 
   
  
    
    Data generated from three prospective institution-wide tumor sequencing studies (SHIVA01 [NCT01771458], MOSCATO-01 [NCT02613962], and MATCH-R [NCT02517892]) to analyze the genomic landscape of metastases from 97 patients with mUC by performing both whole exome sequencing (WES) and whole-transcriptome sequencing (RNA-seq)  
    
   
  
    
      
      unspecified 
      
    
   
  211 
 
  
    EGAD50000000553 
   
  
    
    cfRRBS data produced from the cfDNA in plasma isolated from healthy, adult donors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  44 
 
  
    EGAD50000000554 
   
  
    
    In this study, we explore the potential of classifying pediatric brain tumors based on methylation profiling of the cell-free DNA in cerebrospinal fluid (CSF). For this proof-of-concept study, we collected 20 cerebrospinal fluid samples of pediatric brain cancer patients via a ventricular drain placed for reasons of increased intracranial pressure. For 11 patients in this study we collected matched tumor DNA. This cohort contains fastQ files of cfRRBS data of these samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  43 
 
  
    EGAD50000000555 
   
  
    
    BaTwa genotype data using the H3Africa array.  
    
   
  
    
   
  80 
 
  
    EGAD50000000558 
   
  
    
    Raw fastq files obtained by RNA sequencing 138 IDH-mutant astrocytomas included in the CATNON trial. RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue blocks using the RNeasy FFPE kit. RNA sequencing was performed on an Illumina NovaSeq 6000 (GenomeScan BV, Leiden, The Netherlands) with 150bp paired-end reads including UMI tags.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  138 
 
  
    EGAD50000000559 
   
  
    
    This dataset contains a gene-cell matrix derived from single-cell RNA sequencing (scRNA-seq) data of ileal tissue from Crohn's disease (CD) patients and colorectal cancer (CRC) patients. It includes:
Crohn's Disease Patients: A trio of transmural lesions (stenotic, inflamed, and non-inflamed) from each patient.
Colorectal Cancer Patients: Unaffected ileal tissue used as external non-inflamed control.
Cell Level Metadata:
The dataset includes relevant cell-level metadata such as cell type annotations used in the study.
Experimental Details:
Platform: 10x Genomics Chromium Single Cell 3' GEX
Sequencing: Illumina NovaSeq
Processing: Data processed with Cell Ranger software. Resulting count matrices were merged for downstream analysis, including integration and dimensionality reduction.
Dataset Composition:
Crohn's Disease Patients: 10 patients with 3 samples each (non-inflamed, inflamed, stenotic), totaling 30 samples.
Colorectal Cancer Patients: 5 patients with 1 sample each of unaffected tissue, totaling 5 samples.
Data Provided:
Merged Raw Count Matrix: The final merged raw count matrix used for downstream analysis.
Cell Metadata File: Contains details of sample, tissue, and patient for each cell in the count matrix.
Barcodes File: Indicate each cell barcode which also encodes the sample, tissue, and patient details for each cell. 
CD.S_Inf: Stenotic Corhn's disease inflamed samples
CD.S_Sten: Stenotic CD patient stenosis sample
CD.S_Prox: Stenotic CD Patient - proximal non-inflamed sample
CC.C_Prox: CRC Patient proximal unaffected sample
eg: A barcode 'CC.C_1_Prox_AAGTCGTAGACCCTTA' indicates CRC Patient unaffected proximal sampe from CRC Patient no.1  and the nucleic acid sequence indicate a unique cell from this sample.
Total Samples:
Crohn's Disease (CD) Patients: 30 samples
Colorectal Cancer (CRC) Patients: 5 samples
   Patient_no       Sample Sample_type
1      CC.C_1  CC.C_1_Prox   CC.C_Prox
2      CD.S_1  CD.S_1_Prox   CD.S_Prox
3      CD.S_1  CD.S_1_Infl   CD.S_Infl
4      CD.S_1  CD.S_1_Sten   CD.S_Sten
5      CC.C_2  CC.C_2_Prox   CC.C_Prox
6      CD.S_2  CD.S_2_Prox   CD.S_Prox
7      CD.S_2  CD.S_2_Infl   CD.S_Infl
8      CD.S_2  CD.S_2_Sten   CD.S_Sten
9      CC.C_3  CC.C_3_Prox   CC.C_Prox
10     CC.C_4  CC.C_4_Prox   CC.C_Prox
11     CD.S_3  CD.S_3_Prox   CD.S_Prox
12     CD.S_3  CD.S_3_Infl   CD.S_Infl
13     CD.S_3  CD.S_3_Sten   CD.S_Sten
14     CD.S_4  CD.S_4_Prox   CD.S_Prox
15     CD.S_4  CD.S_4_Infl   CD.S_Infl
16     CD.S_4  CD.S_4_Sten   CD.S_Sten
17     CC.C_5  CC.C_5_Prox   CC.C_Prox
18     CD.S_5  CD.S_5_Prox   CD.S_Prox
19     CD.S_5  CD.S_5_Infl   CD.S_Infl
20     CD.S_5  CD.S_5_Sten   CD.S_Sten
21     CD.S_6  CD.S_6_Prox   CD.S_Prox
22     CD.S_6  CD.S_6_Infl   CD.S_Infl
23     CD.S_6  CD.S_6_Sten   CD.S_Sten
24     CD.S_7  CD.S_7_Prox   CD.S_Prox
25     CD.S_7  CD.S_7_Infl   CD.S_Infl
26     CD.S_7  CD.S_7_Sten   CD.S_Sten
27     CD.S_8  CD.S_8_Prox   CD.S_Prox
28     CD.S_8  CD.S_8_Infl   CD.S_Infl
29     CD.S_8  CD.S_8_Sten   CD.S_Sten
30     CD.S_9  CD.S_9_Prox   CD.S_Prox
31     CD.S_9  CD.S_9_Infl   CD.S_Infl
32     CD.S_9  CD.S_9_Sten   CD.S_Sten
33    CD.S_10 CD.S_10_Prox   CD.S_Prox
34    CD.S_10 CD.S_10_Infl   CD.S_Infl
35    CD.S_10 CD.S_10_Sten   CD.S_Sten 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  35 
 
  
    EGAD50000000561 
   
  
    
    Metagenomic characterization of tracheal aspirates from non-pulmonary sepsis patients. This dataset consists of non-human read data from shotgun sequencing in sepsis case samples from lung aspirates. The dataset consist of 32 paired FASTQ files sequenced in paired-read mode 2x150 bp using an Illumina HiSeq 4000 sequencer. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  32 
 
  
    EGAD50000000562 
   
  
    
    This dataset contains the FASTQ and BAM files for TOTHER3 study. Targeted DNA-Seq experiment were realized with 49 samples. Since this data was obtained with a panel of genes protected by Intellectual Property (IP), these files only contain information to validate the results published and, in consequence, sensible data had to be specifically removed.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  49 
 
  
    EGAD50000000563 
   
  
    
    This research project was a collaboration between University Hospital Frankfurt and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 823 Bipolar case/control samples from collaborators in Germany. Genomic DNA from each sample was sequenced to a mean depth of 20x.  The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  823 
 
  
    EGAD50000000564 
   
  
    
    This dataset contains 10 tumor and normal pairs synthetic WGS data of colorectal cancer that were simulated in a standard format of Illumina paired-end reads. The NEAT read simulator (version 3.0, https://github.com/zstephens/neat-genreads) was utilized to synthetize these 10 pairs of tumor and normal WGS data. In the procedure of data generation, simulated parameters (i.e., sequencing error statistics, read fragment length distribution and GC% coverage bias) were learned from data models provided by NEAT. The average sequencing depth for tumor and normal samples aimed to reach around 110X and 60X, respectively.
 
For generation of synthetic normal WGS data per each sample, a germline variant profile from a real patient was down-sampled randomly, representing 50% germline variants of a given patient. These were mixed with the other 50% in silico germline variants that were modelled randomly using an average mutation rate (0.001), finally constituting a full germline profile for normal synthetic WGS data.
 
For generation of synthetic tumor WGS data per each sample, a pre-defined somatic short variant profile (SNVs+Indels) learnt from a real CRC patient was added to the germline variant profile used for creating the normal synthetic WGS data of the same patient, consisting of the variants for tumor sample. Neither copy number profile nor structural variation profile was introduced into the tumor synthetic WGS data. Tumor content and ploidy were assumed to be 100% and 2, respectively.
 
For mapping/variant detection,  the Sarek pipeline v3.1.2 (https://nf-co.re/sarek/3.1.2) was used, specifically:
1. BWA v0.7.17-r1188 for read mapping
2. GATK v4.3.0.0 for pre-processing BAM file (including markduplicates and recalibration).
2. Mutect2 (GATK v4.3.0.0) for somatic variant calling
3. Strelka2 v2.9.10 for germline and somatic variant calling
 
 
Metadata information of 10 CRC patients used for the generation of synthetic normal and tumor WGS data:
 
Patient_id            Tumor_barcode Normal_barcode              Age        Sex         Tissue    Cancer 
SIM007 SIM007_T            SIM007_N           71           F              Rectal    Primary CRC      
SIM008 SIM008_T            SIM008_N           45           F              Colon     Neuroendocrine  Metastasis CRC              
SIM010 SIM010_T            SIM010_N           62           M            Colon     Metastasis CRC 
SIM011 SIM011_T            SIM011_N           55           M            Colon     Neuroendocrine Metastasis CRC               
SIM012 SIM012_T            SIM012_N           57           M            Rectal    Metastasis CRC 
SIM013 SIM013_T            SIM013_N           69           M            Colon     Metastasis CRC 
SIM014 SIM014_T            SIM014_N           68           M            Colon     Neuroendocrine primary CRC     
SIM015 SIM015_T            SIM015_N           58           F              Colon     Primary CRC      
SIM016 SIM016_T            SIM016_N           49           M            Colon/Rectal      Primary CRC      
SIM017 SIM017_T            SIM017_N           78           M            Colon     Neuroendocrine primary CRC      
    
   
  
    
      
      unspecified 
      
    
   
  20 
 
  
    EGAD50000000565 
   
  
    
    Dataset contains bulk RNA sequencing reads from 44 NERD patients under dupilumab therapy before and after aspirin provocation at baseline and after 24 weeks of treatment. Dataset is a multiplexed .bam file containing all sequencing reads with tagged identifiers.  
    
   
  
    
      
      NextSeq 550 
      
    
   
  1 
 
  
    EGAD50000000566 
   
  
    
    scRNA-seq dataset for the study "Surgery in combination with immune checkpoint therapy as an effective treatment for patients with metastatic cancer." The dataset comprises 38 samples from 19 patients, with each pair includes a baseline (BL) sample and a post-biopsy or post-surgery sample. Each sample corresponds to an R1 (read 1)  and an R2 (read 2) files (.fastq.gz) for paired reads. Beside procedure (biopsy or surgery),  another important phenotype is response (PR and PD in this case, no SD for these patients), included in the description of the samples. The sequencing was performed using Illumina NovaSeq 6000 platform., as seen in the experiment description. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  38 
 
  
    EGAD50000000567 
   
  
    
    Musculoskeletal diseases affect up to 20% of adults worldwide. The gut microbiome has been implicated in inflammatory conditions, but large-scale metagenomic evaluations have not yet traced the routes by which immunity in the gut affects inflammatory arthritis. To characterize the community structure and associated functional processes driving gut microbial involvement in arthritis, the Inflammatory Arthritis Microbiome Consortium investigated 440 stool shotgun metagenomes comprising 221 adults diagnosed with rheumatoid arthritis, ankylosing spondylitis, or psoriatic arthritis and 219 healthy controls and individuals with joint pain without an underlying inflammatory cause. Diagnosis explained about 2% of gut taxonomic variability, which is comparable in magnitude to inflammatory bowel disease. We identified several candidate microbes with differential carriage patterns in patients with elevated blood markers for inflammation. Our results confirm and extend previous findings of increased carriage of typically oral and inflammatory taxa and decreased abundance and prevalence of typical gut clades, indicating that distal inflammatory conditions, as well as local conditions, correspond to alterations to the gut microbial composition. We identified several differentially encoded pathways in the gut microbiome of patients with inflammatory arthritis, including changes in vitamin B salvage and biosynthesis and enrichment of iron sequestration. Although several of these changes characteristic of inflammation could have causal roles, we hypothesize that they are mainly positive feedback responses to changes in host physiology and immune homeostasis. By connecting taxonomic alternations to functional alterations, this work expands our understanding of the shifts in the gut ecosystem that occur in response to systemic inflammation during arthritis. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  440 
 
  
    EGAD50000000573 
   
  
    
    A total of 20 individuals, were analyzed using Oxford Nanopore Technologies long-read sequencing. Each individual was sequenced on one PromethION flow-cell, providing approximately 25-30X coverage of the genome. The data is deposited as nanopore BAM files with methylation information included.  
    
   
  
    
      
      PromethION 
      
    
   
  20 
 
  
    EGAD50000000574 
   
  
    
    This dataset contains the raw sequencing data (fastq-files)  of 10 sample libraries. These are 10x genomics single-cell transcriptome libraries of human peripheral blood mononuclear cells  (PBMC’s). The samples are taken from healthy, acute decompensated (AD)  and acute chronic liver failure  ACLF patients. The sequencing was performed on an Illumina NovaSeq6000. 
 
This data are related to the paper: "Distinct immunometabolic signatures in circulating immune cells define disease outcome in acute-on-chronic liver failure" 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000578 
   
  
    
    Here, we provide mapped cram files from 35 multiple myeloma patients. Sequencing was performed employing Illumina's short read technology. Resulting sequencing data was mapped against GRCh38 using nfcore/sarek v3.2.3. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  35 
 
  
    EGAD50000000581 
   
  
    
    Shallow whole genome sequencing of tumor Samples included in the GLASS cohort 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  227 
 
  
    EGAD50000000582 
   
  
    
    Whole exome sequencing from normal samples in the GLASS trial 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  45 
 
  
    EGAD50000000583 
   
  
    
    Whole exome sequencing from tumor samples in the GLASS trial 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  219 
 
  
    EGAD50000000585 
   
  
    
    The dataset comprises RNAseq data of human testiculr tisue of fertile men with full spermatogenesis nd infertile men with biallelic variants in genes of the piRNA pathway. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD50000000591 
   
  
    
    We used sequenced 34 prDLBCL samples using whole exome sequencing (WES) data to evaluate possible mutational signatures and driver mutations associated with the patient’s clinical and cytogenetic characteristics. 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  34 
 
  
    EGAD50000000592 
   
  
    
    We used sequenced 30 prDLBCL samples using RNA-seq to evaluate possible fusion events and to get expression profiles.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  30 
 
  
    EGAD50000000594 
   
  
    
    Our cohort represents infants with ALL in whom KMT2Ar was not detected by FISH or by standard cytogenetics. Whole-genome sequencing (WGS) was performed using Illumina’s HiSeqX to a depth of >30X.  
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  23 
 
  
    EGAD50000000595 
   
  
    
    Our cohort represents infants with ALL in whom KMT2Ar was not detected by FISH or by standard cytogenetics.  Whole-transcriptome sequencing (WTS) libraries were prepared using the NEBNext Ultra II RNA Directional library kit for samples with at least 100ng RNA (n=19), the Clontech double stranded cDNA conversion kit plus the Nextera XT library protocol for samples with less than 100ng RNA (n=4), or ribodepletion using NEBNext rRNA Depletion Kit v2 (Human/Mouse/Rat) for Total RNA (n=1). Libraries were paired-end sequenced using the Illumina HiSeq 2500.  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  28 
 
  
    EGAD50000000596 
   
  
    
    Massively Parallel Reporter Assays (MPRA) of colorectal cell lines HCEC-1CT (normal colon) and HT29 and SW403 (MSS cancer). Probes identified using the CRC GWAS. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000599 
   
  
    
    These are the metagenomic datasets of 29 FMT traids with corresponding plasma metabolomics.
Data for each triad consists of metagenomics for baseline, post FMT and corresponding donor.
Clinical data concerning the glucose rate of disappearance and blood pressure are available as phenotypes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  69 
 
  
    EGAD50000000601 
   
  
    
    Data generated through single nuclei ATAC sequencing on whole ganglionic eminences from 3 human fetuses (two of 16 and one of 17 gestational weeks). Tissue was acquired from the MRC-Wellcome Trust Human Developmental Biology Resource with ethical approval.  
snATAC-Seq libraries were prepared from ~8,000 nuclei per sample using Chromium Next GEM Single Cell ATAC (v1.1) reagents (10X Genomics). Quality control of libraries was performed using the Agilent 5200 Fragment Analyzer before sequencing on an Illumina NovaSeq 6000 to a depth of at least 617 million read pairs per library. Raw sequencing data were converted into FASTQ files.
For a full description of data generation, please see Cameron et al, Schizophrenia Bulletin 2024, https://doi.org/10.1093/schbul/sbae083.
Please note that 10X generated BAM files, rather than FASTQ files, have been uploded. FASTQ files can be regenerated using the 10X Genomics bamtofastq tool. 
https://support.10xgenomics.com/docs/bamtofastq 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000602 
   
  
    
    Sequencing dataset of 290 DLBCL and HGBL cfDNA samples of paired end 150bp sequencing runs. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  290 
 
  
    EGAD50000000603 
   
  
    
    Spatial transcriptomics data (ST) from 32 human prostate tissue samples originating from 8 prostate cancer patients (5 patients with post-surgery relapse). The ST data was acquired using the Visium Spatial Gene Expression kit which resulted in over 20 000 spatially defined spots for the 32 tissue samples. The raw transcriptomics data is RNA-seq. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Each ST spot has metadata such as sample origin, histology class (stroma, normal epithelium, cancer of various grading etc), number of cells and estimated cell type fractions. Patient metadata include information of age at surgery, time (months) until reported relapse, total follow-up time, pre-surgery PSA, post-surgery T-stage and metastasis status.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  32 
 
  
    EGAD50000000604 
   
  
    
    Bulk transcriptomics data from 176 prostate tissue samples (37 patients, 27 with post-surgery relapse) that were acquired using the SENSE mRNA-Seq Library Prep Kit V2 and Illumina NextSeq 500 instrument. The raw transcriptomics data is single-read RNA-seq. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  176 
 
  
    EGAD50000000605 
   
  
    
    Bulk methylation array data from 64 prostate tissue samples from 16 patients (5 with post-surgery relapse). Methylation data were acquired using the microarray kit Illumina Infinium MethylationEPIC BeadChip (n=64 ). The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.  
    
   
  
    
      
      Infinium MethylationEPIC BeadChip 
      
    
   
  64 
 
  
    EGAD50000000606 
   
  
    
    Bulk methylation array data from 32 prostate tissue samples from 8 patients (3 with post-surgery relapse). Methylation data were acquired using the microarray assay Illumina Infinium MethylationEPIC v2.0 Kit. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.  
    
   
  
    
      
      Infinium MethylationEPIC v2.0 BeadChip 
      
    
   
  32 
 
  
    EGAD50000000607 
   
  
    
    Tumours with high and ultrahigh rates of somatic retrotransposition studied in the publication by Zumalave et al. titled Synchronous L1 retrotransposition events promote chromosomal crossover early in human tumorigenesis 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      MinION 
      
      Sequel II 
      
    
   
  156 
 
  
    EGAD50000000608 
   
  
    
    This dataset contains all fecal shotgun metagenomics and metabolomics of the study which investigated the effects of the probiotic strain Anaerobutyricum soehngenii. Shotgun data is from three different time points, baseline and two follow up. Metabolomics sets are pre and post intervention, both fasted and post prandially. Data on randomization and metformin use can be found in the analysis.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  74 
 
  
    EGAD50000000609 
   
  
    
    The dataset contains RNAseq profiles of 57 patients from the CheckMate-142 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. Baseline tumor tissue samples were processed using the Illumina TruSeq RNA Access in-solution hybrid capture panel and underwent subsequent NGS on the Illumina NovaSeq platform. Fastq files are included. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  57 
 
  
    EGAD50000000610 
   
  
    
    The dataset contains WES profiles of 59 patients from the CheckMate-142 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. Baseline tumor tissue and matched whole -blood samples were processed using the Agilent SureSelect Human All Exon V5 in-solution hybrid capture panel and underwent subsequent next-generation sequencing (NGS) on the Illumina NovaSeq platform. Fastq files are included. 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  59 
 
  
    EGAD50000000611 
   
  
    
    MeD-seq data (fastq files) from gynecological cancers and associated healthy tissues.
In total 292 fastq files generated by MeD-seq are deposited consisting of: healthy tissues (vulva n=11, cervix n=15, endometrium n=13, fallopian tube n=18 and ovary n=13), precursor lesions of cancer (vulva n=23 and cervix n=46) and cancer (vulva n=21, cervix n=45, endometrium n=26, fallopian tube n=8 and ovary n=33) 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  292 
 
  
    EGAD50000000614 
   
  
    
    VCF file with autosomal genotypes from 22 and 20 Eivissan and Menorcan samples. Affymetrix Human Origins array was used to genotype the samples and the variants were lifted over to the hg38 human genome reference.  
    
   
  
    
   
  42 
 
  
    EGAD50000000615 
   
  
    
    We report single-cell RNA sequencing data from myeloid cells of the human visceral adipose tissue in non-alcoholic fatty liver disease patients
 
    
   
  
    
      
      Illumina NovaSeq X 
      
    
   
  16 
 
  
    EGAD50000000616 
   
  
    
    A mutation accumulation experiment in colorectal cancer (CRC) derived tumoroids. A sequential single-cell cloning approach was adopted to measure the mutation rate in eight tumoroids obtained from five patients. WGS was also performed on their matched normal tissue and on standard tumoroids cultures without any cloning step. This is a 150x depth sequencing for 7 samples. 
    
   
  
    
      
      unspecified 
      
    
   
  7 
 
  
    EGAD50000000617 
   
  
    
    Whole Genome Sequencing raw paired-ends reads for 20 matched colorectal patient-derived-organoids (normal and tumor). Tumor samples were obtained from patients treated at Niguarda Cancer Center, (Milano, Italy) and Candiolo Cancer Institute (Candiolo, Turin, Italy).  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  40 
 
  
    EGAD50000000618 
   
  
    
    In this study, formalin-fixed paraffin-embedded targeted locus capture (FFPE-TLC) sequencing is used as a novel technology for targeted detection of tumor-specific genomic structural variants (SVs) in the primary tumor of 29 colorectal cancer patients with metastatic disease. The tumor region was macrodissected and sequenced for 29 patients, and the "normal" region of the same slide was macrodissected and sequenced for 8 of these patients. 
SVs were found in the common fragile site (CFS)-associated genes MACROD2, PRKN, FHIT and WWOX as well as SVs caused by three LINE transposable elements. Tumor-specificity of selected SVs was independently verified by droplet digital PCR of tumor tissue DNA and their applicability as plasma circulating tumor DNA biomarkers was demonstrated.
This dataset contains the hg19 reference alignment of the FFPE-TLC sequences for all 29 colorectal cancers and 8 adjacent normals. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  37 
 
  
    EGAD50000000619 
   
  
    
    This research project was a collaboration between VU and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 943 Bipolar case/control samples from collaborators in the Netherlands. Genomic DNA from each sample was sequenced to a mean depth of 20x.  The exome used Twist capture and the samples were sequenced on Illumina HiSeqX machines producing cram files 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  947 
 
  
    EGAD50000000620 
   
  
    
    30 and 21 whole exome sequences for Eivissan and Menorcan healthy unrelated volunteers, respectively. The exomes were captured with the Agilent SureSelect Human All Exon V6 capture kit and pair-end sequenced in Illumina platforms. For each individual, a pair of  ".fastq" files can be found containing the raw reads. 
    
   
  
    
      
      unspecified 
      
    
   
  51 
 
  
    EGAD50000000621 
   
  
    
    This dataset contains the bam files of 60 paired samples of AML and remission samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  120 
 
  
    EGAD50000000622 
   
  
    
    We performed whole exome sequencing of paired fresh frozen tumor from 4 ER+ advanced breast cancer patients. Samples were obtained for each patient prior to combined CDK4/6 inhibitor and endocrine therapy and at disease progression (N=8). DNA was extracted using the Maxwell® RSC FFPE and Tissue DNA Kit (Promega). WES was performed at the Department of Molecular Medicine, Aarhus University Hospital on matched tumor DNA (derived from primary fresh frozen and FFPE tissue) and buffy coat DNA. Libraries of tumors and matching germline DNA were prepared using 50 ng DNA and captured by Twist Comprehensive Exome with custom spike-ins, sequenced on the Illumina NovaSeq 6000 platform to an average coverage of 413x (range: 148-515x). The dataset include the analysis data files (vcf files).  
    
   
  
    
   
  8 
 
  
    EGAD50000000623 
   
  
    
    Whole Exome Sequencing of 1260 Bipolar cases and matched controls performed at the Broad Institute on a cohort from Edinburgh, Scotland, UK.  The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  1242 
 
  
    EGAD50000000626 
   
  
    
    Here, we applied single-cell RNA-sequencing (scRNA-seq) on isolated HRS cells and the immune cells from the same pediatric classical Hodgkin Lymphoma (cHL) tumors. Specifically, 13 cHL patients and 3 reactive lymph node control samples were included in this cohort. This allowed us to identify genes of cell surface proteins that are consistently overexpressed in HRS cells and can potentially be used as targets for antibody-drug conjugates or CAR T cells. Finally, we identify potential interactions by which HRS cells inhibit T cells, among which the Galectin-1/CD69 and HLA-DRA/LAG3 interactions. However, high levels of inter-patient heterogeneity of the interaction strength were observed. In conclusion, this study identifies new potential therapeutic targets for cHL and highlights the importance of studying heterogeneity when identifying therapy targets. 
    
   
  
    
      
      NextSeq 2000 
      
      NextSeq 500 
      
    
   
  15 
 
  
    EGAD50000000627 
   
  
    
    This research project was a collaboration between Cambridge University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 2,873 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x.  The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  2851 
 
  
    EGAD50000000629 
   
  
    
    The cohort consisted of individuals with low-count MBL, high-count MBL as well as patients with CLL. Whole genome sequencing was initially performed in a smaller cohort of selected, representative samples, while targeted re-sequencing of CLL putative driver genes was performed in all samples of the cohort. Besides CLL cell samples, the sequencing process was performed in paired control samples including both buccal and polymorphonuclear cell samples. 
    
   
  
    
      
      unspecified 
      
    
   
  52 
 
  
    EGAD50000000630 
   
  
    
    The dataset "DELFI low-coverage WGS of plasma cfDNA" includes paired FASTQ files produced by WGS of 689 plasma cfDNA samples from 153 patients with colorectal cancer. WGS (100bp PE) was performed on a NovaSeq 6000 with a target depth of coverage of 8x. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  689 
 
  
    EGAD50000000631 
   
  
    
    a total number of 99 bulk RNA-seq lymphoma samples  
    
   
  
    
      
      Illumina HiSeq X 
      
    
   
  99 
 
  
    EGAD50000000632 
   
  
    
    a total number of 203 targeted DNA sequencing lymphoma samples 
    
   
  
    
      
      Illumina HiSeq 1000 
      
    
   
  203 
 
  
    EGAD50000000633 
   
  
    
    We performed cellular indexing of transcriptomes and epitopes (CITE-seq) of six primary leukemia samples from CK-AML patients. The dataset contains BAM files for each individual sample (D1922 and R0836) or for the hashed pool (HIAML47-HIAML85, P9D-P9R) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000634 
   
  
    
    We performed strand-specific single-cell sequencing of six primary leukemia samples from four CK-AML patients. We also performed strand-specific single-cell sequencing of three matching patient-derived xenografts (PDXs). The dataset contains BAM files for each cell. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  491 
 
  
    EGAD50000000635 
   
  
    
    Pacbio Hifi whole genome sequencing data from six individuals carrying cytogenetically visible inversions. The HiFi data was produced on the PacBio Revio machine, using 1 flowcell per sample. The resulting data was aligned to hg19 using minimap2, and converted into bam using samtools.
 
    
   
  
    
      
      unspecified 
      
    
   
  6 
 
  
    EGAD50000000638 
   
  
    
    Analysis of skewed X inactivation for X-linked disorders 
    
   
  
    
      
      PromethION 
      
    
   
  46 
 
  
    EGAD50000000640 
   
  
    
    In our single-center study, we have launched a pilot program for pediatric patients with undiagnosed diseases in the second-largest university hospital in the Czech Republic. WES was implemented as a first-line test after inclusion in the study as part of the diagnostic workflow. This study was prospectively conducted at the Department of Pediatrics at University Hospital Brno between 2020 and 2023.  
    
   
  
    
      
      NextSeq 500 
      
    
   
  58 
 
  
    EGAD50000000641 
   
  
    
    An additional 320 swab samples were sequenced. The bam files contain consensus reads. 
    
   
  
    
      
      DNBSEQ-G400 
      
    
   
  320 
 
  
    EGAD50000000642 
   
  
    
    ChIP-seq has been perfomed on fresh-frozen tissue derived from healthy breast (HB), primary breast cancer (PB), BCa-derived liver metastasis (LM). Immunoprecipitation has been performed for ERa (sc-542, SantaCruz). Raw paired-end fastq.gz files are provided for both immunoprecipitated DNA and input samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  22 
 
  
    EGAD50000000643 
   
  
    
    Hi-C libraries have been prepared by enzymatic digestion with MboI restriction enzyme and sonication (Covaris). Illumina single-indexig primers have been use to amplify ligated fragments. Hi-C experiments have been performed on 10 slices (50um thick) of fresh-frozen tissues or 10x10e6 Pleural Effusion cells. Raw paired-end fastq.gz files are provided.
 
    
   
  
    
      
      NextSeq 500 
      
    
   
  39 
 
  
    EGAD50000000644 
   
  
    
    Raw sequencing data of WES that were obtained from precancerous samples as adenoma and MMR-deficient crypts and paired healthy tissue from MSI CRC patients of which 10 are diagnosed with lynch syndrome and 3 had sporadic cancer.
These data were analyzed in order to evaluate microsatellite instability though tumorigenesis and associated to splicing deregulation in MSI tumors RNA.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  36 
 
  
    EGAD50000000646 
   
  
    
    The detection of circulating tumor DNA, which allows non-invasive tumor molecular profiling and disease follow-up, promises optimal and individualized management of patients with cancer. However, detecting small fractions of tumor DNA released when the tumor burden is reduced remains a challenge. We developed a PCR-based targeted bisulfite method coupled to deep sequencing to detect methylation patterns of L1PA elements,which we named DIAMOND (for Detection of Long Interspersed Nuclear Element Altered Methylation ON plasma DNA).  We used sodium bisulfite chemical conversion to achieve base-pair resolution analysis and designed a multiplexed PCR based on 8 amplicons covering L1PAs .  
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina MiSeq 
      
      Illumina NovaSeq X 
      
    
   
  729 
 
  
    EGAD50000000648 
   
  
    
    Total RNA sequencing of olfactory mucosa (OM) cells derived from individuals diagnosed with Alzheimer's disease exposed to traffic-related ultrafine particles (UFPs) for 24-h and 72-h in submerged cultures. The UFPs used for exposures were: A0, A20 and Euro6. Exposures were compared to the corresponding blank samples. 
    
   
  
    
      
      unspecified 
      
    
   
  59 
 
  
    EGAD50000000649 
   
  
    
    Bulk RNA-Sequencing of 18 primary breast cancers from Wu et al. (2021) study.  
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  24 
 
  
    EGAD50000000650 
   
  
    
    Tumors from 173 GBM patients were analysed for somatic mutations to generate a personalized peptide vaccine targeting tumor-specific neoantigens. Exome libraries for 173 glioblastoma tumors and matched normal DNA were sequenced on Illumina platform, alongside total RNA from the tumors. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  346 
 
  
    EGAD50000000651 
   
  
    
    This dataset comprises 406 raw fastq files derived from 203 plasma cfDNA samples from 90 patients diagnosed with hepatocellular carcinoma (HCC). The samples include 90 baseline HCC (b-HCC) samples, collected during liver transplant or resection surgeries, and 113 postoperative follow-up (f-HCC) samples. For each sample, 10ng of cfDNA was used for library preparation utilizing cfMeDIP-seq technology based on the 5mC antibody-immunoprecipitation strategy. Libraries were validated via Bioanalyzer trace analysis and sequenced on Illumina NovaSeq 6000 or HiSeq 2500 platform with paired-end 150-bp (NovaSeq) or 125-bp (HiSeq) model for ~100 million reads per sample. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  203 
 
  
    EGAD50000000652 
   
  
    
    This dataset comprises 46 raw fastq files derived from 23 plasma cfDNA samples from 23 healthy (CTL) cancer-free donors. For each sample, 10ng of cfDNA was used for library preparation utilizing the cfMeDIP-seq technology based on the 5mC antibody-immunoprecipitation strategy. Libraries were validated via Bioanalyzer trace analysis and sequenced on the Illumina NovaSeq 6000 platform with paired-end 150-bp model for ~100 million reads per sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD50000000655 
   
  
    
    This database includes comprehensive cancer panel (CCP) on paired tumour and germline DNA from cancer of unknown primary samples, all of which are matched to WGS data under this study. This data was used to compare biomarker yield against what was achieved with WGS. This dataset contains n=34 tumour-germline paired samples. All samples are in BAM format aligned with GRCh38 reference genome. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  68 
 
  
    EGAD50000000656 
   
  
    
    Whole genome and transcriptome sequencing of cancer of unknown primary tumours was used to determine yield of clinical biomarkers for a molecular guided trial or for resolving cancer type of origin. This study includes profiling of germline DNA and tumour DNA and RNA by whole genome and transcriptome sequencing. All samples are in BAM format aligned with GRCh38 reference genome.
This dataset includes:
1. Whole genome sequencing (WGS) of 78 cancer of unknown primary tumour samples and 73 matched germline DNA.
2. Whole transcriptome sequencing (WTS) of 69 cancer of unknown primary tumour samples (matched to WGS cases)
3. Whole genome sequencing of 22 cell-free DNA samples from cancer of unknown primary patients and matched Germaine DNA(8 samples matched to tumour WGS) 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  264 
 
  
    EGAD50000000657 
   
  
    
    TruSight Oncology 500 (TSO500) cancer panel sequencing data on paired tumour and germline DNA from cancer of unknown primary samples. All of the samples in this dataset are matched to WGS data. This data was used to compare biomarker yield against what was achieved with WGS and contains n=51 tumour-only samples. All samples are in BAM format aligned with GRCh38 reference genome. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  51 
 
  
    EGAD50000000661 
   
  
    
    Bulk RNA sequencing of flow cytometry sorted human CD4+ regulatory T (Treg), CD4+ conventional T (Tcon), CD8+ T, and CD19+ B cells from systemic lupus erythematosus patients collected at baseline (day 0, before interleukin-2 immunotherpay), day 5 (after 1 treatment cycle of interleukin-2 immunotherapy), and day 68 (after 4 treatment cycles of interleukin-2 immunotherapy). The dataset comprises files from the above-mentioned 4 immune cell subsets, collected at 3 time points (day 0, day 5, and day 68) of 12 systemic lupus erythematosus patients. Due to technical reasons, day 5 samples of patient SLE_012 could not be processed and are thus missing. The complete dataset totals in 140 raw sequencing files (fastq format). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  140 
 
  
    EGAD50000000662 
   
  
    
    Single-cell RNA sequencing of flow cytometry sorted human CD45+ immune cells from 3 systemic lupus erythematosus patients (SLE_002, SLE_004, and SLE_006) collected at baseline (day 0, before interleukin-2 immunotherpay) and day 5 (after 1 treatment cycle of interleukin-2 immunotherapy). Sorted immune cells were hash tagged and then stained with TotalSeq Human Universal Cocktail (Biolegend) before further processing for single-cell RNA sequencing using the 10X-Genomics platform. Samples from all 3 patients were pooled and loaded on 3 lanes of the chromium controller, resulting in 1 raw sequncing file for each lane.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000663 
   
  
    
    Single-cell RNA sequencing of flow cytometry enriched CD4+ CD127- CD25+ regulatory T cells from 8 systemic lupus erythematosus patients (SLE_001 to SLE_007, and SLE_009) collected at baseline (day 0, before interleukin-2 immunotherpay) and day 5 (after 1 treatment cycle of interleukin-2 immunotherapy). Sorted immune cells were hash tagged and then stained with TotalSeq Human Universal Cocktail (Biolegend) before further processing for single-cell RNA sequencing using the 10X-Genomics platform (incl. generation of TCR libraries). Samples from patients SLE_001, SLE_002, SLE_004, SLE_006 were collected and processed on the first day of experiment (Day 1), and samples from patients SLE_003, SLE_005, SLE_007, SLE_009 on the second day of experiment (Day 2). On each day, samples were pooled and distributed to four lanes on the chromium controller. Run ID 245269 and ID 245270 identify samples sequenced on separate flow cells. Provided are raw sequencing files.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000664 
   
  
    
    Assay for Transposase-Accessible Chromatin using sequencing (ATAC) sequencing of human CD38+, HLA-DR+, CD38+ HLA-DR+, and CD38- HLA-DR- regulatory T (Treg) cell subsets sorted from peripheral blood pooled from 4 healthy donors. After isolation and purification of Treg cell subsets with flow cytometry, samples were frozen and transferred to Active Motif for ATAC sequencing. The dataset contains one raw sequencing file (fastq) of accesible regions for each of the above-mentioned Treg cell subsets. 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  4 
 
  
    EGAD50000000668 
   
  
    
    Simulated (based on real world data) tumor normal pair designed for benchmarking and optimisation of structural variation callers. The calls are derived from 12 patients in the hartwig database requested by the EUCANCan project. Paired-end sequencing data was imputed into the normal reference data to simulate real world variation like purity as well as technical impact of sequencing technologies. The dataset contains a tumor sequenced to ±90x depth and a normal control sequenced to ±30 x depth of these imputed events. The break junctions of all detected SVs in the original cases are imputed in this data from diagnostic WGS sequencing data.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  1 
 
  
    EGAD50000000676 
   
  
    
    PacBio HiFi Revio WGS data from 16 individuals sequenced in the Genomic Medicine Sweden (GMS) Long read project. Each individual was sequenced on one PacBio Revio SMRT cell. The DNA was extracted from blood. The HiFi data was aligned to GRCh38 using minimap2 or the SMRT-link software. 
    
   
  
    
      
      unspecified 
      
    
   
  16 
 
  
    EGAD50000000678 
   
  
    
    Zephir trial gene expression 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  24 
 
  
    EGAD50000000679 
   
  
    
    Colorectal cancer (CRC) is the second leading cause of cancer death worldwide. Early detection of precursor lesions or early-stage cancer could hamper cancer development or improve survival rates. Liquid biopsy, which detects tumor biomarkers, such as mutations, in blood, is a promising avenue for cancer screening. To assess the presence of genetic variants in plasma cell-free tumor DNA from patients with precursor lesions and colorectal cancer using the commercial Oncomine Colon cfDNA Assay. Cell-free DNA (cfDNA) samples from the blood plasma of 52 Brazilian patients were analyzed. Eight patients did not have any significant lesions (five normal colonoscopies and three hyperplastic polyps), 24 exhibited precursor lesions (13 nonadvanced adenomas, ten advanced adenomas, and one sessile serrated lesion), and 20 patients with cancer (CRC). The mutation profile of 14 CRC-associated genes were determined by next-generation sequencing (NGS) using the Oncomine Colon cfDNA Assay in the Ion Torrent PGM/S5 sequencer. 
    
   
  
    
      
      Ion Torrent S5 
      
    
   
  52 
 
  
    EGAD50000000684 
   
  
    
    Little is known about the transcriptomic profile of individuals who are exposed to SARS-CoV-2 yet resist becoming PCR positive.  To investigate this, longitudinal whole-blood samples were taken (0, 7, 14, and 28 days after enrolment) from PCR positive and PCR negative SARS-CoV-2-naïve household contacts who were recently exposed to a COVID-19 index. Samples were also taken from pre- and post-pandemic unexposed controls. Total RNA was extracted from PAXgene tubes before undergoing poly(A) selection followed by globin and rRNA depletion. DNA libraries were constructed using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina. All samples were then sequenced across 2 flowcells of an Illumina HiSeq 4000. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  144 
 
  
    EGAD50000000686 
   
  
    
    Spatial transcriptomics analysis of triple negative breast cancers
Both bulk sequencing and ST were performed
Counts, images etc are available at https://zenodo.org/doi/10.5281/zenodo.8135721 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      NextSeq 500 
      
    
   
  94 
 
  
    EGAD50000000687 
   
  
    
    This is a Next Generation Sequencing approach based on whole Usher Syndrome genes sequencing with the aim of diagnosing USH patients and USH2A-associated RP patients 
    
   
  
    
      
      Illumina MiSeq 
      
      NextSeq 500 
      
    
   
  44 
 
  
    EGAD50000000689 
   
  
    
    The data set contains FASTQ files (filetype) of a NEXTSEQ550DX run (instrument). De FASTQ files are from DNA and RNA sample. The Library prep is a NGS TSO500 library prep (Illumina) (technology). 
A Study to Examine the Clinical Value of Comprehensive Genomic Profiling Performed by Belgian NGS Laboratories: a Belgian Precision Study of the BSMO in Collaboration With the Cancer Centre (BALLETT)
This 2-year study involves the consortium of 9 cooperating Belgian NGS laboratories and will enroll 936 metastatic or locally advanced cancer patients coming from 13 different Belgian hospitals and cancer centers. Upon inclusion, all cancer patients will be offered 'comprehensive genomic profiling' (CGP) using Illumina's TSO500 NGS panel. This targeted NGS panel of 523 genes allows for the detection of single nucleotide variants, small indels, copy number variations and fusions, as well as for the determination of the 'tumor mutational burden' (TMB) and the 'microsatellite-instability' status (MSI). Both the wet lab execution of the CGP as well as the biological and clinical classification of the variants will be performed in a fully standardized way among the 9 participating Belgian local NGS laboratories.
 
    
   
  
    
      
      NextSeq 550 
      
    
   
  8 
 
  
    EGAD50000000691 
   
  
    
    The dataset contains raw data from 10x Multiome derived single-nuclei RNA and single-nuclei ATAC sequencing in the same cells 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD50000000692 
   
  
    
    This dataset includes FASTQ files from FGF14 alleles sequenced by targeted nanopore in 67 patients, 64 control individuals and three unaffected relatives.  
    
   
  
    
      
      MinION 
      
    
   
  134 
 
  
    EGAD50000000694 
   
  
    
    Tumours from MMRd EC patients treated with 2 cycles neoadjuvant ICI. TruSight Oncology 500 (TSO 500) panel for all 10 tumour samples from patients included in the trial. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  10 
 
  
    EGAD50000000695 
   
  
    
    The dataset for “Early detection of ovarian cancer using cell-free DNA fragmentomes and protein biomarkers” includes 409 cram files from whole genome next-generation sequencing on the Illumina HiSeq2500.  The samples analyzed include plasma samples from healthy individuals and patients with cancer.    
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  409 
 
  
    EGAD50000000696 
   
  
    
    We subjected to whole genome sequencing (WGS) one ILC case lacking CDH1 biallelic mutations. Both tumor and normal samples were sequenced. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  2 
 
  
    EGAD50000000697 
   
  
    
    Patients were biopsied at progression on molecular targeted agents and WES and/or WTS were performed to identify resistance mechanisms.
271 underwent one or sequential tissue biopsies. 
    
   
  
    
      
      Illumina HiSeq 2500 
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq X 
      
      NextSeq 500 
      
    
   
  2071 
 
  
    EGAD50000000698 
   
  
    
    In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Here we used RNAseq data from 21 patients(paired end)
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000699 
   
  
    
    In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Therefore we analyzed in-depth paired baseline and recurrent tumor samples through a.o. DNA sequencing analyses. 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  19 
 
  
    EGAD50000000700 
   
  
    
    In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Therefore we analyzed in-depth paired baseline(n=10)  and recurrent tumor samples of the lymph nodes(n=10)  and brain metastasis (n=2) through Whole Exome Sequencing (WES)  analyses. 
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  32 
 
  
    EGAD50000000701 
   
  
    
    Samples were from 34 mothers, 6 with gestational diabetes, 14 with type 1 diabetes and 14 without diabetes history with either vaginal (N=17) or cesarean section (n=17) delivery, comprising primary planned section (N=9) or secondary emergency section (N=8). Anesthesia methods during delivery included spinal (N=11), epidural (N=11), or local anesthesia (N=6), with unknown anesthesia information for six cases. Samples from each other were from the Villous and the Decidua part of the Placenta 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  68 
 
  
    EGAD50000000704 
   
  
    
    Ischemia reperfusion is an unavoidable step of organ transplantation. Development of therapeutics for lung injury during transplantation has proved challenging; understanding lung injury from human data at the single cell resolution is required to accelerate the development of therapeutics. Donor lung biopsies from six human lung transplant cases were collected at the end of cold preservation and 2-hour reperfusion and underwent single cell RNA sequencing. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000706 
   
  
    
    The aim of this project is to assess differences in intratumoral immune composition in pregnant melanoma patients versus non pregnant melanoma controls. Samples (N=25) were obtained from a local patient database. From archived FFPE we isolated RNA and performed NGS and bio-informatica data-analysis.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  25 
 
  
    EGAD50000000707 
   
  
    
    Further investigation and characterisation of 12q-amplified low- and high-grade osteosarcomas with MDM2 and/or CDK4 amplification focusing on SV, copy number and gene fusion analyses. In total, 25 cases (33 samples total due to multi-sampling) were included, with some form of sequencing data available for 27 samples. Mate-pair whole genome sequencing (Illumina) is available for 19 samples, longread whole genome sequencing (PacBio HiFi) on 10 samples and RNA-sequencing (Illumina Truseq) on 21 samples. Data is available as BAM files. 
    
   
  
    
      
      NextSeq 500 
      
      unspecified 
      
    
   
  25 
 
  
    EGAD50000000708 
   
  
    
    Mid-pass whole genome sequencing was performed for 264 Malagasy individuals across three geographic regions across the island of Madagascar (west coast, central highlands, and southern highlands). This dataset includes the VCF from joint variant calling with reference populations as specified in the associated publication. 
    
   
  
    
   
  264 
 
  
    EGAD50000000715 
   
  
    
    Whole exome sequencing of 66 gastric samples and whole transcriptome sequencing of 191 gastric samples. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
    
   
  257 
 
  
    EGAD50000000716 
   
  
    
    This research project was a collaboration between University of Bristol, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 2,969 Control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2969 
 
  
    EGAD50000000720 
   
  
    
    Long-read (PacBio) RNA sequencing dataset of three neural retinal samples. Three PacBio libraries were prepared according the 'standard workflow' optimized for sequencing transcripts centered around 2kb. Additionally one Library (input HNR_S2) was prepared according an optimized 'long workflow' to enrich for larger transcripts up to >10kb. To further enhance the capture of full-length transcripts for USH2A and ADGRV1 (USH2C) genes, we also employed a targeted enrichment approach using the Samplix Xdrop System, followed by PacBio long-read sequencing. USH2A and ADGRV1 enrichment was performed by targeting '5- mid and 3' targets of the respective genes. 
Finally, we also added ONT (Oxford Nanopore technology) long-read mRNA sequencing of three independent neural retina samples. These data were used for the validation of events observed from the Iso-Seq and Samplix data. The ONT datasets contain sequencing data of the Usher-associated genes. 
Files are raw BAM, and CRAM format files generated by a Sequel II machine. Additionally, the ccs3 BAM format files are included. 
    
   
  
    
      
      Sequel II 
      
      unspecified 
      
    
   
  8 
 
  
    EGAD50000000722 
   
  
    
    Exome sequencing data from fourteen phenotypically abnormal human fetal samples.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD50000000724 
   
  
    
    Methylation profile of 33 patients with small cell lung cancer (SCLC) for both, tumour and normal lung samples using MeDIP-seq. 
    
   
  
    
      
      Illumina HiSeq 2000 
      
    
   
  66 
 
  
    EGAD50000000726 
   
  
    
    Using DNA extracted from peripheral blood, Cas9-targeted nanopore DNA sequencing was used to analyze MAGEL2 gene, including its entire regulatory construct (chr15:23639316-23651466), for sequence variation and 5-methyl-cytosine (5mC) modification in a cohort of adults with HFA compared to sex- and age-matched NC.  
    
   
  
    
      
      PromethION 
      
    
   
  40 
 
  
    EGAD50000000727 
   
  
    
    Single-nuclei sequencing data from four neuroblastoma patients. Each patient was run on two lanes, resulting in two runs per patient. Data is provided in paired-end fastq files. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  8 
 
  
    EGAD50000000728 
   
  
    
    Genomic and Transcriptomic single-cell sequencing of neuroblastoma patient. Data represents one 96-well plate that was processed with G&T sequencing, resulting in genomic and transcriptomic data from the same single cells. Dataset contains 95 bam files containing the DNA sequencing data and 95 bam files containing the RNA sequencing data. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  190 
 
  
    EGAD50000000729 
   
  
    
    Genomic and Transcriptomic sequencing of neuroblastoma HSR and ecDNA cell lines. Data represents five 96-well plates that were processed with G&T sequencing, resulting in genomic and transcriptomic data from the same single cells. Dataset contains 95 bam files containing the DNA sequencing data and 95 bam files containing the RNA sequencing data of CHP212 cells, 380 bam files (190 DNA and 190 RNA) for TR14 cells, 188 bam files (94 DNA and 94 RNA) for Kelly cells and 192 bam files (96 DNA and 96 RNA) for IMR5/75 cells. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  950 
 
  
    EGAD50000000741 
   
  
    
    Structural variants assessed in 6 patient samples with varying phenotypes, through the use of sniffles2 with a support parameter of 1 and at a length of 30 bp 
    
   
  
    
      
      PromethION 
      
    
   
  6 
 
  
    EGAD50000000743 
   
  
    
    This dataset contains 138 .bam files sequenced with Illumina NovaSeq 6000. The files with the variant calling performed to the sequencing data, as well as the clinical data (phenotype) extracted during the sample extraction.  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  138 
 
  
    EGAD50000000746 
   
  
    
    Transcriptomics for the ALTTO study by high-throughput sequencing. 386 HER2+ breast cancers treated by trastuzumab or lapatinib + trastuzumab were sequenced. Two fastq files are given for each sample. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  386 
 
  
    EGAD50000000749 
   
  
    
    Single-cell RNA and TCR-sequencing of three patients with treatment-refractive immune-mediated arthritis. The dataset consists of synovial tissue CD4+ and CD8+ T cells and peripheral blood CD45+ leukocytes. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000751 
   
  
    
    This dataset contains single cell protein tag sequencing of EAC samples by 10x Genomics CITE-seq. Total number of files - 24. File format - FASTQ. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000752 
   
  
    
    This dataset contains single cell RNA-sequencing of EAC samples by 10x Genomics. Total number of files - 24. File format - FASTQ. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  12 
 
  
    EGAD50000000753 
   
  
    
    This dataset contains single cell RNA-sequencing of BE samples by 10x Genomics. Total number of files - 28. File format - FASTQ. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  14 
 
  
    EGAD50000000754 
   
  
    
    The NFKBIE gene, which encodes the NF-κB inhibitor IκBε, is mutated in 3-7% of patients with chronic lymphocytic leukemia (CLL). The most recurrent alteration is a 4-bp frameshift deletion associated with NF-κB activation in leukemic B cells and poor clinical outcome. To study the functional consequences of NFKBIE gene inactivation, both in vitro and in vivo, we engineered CLL B cells and CLL-prone mice to stably down-regulate NFKBIE expression and investigated its role in controlling NF-κB activity and disease expansion. We found that IκBε loss leads to NF-κB pathway activation and promotes both migration and proliferation of CLL cells in a dose-dependent manner. Importantly, NFKBIE inactivation was sufficient to induce a more rapid expansion of the CLL clone in lymphoid organs and contributed to the development of an aggressive disease with a shortened survival in both xenografts and genetically modified mice. IκBε deficiency was associated with an alteration of the MAPK pathway, also confirmed by RNA-sequencing in NFKBIE-mutated patient samples, and resistance to the BTK inhibitor ibrutinib. In summary, our work underscores the multimodal relevance of the NF-κB pathway in CLL and paves the way to translate these findings into novel therapeutic options.  
    
   
  
    
      
      Illumina HiSeq 2500 
      
    
   
  8 
 
  
    EGAD50000000755 
   
  
    
    Oxford Nanopore Technologies based long-read RNA sequencing data from 5 patients with stereotyped subset CLL. The BAM files were aligned against hg19 reference genome.
 
    
   
  
    
      
      MinION 
      
    
   
  5 
 
  
    EGAD50000000760 
   
  
    
    9 DHG-H3G34 patient samples were sequenced by paired-end scRNA-sequencing using the Smart-Seq2 protocol on a NextSeq 500 sequencer (Illumina). Illumina bcl2fastq 1.5 was used for demultiplexing. This dataset contains the resulting 9,408 fastq files of 4,704 single cells/nuclei sequenced. 
    
   
  
    
      
      NextSeq 500 
      
    
   
  9 
 
  
    EGAD50000000763 
   
  
    
    Total RNA sequencing on 9 uveal melanomas. Libraries were prepared using the TruSeq Stranded Total RNA Library Prep Gold (Illumina, 20020599). Paired-end libraries (2 x 100 bp) were sequenced on a NovaSeq 6000 instrument (Illumina). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  9 
 
  
    EGAD50000000764 
   
  
    
    Whole genome sequencing on 1 uveal melanoma and corresponding germline sample. Libraries were prepared using the Kapa HyperPrep kit (Roche, 07962363001). Paired-end libraries (2 x 150 bp) were sequenced on a NovaSeq 6000 instrument (Illumina).
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000765 
   
  
    
    Whole genome sequencing on HAP-1 clones wild-type or knockout for MBD4 and/or TDG. Libraries were prepared using the Kapa HyperPrep kit (Roche, 07962363001). Paired-end libraries (2 x 100 bp) were sequenced on a NovaSeq 6000 instrument (Illumina). 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  20 
 
  
    EGAD50000000766 
   
  
    
    Whole Genome Bisulfite Sequencing on normal primary human uveal melanocytes. WGBS libraries were prepared using the Accel-NGS Methyl-Seq DNA Library Kit (Swift Biosciences 30024), the EZ DNA Methylation-Gold Kit (Zymo D5005), and DNA Clean & Concentrator-5 (Zymo D4013), following the instruction manual Accel-NGS Methyl-Seq DNA Library (Revision 160510). Paired-end (2 x 150 bp) libraries were sequenced on a DNBSEQ-T7 instrument (MGI) after library conversion.  
    
   
  
    
      
      DNBSEQ-T7 
      
    
   
  1 
 
  
    EGAD50000000767 
   
  
    
    Whole exome sequencing on 11 uveal melanoma samples and corresponding germline samples. Libraries were prepared using the SureSelectXT2 Clinical Research Exome V2 kit (Agilent, 5190-9500 and G9621B). Paired-end libraries (2 x 100 bp) were sequenced on HiSeq 2000/2500 or NovaSeq 6000 instruments (Illumina). 
    
   
  
    
      
      Illumina HiSeq 2000 
      
      Illumina HiSeq 2500 
      
      Illumina NovaSeq 6000 
      
    
   
  21 
 
  
    EGAD50000000768 
   
  
    
    Dataset for the manuscript Scywalker: scalable end-to-end data analysis workflow for nanopore single-cell transcriptome sequencing. Contains fastq files for 4 brain samples obtained from short-read NovaSeq 6000 v1.5 Illumina and long-read Oxford Nanopore PromethION sequencing. Single-cell suspensions were generated using 10x Genomics Chromium Next GEM Single Cell 3'Kit v3.1 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
      PromethION 
      
    
   
  8 
 
  
    EGAD50000000769 
   
  
    
    Investigation of post-zygotic and germline variants using whole exome and ultra-deep duplex sequencing in paired uninvolved margin and primaty tumor samples from 126 breast cancer patients with differing survival outcomes, with skin or blood samples as reference. Pairs of uninvolved margin and blood samples were also collected for 15 reduction mammoplasty patients without personal or familial history of cancer, serving as controls. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  22 
 
  
    EGAD50000000770 
   
  
    
    Investigation of post-zygotic and germline variants using whole exome and ultra-deep duplex sequencing in paired uninvolved margin and primary tumor samples from 126 breast cancer patients with differing survival outcomes, with skin or blood samples as reference. Pairs of uninvolved margin and blood samples were also collected for 15 reduction mammoplasty patients without personal or familial history of cancer, serving as controls. 
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  408 
 
  
    EGAD50000000771 
   
  
    
    29 Hi-C datasets [8 G3 (MB3667, MB3687, MB3690, MB3692, MB3693, MB4010, MB4037, MB4141), 13 G4 (MB0558, MB3510, MB3670, MB3689, MB3716, MB3760, MB3761, MB3807, MB4079, MB4132, MB4174, MBSF7, SNC_2_5), 7 SHH (MB3612, MB3661, MB3662, MB3695, MB3697, MB3724, MB4143),  and 1 WNT (MB4036)]. Fresh tissue samples were obtained from The Hospital for Sick Children (Toronto, ON). In situ Hi-C libraries were generated using approximately 2.5 million dissociated cells as input. All Hi-C libraries were sequenced at 150 bp PE with a Hi-Seq X instrument (Illumina) at McGill Genome Centre (Montreal, QC).  
    
   
  
    
      
      HiSeq X Ten 
      
    
   
  29 
 
  
    EGAD50000000787 
   
  
    
    This dataset contains long-read whole-genome sequencing (lrWGS) data from 12 samples. Three lrWGS data are from single-cell (sc), multi-cell (mc, 10 cells), and bulk samples of HG002. The remaining nine lrWGS data are from two preimplantation genetic testing (PGT) families, including four from blood bulk DNA of the parental pairs and five from trophectoderm biopsies of two embryos from one family and three embryos from another family. The data are provided in raw FASTQ format and were generated using the PromethION device from Oxford Nanopore Technologies. 
    
   
  
    
      
      PromethION 
      
    
   
  12 
 
  
    EGAD50000000790 
   
  
    
    Low-coverage whole genome sequencing and targeted (30 gene panel) deep sequencing of oral cancer. Note that the targeted deep sequencing is not actually amplicon sequencing but hybrid capture sequencing, which was not available as on option. 
    
   
  
    
      
      unspecified 
      
    
   
  554 
 
  
    EGAD50000000792 
   
  
    
    The dataset contains WES profiles of 487 patients from the CA209-274 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections.  Normalized WES libraries were pooled and sequenced on Illumina NovaSeq 6000 at a plex level appropriate to the coverage of tumor: 2 × 100 bp PE 100 M reads, germline: 2 × 100 bp PE 25 M reads. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  487 
 
  
    EGAD50000000793 
   
  
    
    The dataset contains RNAseq profiles of 370 patients from the CA209-274 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (75PE, 50M) were sequenced on Illumina NovaSeq 6000 at a plex-level appropriate to the coverage of 2 × 50 base pair (bp) paired end (PE) 50M reads. Fastq files are included. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  370 
 
  
    EGAD50000000794 
   
  
    
    10X Genomics single-cell transcriptomics of hepatoblastoma tissues using Chromium Next GEM Single-Cell 3’ Reagent Kits v3.1. Single-cell transcriptomics data (bam files) of PT9, post-chemotherapy tumor sample, and PT13, treatment naive tumor sample. Single-cell solutions were obtained from viably frozen tissue samples using enzymatic digestion.
 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  2 
 
  
    EGAD50000000795 
   
  
    
    10X Genomics single-cell multiome profiling of hepatoblastoma tumor organoids using Chromium Next GEM Single-Cell Multiome ATAC + Gene Expression Kit. Single-cell transcriptomics data (bam file) of multiplexed sample of hepatoblastoma tumor organoids: 3E, 8F1, 10F2, 13F2, 13E, 17E and 96F1. Single-cell solutions were obtained from fresh organoid samples using enzymatic digestion. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  7 
 
  
    EGAD50000000796 
   
  
    
    10X Genomics single-cell transcriptomic profiling of hepatoblastoma tumor organoids using Chromium Next GEM Single-Cell 3’ Reagent Kits v3.1 or 3’ CellPlex Multiplexing Kit. Single-cell transcriptomics data (bam files) of hepatoblastoma tumor organoids: 3E, 8F1, 10F2, 13F2, 13F2 late, 13E, 17E, 17F1, 22E, 27F1, 28F1, 31E, 96F1, 121E and 135. Single-cell solutions were obtained from fresh organoid samples using enzymatic digestion. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  15 
 
  
    EGAD50000000797 
   
  
    
    10X Genomics Visium Spatial transcriptomics analysis of hepatoblastoma and adjacent normal liver. Spatial data of normal liver and PT2 (fastq files), and 
and PT13, PT14 and PT16 (bam files) using the Visium Spatial Gene Expression Solution. All samples, except PT13, were collected post-chemotherapy. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  5 
 
  
    EGAD50000000804 
   
  
    
    Plasma samples from healthy individuals were subjected to low-coverage whole-genome sequencing (less than 10x average depth). This dataset contains raw fastq files from 18 healthy control plasma samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  18 
 
  
    EGAD50000000805 
   
  
    
    This datasets consists os genomic WES of paired-end raw data (FASTQ R1, R2 and UMI sequence) obtained from plasma and HapMap control samples. Specifically, it consists in two Hapmap samples (NA12877, NA12878) and 2 plasma standarts (HD780 and HD816)  with VAF mutations at 0.0%, 0.1%, 1.0%, and 5.0%, in total 21 plasma samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  23 
 
  
    EGAD50000000806 
   
  
    
    This study explores the cell-free transcriptome in a humanized DLBCL patient-derived tumor xenograft (PDTX) model. Blood plasma samples (n=171) derived from a DLBCL PDTX model in-cluding 27 humanized (HIS) PDTX, 8 HIS non-PDTX and 21 non-HIS PDTX non-obese diabetic (NOD)-scid IL2Rgnull (NSG) mice were collected during humanization, xenografting, treatment, and sacrifice. The mice were treated with either rituximab, cyclophosphamide, doxorubicin, vincris-tine, and prednisone (R-CHOP), CD20-targeted human IFNα2-based AcTaferon combined with CHOP (huCD20-Fc-AFN-CHOP), or phosphate-buffered saline (PBS). RNA was extracted using the miRNeasy serum/plasma kit and sequenced on the NovaSeq 6000 platform using the using the SMARTer Stranded Total RNA-Seq Kit v3. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  171 
 
  
    EGAD50000000807 
   
  
    
    Single-cell RNA-Seq data and TCR sequencing data (both by 10X Genomics) of 51 TNBC primary tumors obtained from 29 unique patients from BELLINI clinical trial. The data includes pre- and post-treatment samples. Patients in cohort A (16 patients, 28 samples) received nivolumab for 4 weeks, patients in cohort B (13 patients, 23 samples) received nivolumab + ipilimumab for 4 weeks. The included sequencing data was generated from frozen material for cohort A and from fresh material for cohort B. 
    
   
  
    
      
      Illumina HiSeq 4000 
      
      Illumina NovaSeq 6000 
      
    
   
  116 
 
  
    EGAD50000000808 
   
  
    
    RNA-Seq data of 78 breast cancer primary tumors obtained from 45 unique patients from BELLINI clinical trial. The data includes pre- and post-treatment samples. Patients in cohort A (15 patients, 25 samples) received nivolumab for 4 weeks, patients in cohort B (15 patients, 28 samples) received nivolumab + ipilimumab for 4 weeks, and patients in cohort C (15 patients, 25 samples) received nivolumab + ipilimumab for 6 weeks. The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  78 
 
  
    EGAD50000000809 
   
  
    
    WES data of 30 breast cancer primary tumors obtained from 30 unique patients from BELLINI clinical trial (cohorts A & B) and 30 matched blood samples. The data includes pretreatment samples. The included raw WES sequencing data in fastq format was generated using Illumina NovaSeq 6000. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  60 
 
  
    EGAD50000000810 
   
  
    
    Plasma samples from patients with breast cancer (stage IV) who had confirmed BRCA mutations were subjected to low-coverage whole-genome sequencing This dataset contains raw fastq files from 6 breast cancer plasma samples. 
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  6 
 
  
    EGAD50000000811 
   
  
    
    This data set contains the fastq files from whole-genome sequencing of temporally matched tumour (fresh frozen biopsies), blood germline and plasma samples collected from a BRCA1-mutant breast cancer patient to directly compare mutation signature analysis using gold-standard tumour-germline paired variant calling with a novel ctDNA-based method (MisMatchFinder).  
    
   
  
    
      
      Illumina NovaSeq 6000 
      
    
   
  3 
 
  
    EGAD50000000812 
   
  
    
    This dataset includes FASTQ files from MARCHF6 alleles sequenced by targeted nanopore in 8 patients. 
    
   
  
    
      
      MinION 
      
    
   
  8