Integrative Life Science (ILS) Doctoral Programme
DETECTING NOVEL CANCER PREDISPOSING MUTATIONS BY UTILIZING THE FINNISH CANCER REGISTRY AND ARCHIVAL TISSUE
MATERIAL
Iikki Donner
Department of Medical and Clinical Genetics, Medicum Applied Tumor Genomics Research Program
Faculty of Medicine University of Helsinki, Finland
ACADEMIC DISSERTATION
To be publicly discussed, with the permission of the Faculty of Biological and Environmental Sciences, in Metsätalo, Room 2, Unioninkatu 40, Helsinki, on June
5th, 2020, at 12 noon.
Helsinki 2020
2
Supervised by Academy Professor Lauri A. Aaltonen, MD,
PhD
Department of Medical and Clinical Genetics, Medicum
Applied Tumor Genomics Research Program
Faculty of Medicine
University of Helsinki, Finland
Docent Auli Karhu, PhD
Department of Medical and Clinical Genetics, Medicum
Applied Tumor Genomics Research Program
Faculty of Medicine
University of Helsinki, Finland
Reviewed by Docent Hellevi Peltoketo, PhD
Biocenter Oulu
Cancer and Translational Medicine Research Unit
Faculty of Medicine University of Oulu, Finland
Docent Katri Pylkäs, PhD Biocenter Oulu
Cancer and Translational Medicine Research Unit
Faculty of Medicine University of Oulu, Finland
Official opponent Docent Jaana Hartikainen, PhD
Faculty of Health Sciences School of Medicine
Institute of Clinical Medicine
Clinical Pathology and Forensic Medicine University of Eastern Finland, Finland
ISBN 978-951-51-6169-7 (paperback) ISBN 978-951-51-6170-3 (PDF)
http://urn.fi/URN:ISBN:978-951-51-6170-3 Unigrafia Oy
Helsinki 2020
3
CONTENTS
ORIGINAL PUBLICATIONS ... 5
ABBREVIATIONS ... 6
... 6
... 7
... 10
ABSTRACT ... 11
1 REVIEW OF THE LITERATURE ... 14
ͳǤͳ ... 14
ͳǤͳǤͳ ... 14
ͳǤͳǤʹ ... 16
ͳǤͳǤ͵ ... 18
ͳǤʹ ȂǦ ... 20
ͳǤʹǤͳ ... 21
ͳǤʹǤʹ ... 21
ͳǤʹǤ͵ ... 21
ͳǤ͵Ǧ ... 22
ͳǤ͵Ǥͳ ... 22
ͳǤ͵Ǥʹ ... 23
ͳǤ͵Ǥ͵ ... 25
ͳǤ͵ǤͶ ... 26
ͳǤͶ ... 27
ͳǤͶǤͳ ȋȌ ... 27
ͳǤͶǤʹ ȋȌ ... 29
ͳǤͶǤ͵ ȋȌ ... 30
ͳǤͶǤͶ Ǧ ȋȌ ... 31
2 AIMS OF THE STUDY ... 33
3 MATERIALS AND METHODS ... 34
͵Ǥͳ ... 34
͵Ǥʹ ... 34
͵Ǥ͵ ... 34
͵Ǥ͵Ǥͳ ... 34
͵Ǥ͵Ǥʹ ... 35
4
͵Ǥ͵Ǥ͵ ... 37
͵Ǥ͵ǤͶ ... 37
͵ǤͶ ... 37
͵Ǥͷ ... 38
͵Ǥ ... 40
4 RESULTS ... 42
ͶǤͳ INSRǡFBXO24ǡDOT1L CDH1ȋȌ ... 42
ͶǤʹ ȋȌ ... 42
ͶǤʹǤͳ ... 42
ͶǤʹǤʹDNAH9ͳͷΨ ... 43
ͶǤʹǤ͵EP300 ... 43
ͶǤ͵ ǦȋȌ ... 45
ͶǤ͵ǤͳǦ ... 45
ͶǤ͵Ǥʹ COL6A1 ... 45
ͶǤ͵Ǥ͵ ... 49
ͶǤͶPOLKPRKCB ȋȌ ... 49
5 DISCUSSION ... 51
ͷǤͳ Ǧ ... 51
ͷǤʹ ȋȌ ... 53
ͷǤ͵ ... 53
ͷǤ͵ǤͳINSRǡFBXO24ǡDOT1LȋȌ ... 54
ͷǤ͵ǤʹDNAH9EP300ȋȌ ... 55
ͷǤ͵Ǥ͵TP53ǡBRCA1ǡBRCA2 ȋȌ ... 56
ͷǤ͵ǤͶSCN7A ȋȌ . 59 ͷǤ͵ǤͷPOLKPRKCB ȋȌ ... 60
ͷǤͶ ... 61
CONCLUDING REMARKS AND FUTURE PROSPECTS ... 62
ACKNOWLEDGEMENTS ... 63
REFERENCES ... 65
5
ORIGINAL PUBLICATIONS
ǣ
I Donner Iǡ ǡ ¡ ǡ ǡ Ǥ
ǤFamilial CancerǡʹͲͳͷǤ
II Donner Iǡ ǡ ǡ ǡ ǡ ǡ
ǡǡǤ
ǤGenes Chromosomes CancerǡʹͲͳǤ III Donner Iǡ ǡ ¡ ǡ ǡ ǡ Ǥ
Ǧ
ǤLung Cancer, ʹͲͳͺǤ
IV Donner Iǡ ǡ ǡ ǡ ¡ ǡ ǡ
Ǥ Ǧ ǤFamilial CancerǡʹͲͳͻǤ
Ǥ
Dz ǦdzȋǡʹͲʹͲȌǤ
6
ABBREVIATIONS
General
AITL Ǧ
BAM
CGC
COSMIC
E
EBV Ǧ
ESCC
FCR
FF Ǧ
FFPE ǦǡǦ
FIMM
FUSION Ǧ gnomAD
GWAS
HDGC
HDR
GATK
IGCLC
INDEL
IPF
LFS Ǧ
LOH
LUAD
MAF
MN
NGS Ǧ
NSCLC Ǧ
O
O/E Ȁ
PARP
PCR
PIS
PTCL Ǧ
SCLC
SEM1 ͳʹ
SISu
7 SNP
SNV
TFH
TSG
UIP
VCF
WT
Genes
ABCC10 ͳͲ ADH1B ͳȋ Ȍǡ
ALDH2 ʹ
ALK
ANK2 ʹ
ATM Ȁ
ATP7B
BAG1 ʹ ͳ BAP1 ͳ ͳ BRCA1 ͳ
BRCA2 ʹ
CABIN1 ͳ
CACNA1S Ǧ ͳ CD28 ʹͺ
CDH1 ͳ
CDKN1B ͳ CDKN2A ʹ CFTR
CHEK2 ʹ
CHRNA3 ͵
CHRNA5 ͷ
CHRNB4 Ͷ
CLIP4 Ǧ Ͷ CLPTM1L ͳǦ
CREBBP
COL6A1 ͳ
COL6A6
CTNNA1 ͳ CTNNB1 ͳ
CYP1A1 ͶͷͲͳͳ CYP21A2 ͶͷͲʹͳʹ
8 DCDC2B ʹ
DDOST Ǧ Ǧ
Ǧ
DICER1 ͳǡ
DNAH9 ͻ DNMT3A ͵
DOT1L ͳ
EGF
EGFR
EML4 Ǧ ǦͶ EP300 ͳ͵ͲͲ
ER
ERBB2 Ǧʹ ʹ
ERCC4 Ͷǡ
EXT1 ͳ FANCA
FANCD2 ʹ FANCE
FANCL
FAT1 ͳ FBXO24 ǦʹͶ
FBXW7 Ǧ
FCSK
FGFR1 ͳ
FYN Ǧ ǡ
GCN1 ͳ ʹͶ GJB6
GKAP1 ͳ GNL3L ͵
GTF2I
HEATR3 ͵
HOTAIR
HNF1A ͳ
IDH2 ȋȋΪȌȌʹǡ
INSR
KDM6A
KMT2A ʹ KMT2D ʹ MMP2 ʹ NFE2L2 ǡʹʹ
NFX1 ǡǦͳ NOTCH1 ͳ
9
OSGEPL1 Ǧ ͳ OTOG
PCDHGB6 ǡ
PDE4D Ͷ
PIK3CA ǦͶǡͷǦ͵Ǧ
PLCE1 ͳ PLCG1 ͳ POLK
PRKCB
PRKN ͵
PRRC2B Ǧ ʹ PTCH1 ͳ
RASSF9 ͻ RB1 ͳ
RGS17 ͳ
RHBDF2 ͷʹ RHOA
RUNX1 ͳ RYR1 ͳ
SCN7A Ǧ
SEC24A ʹͶǡ
SMARCB1 Ȁ ǡ ǡ
ǡǡͳ SP100 ͳͲͲ
ST6GAL1 Ǧ ǦʹǡǦͳ TERT
TET2 ʹ TP53 ͷ͵
TTC36 ͵
TTN
USH2A
XBP1 Ǧͳ ZNF676
ZNHIT3 Ǧ ͵
10
Amino acids
11
ABSTRACT
ǡ
Ǥ ǡ
Ǥ ǡ
ȋ Ȍ
ȋȌ
Ǥ
Ǧ Ǧ ȋ Ȍ
ǡ
Ǥ
ǡ
Ǥ
Ǧ
ǣ ȋȌǡ
Ǧ ȋȌǤ ǡ
ȋȌ
ȋȌ Ǥ
RHBDF2 CDH1ǡ Ǥ
Ǥ
Ǥ
ǣ ǡ
ǡ
Ǥ
CDH1ǡ Ǧ Ǥ
ǡǡ Ǥ
Ǧ
CDH1 Ǥ
ǣ Ǥͳ͵ͳ͵ ǡ Ǥͺͳ ʹͶ
ǤͳͳͶͳǤINSR
Ǧ Ǥ
ǡ
Ǥ
ǡ
Ǥ
Ǥǡ
Ǥ
Ǥ
12
Ǥ ͵Ͳ Ǥ ǡ
DNAH9 Ǥ
̵DNAH9ǡǦ
Ǥ GKAP1ǡBAG1ǡNFX1ǡFCSKǡ
DDOST Ǥ EP300ǡ
ǡ
Ǥ
Ǥ
ǡǡ
ͳͷΨ ͳͲǦ ʹͷΨ Ǥ
Ǥ
ǡ Ǧ
Ǥ
ǡǤ
ʹͳǦ
ͶͷǤ
ǣBRCA1ǡBRCA2ǡERCC4ǡEXT1ǡHNF1 AǡPTCH1ǡSMARCB1ǡ TP53Ǥ
TP53ǡBRCA1ǡBRCA2
Ǧ Ǥ
Ǥǡ
ǣABCC10ǡATP7BǡCACNA1Sǡ CFTRǡCLIP4ǡCOL6A1ǡCOL6A6ǡ GCN1ǡGJB6ǡRYR1ǡSCN7AǡSEC24AǡSP100ǡTTNǡ USH2AǤ
Ǥ
Ǧ Ǥ
TET2ǡIDH2ǡDNMT3AǡRHOAǡFYNǡPLCG1ǡ
CD28ǡ
ǤǦ
ʹ͵ Ǥ ǡ
Ǥ
ǣPOLKǡPRKCBǡ ZNF676ǡPRRC2BǡPCDHGB6ǡGNL3LǡTTC36ǡOTOGǡOSGEPL1ǡ RASSF9Ǥ
ǤͶͻ
ǤͷͺͺǤ ǡPolk
13
PRKCB
͵͵ΨǦ ǡ
Ǧ Ǥ
Ǧ
Ǥ
Ǥ ǡ ǡ
Ǥ
14
1 REVIEW OF THE LITERATURE
1.1 Cancer as an inherited disease
Ǥ
ǡ ǡ ȋ Ǥǡ ʹͲͲͷǢ Ƭ
ǡ ʹͲͳͷȌǤ
Ǥ
ǡ
Ǧ
ȋǡʹͲͳͶȌǤ
ǤCDH1 ǡǡͶʹΨ ȋȌ ͵͵Ψ ȋȌ ͺͲ ȋǤǡʹͲͳͻȌǡ
ͳǤͺΨͲǤͻΨȋǤǡʹͲͳͺȌǤ
Ǥ
ȋǤǡʹͲͲͶȌǤ
ͳͻͺǡRB1ȋ
ǤǡͳͻͺȌǤ
ȋǡͳͻͳȌǤ ǡRB1
Ǥ
ȋ Ȍǡ
ȋ ȌȋFigure 1ȌǤ
Ǧ BRCA1
ͳͻͻͲǡ ȋ
ǤǡͳͻͻͲǢǤǡͳͻͻͶȌǤ
ȋǡ ʹͲͳͶȌǤ
ȋȌ ʹ͵ ǡ Ǧ
ȋȌȋǤǡʹͲͳͺǢǤǡʹͲͳͻȌǤǡͳͲ͵
Ǥ
1.1.1 The hallmarks of cancer
Ǥ
ȋȌ
Ǥ
15
ǤʹͲͲͲ
Dz dz
ȋƬǡʹͲͲͲȌǤ
ʹͲͳͳȋƬǡʹͲͳͳȌǤ
Box 1Ǥ
ǡ
ǡ ǦDz
dzǡ Ǧ
ȋ Ƭ ǡ ʹͲͳͺǢ Ƭ ǡ ʹͲͳͳȌǤ
Dzdz
Figure 1. Somatic versus germline mutations. Somatic mutations occur after conception throughout an individual’s lifetime. The progenitors of the affected cell all carry the mutation. By definition, a somatic mutation does not affect the germ cells, and is thus not passed onto the offspring. Germline mutations occur in the germ cells, which develop into sperm and ova. If one of these germ cells forms a zygote, the mutation will be present in every cell of the developing offspring, since they are all progenitors of the mutated germ cell. Adapted from Dizon and Monk 2017.
16
ǡ
Ǥ
1.1.2 A disease of the genome
ȋ Ǥǡ ʹͲͳ͵ȌǤ
ǣ
ʹͲ ͵Ͳ
ǡ
Ǥ
ǡ͵͵Ǧ
Ǥ ǡ
ȋ ǤǡʹͲͳ͵ȌǤ
ȋ ͲǤͳȀȌǡ
ȋ
ͳͲͲȀȌǡ ȋ
ȌǤ
ǡǤǤ
ȋ Ȍ
ȋƬ¡ǡͳͻͻͶȌǤ ǡ
ǡ
ǡ
ȋ ǡʹͲͳͲȌǤ ǡ
ȋƬǡʹͲͳͷȌǤ
Ǧ ǦǤǦ
ȋ Ƭ ǡ Box 1. The ten hallmarks of cancer
Sustaining proliferative signaling
Evading growth suppressors
Resisting cell death
Enabling replicative immortality
Inducing angiogenesis
Reprogramming cellular metabolism
Avoiding immune destruction
Activating invasion and metastasis
Genomic instability and mutation
Tumor promoting inflammation
17
ʹͲͲͶȌǤ Ǧ Ǥ
ǦǦ
Ǥ
ǤǦ
ȋȌǡ ǡ
Ǥ ǡ
ǡ Ǥ ǡ
ǤTP53 ǡ
Ǧ
ȋƬ ǡͳͻͻͳȌǤǡ
ǡ CDKN1B Ǧ
ȋ ǤǡͳͻͻͺȌǤ
ǡ ǡ
ȋ Ƭ ǡ ͳͻͻǢ Ƭ ǡ ͳͻͻͺȌǤ
ǡ
Ǥ
Ǥ
Ǥ
Ǧ ǡ Ǥ
Ǥ
Ǧ Ǥ
Ǧ ǡ
ǡǦ
ȋ ǡʹͲͲͺȌǤ
ǡ
ǡǦȋǤǡʹͲͳͳȌǤ
ǡ
ǡ ȋȌǤ
ȋƬǡʹͲͳȌǤ
ͺͲΨ Ǥ ǡ ͶͲΨȂͲΨǤ
Ǧ ȋƬǡʹͲͳǢǤǡʹͲͲ͵ȌǤ
ǡ
ǡ
ȋ Ƭ ǡ ʹͲͳͷȌǤ ǡ ǡ ǡ ǡ ǤǤ
ǡǡ ǡ
18
Ǥ
ǡ ǡ
ȋǤǡʹͲͳ͵Ǣ ǤǡʹͲͳͺǢ ǤǡʹͲͳͻȌǤ
ȋǤǡʹͲͳȌǤ
1.1.3 Genetic cancer susceptibility
ǡ
Ǥ
Ǧ ǡ
ȋǤǡʹͲͲͶȌǤ
Dz dzȋǡͳͻͳȌǤ
ǡ Ǥ
ǡǡ ǡ
ǣ ǡ ȋ
Ȍ ȋ
ǤǡʹͲͲͶǢǤǡͳͻͻͶȌǤ
Ǥ
ȋȌ
ǡ ǡ
ȋǤǡʹͲͳȌǤ ǡDzdz
ǡ Ǥ
Dzdz ǡ
Ǧ Ǥǡ
Ǧ ǡ
ȋǤǡʹͲͲͻȌǤ
Figure 2Ǥ
19
ǡǤǤǡ
ȋ Ǥǡ ʹͲͲͺȌǤ
Ǧ
ȋ Ȍ
ǡ Ǥ
ȋDzȀdzȌ ǡ
ȋ Ǥǡ ʹͲͳͶȌǤ
ȋ ǤǡʹͲͳȌǤ
ǡ ǡ
ǡ Ǧ
ȋȌ ȋǤǡʹͲͳͷȌǤ
ǡ ȋ Ȍ
ǡ
ȋǤǡʹͲͳͻȌǤ
Figure 2. The relationship between effect size and allele frequency of pathogenic variants. The genetic basis of Mendelian syndromes occupying the upper left corner of the diagram have been fairly well uncovered by linkage studies, and lately there has been success in discovering common variants of low effect by GWAS (lower right). Common variants of high effect are usually removed from the gene pool by natural selection, whereas rare variants of low effect are currently extremely hard to detect. Low frequency variants of intermediate effect are not well studied and may explain a part of missing heritability. Adapted from Manolio et al., 2009.
20
ȋǤǡʹͲͳǢ
Ǥǡ ʹͲͳͻȌǤ ǡ
ǡ
ǡ ȋȌ
ȋǤǡʹͲͳȌǤ
ǡ
Ǥ
Ǧ ǡ
Ǥ
ǡ
ǡ
ȋǤǡʹͲͳȌǤ ǡ
ǡ ȋǤǡ ʹͲͲͻȌǤ
ȋ Ȍ
ǡ ǡ ǡ Ǧ ǡ
ȋ
Ȍȋ×ǦƬ
ǡʹͲͳͻȌǤ
1.2 The Finnish population – a well-studied genetic isolate Ǥ
ǡ ǡ
ȋ ǦƬǡʹͲͲʹǢǤǡͳͻͻͷȌǤ
ǡ
ȋ Ǧ Ƭ ʹͲͲʹǢ
ʹͲͳͷȌǤ ǡ
ǣ
ǡǡ
Ǥ
ȋƬʹͲͳȌǡ
ȋ Ǧ ʹͲͲʹǢ ʹͲͳͷȌǤ
ǡ ǡ
(ǤǡͳͻͻͷǢʹͲͳͷȌǤ
21 1.2.2 The Finnish disease heritage
ǡ
ȋ¡¡¡Ǥǡ ʹͲͳȌǤ ǡ ȋǤǤ
ǡ ǡ ǡ Ȍ
( ǡ ͳͻͻ͵ȌǤ
ȋǤǡͳͻ͵ȌǤǡ
͵ȋ¡¡¡ǤǡʹͲͳǢǤǡʹͲͳ͵ȌǤ
Ǧ ȋ
Ȍǡ Ǥ
ǡ
ȋǤǡʹͲͲͺȌǤ
ǡ ǡ Ȅ
ǡ ȋ
ǤǡʹͲͳ͵ȌǤ
ǡ ZNHIT3ǡ ǡ
ȋȌȋǤǡʹͲͳȌǤ
1.2.1 Finnish population history
ͻͲͲͲ ȋǤǡʹͲͳͺȌǤ
ʹͲͲͲ
ȋǡͳͻͻ͵ȌǤ
ǡ ǡ
ǤǤ
ȋǤǡʹͲͳͶȌǤ ǡ
ǡ
ͶͲͲͲ ȋ Ǥǡ ͳͻͻͺȌǤ ǡ
͵ͷͲͲȋǤǡʹͲͳͺȌǤ
ǡ
ǡ
ȋ
Ǥǡ ͳͻͻͷȌǤ ǣ ǡ ǡ
22
ȋ¡¡¡ǤǡʹͲͳȌǤ
ͳͻǦͳͻͺȄ
ǡ
ͶͲͲͲͲͲ ȋ Ǥǡ ͳͻͻͻȌǤ
ǡʹͷͲͲͲͲ
ͳͺ ǡ ͷǤͷǤ
ǡ ȋǤǡͳͻͻͷȌǤ
ͳ ǡ
ǡ
ȋ Ǥǡ ͳͻͻͻȌǤ
ǡ
ǡ Ǥ
1.3 Next-generation sequencing
ǡ
Ǧ ȋȌǡ ǡ
ͶͷͶ ʹͲͲͷȋǤǡʹͲͲͷȌǤ
Ǥ ǡ
ȋǤǡʹͲͳʹȌǤ
ǡ
Ǥ Ǥ
ǣ Ǧ Ǧ ǡ
ͳΨǡ
ȋǤǡʹͲͲͻǢǤǡʹͲͳʹȌǤ
Ǧ ǡ
ȋǤǡʹͲͳʹȌǤ
1.3.1 Reference genomes and population controls
ǡ Ǥ
ʹͲͲ͵ ȋ
ǡʹͲͲͶȌǤ
23
Ǧ ȋ Ȍǡ
Ǥ
ǡ
Ǥ ǡ
Ǥ
ǡǡ Ȁ
ǡ
Ǧ ȋ Ǥǡ ʹͲͳͻȌǤ ͳʹͷǡͶͺͳͷǡͲͺ
Ǧ
ȋǣȀȀǤǤȀȌǤ ǡ ǡ
ȋȌǤ ȋȌ
ȋ Ȍ ȋ Ǥǡ ʹͲͳͶȌǤ
ͳͲǡͶͻͲ ȋǣȀȀǤ ǤȀȌǤ
1.2.3 Recording population and disease history
ǡ
ȋǤǡͳͻͻͻȌǤ
ͳͺͲͻǡ ȋǡ ʹͲͲʹȌǤ
ͳͷʹ͵ǦͳͷͲǡ
Ȅ ͳͺǡ Ǥ
ǡ
ǡȋǣȀȀǤȀȌǤ
ǤǤǡ ǡ
Ǥ
Ǥ
ǡ
ȋȌǡͳͻͻ
ȋǣȀȀǤȀȌǤ
Ǥ
ǡ ǡ ǡ ǡ
ǡǡǡǡ Ǥ
ȋ Ȍ ͳͻͷʹǡ
ȋǣȀȀ ǤȀȌǤ ͳͻͷ͵ǡ