- Structural variation
- Human genetic sequence variation
- Pathways involved in diabetes
- Tracking genes involved in coronary heart disease after GWAS
Human genetic sequence variation
During the past decade there have been significant advances in technologies for DNA sequencing that have facilitated studies of variation in ancient and modern humans.
Studies in ancient fossil remains
In 2010, Green, et al., published data on four billion nucleotides of DNA sequence from three different Neanderthal individuals. They noted that DNA extracted from these late Pleistocene remains had degraded to segments less than 200 nucleotides in length and that it had been chemically modified. In addition, they found substantial contamination from microorganisms. To enrich the Neanderthal DNA, samples were digested with restriction endonucleases that selectively cleave microbial DNA.
Green, et al., examined the DNA sequence in loci with specific alleles that are known to differ in different modern human populations. They determined that Neanderthals shared 1% to 4% of genotypes at the sequences with Europeans and Asians. At these loci, Neanderthals did not share alleles with Sub-Saharan African populations.
Sequence analyses also led to the identification of genes that apparently underwent positive selection in modern humans. Specific sequences in these genes impact protein function. Green, et al., identified specific functional sequence changes that occurred in modern humans but were absent from Neanderthals, and the Neanderthal sequence matched the sequence present in chimpanzees.
Studies on an ancient Saqqaq individual
In the past decade, we have seen the confluence of paleontological analyses of bone fossils and cultural artifacts with DNA analyses. Rasmussen, et al. (2010), examined DNA recovered from the hair roots of an individual from Greenland, estimated to have lived 4,000 years ago, who was of member of the Saqqaq culture. Analysis of DNA polymorphisms from the hair roots indicated that the closest match was with individuals from eastern Siberia.
One advantage of analyses from hair roots is that they are less contaminated with fungi and bacteria than samples isolated from bone fossils. The high quality of the DNA isolated from hair roots of the Saqqaq individual enabled the analysis of 350,000 SNPs. Earlier studies by Rasmussen's group generated information on the complete mitochondrial DNA gene from permafrost-preserved Saqqaq individuals.
Given the length of the tracts of homozygosity they found, Rasmussen, et al., concluded that the inbreeding coefficient was high. Rasmussen studied DNA sequence at functional polymorphic sites. The combination of SNPs at the HERC2 and OCA2 (oculocutaneous albinism gene2) indicated that the individual most likely had brown eyes and dark hair. Analyses also revealed that the Saqqaq individual was most closely related to three Northern Old World Arctic populations and was more distantly related to New World Amerinds. Researchers were not able to detect evidence of West Eurasian population admixture. Nuclear DNA SNP analyses and studies on the mitochondrial and Y chromosome haplotypes of the Saqqaq individuals matched most closely with those of North East Asian populations.
The Saqqaq culture is a component of the Arctic small tool transition and is estimated to have existed between 4,750 and 2,500 years ago.
Sequence variation in different populations and regions
In an analysis of 650,000 common SNPs, Li, et al. (2008), collected samples from populations in 51 geographic regions. Populations studied were drawn from Sub-Saharan Africa, North Africa, the Middle East, Europe, East, South, and Central Asia, Oceania, and the Americas. They carried out haplotype analysis to identify linked alleles at specific loci. They detected finer haplotype substructure in different regions. They noted, for example, that Palestinians, Druze, and Bedouins have haplotype contributions from the Middle East, Europe, and South and Central Asia.
Li, et al., concluded that nonrandom differences between populations have accumulated at a number of different loci. However, they also concluded that within population differences accounted for most of the genetic diversity. Results of their analyses revealed that heterozygosity was greatest in Africa and was reduced as geographic distance from Addis Ababa increased.
Tishkoff, et al. (2009), studied genotypes in 121 African populations, in 60 non-African populations, and in the African-American population. They studied microsatellite repeat polymorphisms and insertion deletion polymorphisms. They obtained evidence for regional differences in the allele frequency of markers; however, their analyses also revealed evidence for substantial population admixture.
Homozygosity mapping
Because recombination occurs between homologous chromosomes during meiosis, the presence of identical alleles over long stretches of genomic DNA on homologous chromosomes (homozygosity) was thought to occur only in consanguineous pedigrees or inbred populations. Gibson, et al. (2006), studied 262 individuals in four different populations and identified 20 different genomic regions where homozygosity extended over 1 megabase or more. They noted that the lowest number of homozygous tracts occurs in the Yoruba population and that this reflects the more ancient roots of the population; over longer time periods, segments of chromosomes have broken up. Gibson, et al., noted that, in modern populations, tracts of homozygosity often occurred in similar genomic regions, indicating regions with a lower frequency of recombination. Examples of blocks of homozygosity are illustrated in Figure 1.2.
Figure 1.2 Blocks of homozygosity that are identical in twins (rows 1 and 2). Strikingly large blocks of homozygosity are present in the individual illustrated in row 3, likely due to consanguinity of parents. Rows 4 and 5 indicate positions of genes on chromosome 16.
Studies on hereditary disorders and population history
Currently, 36 disorders are considered to comprise the Finnish disease heritage. Norio (2003) reviewed the history and studies of these disorders and related them to the historical origins, migrations, and settlements of the Finnish population. Thirty-two of these disorders have autosomal recessive inheritance patterns, two are autosomal dominant, and one is an X-linked disorder.
Norio noted that the Finnish population has been relatively stable for many years, without evidence of continuous migration into Finland. Internal migration of families from the southeastern parts of the country around Sevo into middle and northern regions of the country occurred around 1600. The migrant families settled in small clusters. Each cluster was often located at some distance from other clusters, and with little admixture between clusters, mating occurred within clusters. In later generations, couples who married often shared founders six or seven generations ago.
In 1999, Peltonen, et al., reported that gene loci for 32 of the Finnish hereditary diseases were mapped to specific chromosomes and causative genes for 17 of the disorders were isolated. As expected, marked linkage disequilibrium occurred for markers in the vicinity of the disease alleles, and homogeneity of the disease-causing alleles was observed.
Peltonen and colleagues studied molecular mechanisms in these genetic disorders. Molecular analyses of products encoded by genes in regions where the diseases mapped resulted in the discovery of new proteins and enzymes. Peltonen emphasized that analysis of the disease genes facilitated disease diagnosis, and, importantly, health-care was available following diagnosis.
Peltonen and coworkers also reported that analyzing linkage disequilibrium is useful in identifying gene loci that contribute to the risk of complex common diseases. Kilpinen, et al. (2009), studied regions of linkage disequilibrium in a unique pedigree with multiple cases of autism. Individuals in this pedigree shared ancestors in the 17th century. Analyses revealed areas of linkage disequilibrium in three chromosomal regions at 15q11-q13, 19p13, and 1q23.
Genetic variations, single nucleotide polymorphisms (SNPs) and genome wide association studies (GWAS)
The design of genome wide association studies (GWAS) is predicated on the hypothesis that common DNA sequence variants contribute to the etiology of common disease. Results indicate that even when statistically significant associations between disease and a specific SNP are determined, the overall contribution of specific SNPs to disease risk is often low.
Genome wide association studies and insight into etiology of type 2 diabetes and obesity
In type 2 diabetes, the pancreatic beta cell–secreting capacity becomes inadequate to overcome the progressive peripheral resistance to insulin uptake. Factors that play roles in the development of peripheral insulin resistance include age, inactivity, and weight gain. McCarthy (2010) reviewed the discovery of genes that impact susceptibility to diabetes and obesity. He considered three waves of discovery. The first included family-based linkage studies. These studies led to the identification of genes involved in a number of Mendelian forms of early-onset diabetes, including neonatal diabetes and maturity-onset diabetes of the young (MODY). Genes that were found to play a role in MODY included NEUROD1 (neurogenic differentiation 1); GCK (glucokinase); hepatic nuclear factor genes HNF1A, HNF1B, and HNF4A; and IPF (insulin promoter factor). Family studies also led to the discovery of a mitochondrial DNA mutation that predisposes carriers to diabetes and deafness.
McCarthy noted that family studies of childhood obesity led to the discovery of rare forms of this condition due to mutations in any one of three genes: leptin, leptin receptor, and pro-opiomelanocortin.
The second phase of investigation into diabetes and obesity involved searching for variants in candidate genes. These studies led to the identification of common variants of modest effect in PPARG (peroxisome proliferation activated receptor gamma) and KCNJ11 (potassium inwardly-rectifying channel, subfamily J, member 11). Resequencing of the melanocortin 4-receptor gene led to the identification of associated variants in 2% to 3% of cases of obesity.
A third wave of studies involved large-scale analysis of common DNA sequence variants (SNPs). McCarthy considered this to be the most successful wave of studies. Important diabetes-associated loci identified in these studies include the transcription factor that modulates pancreatic function TCF7L2; cyclin-dependent kinases CDKAL1, CDKN2A, and CDKN2B, which regulate cyclin; and HHEX, a gene involved in beta cell development. Each copy of a susceptibility allele at one of these loci leads to a 15% to 20% increase in the risk for diabetes.
McCarthy reported at least 40 known loci with alleles associated with increased risk of diabetes. Of interest is the fact that five of the loci with common variant alleles associated with increased risk of diabetes also harbor rare variants involved in familial or syndromic diabetes. These four loci are wolframin (NFS1); hepatocyte nuclear factors HNF1A and HNF1B; the melatonin receptor MTNR1B; and IRS1 insulin receptor substrate 1, which impacts insulin action.