Purdue University Graduate School
2 files

Quantitatively Assessing the Genetics of Hair Color in Addition to Identifying Regulatory Elements Impacting Body-Mass Index in the FTO Gene

posted on 2021-10-12, 13:09 authored by Racquel LeShawn HopkinsRacquel LeShawn Hopkins

Obesity is a medical condition whose rates have seen a rise in both the United States and worldwide in recent decades. Numerous studies have been done to understand obesity, and through the use of GWAS researchers have been able to find multiple genetic factors that can contribute to obesity in mammals. One proposed cause of obesity are genetic impacts on cilia formation in the CNS, which causes downstream effects on food intake and energy expenditure, causing obesity via overeating and decreased activity.

In the first half of this thesis, I describe a study, in collaboration with the Berbari Lab at IUPUI, that explored the human chromosome 16:53801550-53808600 (GRCh37/hg19), an intron of the FTO alpha-ketoglutarate Dependent Dioxygenase (FTO) gene for transcriptional regulators that impact BMI and obesity. First, using control DNA, PCR, and gel-electrophoresis, we created an assay for 44 primer sets (forward and reverse) covering the genomic region. After optimizing the assay, we then selected 111 human DNA samples across three weight groups (underweight, normal weight, and obese) to sequence using the assay. The samples were selected from subjects enrolled in the Walsh Lab FDP study. Sequencing was completed using the Illumina MiSeq System, and sequenced results were viewed using the Integrative Genomics Viewer (IGV) program. Variants that showed in the results were analyzed across and within the weight groups, and their locations were researched for previously known BMI or enhancer activity using online genome browsers Ensembl and UCSC Genome Browser.

The results of this study revealed two SNPs, rs8055197 and rs11642015, that provided the best correlation with the weight categories among the samples. These results were consistent with literature that previously linked these single-nucleotide polymorphisms (SNPs) to obesity, particularly in relation to genes that are regulated by FTO (CUX1, POMC, and IRX3/5). Both SNPs lie within areas that show high enhancer activity in neural crest cells, important cells for cilia formation. Although there were SNPs in high LD within both regions, these two SNPs were chosen due to their homologous variant locations within the mouse genome (rs8055197 - GRCm38/mm10 8:91376305; rs11642015 - GRCm38/mm10 8:91375651), which provides a means of testing this obesity correlation, with a proposed enhancer relationship through FTO, in mouse models.

In the second half of this thesis, I explored new methods for quantitatively defining natural hair color categories, and attempted to find novel SNPs impacting hair color in a GWAS using the quantitative values as phenotypes. In previous publications, the development and validation of the HIrisPlex-S Prediction Tool for hair prediction was made using categorical hair colors, which were defined and classified by individual researchers or lab personnel. Using spectrophotometer measurements and HSV color values, we used a machine-learning tool to objectively classify sample hair photos into natural hair color classes. We then used this quantitative data as the input phenotype for a GWAS, using both linear regression and linear mixed model regression, to search for new genetic associations with these objectively defined hair color classes. Lastly, we also measured correlations between these hair color phenotypes and a SNP array consisting of all currently known pigment SNPs cited in recent literature.

The results of this study showed that quantitative values can be used as a means of classifying human hair colors. Both models used in the GWAS highlighted previously known SNPs that contributed to quantitative hair color. By utilizing the linear mixed model approach which has the ability to generate more power due to the normalization of hidden population structure, there was one near genome-wide significant SNP found that is currently not linked with hair color, rs2037697 (IQUB), which showed strong associations with light brown hair (p-value = 1.83192E-07), however this would need to be confirmed with increased numbers to validate its association.

The results of the correlation analysis showed that SNPs cited as having impacts on pigmentation (eye, skin, and hair) also show strong associations with these objectively defined quantitative hair color classes and these rankings may prove very useful as the field moves towards quantitative hair color prediction.




Degree Type

  • Master of Science


  • Biological Sciences

Campus location

  • Indianapolis

Advisor/Supervisor/Committee Chair

Susan Walsh

Additional Committee Member 2

Kathleen Marrs

Additional Committee Member 3

Nicolas Berbari

Usage metrics



    Ref. manager