Purdue University Graduate School
Final Formatted-Bailey Wills.pdf (7.99 MB)

Optimization of marker sets and tools for phenotype, ancestry, and identity using genetics and proteomics

Download (7.99 MB)
posted on 2021-10-12, 12:25 authored by Bailey Mae WillsBailey Mae Wills
In the forensic science community, there is a vast need for tools to help assist investigations when standard DNA profiling methods are uninformative. Methods such as Forensic DNA Phenotyping (FDP) and proteomics aims to help this problem and provide aid in investigations when other methods have been exhausted. FDP is useful by providing physical appearance information, while proteomics allows for the examination of difficult samples, such as hair, to infer human identity and ancestry. To create a “biological eye witness” or develop informative probability of identity match statistics through proteomically inferred genetic profiles, it is necessary to constantly strive to improve these methods.

Currently, two developmentally validated FDP prediction assays, ‘HIrisPlex’ and ‘HIrisplex-S’, are used on the capillary electrophoresis to develop a phenotypic prediction for eye, hair, and skin color based on 41 variants. Although highly useful, these assays are limited in their ability when used on the CE due to a 25 variant per assay cap. To overcome these limitations and expand the capacities of FDP, we successfully designed and validated a massive parallel sequencing (MPS) assay for use on both the ThermoFisher Scientific Ion Torrent and Illumina MiSeq systems that incorporates all HIrisPlex-S variants into one sensitive assay. With the migration of this assay to an MPS platform, we were able to create a semi-automated pipeline to extract SNP-specific sequencing data that can then be easily uploaded to the freely accessible online phenotypic prediction tool (found at https://hirisplex.erasmusmc.nl) and a mixture deconvolution tool with built-in read count thresholds. Based on sequencing reads counts, this tool can be used to assist in the separation of difficult two-person mixture samples and outline the confidence in each genotype call.

In addition to FDP, proteomic methods, specifically in hair protein analysis, opens doors and possibilities for forensic investigations when standard DNA profiling methods come up short. Here, we analyzed 233 genetically variant peptides (GVPs) within hair-associated proteins and genes for 66 individuals. We assessed the proteomic methods ability to accurately infer and detect genotypes at each of the 233 SNPs and generated statistics for the probability of identity (PID). Of these markers, 32 passed all quality control and population genetics criteria and displayed an average PID of 3.58 x 10-4. A population genetics assessment was also conducted to identify any SNP that could be used to infer ancestry and/or identity. Providing this information is valuable for the future use of this set of markers for human identification in forensic science settings.




Degree Type

  • Master of Science


  • Forensic and Investigative Sciences

Campus location

  • Indianapolis

Advisor/Supervisor/Committee Chair

Susan Walsh

Additional Committee Member 2

Christine Picard

Additional Committee Member 3

David Skalnik