Purdue University Graduate School
Text_Mining_for_social_harm_and_criminal_justice_applications.pdf (1.26 MB)

Text mining for social harm and criminal justice application

Download (1.26 MB)
posted on 2020-07-30, 12:28 authored by Ritika PandeyRitika Pandey
Increasing rates of social harm events and plethora of text data demands the need of employing text mining techniques not only to better understand their causes but also to develop optimal prevention strategies. In this work, we study three social harm issues: crime topic models, transitions into drug addiction and homicide investigation chronologies. Topic modeling for the categorization and analysis of crime report text allows for more nuanced categories of crime compared to official UCR categorizations. This study has important implications in hotspot policing. We investigate the extent to which topic models that improve coherence lead to higher levels of crime concentration. We further explore the transitions into drug addiction using Reddit data. We proposed a prediction model to classify the users’ transition from casual drug discussion forum to recovery drug discussion forum and the likelihood of such transitions. Through this study we offer insights into modern drug culture and provide tools with potential applications in combating opioid crises. Lastly, we present a knowledge graph based framework for homicide investigation chronologies that may aid investigators in analyzing homicide case data and also allow for post hoc analysis of key features that determine whether a homicide is ultimately solved. For this purpose
we perform named entity recognition to determine witnesses, detectives and suspects from chronology, use keyword expansion to identify various evidence types and finally link these entities and evidence to construct a homicide investigation knowledge graph. We compare the performance over several choice of methodologies for these sub-tasks and analyze the association between network statistics of knowledge graph and homicide solvability.


NSF SCC-1737585

NSF ATD-1737996​


Degree Type

  • Master of Science


  • Computer Science

Campus location

  • Indianapolis

Advisor/Supervisor/Committee Chair

George Mohler

Additional Committee Member 2

Mohammad Al Hasan

Additional Committee Member 3

Snehasis Mukhopadhyay