Discovering Location Patterns in iOS Users
9
month(s)26
day(s)until file(s) become available
Discovering Location Patterns in iOS Users Utilizing Machine Learning Methods For Purposes of Digital Forensics Investigations
The proliferation of mobile devices and big data has put digital forensic investigators at a disadvantage. Despite all the technological advances, the tools and methods used during the investigations must catch up. With smartphones becoming integral to crime scenes, often containing multiple instances, courts and law enforcement offices greatly depend on their data. In addition to traditional data on smartphones, such as call logs, text messages, and emails, sensor data can drastically increase the chances of resolving and painting the complete picture of the events required for a successful investigation. While sensor data are collected frequently, it often creates a lot of noise due to the amount of entries over some time. In attempting to decipher the data and link them to the relevant events, digital forensics investigators are prone to missing or simply disregarding the data extracted from smartphones. Interpreting sensor data such as location and various phone activities already collected and extracted can lead to finding two main links required for the investigation: time and location. Knowing an individual's time and location can significantly improve the investigation process and aid in the final outcome. Despite smartphones being capable of collecting sensor data and discovering these two variables, data interpretation and correlation between them still need to be improved. The statement is particularly true for smartphones with newer operating system versions. Due to the special forensic software required to extract the data and the ability to interpret them, digital forensic investigators are either strained for time or are unequipped for processing them.
In order to mitigate the gap, automation of the process capable of handling large amounts of data while classifying the time and the location appropriate for the investigation is necessary. Reducing investigation times and increasing prediction accuracy will allow faster resolving times while freeing up desperately needed resources for digital forensic investigators. Therefore, this study presents a novel approach to identifying and predicting user locations using machine learning based on various sensor data collected from multiple smartphones. As the first step in achieving the goal, a user study was conducted, collecting real-world data for training and testing of the machine learning models. The process includes engineering the necessary procedures and methodologies required to extract raw data and process them for successful model training. The results showed that the models are capable of differentiating between the three different locations using XGBoost with score test accuracy over 0.88. Additionally, Random Forest Entropy and Random Forest Gini achieved accuracy over 0.85. As for for the results where only two locations were predicted Random Forest Entropy and Random Forest Gini achieved accuracy test score per model over 0.97.
History
Degree Type
- Doctor of Philosophy
Department
- Computer and Information Technology
Campus location
- West Lafayette