Purdue University Graduate School
Browse

Discovering Location Patterns in iOS Users

9

month(s)

26

day(s)

until file(s) become available

Discovering Location Patterns in iOS Users Utilizing Machine Learning Methods For Purposes of Digital Forensics Investigations

thesis
posted on 2024-08-06, 18:39 authored by Milos StankovicMilos Stankovic

The proliferation of mobile devices and big data has put digital forensic investigators at a disadvantage. Despite all the technological advances, the tools and methods used during the investigations must catch up. With smartphones becoming integral to crime scenes, often containing multiple instances, courts and law enforcement offices greatly depend on their data. In addition to traditional data on smartphones, such as call logs, text messages, and emails, sensor data can drastically increase the chances of resolving and painting the complete picture of the events required for a successful investigation. While sensor data are collected frequently, it often creates a lot of noise due to the amount of entries over some time. In attempting to decipher the data and link them to the relevant events, digital forensics investigators are prone to missing or simply disregarding the data extracted from smartphones. Interpreting sensor data such as location and various phone activities already collected and extracted can lead to finding two main links required for the investigation: time and location. Knowing an individual's time and location can significantly improve the investigation process and aid in the final outcome. Despite smartphones being capable of collecting sensor data and discovering these two variables, data interpretation and correlation between them still need to be improved. The statement is particularly true for smartphones with newer operating system versions. Due to the special forensic software required to extract the data and the ability to interpret them, digital forensic investigators are either strained for time or are unequipped for processing them.

In order to mitigate the gap, automation of the process capable of handling large amounts of data while classifying the time and the location appropriate for the investigation is necessary. Reducing investigation times and increasing prediction accuracy will allow faster resolving times while freeing up desperately needed resources for digital forensic investigators. Therefore, this study presents a novel approach to identifying and predicting user locations using machine learning based on various sensor data collected from multiple smartphones. As the first step in achieving the goal, a user study was conducted, collecting real-world data for training and testing of the machine learning models. The process includes engineering the necessary procedures and methodologies required to extract raw data and process them for successful model training. The results showed that the models are capable of differentiating between the three different locations using XGBoost with score test accuracy over 0.88. Additionally, Random Forest Entropy and Random Forest Gini achieved accuracy over 0.85. As for for the results where only two locations were predicted Random Forest Entropy and Random Forest Gini achieved accuracy test score per model over 0.97.

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer and Information Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Umit Karabiyik

Additional Committee Member 2

Jin Wei-Kocsis

Additional Committee Member 3

Baijian Yang

Additional Committee Member 4

Smriti Bhatt

Additional Committee Member 5

Esra Akbas

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC