Purdue University Graduate School
Browse

LEAFRAG - OBJECT RECOGNITION EXPLANATION THROUGH MULTIMODAL RETRIEVAL-AUGMENTED GENERATION

Download (279.81 MB)
thesis
posted on 2025-05-12, 18:17 authored by Zhuoyang SunZhuoyang Sun

Traditional object recognition systems often struggle with real-world complexities despite achieving high performance in controlled environments. This thesis presents LeafRAG, a novel approach integrating multi-modal Retrieval-Augmented Generation (RAG) with botanical knowledge to enhance object recognition, using leaf identification as the primary application domain.

The implemented system features a scalable architecture for processing multimodal data sources including text, images, and PDF documents, providing not only accurate identification results but also explanatory details that enhance educational value. Its modular design incorporates advanced embedding technologies and LLMs, with vector storage and retrieval mechanisms using FAISS.

Experimental evaluation across 13 tree species yielded promising results, with the reasoning-optimized model achieving 82.14\% Top-1 accuracy and 89.29\% Top-3 accuracy. The key findings include the substantial impact of the image comparison features (+10.71\% accuracy improvement), the value of the knowledge derived from the textbook (+8. 93\% precision) and the optimal retrieval parameter of k = 25 to balance precision and response time.

These results confirm that the integration of scientific knowledge with modern AI techniques offers substantial capabilities for complex recognition tasks. Future research directions include hierarchical retrieval and validation mechanism, GraphRAG incorporation, optimized embedding strategies, and multi-image processing capabilities to address more complex identification scenarios.

History

Degree Type

  • Master of Science

Department

  • Computer Graphics Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Yingjie Chen

Additional Committee Member 2

Songlin Fei

Additional Committee Member 3

Hannah Yanhua Zong

Additional Committee Member 4

Dongfang Liu

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC