Purdue University Graduate School
Browse

<b>CEDARSAM: ENHANCING INSTANCE SEGMENTATION FOR CLASSIFICATION OF EASTERN RED CEDAR TREES FROM UAV IMAGERY</b>

Download (3.65 MB)
thesis
posted on 2025-08-04, 19:42 authored by Minji LeeMinji Lee
<p dir="ltr">In the contemporary landscape of technology, Artificial Intelligence (AI) — especially computer vision techniques within Machine Learning (ML) and Deep Learning (DL) — has played an important role in agriculture and forestry. Applications such as automated irrigation systems, agricultural drones for field analysis, and crop monitoring arrangements provide substantial benefits in addressing various challenges. One critical issue is the spread of invasive plant species, such as Eastern Red Cedar (ERC) trees. Managing and controlling the growth and structure of this tree is a real-world concern due to its scale and cost, often deemed impractical with traditional theory-oriented approaches. As many vision models, such as Segment Anything Model (SAM), often produce fragmented masks that fail to align with expert-defined boundaries in complex natural environments, alternative perspectives and approaches are required to address this limitation. One promising direction is to create multimodal solutions that combine vision and text information to enhance segmentation performance and contextual understanding.</p><p dir="ltr">To overcome these limitations, this research proposes CedarSAM, a fine-tuned model trained on ERC datasets collected via Unmanned Aerial Vehicle (UAV) imagery. The image data is enriched with spatial and contextual metadata and is further extended by incorporating high-resolution segmentation masks and aligned textual descriptions. The model fine-tunes the mask decoder while keeping the image encoder and prompt encoder frozen, enabling efficient domain adaptation under limited data conditions. The modeling pipeline begins with Convolutional Neural Network (CNN)-based classification, progresses through Faster R-CNN (both using ResNet-50 backbones), and culminates in the use of the SAM, which employs a Vision Transformer (ViT)-based architecture for advanced segmentation.</p><p dir="ltr">Despite the limited number of training samples, CedarSAM has achieved notable improvements across key image segmentation metrics, including Intersection over Union (IoU), Dice score, precision, recall, and inference speed, demonstrating its robustness under data-scarce conditions. Beyond segmentation performance, CedarSAM enhances interpretability and field applicability through a rule-based metadata extraction pipeline that parses spatial information and structured descriptions from image-level annotations. This approach enables context-aware ecological recommendations—such as targeted removal for small trees, mechanical removal for mature specimens, and systematic strategies for clustered tree formations. The proposed methodology demonstrates both high segmentation accuracy and practical usability through structured post-processing, offering an accessible interface for non-expert users and field practitioners. This research lays the groundwork for real-time decision support systems in ecological management and provides structured image-text outputs that serve as foundational data for future Vision-Language Model (VLM) development.</p>

History

Degree Type

  • Doctor of Philosophy

Department

  • Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Eric T. Matson

Additional Committee Member 2

J. Eric Dietz

Additional Committee Member 3

John A. Springer

Additional Committee Member 4

Marcin Woniak

Additional Committee Member 5

John C. Gallagher

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC