Reducing Wide-Area Satellite Data to Concise Sets for More Efficient Training and Testing of Land-Cover Classifiers

Chang, Tommy Y.

doi:10.25394/PGS.7857782.v1

Tommy_Chang_Dissertation_v3.pdf (7.49 MB)

Reducing Wide-Area Satellite Data to Concise Sets for More Efficient Training and Testing of Land-Cover Classifiers

thesis

posted on 2019-06-10, 17:27 authored by Tommy Y. ChangTommy Y. Chang

Obtaining an accurate estimate of a land-cover classifier's performance over a wide geographic area is a challenging problem due to the need to generate the ground truth that covers the entire area that may be thousands of square kilometers in size. The current best approach constructs a testing dataset by drawing samples randomly from the entire area --- with a human supplying the true label for each such sample --- with the hope that the selections thus made statistically capture all of the data diversity in the area. A major shortcoming of this approach is that it is difficult for a human to ensure that the information provided by the next data element chosen by the random sampler is non-redundant with respect to the data already collected. In order to reduce the annotation burden, it makes sense to remove any redundancies from the entire dataset before presenting its samples to a human for annotation. This dissertation presents a framework that uses a combination of clustering and compression to create a concise-set representation of the land-cover data for a large geographic area. Whereas clustering is achieved by applying Locality Sensitive Hashing (LSH) to the data elements, compression is achieved through choosing a single data element to represent a given cluster. This framework reduces the annotation burden on the human and makes it more likely that the human would persevere during the annotation stage. We validate our framework experimentally by comparing it with the traditional random sampling approach using WorldView2 satellite imagery.

Funding

IARPA FA8650-12-C-7214

History

Degree Type

Doctor of Philosophy

Department

Electrical and Computer Engineering

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Avinash Kak

Additional Committee Member 2

Charles Bouman

Additional Committee Member 3

Alexander Quinn

Additional Committee Member 4

Tanmay Prakash

Usage metrics

Keywords

satellite data sets big database mining Human factors engineering Unsupervised learning image analysis techniques landcover map groundtruthing Computer Engineering

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Reducing Wide-Area Satellite Data to Concise Sets for More Efficient Training and Testing of Land-Cover Classifiers

Funding

IARPA FA8650-12-C-7214

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Additional Committee Member 4

Usage metrics

Categories

Keywords

Licence

Exports