Purdue University Graduate School
Browse

SENTIMENT ANALYSIS USING BERT ON YELP RESTAURANT REVIEWS

Download (1.03 MB)
thesis
posted on 2022-08-03, 15:23 authored by Sunmin LeeSunmin Lee

Yelp is a platform for users to leave text-based reviews of products or services in addition to photos and ratings from one to five stars. This study addresses two distinct problems that Yelp currently has. First, Yelp's exorbitant number of text-based reviews sometimes makes it impossible for the user to go through and read every single review. Second, the lack of specificity of Yelp's current one-to-five-star rating system cannot determine the rationales of the customers if they have given the same rating. 

To solve the aforementioned problems, the study focused on the initial stage of the algorithm by answering the research question, "Can the BERT model determine whether a customer's review on Yelp is positive or negative, and the degree of said positivity or negativity, based on the review's content?". To answer the stated research question, the study provided each step of the research approach: (1) tokenization and removing stop words, (2) keyword analysis, (3) preparation for the BERT model, and (4) training the BERT model. 

Based on the results obtained from the research approach, the study supported the research question that the researcher established in this study. The researcher concluded the study by summarizing the limitations of the study and introducing the future development algorithm that would be focused on building on this initial stage to assign a ranking on a one-to-five scale of each pre-defined category based on the contents of the text-based reviews.  

History

Degree Type

  • Master of Science

Department

  • Computer and Information Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Eric T. Matson

Additional Committee Member 2

Julia M. Rayz

Additional Committee Member 3

Baijian Yang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC