Purdue University Graduate School
Browse

Towards Improving Software Reliability through Modern Natural Language Processing Techniques

Download (6.74 MB)
thesis
posted on 2025-07-25, 12:08 authored by Danning XieDanning Xie
<p dir="ltr">Software reliability is of vital importance given software's increasingly critical role in areas like healthcare and transportation, where failures can lead to significant monetary losses and even loss of human life. Unreliability can arise at any of the four primary stages of the Software Development Life Cycle (SDLC): Planning and Designing, Implementation, Quality Assurance, and Deployment. Each stage presents unique challenges, including bridging the gap between informal documentation and formal specifications, generating reliable code, producing effective test cases, and securely auditing binaries. Traditional approaches to addressing these challenges often struggle with scalability, generalizability, and practical effectiveness, particularly when tackling inherently complex or undecidable problems.</p><p><br></p><p dir="ltr">In response to these persistent challenges, advancements in modern Natural Language Processing (NLP) techniques, including Large Language Models (LLMs), offer promising new directions. These models have shown strong capabilities in understanding and generating natural language and other structured artifacts. This thesis investigates the application of modern NLP techniques to improve software reliability. It presents novel applications of NLP techniques across key software engineering tasks throughout the SDLC and provides empirical evaluations to assess their capabilities and feasibility.</p><p><br></p><p dir="ltr">The thesis explores the application of NLP techniques through five core projects. It begins by demonstrating how NLP can bridge the gap between informal documentation and formal specifications through DocTer, a pattern-based technique that extracts input constraints from natural language to guide the testing of deep learning libraries. This approach is then extended into CEDAR, a continuous testing framework designed to support the reliability of rapidly evolving deep learning libraries through scalable and automated testing across versions. The work also conducts the first empirical study evaluating the effectiveness of LLMs in generating formal specifications from software documentation and comments, comparing them to traditional methods. Furthermore, it introduces ReSym, a hybrid approach combining LLMs with program analysis to recover symbolic information from stripped binaries for reverse engineering and security purposes. Finally, the thesis presents CoRe, a benchmark designed to evaluate LLMs' code reasoning capabilities through fundamental static analysis tasks.</p><p><br></p><p dir="ltr">Together, these projects demonstrate how modern NLP techniques, encompassing both traditional pattern-based methods and LLMs, can be effectively leveraged across various stages of the SDLC to improve software reliability. The research shows that these approaches can be applied individually or combined with traditional techniques for more reliable and scalable solutions. The empirical evaluations provide valuable insights into the feasibility, strengths, and current limitations of LLM-based methods for tasks like specification generation and code reasoning. By highlighting both the practical applications and the existing challenges for LLMs, this work guides future research toward developing more robust and dependable NLP-driven software engineering practices.</p>

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer Science

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Lin Tan

Advisor/Supervisor/Committee co-chair

Xiangyu Zhang

Additional Committee Member 2

Dan Goldwasser

Additional Committee Member 3

Tianyi Zhang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC