Interpretable natural language processing models with deep hierarchical structures and effective statistical training
The research focuses on improving natural language processing (NLP) models by integrating the hierarchical structure of language, which is essential for understanding and generating human language. The main contributions of the study are:
- Hierarchical RNN Model: Development of a deep Recurrent Neural Network model that captures both explicit and implicit hierarchical structures in language.
- Hierarchical Attention Mechanism: Use of a multi-level attention mechanism to help the model prioritize relevant information at different levels of the hierarchy.
- Latent Indicators and Efficient Training: Integration of latent indicators using the Expectation-Maximization algorithm and reduction of computational complexity with Bootstrap sampling and layered training strategies.
- Sequence-to-Sequence Model for Translation: Extension of the model to translation tasks, including a novel pre-training technique and a hierarchical decoding strategy to stabilize latent indicators during generation.
The study claims enhanced performance in various NLP tasks with results comparable to larger models, with the added benefit of increased interpretability.
- Doctor of Philosophy
- West Lafayette