File(s) under embargo

4

month(s)

15

day(s)

until file(s) become available

Multiple Learning for Generalized Linear Models in Big Data

thesis
posted on 19.12.2021, 17:27 by Xiang LiuXiang Liu
Big data is an enabling technology in digital transformation. It perfectly complements ordinary linear models and generalized linear models, as training well-performed ordinary linear models and generalized linear models require huge amounts of data. With the help of big data, ordinary and generalized linear models can be well-trained and thus offer better services to human beings. However, there are still many challenges to address for training ordinary linear models and generalized linear models in big data. One of the most prominent challenges is the computational challenges. Computational challenges refer to the memory inflation and training inefficiency issues occurred when processing data and training models. Hundreds of algorithms were proposed by the experts to alleviate/overcome the memory inflation issues. However, the solutions obtained are locally optimal solutions. Additionally, most of the proposed algorithms require loading the dataset to RAM many times when updating the model parameters. If multiple model hyper-parameters needed to be computed and compared, e.g. ridge regression, parallel computing techniques are applied in practice. Thus, multiple learning with sufficient statistics arrays are proposed to tackle the memory inflation and training inefficiency issues.

History

Degree Type

Doctor of Philosophy

Department

Computer and Information Technology

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Baijian Yang

Advisor/Supervisor/Committee co-chair

Tonglin Zhang

Additional Committee Member 2

Jin Wei-Kocsis

Additional Committee Member 3

Abdul Salam