Zhiwen_Dissertation.pdf
In this work, we presented a novel approach to the mathematical representation of facial pose, followed by the design of a neural network (NN) capable of leveraging these representations to solve the task of facial pose estimation. Our core contribution lay in the development of advanced mathematical representations for face orientation, which include: 1) three column-vector-based representation, 2) an Anisotropic Spherical Gaussian (ASG)-based Label Distribution Learning (LDL) representation, and 3) the SO(3) Hopf coordinate-based LDL representation. These representations provided continuous and unique descriptions of the facial orientation and avoided the Gimbal lock issue of Euler angles and the antipodal issue of quaternions. Building upon these mathematical representations, we specifically designed neural network architectures to utilize these features. Key components of our NN design included 1) orthogonal loss function for column-vector-based representations which encouraged the orthogonality of predicted vectors. 2) dynamic distribution parameter learning for ASG- and SO(3)-based LDL representations which allowed the NN to adjust the contributions of adjacent labels adaptively. Our proposed mathematical representations of rotations, combined with our NN architectures, provided a powerful framework for robust and accurate facial pose estimation.
History
Degree Type
- Doctor of Philosophy
Department
- Technology
Campus location
- West Lafayette