Purdue University Graduate School
Browse

Optimal Transport Theory and Applications to Unsupervised Learning and Asset Pricing

Download (1.06 MB)
thesis
posted on 2024-07-30, 12:25 authored by Marcelo Cruz de SouzaMarcelo Cruz de Souza

This thesis presents results in Optimal Transport theory and applications to unsuper- vised statistical learning and robust asset pricing. In unsupervised learning applications, we assume that we observe the distribution of some data of interest which might be too big in size, have a high-dimensional structure or be polluted with noise. We investigate the construction of an optimal distribution that precedes the given data distribution in convex order, which means that the given distribution is a dispersion of it. The intention is to use this construction to estimate a concise, lower-dimensional or unpolluted version of the given data. We provide existence and convergence results and show that popular methods including k-means and principal curves can be unified under this model. We further investi- gate a relaxation of the order relation that leads to similar results in terms of existence and convergence and broadens the range of applications to include e.g. the Principal Compo- nent Analysis and the Factor model. We relate the two versions and show that the relaxed problem can be described as a bilinear optimization with a tractable computational method. As examples, we apply our method to generate fixed-weight k-means, principal curves with bounded curvature that are actual generalizations of PCA, and a latent factor structure in a classical Gaussian setting. In robust finance applications, we investigate the Vectorial Martingale Optimal Transport problem, the geometry of its solutions, and its application to model-free asset pricing. We consider a multi-asset, two-period contract pricing model and show that the solution to this problem with a sub or supermodular payoff function reduces to a single factor in the first period in the case of two underlying assets (d = 2), but not in general for a greater number of assets. This result for d = 2 enables the construction of a joint distribution of prices at the first period from market data, which adds information to the model-free pricing method and reduces the computational dimensionality. We provide an improved version of an existing pricing method and show numerical evidence of increased accuracy.

History

Degree Type

  • Doctor of Philosophy

Department

  • Management

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Tongseok Lim

Additional Committee Member 2

Phillip Thompson

Additional Committee Member 3

Yichen Zhang

Additional Committee Member 4

Alex Liheng Wang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC