Mechanical and Civil Engineering Seminar: PhD Thesis Defense
Abstract:
This talk studies operator learning from a statistical perspective. Operator learning uses observed data to estimate mappings between infinite-dimensional spaces. It does so at the conceptually continuum level, leading to discretization-independent machine learning methods when implemented in practice. Although this framework shows promise for physical model acceleration and discovery, the mathematical theory of operator learning lags behind its empirical success. Motivated by scientific computing and inverse problems where the available data are often scarce, this talk develops scalable algorithms for operator learning and theoretical insights into their data efficiency. The talk begins by introducing the function-valued random features method and applying it to learn nonlinear solution operators of parametric partial differential equations. A statistical analysis derives state-of-the-art error bounds for the trained method and establishes its robustness to errors stemming from noisy observations and model misspecification. Next, the talk tackles fundamental statistical questions about how problem structure, data quality, and prior information influence learning accuracy. Specializing to a linear setting, a sharp Bayesian nonparametric analysis shows that continuum linear operators, such as the integration or differentiation of spatially-varying functions, are provably learnable from noisy input-output data pairs. When only specific linear functionals of the operator's output are the primary quantities of interest, the final part of the talk proves that the smoothness of the functionals determines whether learning directly from these finite-dimensional observations carries a statistical advantage over plug-in estimators based on learning the entire operator. To validate the findings beyond linear problems, the talk develops practical deep operator learning architectures for nonlinear maps that send functions to vectors, or vice versa, and shows their corresponding universal approximation properties. Altogether, the theoretical and methodological contributions of this talk advance the reliability and efficiency of operator learning for continuum problems in the physical and data sciences.