From a35c5aaa85c66e1b72099eb97b140d49ee47697a Mon Sep 17 00:00:00 2001 From: Yimin Zhong Date: Sun, 23 Jun 2024 04:12:09 +0000 Subject: [PATCH] Update Awards --- .../Awards-Computational-Mathematics-2024.csv | 5 +++-- Statistics/Awards-Statistics-2024.csv | 9 ++++++--- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/Computational-Mathematics/Awards-Computational-Mathematics-2024.csv b/Computational-Mathematics/Awards-Computational-Mathematics-2024.csv index 5c5ca05..f21404c 100644 --- a/Computational-Mathematics/Awards-Computational-Mathematics-2024.csv +++ b/Computational-Mathematics/Awards-Computational-Mathematics-2024.csv @@ -1,8 +1,10 @@ "AwardNumber","Title","NSFOrganization","Program(s)","StartDate","LastAmendmentDate","PrincipalInvestigator","State","Organization","AwardInstrument","ProgramManager","EndDate","AwardedAmountToDate","Co-PIName(s)","PIEmailAddress","OrganizationStreet","OrganizationCity","OrganizationState","OrganizationZip","OrganizationPhone","NSFDirectorate","ProgramElementCode(s)","ProgramReferenceCode(s)","ARRAAmount","Abstract" +"2411264","NSF-BSF: Scalable Graph Neural Network Algorithms and Applications to PDEs","DMS","COMPUTATIONAL MATHEMATICS","08/01/2024","06/21/2024","Lars Ruthotto","GA","Emory University","Continuing Grant","Troy D. Butler","07/31/2027","$121,190.00","","lruthotto@emory.edu","201 DOWMAN DR NE","ATLANTA","GA","303221061","4047272503","MPS","127100","079Z, 9263","$0.00","This project will advance the fields of geometric machine learning and numerical partial differential equations and strengthen the connections between them. Geometric machine learning provides an effective approach for analyzing unstructured data and has become indispensable for computer graphics and vision, bioinformatics, social network analysis, protein folding, and many other areas. Partial differential equations (PDEs) are ubiquitous in mathematical modeling, and their numerical solution enables the simulation of real-world phenomena in engineering design, medical analysis, and material sciences, to name a few. A unified study of both fields exposes many potential synergies, which the project will seize to improve the efficiency of algorithms in both areas. The first goal is to improve the scalability of geometric machine learning approaches based on graph neural networks (GNNs) to accommodate growing datasets with millions of nodes using insights and ideas from numerical PDEs. The second goal is to accelerate numerical PDE simulations by enhancing numerical solvers on unstructured meshes with GNN components. Through these improvements in computational efficiency, the project will enable more accurate data analysis and PDE simulations for high-impact applications across the sciences, engineering, and industry. Graduate students and postdoctoral researchers will be integrated into this research as part of their professional training.

This project will develop computational algorithms that improve the efficiency and scalability of GNNs and create new approaches for GNNs for solving nonlinear PDEs on unstructured meshes. To improve the scalability of GNNs to graphs with millions of nodes, the research team will develop spatial smoothing operators, coarsening operators, and multilevel training schemes. To accelerate PDE simulations on unstructured meshes, the team will train GNNs to produce effective prolongation, restriction, and coarse mesh operators in multigrid methods and preconditioners in Krylov methods. The team will demonstrate that the resulting hybrid schemes accelerate computations and are provably convergent. To show the broad applicability of the schemes, the team will consider challenging PDE problems in computational fluid dynamics and test the scalable GNNs on established geometric learning benchmark tasks such as shape and node classification. The mathematical backbone of these developments is algebraic multigrid techniques, which motivate GNN design and training and are used in the PDE solvers.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2414705","CAREER: Mathematical Modeling from Data to Insights and Beyond","DMS","COMPUTATIONAL MATHEMATICS","01/15/2024","01/22/2024","Yifei Lou","NC","University of North Carolina at Chapel Hill","Continuing Grant","Yuliya Gorb","05/31/2025","$141,540.00","","yflou@unc.edu","104 AIRPORT DR STE 2200","CHAPEL HILL","NC","275995023","9199663411","MPS","127100","1045, 9263","$0.00","This project will develop both analytical and computational tools for data-driven applications. In particular, analytical tools will hold great promise to provide theoretical guidance on how to acquire data more efficiently than current practices. To retrieve useful information from data, numerical methods will be investigated with emphasis on guaranteed convergence and algorithmic acceleration. Thanks to close interactions with collaborators in data science and information technology, the investigator will ensure the practicability of the proposed research, leading to a real impact. The investigator will also devote herself to various outreach activities in the field of data science. For example, she will initiate a local network of students, faculty members, and domain experts to develop close ties between mathematics and industry as well as to broaden career opportunities for mathematics students. This initiative will have a positive impact on the entire mathematical sciences community. In addition, she will advocate for the integration of mathematical modeling into K-16 education by collaborating with The University of Texas at Dallas Diversity Scholarship Program to reach out to mathematics/sciences teachers.

This project addresses important issues in extracting insights from data and training the next generation in the ""big data"" era. The research focuses on signal/image recovery from a limited number of measurements, in which ""limited"" refers to the fact that the amount of data that can be taken or transmitted is limited by technical or economic constraints. When data is insufficient, one often requires additional information from the application domain to build a mathematical model, followed by numerical methods. Questions to be explored in this project include: (1) how difficult is the process of extracting insights from data? (2) how should reasonable assumptions be taken into account to build a mathematical model? (3) how should an efficient algorithm be designed to find a model solution? More importantly, a feedback loop from insights to data will be introduced, i.e., (4) how to improve upon data acquisition so that information becomes easier to retrieve? As these questions mimic the standard procedure in mathematical modeling, the proposed research provides a plethora of illustrative examples to enrich the education of mathematical modeling. In fact, one of this CAREER award's educational objectives is to advocate the integration of mathematical modeling into K-16 education so that students will develop problem-solving skills in early ages. In addition, the proposed research requires close interactions with domain experts in business, industry, and government (BIG), where real-world problems come from. This requirement helps to fulfill another educational objective, that is, to promote BIG employment by providing adequate training for students in successful approaches to BIG problems together with BIG workforce skills.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2410671","Robust Algorithms Based on Domain Decomposition and Microlocal-Analysis for Wave propagation","DMS","COMPUTATIONAL MATHEMATICS","07/01/2024","06/14/2024","Yassine Boubendir","NJ","New Jersey Institute of Technology","Standard Grant","Ludmil T. Zikatanov","06/30/2027","$200,000.00","","boubendi@njit.edu","323 DR MARTIN LUTHER KING JR BLV","NEWARK","NJ","071021824","9735965275","MPS","127100","9263","$0.00","More than ever, technological advances in industries such as aerospace, microchips, telecommunications, and renewable energy rely on advanced numerical solvers for wave propagation. The aim of this project is the development of efficient and accurate algorithms for acoustic and electromagnetic wave propagation in complex domains containing, for example, inlets, cavities, or a multilayer structure. These geometrical features continue to pose challenges for numerical computation. The numerical methods developed in this project will have application to radar, communications, remote sensing, stealth technology, satellites, and many others. Fundamental theoretical and computational issues as well as realistic complex geometries such as those occurring in aircraft and submarines will be addressed in this project. The obtained algorithms will facilitate the use of powerful computers when simulating industrial high-frequency wave problems. The numerical solvers obtained through this research will be made readily available to scientists in aerospace and other industries, which will contribute to enhancing the U.S leadership in this field. Several aspects in this project will benefit the education of both undergraduate and graduate students. Graduate students will gain expertise in both scientific computing and mathematical analysis. This will reinforce their preparation to face future challenges in science and technology.

The aim of this project is the development of efficient and accurate algorithms for acoustic and electromagnetic wave propagation in complex domains. One of the main goals of this project resides in the design of robust algorithms based on high-frequency integral equations, microlocal and numerical analysis, asymptotic methods, and finite element techniques. The investigator plans to derive rigorous asymptotic expansions for incidences more general than plane waves in order to support the high-frequency integral equation multiple scattering iterative procedure. The investigator will introduce Ray-stabilized Galerkin boundary element methods, based on a new theoretical development on ray tracing, to significantly reduce the computational cost at each iteration and limit the exponentially increasing cost of multiple scattering iterations to a fixed number. Using the theoretical findings in conjunction with the stationary phase lemma, frequency-independent quadratures for approximating the multiple scattering amplitude will also be designed. These new methods will be beneficial for industrial applications involving multi-component radar and antenna design. In addition, this project includes development of new non-overlapping domain decomposition methods with considerably enhanced convergence characteristics. The main idea resides in a novel treatment of the continuity conditions in the neighborhood of the so called cross-points. Analysis of the convergence and stability will be included in parallel to numerical simulations in the two and three dimensional cases using high performance computing.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2411396","Interacting particle system for nonconvex optimization","DMS","COMPUTATIONAL MATHEMATICS","07/01/2024","06/13/2024","Yuhua Zhu","CA","University of California-San Diego","Continuing Grant","Troy D. Butler","06/30/2027","$80,048.00","","yuz244@ucsd.edu","9500 GILMAN DR","LA JOLLA","CA","920930021","8585344896","MPS","127100","9263","$0.00","Collective Intelligence offers profound insights into how groups, whether they be cells, animals, or even machines, can work together to accomplish tasks more effectively than individuals alone. Originating in biology and now influencing fields as varied as management science, artificial intelligence, and robotics, this concept underscores the potential of collaborative efforts in solving complex challenges. On the other hand, the quest for finding global minimizers of nonconvex optimization problems arises in physics and chemistry, as well as in machine learning due to the widespread adoption of deep learning. Building the bridge between these two seemingly disparate realms, this project will utilize Collective Intelligence to leverage the interacting particle systems as a means to address the formidable challenge of finding global minimizers in nonconvex optimization problems. Graduate students will also be integrated within the research team as part of their professional training.

This project will focus on a gradient-free optimization method inspired by a consensus-based interacting particle system to solve different types of nonconvex optimization problems. Effective communication and cooperation among particles within the system play pivotal roles in efficiently exploring the landscape and converging to the global minimizer. Aim 1 targets nonconvex optimization with equality constraints; and Aim 2 addresses nonconvex optimization on convex sets; while Aim 3 applies to Clustered Federated Learning. Additionally, convergence guarantees will be provided for nonconvex and nonsmooth objective functions. Theoretical analyses, alongside practical implementations, will provide valuable insights and tools for addressing different types of nonconvex optimization challenges.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." -"2410678","Collaborative Research: Data-driven Realization of State-space Dynamical Systems via Low-complexity Algorithms","DMS","COMPUTATIONAL MATHEMATICS","08/01/2024","06/07/2024","Aaron Welters","FL","Florida Institute of Technology","Standard Grant","Jodi Mead","07/31/2027","$125,000.00","Xianqi Li","awelters@fit.edu","150 W UNIVERSITY BLVD","MELBOURNE","FL","329018995","3216748000","MPS","127100","079Z, 9263","$0.00","Data science is evolving rapidly and places a new perspective on realizing state-space dynamical systems. Predicting time-advanced states of dynamical systems is a challenging problem in STEM disciplines due to their nonlinear and complex nature. This project will utilize data-driven methods and analyze state-space dynamical systems to predict and understand future states, surpassing classical techniques. In addition, the PI team will (i) guide students to obtain cross-discipline PhD/Master's degrees, (ii) guide students to work in a peer-learning environment, and (iii) educate a diverse group of undergraduates.

In more detail, this project will utilize state-of-the-art machine learning (ML) algorithms to efficiently analyze and predict information within data matrices and tensor computations with low-complexity algorithms. Single-dimensional ML models are not efficient at extracting hidden semantic information in the time and space domains. As a result, it becomes challenging to simultaneously capture multi-dimensional spatiotemporal data in state-space dynamical systems. Using efficient ML algorithms to recover multi-dimensional spatiotemporal data simultaneously offers a breakthrough in understanding the chaotic behavior of dynamical systems. This project will (i) utilize ML to predict future states of dynamical systems based on high-dimensional data matrices captured at different time stamps, (ii) realize state-space controllable and observable systems via low-complexity algorithms to simultaneously analyze multiple states of the systems, (iii) analyze noise in state-space systems for uncertainty quantification, predict patterns in real-time states, generate counter-resonance states to suppress them, and optimize performance and stability, (iv) study system resilience via multiple state predictors and perturbations to assess performance and adaptation to disturbances and anomalies, and finally (v) optimize spacecraft trajectories, avoid impact, and use low-complexity algorithms to understand spacecraft launch dynamics on the space coast and support ERAU's mission in aeronautical research.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2416250","Theory and algorithms for a new class of computationally amenable nonconvex functions","DMS","COMPUTATIONAL MATHEMATICS","03/01/2024","03/12/2024","Ying Cui","CA","University of California-Berkeley","Standard Grant","Jodi Mead","06/30/2026","$240,330.00","","yingcui@berkeley.edu","1608 4TH ST STE 201","BERKELEY","CA","947101749","5106433891","MPS","127100","079Z, 9263","$0.00","As the significance of data science continues to expand, nonconvex optimization models become increasingly prevalent in various scientific and engineering applications. Despite the field's rapid development, there are still a host of theoretical and applied problems that so far are left open and void of rigorous analysis and efficient methods for solution. Driven by practicality and reinforced by rigor, this project aims to conduct a comprehensive investigation of composite nonconvex optimization problems and games. The technologies developed will offer valuable tools for fundamental science and engineering research, positively impacting the environment and fostering societal integration with the big-data world. Additionally, the project will educate undergraduate and graduate students, cultivating the next generation of experts in the field.

This project seeks to advance state-of-the-art techniques for solving nonconvex optimization problems and games through both theoretical and computational approaches. At its core is the innovative concept of ""approachable difference-of-convex functions,"" which uncovers a hidden, asymptotically decomposable structure within the multi-composition of nonconvex and non-smooth functions. The project will tackle three main tasks: (i) establishing fundamental properties for a novel class of computationally amenable nonconvex and non-smooth composite functions; (ii) designing and analyzing computational schemes for single-agent optimization problems, with objective and constrained functions belonging to the aforementioned class; and (iii) extending these approaches to address nonconvex games.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2410678","Collaborative Research: Data-driven Realization of State-space Dynamical Systems via Low-complexity Algorithms","DMS","COMPUTATIONAL MATHEMATICS","08/01/2024","06/07/2024","Aaron Welters","FL","Florida Institute of Technology","Standard Grant","Jodi Mead","07/31/2027","$125,000.00","Xianqi Li","awelters@fit.edu","150 W UNIVERSITY BLVD","MELBOURNE","FL","329018995","3216748000","MPS","127100","079Z, 9263","$0.00","Data science is evolving rapidly and places a new perspective on realizing state-space dynamical systems. Predicting time-advanced states of dynamical systems is a challenging problem in STEM disciplines due to their nonlinear and complex nature. This project will utilize data-driven methods and analyze state-space dynamical systems to predict and understand future states, surpassing classical techniques. In addition, the PI team will (i) guide students to obtain cross-discipline PhD/Master's degrees, (ii) guide students to work in a peer-learning environment, and (iii) educate a diverse group of undergraduates.

In more detail, this project will utilize state-of-the-art machine learning (ML) algorithms to efficiently analyze and predict information within data matrices and tensor computations with low-complexity algorithms. Single-dimensional ML models are not efficient at extracting hidden semantic information in the time and space domains. As a result, it becomes challenging to simultaneously capture multi-dimensional spatiotemporal data in state-space dynamical systems. Using efficient ML algorithms to recover multi-dimensional spatiotemporal data simultaneously offers a breakthrough in understanding the chaotic behavior of dynamical systems. This project will (i) utilize ML to predict future states of dynamical systems based on high-dimensional data matrices captured at different time stamps, (ii) realize state-space controllable and observable systems via low-complexity algorithms to simultaneously analyze multiple states of the systems, (iii) analyze noise in state-space systems for uncertainty quantification, predict patterns in real-time states, generate counter-resonance states to suppress them, and optimize performance and stability, (iv) study system resilience via multiple state predictors and perturbations to assess performance and adaptation to disturbances and anomalies, and finally (v) optimize spacecraft trajectories, avoid impact, and use low-complexity algorithms to understand spacecraft launch dynamics on the space coast and support ERAU's mission in aeronautical research.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2409903","Development of novel numerical methods for forward and inverse problems in mean field games","DMS","COMPUTATIONAL MATHEMATICS","07/01/2024","06/11/2024","Yat Tin Chow","CA","University of California-Riverside","Continuing Grant","Troy D. Butler","06/30/2027","$95,280.00","","yattinc@ucr.edu","200 UNIVERSTY OFC BUILDING","RIVERSIDE","CA","925210001","9518275535","MPS","127100","9263","$0.00","Mean field games is the study of strategic decision making in large populations where individual players interact through a certain quantity in the mean field. Mean field games have strong descriptive power in socioeconomics and biology, e.g. in the understanding of social cooperation, stock markets, trading and economics, biological systems, election dynamics, population games, robotic control, machine learning, dynamics of multiple populations, pandemic modeling and control as well as vaccination distribution. It is therefore essential to develop accurate numerical methods for large-scale mean field games and their model recovery. However, current computational approaches for the recovery problem are impractical in high dimensions. This project will comprehensively study new computational methods for both large-scale mean field games and their model recovery. The comprehensive plans will cover algorithmic development, theoretical analysis, numerical implementation and practical applications. The project will also involve research on speeding up the forward and inverse problem computations to speed up the computation for mean field game modeling and turn real life mean field game model recovery problems from computationally unaffordable to affordable. The research team will disseminate results through publications, professional presentations, the training of graduate students at the University of California, Riverside as well as through public outreach events that involve public talks and engagement with high school math fairs. The goals of these outreach events are to increase public literacy and public engagement in mathematics, improve STEM education and educator development, and broaden participation of women and underrepresented minorities.

The project will provide novel computational methods for both forward and inverse problems of mean field games. The team will (1) develop two new numerical methods for forward problems in mean field games, namely monotone inclusion with Benamou-Brenier's formulation and extragradient algorithm with moving anchoring; (2) develop three new numerical methods for inverse problems in mean field games with only boundary measurements, namely a three-operator splitting scheme, a semi-smooth Newton acceleration method, and a direct sampling method. Both theoretical analysis and practical implementations will be emphasized. In particular, numerical methods for inverse problems for mean field games, which is a main target of the project, will be designed to work with only boundary measurements. This represents a brand new field in inverse problems and optimization. The project will also seek the simultaneous reconstruction of coefficients in the severely ill-posed case when only noisy boundary measurements from one or two measurement events are available.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2409918","Structure preservation in nonlinear, degenerate, evolution","DMS","COMPUTATIONAL MATHEMATICS","08/01/2024","06/03/2024","Abner Salgado","TN","University of Tennessee Knoxville","Standard Grant","Ludmil T. Zikatanov","07/31/2027","$204,533.00","","asalgad1@utk.edu","201 ANDY HOLT TOWER","KNOXVILLE","TN","379960001","8659743466","MPS","127100","9263","$0.00","A thorough treatment is feasible for the classical linear problems in the numerical approximation of partial differential equations. The continuous problem is well-posed. The numerical schemes are well-posed, parameter-robust, and convergent. It is even possible to prove convergence rates. However, the situation is more precarious for modern, complex systems of equations. Oftentimes, the uniqueness of solutions is not known. Even when there is uniqueness, the theory is far from complete, and so besides (weak) convergence of numerical solutions, little can be said about their behavior. In these scenarios, one must settle for simpler yet still relevant goals. An important goal in this front is that of structure preservation. The study of structure preservation in numerical methods is not new. Geometric numerical integration, many methods for electromagnetism, the finite element exterior calculus, and some novel approaches to hyperbolic systems of conservation laws, have this goal in mind: geometric, algebraic, or differential constraints must be preserved. This project does not focus on the problems mentioned above. Instead, it studies structure preservation in some evolution problems that have, possibly degenerate, diffusive behavior. This class of problems remains a largely unexplored topic when it comes to numerical discretizations. Bridging this gap will enhance modeling and prediction capabilities since diffusive models can be found in every aspect of scientific inquiry.

This project is focused on a class of diffusive problems in which stability of the solution cannot be obtained by standard energy arguments, in other words, by testing the equation with the solution to assert that certain space-time norms are under control. Norms are always convex. Structure preservation may then be a generalization of the approach given above. Instead of norms being under control, a (family of) convex functional(s) evaluated at the solution behave predictably during the evolution. The project aims to develop numerical schemes that mimic this in the discrete setting. While this is a largely unexplored topic, at the same time, many of the problems under consideration can be used to describe a wide range of phenomena. In particular, the project will develop new numerical schemes for an emerging theory of non-equilibrium thermodynamics, active scalar equations, and a class of problems in hyperbolic geometry. These models have a very rich intrinsic structure and a wide range of applications, and the developments of this project will serve as a stepping stone to bring these tools to the numerical treatment of more general problems. The students involved in the project will be trained in exciting, mathematically and computationally challenging, and practically relevant areas of research.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2410676","Collaborative Research: Data-driven Realization of State-space Dynamical Systems via Low-complexity Algorithms","DMS","COMPUTATIONAL MATHEMATICS","08/01/2024","06/07/2024","Sirani Mututhanthrige-Perera","FL","Embry-Riddle Aeronautical University","Standard Grant","Jodi Mead","07/31/2027","$175,000.00","","pereras2@erau.edu","1 AEROSPACE BLVD","DAYTONA BEACH","FL","321143910","3862267695","MPS","127100","079Z, 9263","$0.00","Data science is evolving rapidly and places a new perspective on realizing state-space dynamical systems. Predicting time-advanced states of dynamical systems is a challenging problem in STEM disciplines due to their nonlinear and complex nature. This project will utilize data-driven methods and analyze state-space dynamical systems to predict and understand future states, surpassing classical techniques. In addition, the PI team will (i) guide students to obtain cross-discipline PhD/Master's degrees, (ii) guide students to work in a peer-learning environment, and (iii) educate a diverse group of undergraduates.

In more detail, this project will utilize state-of-the-art machine learning (ML) algorithms to efficiently analyze and predict information within data matrices and tensor computations with low-complexity algorithms. Single-dimensional ML models are not efficient at extracting hidden semantic information in the time and space domains. As a result, it becomes challenging to simultaneously capture multi-dimensional spatiotemporal data in state-space dynamical systems. Using efficient ML algorithms to recover multi-dimensional spatiotemporal data simultaneously offers a breakthrough in understanding the chaotic behavior of dynamical systems. This project will (i) utilize ML to predict future states of dynamical systems based on high-dimensional data matrices captured at different time stamps, (ii) realize state-space controllable and observable systems via low-complexity algorithms to simultaneously analyze multiple states of the systems, (iii) analyze noise in state-space systems for uncertainty quantification, predict patterns in real-time states, generate counter-resonance states to suppress them, and optimize performance and stability, (iv) study system resilience via multiple state predictors and perturbations to assess performance and adaptation to disturbances and anomalies, and finally (v) optimize spacecraft trajectories, avoid impact, and use low-complexity algorithms to understand spacecraft launch dynamics on the space coast and support ERAU's mission in aeronautical research.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." @@ -35,7 +37,6 @@ "2409868","On Iteratively Regularized Alternating Minimization under Nonlinear Dynamics Constraints with Applications to Epidemiology","DMS","COMPUTATIONAL MATHEMATICS","09/01/2024","05/29/2024","Alexandra Smirnova","GA","Georgia State University Research Foundation, Inc.","Standard Grant","Troy D. Butler","08/31/2027","$200,000.00","Xiaojing Ye","asmirnova@gsu.edu","58 EDGEWOOD AVE NE","ATLANTA","GA","303032921","4044133570","MPS","127100","9263","$0.00","How widely has the virus spread? This important and often overlooked question was brought to light by the recent COVID-19 outbreak. Several techniques have been used to account for silent spreaders along with varying testing and healthcare seeking habits as the main reasons for under-reporting of incidence cases. It has been observed that silent spreaders play a more significant role in disease progression than previously understood, highlighting the need for policymakers to incorporate these hidden figures into their strategic responses. Unlike other disease parameters, i.e., incubation and recovery rates, the case reporting rate and the time-dependent effective reproduction number are directly influenced by a large number of factors making it impossible to directly quantify these parameters in any meaningful way. This project will advance iteratively regularized numerical algorithms, which have emerged as a powerful tool for reliable estimation (from noise-contaminated data) of infectious disease parameters that are crucial for future projections, prevention, and control. Apart from epidemiology, the project will benefit all real-world applications involving massive amounts of observation data for multiple stages of the inversion process with a shared model parameter. In the course of their theoretical and numerical studies, the PIs will continue to create research opportunities for undergraduate and graduate students, including women and students from groups traditionally underrepresented in STEM disciplines. A number of project topics are particularly suitable for student research and will be used to train some of the next generation of computational mathematicians.

In the framework of this project, the PIs will develop new regularized alternating minimization algorithms for solving ill-posed parameter-estimation problems constrained by nonlinear dynamics. While significant computational challenges are shared by both deterministic trust-region and Bayesian methods (such as numerical solutions requiring solutions to possibly complex ODE or PDE systems at every step of the iterative process), the team will address these challenges by constructing a family of fast and stable iteratively regularized optimization algorithms, which carefully alternate between updating model parameters and state variables.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2409855","Metric-Dependent Strategies for Inverse Problem Analysis and Computation","DMS","COMPUTATIONAL MATHEMATICS","07/01/2024","05/29/2024","Yunan Yang","NY","Cornell University","Standard Grant","Troy D. Butler","06/30/2027","$275,000.00","","yunan.yang@cornell.edu","341 PINE TREE RD","ITHACA","NY","148502820","6072555014","MPS","127100","9263","$0.00","This project will develop novel approaches to solving inverse problems, which are pivotal in many scientific fields, including biology, geophysics, and medical imaging. Inverse problems often involve deducing unknown parameters from observed data, a task complicated by issues such as sensitivity to measurement noise and complex modeling procedures. The broader significance of this research lies in its potential to significantly enhance the accuracy and efficiency of computational methods used in critical applications such as electrical impedance tomography (EIT), inverse scattering, and cryo-electron microscopy (cryo-EM). For instance, improvements in cryo-EM computation will accelerate breakthroughs in molecular biology and aid in rapid drug development, directly benefiting medical research and public health. Additionally, this project will also (1) engage undergraduate and graduate students in research to foster a new generation of computational mathematicians, and (2) promote STEM careers among K-12 students through outreach activities.

The technical focus of this project will be on the development of metric-dependent strategies to improve the stability and computational efficiency of solving inverse problems. Lipschitz-type stability will be established by selecting metrics tailored to the data and unknown parameters to facilitate more robust algorithmic solutions. A key highlight of the project will be the investigation of the stochastic inverse problem's well-posedness. Sampling methods inspired by metric-dependent gradient flows will serve as the novel computational tool for the practical solution of stochastic inverse problems. These analytical and computational strategies will be designed to handle the randomness inherent in many practical scenarios, shifting the traditional deterministic approach for solving inverse problems to a probabilistic framework that better captures the intricacies of real-world data. This research has the promise to not only advance theoretical knowledge in studying inverse problems but also to develop practical, efficient tools for a wide range of applications in science and engineering.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2432134","Collaborative Research: Computational Methods for Optimal Transport via Fluid Flows","DMS","COMPUTATIONAL MATHEMATICS","05/15/2024","05/17/2024","Yangwen Zhang","LA","University of Louisiana at Lafayette","Continuing Grant","Yuliya Gorb","06/30/2025","$56,877.00","","yangwen.zhang@louisiana.edu","104 E UNIVERSITY AVE","LAFAYETTE","LA","705032014","3374825811","MPS","127100","9150, 9263","$0.00","Transport and mixing in fluids is a topic of fundamental interest in engineering and natural sciences, with broad applications ranging from industrial and chemical mixing on small and large scales, to preventing the spreading of pollutants in geophysical flows. This project focuses on computational methods for control of optimal transport and mixing of some quantity of interest in fluid flows. The question of what fluid flow maximizes mixing rate, slows it down, or even steers a quantity of interest toward a desired target distribution draws great attention from a broad range of scientists and engineers in the area of complex dynamical systems. The goal of this project is to place these problems within a flexible computational framework, and to develop a solution strategy based on optimal control tools, data compression strategies, and methods to reduce the complexity of the mathematical models. This project will also help the training and development of graduate students across different disciplines to conduct collaborative research in optimal transport and mixing, flow control, and computational methods for solving these problems.


The project is concerned with the development and analysis of numerical methods for optimal control for mixing in fluid flows. More precisely, the transport equation is used to describe the non-dissipative scalar field advected by the incompressible Stokes and Navier-Stokes flows. The research aims at achieving optimal mixing via an active control of the flow velocity and constructing efficient numerical schemes for solving this problem. Various control designs will be investigated to steer the fluid flows. Sparsity of the optimal boundary control will be promoted via a non-smooth penalty term in the objective functional. This essentially leads to a highly challenging nonlinear non-smooth control problem for a coupled parabolic and hyperbolic system, or a semi-dissipative system. The project will establish a novel and rigorous mathematical framework and also new accurate and efficient computational techniques for these difficult optimal control problems. Compatible discretization methods for coupled flow and transport will be employed to discretize the controlled system and implement the optimal control designs numerically. Numerical schemes for the highly complicated optimality system will be constructed and analyzed in a systematic fashion. New incremental data compression techniques will be utilized to avoid storing extremely large solution data sets in the iterative solvers, and new model order reduction techniques specifically designed for the optimal mixing problem will be developed to increase efficiency. The synthesis of optimal control and numerical approximation will enable the study of similar phenomena arising in many other complex and real-world flow dynamics.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." -"2414705","CAREER: Mathematical Modeling from Data to Insights and Beyond","DMS","COMPUTATIONAL MATHEMATICS","01/15/2024","01/22/2024","Yifei Lou","NC","University of North Carolina at Chapel Hill","Continuing Grant","Yuliya Gorb","05/31/2025","$141,540.00","","yflou@unc.edu","104 AIRPORT DR STE 2200","CHAPEL HILL","NC","275995023","9199663411","MPS","127100","1045, 9263","$0.00","This project will develop both analytical and computational tools for data-driven applications. In particular, analytical tools will hold great promise to provide theoretical guidance on how to acquire data more efficiently than current practices. To retrieve useful information from data, numerical methods will be investigated with emphasis on guaranteed convergence and algorithmic acceleration. Thanks to close interactions with collaborators in data science and information technology, the investigator will ensure the practicability of the proposed research, leading to a real impact. The investigator will also devote herself to various outreach activities in the field of data science. For example, she will initiate a local network of students, faculty members, and domain experts to develop close ties between mathematics and industry as well as to broaden career opportunities for mathematics students. This initiative will have a positive impact on the entire mathematical sciences community. In addition, she will advocate for the integration of mathematical modeling into K-16 education by collaborating with The University of Texas at Dallas Diversity Scholarship Program to reach out to mathematics/sciences teachers.

This project addresses important issues in extracting insights from data and training the next generation in the ""big data"" era. The research focuses on signal/image recovery from a limited number of measurements, in which ""limited"" refers to the fact that the amount of data that can be taken or transmitted is limited by technical or economic constraints. When data is insufficient, one often requires additional information from the application domain to build a mathematical model, followed by numerical methods. Questions to be explored in this project include: (1) how difficult is the process of extracting insights from data? (2) how should reasonable assumptions be taken into account to build a mathematical model? (3) how should an efficient algorithm be designed to find a model solution? More importantly, a feedback loop from insights to data will be introduced, i.e., (4) how to improve upon data acquisition so that information becomes easier to retrieve? As these questions mimic the standard procedure in mathematical modeling, the proposed research provides a plethora of illustrative examples to enrich the education of mathematical modeling. In fact, one of this CAREER award's educational objectives is to advocate the integration of mathematical modeling into K-16 education so that students will develop problem-solving skills in early ages. In addition, the proposed research requires close interactions with domain experts in business, industry, and government (BIG), where real-world problems come from. This requirement helps to fulfill another educational objective, that is, to promote BIG employment by providing adequate training for students in successful approaches to BIG problems together with BIG workforce skills.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2347546","Conference: Mathematical models and numerical methods for multiphysics problems","DMS","COMPUTATIONAL MATHEMATICS","01/15/2024","01/10/2024","Ivan Yotov","PA","University of Pittsburgh","Standard Grant","Troy D. Butler","12/31/2024","$30,000.00","","yotov@math.pitt.edu","4200 FIFTH AVENUE","PITTSBURGH","PA","152600001","4126247400","MPS","127100","7556, 9263","$0.00","The Mathematical Models and Numerical Methods for Multiphysics Problems Conference is held May 1-3, 2024 at the University of Pittsburgh in Pittsburgh, PA. The conference aims to bring together experts from different communities in which computational models for multiphysics systems are employed. Multiphysics systems model the physical interactions between two or more media, such as couplings of fluid flows, rigid or deformable porous media, and elastic structures. Typical examples are coupling of free fluid and porous media flows, fluid-structure interaction, and fluid-poroelastic structure interaction. Applications of interest include climate modeling, interaction of surface and subsurface hydrological systems, fluid flows through fractured or deformable aquifers or reservoirs, evolution of soil structures, arterial flows, perfusion of living tissues, and organ modeling, such as the heart, lungs, and brain. The work presented at the conference will cover both rigorous mathematical and numerical analysis and applications to cutting-edge problems.

The mathematical models describing the multiphysics systems of interest consist of couplings of complex systems of partial differential equations. Examples include the Stokes/Navier-Stokes equations for free fluid flows, the linear or nonlinear elasticity equations for structure mechanics, the Darcy equations for porous media flows, and the Biot equations for poroelasticity. Physical phenomena occurring in different regions are coupled through kinematic and dynamic interface conditions. The modeling and simulation process involves well-posedness analysis of the mathematical models, design and analysis of stable, accurate, and robust numerical methods, and development of efficient solution strategies. Despite significant progress in recent years, many challenges remain in all three areas. Examples include, on the mathematical modeling side, the nonlinear advection term in the Navier Stokes equations in coupled settings, nonlinear fully-coupled flow-transport models, nonlinear diffusion, mobility, and elastic parameters, and non-isothermal effects; on the numerical side, structure preserving and parameter robust discretization methods, a posteriori error estimation and mesh adaptivity in both space and time, multiscale and reduced order models; on the solution side, stable and higher-order loosely-coupled time splitting methods, domain decomposition methods, and parameter-robust monolithic solvers and preconditioners. The conference will bring together experts in the field who are actively working to address these challenges. It will provide an environment for them to discuss state-of-the-art results and trends and encourage future collaborations and research directions. The conference website is https://www.mathematics.pitt.edu/events/mathematical-models-and-numerical-methods-multiphysics-systems

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2333724","Conference: North American High Order Methods Con (NAHOMCon)","DMS","COMPUTATIONAL MATHEMATICS","03/01/2024","01/18/2024","Anne Gelb","NH","Dartmouth College","Standard Grant","Troy D. Butler","02/28/2025","$30,000.00","","annegelb@math.dartmouth.edu","7 LEBANON ST","HANOVER","NH","037552170","6036463007","MPS","127100","7556, 9150, 9263","$0.00","The North American High Order Methods conference series (NAHOMCon) is held June 17-19, 2024 at Dartmouth College in Hanover, New Hampshire. This conference brings together researchers and practitioners with an interest in the theoretical, computational and applied aspects of high-order methods for the solution of differential equations impacting increasingly diverse and important problems in biology, climate studies, earth science, engineering, geology and medicine. The conference engages a broad group of researchers at all career stages representing the mathematical sciences and engineering, as well as practitioners from multiple industrial and government communities, allowing participants to gain new insights and perspectives from experts in various application domains. The conference format includes both invited speakers as well as submitted contributions to encourage broad participation especially from young scientists and under-represented minorities. The conference will be held in conjunction with the New England Numerical Analysis Day (NENAD), which provides the opportunity for faculty and students from local universities to convene and discuss new research trends in numerical analysis, and enable people from the many smaller institutions in New England to establish future collaborative partners.

This conference brings together researchers and practitioners with an interest in the theoretical, computational and applied aspects of high-order and spectral methods for the solution of differential equations impacting increasingly diverse and important applications. Subjects of the meeting include, but are not limited to, high-order finite difference methods, p- and h-p type finite element methods, discontinuous Galerkin methods, spectral methods, efficient solvers and preconditioners, efficient time-stepping method, and computational aspects for modern hardware environments. There will also be an emphasis on high order methods in the context of data-driven models, including machine learning and data assimilation approaches. The format of the meeting includes both invited papers as well as submitted contributions in order to encourage wider participation especially from young scientists and under-represented minorities. Two career development panel discussions are scheduled. The first will be aimed at undergraduates who are interested in graduate school, while the second will focus on careers in industry. The conference website is https://math.dartmouth.edu/~nahomcon2024/

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2403506","Conference: Power of Diversity in Uncertainty Quantification (PoD UQ)","DMS","COMPUTATIONAL MATHEMATICS","02/01/2024","01/30/2024","Annalisa Quaini","TX","University of Houston","Standard Grant","Troy D. Butler","01/31/2025","$23,500.00","","quaini@math.uh.edu","4300 MARTIN LUTHER KING BLVD","HOUSTON","TX","772043067","7137435773","MPS","127100","7556, 9263","$0.00","The Power of Diversity in Uncertainty Quantification (PoD UQ) workshop will be a one-day meeting hosted by the International School for Advanced Studies in Trieste, Italy, on February 26, 2024. This is the day prior to the beginning of the next SIAM Conference on Uncertainty Quantification (SIAM UQ24), which will be held in Trieste from February 27 to March 1, 2024. SIAM UQ24 is dedicated to recognizing the natural synergy between computational models and statistical methods, strengthened by the emergence of machine learning as a practical tool. Thus, for the first time there will be a large international meeting on the whole UQ ecosystem viewed through a unifying lens. For graduate students and young researchers entering the field of UQ, SIAM UQ24 offers a unique opportunity to learn from and exchange ideas with diverse groups of UQ professionals from academia, industry, and government laboratories. As attractive as this opportunity is, the size and breadth of the conference could be daunting. PoD UQ is targeted to graduate students and young researchers to ease their approach to SIAM UQ24 and ensure they make the most out of it.

The name of the event highlights the central role played by diversity in UQ, a field whose advancement requires the integration of mathematics and statistics, theory and computations. Diversity refers also to the community of under-represented groups that the event targets for greater inclusivity. The goals of PoD UQ are to: (i) Introduce students and early-career researchers to the state-of-the-art and current trends in modeling, sampling, and analyzing uncertainties. The meeting will feature three talks meant to give an overview of the different areas in UQ and two introductory talks on current hot topics. All the talks will be delivered by internationally renowned leaders in the field. In addition, a panel of established UQ researchers will discuss the opportunities and challenges of working in an application-driven and inherently interdisciplinary field that relies on a broad range of mathematical and statistical foundations, domain knowledge, and algorithmic and computational tools. (ii) Offer an excellent chance for networking with both peers and more established researchers. Recognizing the importance of a supportive network in building a career, especially for people from minority groups, the schedule of PoD UQ includes ample time to connect and interact. The participants will have two coffee breaks and a generous lunch break to interact among themselves. A poster sessions and seated dinner with assigned seats will ensure that the participants get to interact with the speakers and the panelists. More details can be found at: http://go.sissa.it/siamuq24satelliteevent

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." diff --git a/Statistics/Awards-Statistics-2024.csv b/Statistics/Awards-Statistics-2024.csv index 00faaa4..a23f9bf 100644 --- a/Statistics/Awards-Statistics-2024.csv +++ b/Statistics/Awards-Statistics-2024.csv @@ -1,4 +1,7 @@ "AwardNumber","Title","NSFOrganization","Program(s)","StartDate","LastAmendmentDate","PrincipalInvestigator","State","Organization","AwardInstrument","ProgramManager","EndDate","AwardedAmountToDate","Co-PIName(s)","PIEmailAddress","OrganizationStreet","OrganizationCity","OrganizationState","OrganizationZip","OrganizationPhone","NSFDirectorate","ProgramElementCode(s)","ProgramReferenceCode(s)","ARRAAmount","Abstract" +"2413953","Collaborative Research: Statistical Inference for High Dimensional and High Frequency Data: Contiguity, Matrix Decompositions, Uncertainty Quantification","DMS","STATISTICS","07/01/2024","06/21/2024","Lan Zhang","IL","University of Illinois at Chicago","Standard Grant","Jun Zhu","06/30/2027","$155,372.00","","lanzhang@uic.edu","809 S MARSHFIELD AVE M/C 551","CHICAGO","IL","606124305","3129962862","MPS","126900","","$0.00","To pursue the promise of the big data revolution, the current project is concerned with a particular form of such data, high dimensional high frequency data (HD2), where series of high-dimensional observations can see new data updates in fractions of milliseconds. With technological advances in data collection, HD2 data occurs in medicine (from neuroscience to patient care), finance and economics, geosciences (such as earthquake data), marine science (fishing and shipping), and, of course, in internet data. This research project focuses on how to extract information from HD2 data, and how to turn this data into knowledge. As part of the process, the project develops cutting-edge mathematics and statistical methodology to uncover the dependence structure governing HD2 data. It interfaces with concepts of artificial intelligence. In addition to developing a general theory, the project is concerned with applications to financial data, including risk management, forecasting, and portfolio management. More precise estimators, with improved margins of error, will be useful in all these areas of finance. The results are of interest to main-street investors, regulators and policymakers, and the results are entirely in the public domain. The project will also provide research training opportunities for students.

In more detail, the project will focus on four linked questions for HD2 data: contiguity, matrix decompositions, uncertainty quantification, and the estimation of spot quantities. The investigators will extend their contiguity theory to the common case where observations have noise, which also permits the use of longer local intervals. Under a contiguous probability, the structure of the observations is often more accessible (frequently Gaussian) in local neighborhoods, facilitating statistical analysis. This is achieved without altering the underlying models. Because the effect of the probability change is quite transparent, this approach also enables more direct uncertainty quantification. To model HD2 data, the investigators will explore time-varying matrix decompositions, including the development of a singular value decomposition (SVD) for high frequency data, as a more direct path to a factor model. Both SVD and principal component analysis (PCA) benefit from contiguity, which eases both the time-varying construction, and uncertainty quantification. The latter is of particular importance not only to set standard errors, but also to determine the trade-offs involved in estimation under longitudinal variation: for example, how many minutes or days are required to estimate a covariance matrix, or singular vectors? The investigators also plan to develop volatility matrices for the drift part of a financial process, and their PCAs. The work on matrix decompositions will also benefit from projected results on spot estimation, which also ties in with contiguity. It is expected that the consequences of the contiguity and the HD2 inference will be transformational, leading to more efficient estimators and better prediction, and that this approach will form a new paradigm for high frequency data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2413301","Next-Generation Functional Data Analysis via Machine Learning","DMS","STATISTICS","07/01/2024","06/21/2024","Guanqun Cao","MI","Michigan State University","Standard Grant","Yulia Gel","06/30/2027","$170,000.00","","caoguanq@msu.edu","426 AUDITORIUM RD RM 2","EAST LANSING","MI","488242600","5173555040","MPS","126900","079Z, 1269","$0.00","Classical functional data refer to curves or functions, i.e., the data for each variable are viewed as smooth curves, surfaces, or hypersurfaces evaluated at a finite subset of some interval in one-, two- or three-dimensional Euclidean spaces (for example, some period of time, some range of pixels or voxels, and so on). The independent and identically distributed functional data are sometimes referred to as first-generation functional data. Modern studies from a variety of fields record multiple functional observations according to either multivariate, high-dimensional, multilevel, or time series designs. Such data are called next-generation functional data. This project will elevate the focus on developing machine learning (ML) and artificial intelligence-based methodologies tailored for the next-generation of functional data analysis (FDA). The project will bridge the gap between theoretical knowledge and practical application in ML and FDA. While there have been efforts to integrate ML into the FDA field, these initiatives have predominantly concentrated on handling relatively straightforward formats of functional data. In addition, multiple student research training opportunities will be offered, and high-performance statistics software packages will be developed. These packages will enable researchers from various disciplines to investigate complex relationships that exist among modern functional data.

The widespread utilization of resilient digital devices has led to a notable increase of dependent, high-dimensional, and multi-way functional data. Consequently, the existing toolkit?s efficiency diminishes when tasked with addressing emerging FDA challenges. The PI will introduce: (i) Deep neural networks-based Lasso for dependent FDA; (ii) Optimal multi-way FDA; and (iii) Transfer learning for FDA, and will develop flexible and intelligent ML based estimators, classifiers, clusters, and investigate their statistical properties including the bounds of the prediction errors, convergence rates and minimax excess risk. The proposed methodology will be particularly useful for modeling complex functional data whose underlying structure cannot be properly captured by the existing statistical methods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2413747","Collaborative Research: NSF MPS/DMS-EPSRC: Stochastic Shape Processes and Inference","DMS","STATISTICS","08/01/2024","06/20/2024","Sebastian Kurtek","OH","Ohio State University","Standard Grant","Yulia Gel","07/31/2027","$199,555.00","","kurtek.1@osu.edu","1960 KENNY RD","COLUMBUS","OH","432101016","6146888735","MPS","126900","1269, 7929","$0.00","The intimate link between form, or shape, and function is ubiquitous in science. In biology, for instance, the shapes of biological components are pivotal in understanding patterns of normal behavior and growth; a notable example is protein shape, which contributes to our understanding of protein function and classification. This project, led by a team of investigators from the USA and the UK, will develop ways of modeling how biological and other shapes change with time, using formal statistical frameworks that capture not only the changes themselves, but how these changes vary across objects and populations. This will enable the study of the link between form and function in all its variability. As example applications, the project will develop models for changes in cell morphology and topology during motility and division, and changes in human posture during various activities, facilitating the exploration of scientific questions such as how and why cell division fails, or how to improve human postures in factory tasks. These are proofs of concept, but the methods themselves will have much wider applicability. This project will thus not only progress the science of shape analysis and the specific applications studied; it will have broader downstream impacts on a range of scientific application domains, providing practitioners with general and useful tools.

While there are several approaches for representing and analyzing static shapes, encompassing curves, surfaces, and complex structures like trees and shape graphs, the statistical modeling and analysis of dynamic shapes has received limited attention. Mathematically, shapes are elements of quotient spaces of nonlinear manifolds, and shape changes can be modeled as stochastic processes, termed shape processes, on these complex spaces. The primary challenges lie in adapting classical modeling concepts to the nonlinear geometry of shape spaces and in developing efficient statistical tools for computation and inference in such very high-dimensional, nonlinear settings. The project consists of three thrust areas, dealing with combinations of discrete and continuous time, and discrete and continuous representations of shape, with a particular emphasis on the issues raised by topology changes. The key idea is to integrate spatiotemporal registration of objects and their evolution into the statistical formulation, rather than treating them as pre-processing steps. This project will specifically add to the current state-of-the-art in topic areas such as stochastic differential equations on shape manifolds, time series models for shapes, shape-based functional data analysis, and modeling and inference on infinite-dimensional shape spaces.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413833","Collaborative Research: Nonparametric Learning in High-Dimensional Survival Analysis for causal inference and sequential decision making","DMS","STATISTICS","07/01/2024","06/18/2024","Shanshan Ding","DE","University of Delaware","Standard Grant","Jun Zhu","06/30/2027","$200,000.00","Wei Qian","sding@udel.edu","220 HULLIHEN HALL","NEWARK","DE","197160099","3028312136","MPS","126900","9150","$0.00","Data with survival outcomes are commonly encountered in real-world applications to capture the time duration until a specific event of interest occurs. Nonparametric learning for high dimensional survival data offers promising avenues in practice because of its ability to capture complex relationships and provide comprehensive insights for diverse problems in medical and business services, where vast covariates and individual metrics are prevalent. This project will significantly advance the methods and theory for nonparametric learning in high-dimensional survival data analysis, with a specific focus on causal inference and sequential decision making problems. The study will be of interest to practitioners in various fields, particularly providing useful methods for medical researchers to discover relevant risk factors, assess causal treatment effects, and utilize personalized treatment strategies in contemporary health sciences. It will also provide useful analytics tools beneficial to financial and related institutions for assessing user credit risks and facilitating informed decisions through personalized services. The theoretical and empirical studies to incorporate complex nonparametric structures in high-dimensional survival analysis, together with their interdisciplinary applications, will create valuable training and research opportunities for graduate and undergraduate students, including those from underrepresented minority groups.

Under flexible nonparametric learning frameworks, new embedding methods and learning algorithms will be developed for high dimensional survival analysis. First, the investigators will develop supervised doubly robust linear embedding and supervised nonlinear manifold learning method for supervised dimension reduction of high dimensional survival data, without imposing stringent model or distributional assumptions. Second, a robust nonparametric learning framework will be established for estimating causal treatment effect for high dimensional survival data that allows the covariate dimension to grow much faster than the sample size. Third, motivated by applications in personalized service, the investigators will develop a new nonparametric multi-stage algorithm for high dimensional censored bandit problems that allows flexibility with potential non-linear decision boundaries with optimal regret guarantees.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413721","New Directions in Bayesian Heterogeneous Data Integration: Methods, Theory and Applications","DMS","STATISTICS","07/01/2024","06/17/2024","Sharmistha Guha","TX","Texas A&M University","Continuing Grant","Tapabrata Maiti","06/30/2027","$49,899.00","","sharmistha@tamu.edu","400 HARVEY MITCHELL PKY S STE 30","COLLEGE STATION","TX","778454375","9798626777","MPS","126900","1269","$0.00","As the scientific community is moving into a data-driven era, there is an unprecedented opportunity for the integrative analysis of network and functional data from multiple sources to uncover important scientific insights which might be missing when these data sources are analyzed in isolation. To this end, this project plans to transform the current landscape of integrating network and functional data, leveraging their combined strength for scientific advancements through the development of innovative hierarchical Bayesian statistical models. The proposed work holds transformative promise in vital scientific domains, such as cognitive and motor aging, and neurodegenerative diseases. It will enhance scientific collaborations with neuroscientists using multi-source image data for targeted investigations of key brain regions significant in the study of motor and cognitive aging. Moreover, the proposed research will facilitate the prediction of images, traditionally acquired via costly imaging modalities, utilizing images from more cost-effective alternatives, which is poised to bring about transformative changes in the healthcare economy. The open-source software and educational materials created will be maintained and accessible to a wider audience of statisticians and domain experts. This accessibility is anticipated to foster widespread adoption of these techniques among statisticians and domain scientists. The PI's involvement in conference presentations, specialized course development, curriculum expansion, graduate student mentoring, undergraduate research engagement with a focus on under-represented backgrounds, and provision of short courses will enhance dissemination efforts and encourage diverse utilization of the developed methods.

The proposed project aims to address the urgent need for principled statistical approaches to seamlessly merge information from diverse sources, including modern network and functional data. It challenges the prevailing trend of analyzing individual data sources, which inherently limits the potential for uncovering innovative scientific insights that could arise from integrating multiple sources. Hierarchical Bayesian models are an effective way to capture the complex structures in network and functional data. These models naturally share information among heterogeneous objects, providing comprehensive uncertainty in inference through science-driven joint posterior distributions. Despite the potential advantages of Bayesian perspectives, their widespread adoption is hindered by the lack of theoretical guarantees, computational challenges, and difficulties in specifying robust priors for high-dimensional problems. This proposal will address these limitations by integrating network and functional data, leveraging their combined strength for scientific advancements through the development of innovative hierarchical Bayesian models. Specifically, the project will develop a semi-parametric joint regression framework with network and functional responses, deep network regression with multiple network responses, and Bayesian interpretable deep neural network regression with functional response on network and functional predictors. Besides offering a novel toolbox for multi-source object data integration, the proposed approach will advance the emerging field of interpretable deep learning for object regression by formulating novel and interpretable deep neural networks that combine predictive power with statistical model interpretability. The project will develop Bayesian asymptotic results to guarantee accurate parametric and predictive inference from these models as a function of network and functional features and sample size, an unexplored domain in the Bayesian integration of multi-object data. The proposed methodology will significantly enhance the seamless integration of multimodal neuroimaging data, leading to principled inferences and deeper comprehension of brain structure and function in the study of Alzheimer's disease and aging.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413552","Collaborative Research: Statistical Network Integration","DMS","STATISTICS","07/01/2024","06/17/2024","Joshua Cape","WI","University of Wisconsin-Madison","Continuing Grant","Yulia Gel","06/30/2027","$37,235.00","","jrcape@wisc.edu","21 N PARK ST STE 6301","MADISON","WI","537151218","6082623822","MPS","126900","1269","$0.00","This project pursues the contemporary problem of statistical network integration facing scientists, practitioners, and theoreticians. The study of networks and graph-structured data has received growing attention in recent years, motivated by investigations of complex systems throughout the biological and social sciences. Models and methods have been developed to analyze network data objects, often focused on single networks or homogeneous data settings, yet modern available data are increasingly heterogeneous, multi-sample, and multi-modal. Consequently, there is a growing need to leverage data arising from different sources that result in multiple network observations with attributes. This project will develop statistically principled data integration methodologies for neuroimaging studies, which routinely collect multiple subject data across different groups (strains, conditions, edge groups), modalities (functional and diffusion MRI), and brain covariate information (phenotypes, healthy status, gene expression data from brain tissue). The investigators will offer interdisciplinary mentoring opportunities to students participating in the research project and co-teach a workshop based on the proposed research.

The goals of this project are to establish flexible, parsimonious latent space models for network integration and to develop efficient, theoretically justified inference procedures for such models. More specifically, this project will develop latent space models to disentangle common and individual local and global latent features in samples of networks, propose efficient spectral matrix-based methods for data integration, provide high-dimensional structured penalties for dimensionality reduction and regularization in network data, and develop cross-validation methods for multiple network data integration. New theoretical developments spanning concentration inequalities, eigenvector perturbation analysis, and distributional asymptotic results will elucidate the advantages and limitations of these methods in terms of signal aggregation, heterogeneity, and flexibility. Applications of these methodologies to the analysis of multi-subject brain network data will be studied. Emphasis will be on interpretability, computation, and theoretical justification.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." @@ -6,13 +9,14 @@ "2412832","Collaborative Research: Statistical Modeling and Inference for Object-valued Time Series","DMS","STATISTICS","07/01/2024","06/17/2024","Changbo Zhu","IN","University of Notre Dame","Continuing Grant","Jun Zhu","06/30/2027","$56,755.00","","czhu4@nd.edu","836 GRACE HALL","NOTRE DAME","IN","465566031","5746317432","MPS","126900","","$0.00","Random objects in general metric spaces have become increasingly common in many fields. For example, the intraday return path of a financial asset, the age-at-death distributions, the annual composition of energy sources, social networks, phylogenetic trees, and EEG scans or MRI fiber tracts of patients can all be viewed as random objects in certain metric spaces. For many endeavors in this area, the data being analyzed is collected with a natural ordering, i.e., the data can be viewed as an object-valued time series. Despite its prevalence in many applied problems, statistical analysis for such time series is still in its early development. A fundamental difficulty of developing statistical techniques is that the spaces where these objects live are nonlinear and commonly used algebraic operations are not applicable. This research project aims to develop new models, methodology and theory for the analysis of object-valued time series. Research results from the project will be disseminated to the relevant scientific communities via publications, conference and seminar presentations. The investigators will jointly mentor a Ph.D. student and involve undergraduate students in the research, as well as offering advanced topic courses to introduce the state-of-the-art techniques in object-valued time series analysis.

The project will develop a systematic body of methods and theory on modeling and inference for object-valued time series. Specifically, the investigators propose to (1) develop a new autoregressive model for distributional time series in Wasserstein geometry and a suite of tools for model estimation, selection and diagnostic checking; (2) develop new specification testing procedures for distributional time series in the one-dimensional Euclidean space; and (3) develop new change-point detection methods to detect distribution shifts in a sequence of object-valued time series. The above three projects tackle several important modeling and inference issues in the analysis of object-valued time series, the investigation of which will lead to innovative methodological and theoretical developments, and lay groundwork for this emerging field.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2412403","Robust Extensions to Bayesian Regression Trees for Complex Data","DMS","STATISTICS","08/01/2024","06/17/2024","HENGRUI LUO","TX","William Marsh Rice University","Continuing Grant","Tapabrata Maiti","07/31/2027","$58,710.00","","hl180@rice.edu","6100 MAIN ST","Houston","TX","770051827","7133484820","MPS","126900","","$0.00","This project is designed to extend the capabilities of tree-based models within the context of machine learning. Tree-based models allow for decision-making based on clear, interpretable rules and are widely adopted in diagnostic and learning tasks. This project will develop novel methodologies to enhance their robustness. Specifically, the research will integrate deep learning techniques with tree-based statistical methods to create models capable of processing complex, high-dimensional data from medical imaging, healthcare, and AI sectors. These advancements aim to significantly improve prediction and decision-making processes, enhancing efficiency and accuracy across a broad range of applications. The project also prioritizes inclusivity and education by integrating training components, thereby advancing scientific knowledge and disseminating results through publications and presentations.

The proposed research leverages Bayesian hierarchies and transformation techniques on trees to develop models capable of managing complex transformations of input data. These models will be tailored to improve interpretability, scalability, and robustness, overcoming current limitations in non-parametric machine learning applications. The project will utilize hierarchical layered structures, where outputs from one tree serve as inputs to subsequent trees, forming network architectures that enhance precision in modeling complex data patterns and relationships. Bayesian techniques will be employed to effectively quantify uncertainty and create ensembles, providing reliable predictions essential for critical offline prediction and real-time decision-making processes. This initiative aims to develop pipelines and set benchmarks for the application of tree-based models across diverse scientific and engineering disciplines.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2412015","Statistical methods for point-process time series","DMS","STATISTICS","07/01/2024","06/17/2024","Daniel Gervini","WI","University of Wisconsin-Milwaukee","Standard Grant","Jun Zhu","06/30/2027","$149,989.00","","gervini@uwm.edu","3203 N DOWNER AVE # 273","MILWAUKEE","WI","532113188","4142294853","MPS","126900","","$0.00","This research project will develop statistical models and inference methods for the analysis of random point processes. Random point processes are events that occur at random in time or space according to certain patterns; this project will provide methods for the discovery and analysis of such patterns. Examples of events that can be modelled as random point processes include cyberattacks on a computer network, earthquakes, crimes in a city, spikes of neural activity in humans and animals, car crashes in a highway, and many others. Therefore, the methods to be developed under this project will find applications in many fields, such as national security, economy, neuroscience and geosciences, among others. The project will also provide training opportunities for graduate and undergraduate students in the field of Data Science.

This project will specifically develop statistical tools for the analysis of time series of point processes, that is, for point processes that are observed repeatedly over time; for example, when the spatial distribution of crime in a city is observed for several days. These tools will include trend estimation methods, autocorrelation estimation methods, and autoregressive models. Research activities in this project include the development of parameter estimation procedures, their implementation in computer programs, the study of theoretical large sample properties of these methods, the study of small sample properties by simulation, and their application to real-data problems. Other activities in this project include educational activities, such as the supervision of Ph.D. and Master's students, and the development of graduate and undergraduate courses in Statistics and Data Science.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2412408","Monitoring time series in structured function spaces","DMS","STATISTICS","07/01/2024","06/14/2024","Piotr Kokoszka","CO","Colorado State University","Standard Grant","Yulia Gel","06/30/2027","$292,362.00","","Piotr.Kokoszka@colostate.edu","601 S HOWES ST","FORT COLLINS","CO","805212807","9704916355","MPS","126900","1269","$0.00","This project aims to develop new mathematical theory and statistical tools that will enable monitoring for changes in complex systems, for example global trade networks. Comprehensive databases containing details of trade between almost all countries are available. Detecting in real time a change in the typical pattern of trade and identifying countries where this change takes place is an important problem. This project will provide statistical methods that will allow making decisions about an emergence of an atypical pattern in a complex system in real time with certain theoretical guarantees. The project will also offer multiple interdisciplinary training opportunities for the next generation of statisticians and data scientists.

The methodology that will be developed is related to sequential change point detection, but is different because the in-control state is estimated rather than assumed. This requires new theoretical developments because it deals with complex infinite dimensional systems, whereas existing mathematical tools apply only to finite-dimensional systems. Panels of structured functions will be considered and methods for on-line identification of components undergoing change will be devised. All methods will be inferential with controlled probabilities of type I errors. Some of the key aspects of the project can be summarized in the following points. First, statistical theory leading to change point monitoring schemes in infinite dimensional function spaces will be developed. Second, strong approximations valid in Banach spaces will lead to assumptions not encountered in scalar settings and potentially to different threshold functions. Third, for monitoring of random density functions, the above challenges will be addressed in custom metric spaces. Fourth, since random densities are not observable, the effect of estimation will be incorporated. The new methodology will be applied to viral load measurements, investment portfolios, and global trade data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2412628","Collaborative Research: Partial Priors, Regularization, and Valid & Efficient Probabilistic Structure Learning","DMS","STATISTICS","07/01/2024","06/17/2024","Ryan Martin","NC","North Carolina State University","Standard Grant","Yulia Gel","06/30/2027","$160,000.00","","rgmarti3@ncsu.edu","2601 WOLF VILLAGE WAY","RALEIGH","NC","276950001","9195152444","MPS","126900","1269","$0.00","Modern applications of statistics aim to solve complex scientific problems involving high-dimensional unknowns. One feature that these applications often share is that the high-dimensional unknown is believed to satisfy a complexity-limiting, low-dimensional structure. Specifics of the posited low-dimensional structure are mostly unknown, so a statistically interesting and scientifically relevant problem is structure learning, i.e., using data to learn the latent low-dimensional structure. Because structure learning problems are ubiquitous and reliable uncertainty quantification is imperative, results from this project will have an impact across the biomedical, physical, and social sciences. In addition, the project will offer multiple opportunities for career development of new generations of statisticians and data scientists.

Frequentist methods focus on data-driven estimation or selection of a candidate structure, but currently there are no general strategies for reliable uncertainty quantification concerning the unknown structure. Bayesian methods produce a data-dependent probability distribution over the space of structures that can be used for uncertainty quantification, but it comes with no reliability guarantees. A barrier to progress in reliable uncertainty quantification is the oppositely extreme perspectives: frequentists' anathema of modeling structural/parametric uncertainty versus Bayesians' insistence that such uncertainty always be modeled precisely and probabilistically. Overcoming this barrier requires a new perspective falling between these two extremes, and this project will develop a new framework that features a more general and flexible perspective on probability, namely, imprecise probability. Most importantly, this framework will resolve the aforementioned issues by offering new and powerful methods boasting provably reliable uncertainty quantification in structure learning applications.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." -"2413747","Collaborative Research: NSF MPS/DMS-EPSRC: Stochastic Shape Processes and Inference","DMS","STATISTICS","08/01/2024","06/20/2024","Sebastian Kurtek","OH","Ohio State University","Standard Grant","Yulia Gel","07/31/2027","$199,555.00","","kurtek.1@osu.edu","1960 KENNY RD","COLUMBUS","OH","432101016","6146888735","MPS","126900","1269, 7929","$0.00","The intimate link between form, or shape, and function is ubiquitous in science. In biology, for instance, the shapes of biological components are pivotal in understanding patterns of normal behavior and growth; a notable example is protein shape, which contributes to our understanding of protein function and classification. This project, led by a team of investigators from the USA and the UK, will develop ways of modeling how biological and other shapes change with time, using formal statistical frameworks that capture not only the changes themselves, but how these changes vary across objects and populations. This will enable the study of the link between form and function in all its variability. As example applications, the project will develop models for changes in cell morphology and topology during motility and division, and changes in human posture during various activities, facilitating the exploration of scientific questions such as how and why cell division fails, or how to improve human postures in factory tasks. These are proofs of concept, but the methods themselves will have much wider applicability. This project will thus not only progress the science of shape analysis and the specific applications studied; it will have broader downstream impacts on a range of scientific application domains, providing practitioners with general and useful tools.

While there are several approaches for representing and analyzing static shapes, encompassing curves, surfaces, and complex structures like trees and shape graphs, the statistical modeling and analysis of dynamic shapes has received limited attention. Mathematically, shapes are elements of quotient spaces of nonlinear manifolds, and shape changes can be modeled as stochastic processes, termed shape processes, on these complex spaces. The primary challenges lie in adapting classical modeling concepts to the nonlinear geometry of shape spaces and in developing efficient statistical tools for computation and inference in such very high-dimensional, nonlinear settings. The project consists of three thrust areas, dealing with combinations of discrete and continuous time, and discrete and continuous representations of shape, with a particular emphasis on the issues raised by topology changes. The key idea is to integrate spatiotemporal registration of objects and their evolution into the statistical formulation, rather than treating them as pre-processing steps. This project will specifically add to the current state-of-the-art in topic areas such as stochastic differential equations on shape manifolds, time series models for shapes, shape-based functional data analysis, and modeling and inference on infinite-dimensional shape spaces.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2413952","Collaborative Research: Statistical Inference for High Dimensional and High Frequency Data: Contiguity, Matrix Decompositions, Uncertainty Quantification","DMS","STATISTICS","07/01/2024","06/21/2024","Per Mykland","IL","University of Chicago","Standard Grant","Jun Zhu","06/30/2027","$219,268.00","","mykland@galton.uchicago.edu","5801 S ELLIS AVE","CHICAGO","IL","606375418","7737028669","MPS","126900","","$0.00","To pursue the promise of the big data revolution, the current project is concerned with a particular form of such data, high dimensional high frequency data (HD2), where series of high-dimensional observations can see new data updates in fractions of milliseconds. With technological advances in data collection, HD2 data occurs in medicine (from neuroscience to patient care), finance and economics, geosciences (such as earthquake data), marine science (fishing and shipping), and, of course, in internet data. This research project focuses on how to extract information from HD2 data, and how to turn this data into knowledge. As part of the process, the project develops cutting-edge mathematics and statistical methodology to uncover the dependence structure governing HD2 data. In addition to developing a general theory, the project is concerned with applications to financial data, including risk management, forecasting, and portfolio management. More precise estimators, with improved margins of error, will be useful in all these areas of finance. The results will be of interest to main-street investors, regulators and policymakers, and the results will be entirely in the public domain. The project will also provide research training opportunities for students.

In more detail, the project will focus on four linked questions for HD2 data: contiguity, matrix decompositions, uncertainty quantification, and the estimation of spot quantities. The investigators will extend their contiguity theory to the common case where observations have noise, which also permits the use of longer local intervals. Under a contiguous probability, the structure of the observations is often more accessible (frequently Gaussian) in local neighborhoods, facilitating statistical analysis. This is achieved without altering the underlying models. Because the effect of the probability change is quite transparent, this approach also enables more direct uncertainty quantification. To model HD2 data, the investigators will explore time-varying matrix decompositions, including the development of a singular value decomposition (SVD) for high frequency data, as a more direct path to a factor model. Both SVD and principal component analysis (PCA) benefit from contiguity, which eases both the time-varying construction, and uncertainty quantification. The latter is of particular importance not only to set standard errors, but also to determine the trade-offs involved in estimation under longitudinal variation: for example, how many minutes or days are required to estimate a covariance matrix, or singular vectors? The investigators also plan to develop volatility matrices for the drift part of a financial process, and their PCAs. The work on matrix decompositions will also benefit from projected results on spot estimation, which also ties in with contiguity. It is expected that the consequences of the contiguity and the HD2 inference will be transformational, leading to more efficient estimators and better prediction, and that this approach will form a new paradigm for high frequency data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413864","Statistical Properties of Neural Networks","DMS","STATISTICS","07/01/2024","06/18/2024","Sourav Chatterjee","CA","Stanford University","Standard Grant","Tapabrata Maiti","06/30/2027","$225,000.00","","souravc@stanford.edu","450 JANE STANFORD WAY","STANFORD","CA","943052004","6507232300","MPS","126900","1269","$0.00","Neural networks have revolutionized science and engineering in recent years, but their theoretical properties are still poorly understood. The proposed projects aim to gain a deeper understanding of these theoretical properties, especially the statistical ones. It is a matter of intense debate whether neural networks can ""think"" like humans do, by recognizing logical patterns. The project aims to take a small step towards showing that under ideal conditions, perhaps they can. If successful, this will have impact in a vast range of applications of neural networks. This award includes support and mentoring for graduate students.

In one direction, it is proposed to study features of deep neural networks that distinguish them from classical statistical parametric models. Preliminary results suggest that the lack of identifiability is the differentiating factor. Secondly, it is proposed to investigate the extent to which neural networks may be seen as algorithm approximators, going beyond the classical literature on universal function approximation for neural networks. This perspective may shed light on recent empirical phenomena in neural networks, including the surprising emergent behavior of transformers and large language models.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413834","Collaborative Research: Nonparametric Learning in High-Dimensional Survival Analysis for causal inference and sequential decision making","DMS","STATISTICS","07/01/2024","06/18/2024","Zhezhen Jin","NY","Columbia University","Standard Grant","Jun Zhu","06/30/2027","$100,000.00","","zj7@columbia.edu","615 W 131ST ST","NEW YORK","NY","100277922","2128546851","MPS","126900","","$0.00","Data with survival outcomes are commonly encountered in real-world applications to capture the time duration until a specific event of interest occurs. Nonparametric learning for high dimensional survival data offers promising avenues in practice because of its ability to capture complex relationships and provide comprehensive insights for diverse problems in medical and business services, where vast covariates and individual metrics are prevalent. This project will significantly advance the methods and theory for nonparametric learning in high-dimensional survival data analysis, with a specific focus on causal inference and sequential decision making problems. The study will be of interest to practitioners in various fields, particularly providing useful methods for medical researchers to discover relevant risk factors, assess causal treatment effects, and utilize personalized treatment strategies in contemporary health sciences. It will also provide useful analytics tools beneficial to financial and related institutions for assessing user credit risks and facilitating informed decisions through personalized services. The theoretical and empirical studies to incorporate complex nonparametric structures in high-dimensional survival analysis, together with their interdisciplinary applications, will create valuable training and research opportunities for graduate and undergraduate students, including those from underrepresented minority groups.

Under flexible nonparametric learning frameworks, new embedding methods and learning algorithms will be developed for high dimensional survival analysis. First, the investigators will develop supervised doubly robust linear embedding and supervised nonlinear manifold learning method for supervised dimension reduction of high dimensional survival data, without imposing stringent model or distributional assumptions. Second, a robust nonparametric learning framework will be established for estimating causal treatment effect for high dimensional survival data that allows the covariate dimension to grow much faster than the sample size. Third, motivated by applications in personalized service, the investigators will develop a new nonparametric multi-stage algorithm for high dimensional censored bandit problems that allows flexibility with potential non-linear decision boundaries with optimal regret guarantees.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413405","Collaborative Research: Statistical Optimal Transport: Foundation, Computation and Applications","DMS","STATISTICS","07/01/2024","06/18/2024","Kengo Kato","NY","Cornell University","Standard Grant","Yong Zeng","06/30/2027","$160,000.00","","kk976@cornell.edu","341 PINE TREE RD","ITHACA","NY","148502820","6072555014","MPS","126900","079Z","$0.00","Comparing probability models is a fundamental task in almost every data-enabled problem, and Optimal Transport (OT) offers a powerful and versatile framework to do so. Recent years have witnessed a rapid development of computational OT, which has expanded applications of OT to statistics, including clustering, generative modeling, domain adaptation, distribution-to-distribution regression, dimension reduction, and sampling. Still, understanding the fundamental strengths and limitations of OT as a statistical tool is much to be desired. This research project aims to fill this important gap by advancing statistical analysis (estimation and inference) and practical approximation of two fundamental notions (average and quantiles) in statistics and machine learning, demonstrated through modern applications for measure-valued data. The project also provides research training opportunities for graduate students.

The award contains three main research projects. The first project will develop a new regularized formulation of the Wasserstein barycenter based on the multi-marginal OT and conduct an in-depth statistical analysis, encompassing sample complexity, limiting distributions, and bootstrap consistency. The second project will establish asymptotic distribution and bootstrap consistency results for linear functionals of OT maps and will study sharp asymptotics for entropically regularized OT maps when regularization parameters tend to zero. Building on the first two projects, the third project explores applications of the OT methodology to two important statistical tasks: dimension reduction and vector quantile regression. The research agenda will develop a novel and computationally efficient principal component method for measure-valued data and a statistically valid duality-based estimator for quantile regression with multivariate responses. The three projects will produce novel technical tools integrated from OT theory, empirical process theory, and partial differential equations, which are essential for OT-based inferential methods and will inspire new applications of OT to measure-valued and multivariate data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." +"2413748","Collaborative Research: NSF MPS/DMS-EPSRC: Stochastic Shape Processes and Inference","DMS","STATISTICS","08/01/2024","06/20/2024","Anuj Srivastava","FL","Florida State University","Standard Grant","Yulia Gel","07/31/2027","$200,000.00","","anuj@stat.fsu.edu","874 TRADITIONS WAY","TALLAHASSEE","FL","323060001","8506445260","MPS","126900","1269, 7929","$0.00","The intimate link between form, or shape, and function is ubiquitous in science. In biology, for instance, the shapes of biological components are pivotal in understanding patterns of normal behavior and growth; a notable example is protein shape, which contributes to our understanding of protein function and classification. This project, led by a team of investigators from the USA and the UK, will develop ways of modeling how biological and other shapes change with time, using formal statistical frameworks that capture not only the changes themselves, but how these changes vary across objects and populations. This will enable the study of the link between form and function in all its variability. As example applications, the project will develop models for changes in cell morphology and topology during motility and division, and changes in human posture during various activities, facilitating the exploration of scientific questions such as how and why cell division fails, or how to improve human postures in factory tasks. These are proofs of concept, but the methods themselves will have much wider applicability. This project will thus not only progress the science of shape analysis and the specific applications studied; it will have broader downstream impacts on a range of scientific application domains, providing practitioners with general and useful tools.

While there are several approaches for representing and analyzing static shapes, encompassing curves, surfaces, and complex structures like trees and shape graphs, the statistical modeling and analysis of dynamic shapes has received limited attention. Mathematically, shapes are elements of quotient spaces of nonlinear manifolds, and shape changes can be modeled as stochastic processes, termed shape processes, on these complex spaces. The primary challenges lie in adapting classical modeling concepts to the nonlinear geometry of shape spaces and in developing efficient statistical tools for computation and inference in such very high-dimensional, nonlinear settings. The project consists of three thrust areas, dealing with combinations of discrete and continuous time, and discrete and continuous representations of shape, with a particular emphasis on the issues raised by topology changes. The key idea is to integrate spatiotemporal registration of objects and their evolution into the statistical formulation, rather than treating them as pre-processing steps. This project will specifically add to the current state-of-the-art in topic areas such as stochastic differential equations on shape manifolds, time series models for shapes, shape-based functional data analysis, and modeling and inference on infinite-dimensional shape spaces.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413426","Collaborative Research: Synergies between Steins Identities and Reproducing Kernels: Modern Tools for Nonparametric Statistics","DMS","STATISTICS","07/01/2024","06/17/2024","Krishnakumar Balasubramanian","CA","University of California-Davis","Standard Grant","Yong Zeng","06/30/2027","$169,999.00","","kbala@ucdavis.edu","1850 RESEARCH PARK DR STE 300","DAVIS","CA","956186153","5307547700","MPS","126900","079Z","$0.00","The project aims to conduct comprehensive statistical and computational analyses, with the overarching objective of advancing innovative nonparametric data analysis techniques. The methodologies and theories developed are anticipated to push the boundaries of modern nonparametric statistical inference and find applicability in other statistical domains such as nonparametric latent variable models, time series analysis, and sequential nonparametric multiple testing. This project will enhance the interconnections among statistics, machine learning, and computation and provide training opportunities for postdoctoral fellows, graduate students, and undergraduates.

More specifically, the project covers key problems in nonparametric hypothesis testing, intending to establish a robust framework for goodness-of-fit testing for distributions on non-Euclidean domains with unknown normalization constants. The research also delves into nonparametric variational inference, aiming to create a particle-based algorithmic framework with discrete-time guarantees. Furthermore, the project focuses on nonparametric functional regression, with an emphasis on designing minimax optimal estimators using infinite-dimensional Stein's identities. The study also examines the trade-offs between statistics and computation in all the aforementioned methods. The common thread weaving through these endeavors is the synergy between various versions of Stein's identities and reproducing kernels, contributing substantially to the advancement of models, methods, and theories in contemporary nonparametric statistics.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." -"2412408","Monitoring time series in structured function spaces","DMS","STATISTICS","07/01/2024","06/14/2024","Piotr Kokoszka","CO","Colorado State University","Standard Grant","Yulia Gel","06/30/2027","$292,362.00","","Piotr.Kokoszka@colostate.edu","601 S HOWES ST","FORT COLLINS","CO","805212807","9704916355","MPS","126900","1269","$0.00","This project aims to develop new mathematical theory and statistical tools that will enable monitoring for changes in complex systems, for example global trade networks. Comprehensive databases containing details of trade between almost all countries are available. Detecting in real time a change in the typical pattern of trade and identifying countries where this change takes place is an important problem. This project will provide statistical methods that will allow making decisions about an emergence of an atypical pattern in a complex system in real time with certain theoretical guarantees. The project will also offer multiple interdisciplinary training opportunities for the next generation of statisticians and data scientists.

The methodology that will be developed is related to sequential change point detection, but is different because the in-control state is estimated rather than assumed. This requires new theoretical developments because it deals with complex infinite dimensional systems, whereas existing mathematical tools apply only to finite-dimensional systems. Panels of structured functions will be considered and methods for on-line identification of components undergoing change will be devised. All methods will be inferential with controlled probabilities of type I errors. Some of the key aspects of the project can be summarized in the following points. First, statistical theory leading to change point monitoring schemes in infinite dimensional function spaces will be developed. Second, strong approximations valid in Banach spaces will lead to assumptions not encountered in scalar settings and potentially to different threshold functions. Third, for monitoring of random density functions, the above challenges will be addressed in custom metric spaces. Fourth, since random densities are not observable, the effect of estimation will be incorporated. The new methodology will be applied to viral load measurements, investment portfolios, and global trade data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413823","Robust and efficient Bayesian inference for misspecified and underspecified models","DMS","STATISTICS","07/01/2024","06/18/2024","Steven MacEachern","OH","Ohio State University","Standard Grant","Tapabrata Maiti","06/30/2027","$300,000.00","Ju Hee Lee, Hang Joon Kim","snm@stat.osu.edu","1960 KENNY RD","COLUMBUS","OH","432101016","6146888735","MPS","126900","","$0.00","This research project aims to improve data-driven modelling and decision-making. Its focus is on the development of Bayesian methods for low-information settings. Bayesian methods have proven to be tremendously successful in high-information settings where data is of high-quality, the scientific/business background that has generated the data is well-understood, and clear questions are asked. This project will develop a suite of Bayesian methods designed for low-information settings, including those where (i) the data show particular types of deficiencies, such as a preponderance of outlying or ?bad data?, (ii) a limited conceptual understanding of the phenomenon under study leads to a model that leaves a substantial gap between model and reality, producing a misspecified model or a model that is not fully specified, and (iii) when there is a shortage of data, so that the model captures only a very simplified version of reality. The new methods will expand the scope of Bayesian applications, with attention to problems in biomedical applications and psychology. The project will provide training for the next generation of data scientists.

This project has two main threads. For the first, the project will develop diagnostics that allow the analyst to assess the adequacy of portions of a posited model. Such assessments point the way toward elaborations that will bring the model closer to reality, improving the full collection of inferences. These assessments will also highlight limitations of the model, enabling the analyst to know when to make a decision and when to refrain from making one. The second thread will explore the use of sample-size adaptive loss functions for modelling and for inference. Adaptive loss functions have been used by classical statisticians to improve inference by exploiting the bias-variance tradeoff. This thread will blend adaptivity with Bayesian methods. This will robustify inference by providing smoother likelihoods for small and moderate sample sizes and by relying on smoother inference functions when the sample size is limited.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413404","Collaborative Research: Statistical Optimal Transport: Foundation, Computation and Applications","DMS","STATISTICS","07/01/2024","06/18/2024","Xiaohui Chen","CA","University of Southern California","Standard Grant","Yong Zeng","06/30/2027","$180,000.00","","xiaohuic@usc.edu","3720 S FLOWER ST FL 3","LOS ANGELES","CA","900890701","2137407762","MPS","126900","079Z","$0.00","Comparing probability models is a fundamental task in almost every data-enabled problem, and Optimal Transport (OT) offers a powerful and versatile framework to do so. Recent years have witnessed a rapid development of computational OT, which has expanded applications of OT to statistics, including clustering, generative modeling, domain adaptation, distribution-to-distribution regression, dimension reduction, and sampling. Still, understanding the fundamental strengths and limitations of OT as a statistical tool is much to be desired. This research project aims to fill this important gap by advancing statistical analysis (estimation and inference) and practical approximation of two fundamental notions (average and quantiles) in statistics and machine learning, demonstrated through modern applications for measure-valued data. The project also provides research training opportunities for graduate students.

The award contains three main research projects. The first project will develop a new regularized formulation of the Wasserstein barycenter based on the multi-marginal OT and conduct an in-depth statistical analysis, encompassing sample complexity, limiting distributions, and bootstrap consistency. The second project will establish asymptotic distribution and bootstrap consistency results for linear functionals of OT maps and will study sharp asymptotics for entropically regularized OT maps when regularization parameters tend to zero. Building on the first two projects, the third project explores applications of the OT methodology to two important statistical tasks: dimension reduction and vector quantile regression. The research agenda will develop a novel and computationally efficient principal component method for measure-valued data and a statistically valid duality-based estimator for quantile regression with multivariate responses. The three projects will produce novel technical tools integrated from OT theory, empirical process theory, and partial differential equations, which are essential for OT-based inferential methods and will inspire new applications of OT to measure-valued and multivariate data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413553","Collaborative Research: Statistical Network Integration","DMS","STATISTICS","07/01/2024","06/17/2024","Jesús Arroyo","TX","Texas A&M University","Continuing Grant","Yulia Gel","06/30/2027","$37,118.00","","jarroyo@tamu.edu","400 HARVEY MITCHELL PKY S STE 30","COLLEGE STATION","TX","778454375","9798626777","MPS","126900","1269","$0.00","This project pursues the contemporary problem of statistical network integration facing scientists, practitioners, and theoreticians. The study of networks and graph-structured data has received growing attention in recent years, motivated by investigations of complex systems throughout the biological and social sciences. Models and methods have been developed to analyze network data objects, often focused on single networks or homogeneous data settings, yet modern available data are increasingly heterogeneous, multi-sample, and multi-modal. Consequently, there is a growing need to leverage data arising from different sources that result in multiple network observations with attributes. This project will develop statistically principled data integration methodologies for neuroimaging studies, which routinely collect multiple subject data across different groups (strains, conditions, edge groups), modalities (functional and diffusion MRI), and brain covariate information (phenotypes, healthy status, gene expression data from brain tissue). The investigators will offer interdisciplinary mentoring opportunities to students participating in the research project and co-teach a workshop based on the proposed research.

The goals of this project are to establish flexible, parsimonious latent space models for network integration and to develop efficient, theoretically justified inference procedures for such models. More specifically, this project will develop latent space models to disentangle common and individual local and global latent features in samples of networks, propose efficient spectral matrix-based methods for data integration, provide high-dimensional structured penalties for dimensionality reduction and regularization in network data, and develop cross-validation methods for multiple network data integration. New theoretical developments spanning concentration inequalities, eigenvector perturbation analysis, and distributional asymptotic results will elucidate the advantages and limitations of these methods in terms of signal aggregation, heterogeneity, and flexibility. Applications of these methodologies to the analysis of multi-subject brain network data will be studied. Emphasis will be on interpretability, computation, and theoretical justification.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." @@ -21,7 +25,6 @@ "2412629","Collaborative Research: Partial Priors, Regularization, and Valid & Efficient Probabilistic Structure Learning","DMS","STATISTICS","07/01/2024","06/17/2024","Chuanhai Liu","IN","Purdue University","Standard Grant","Yulia Gel","06/30/2027","$160,000.00","","chuanhai@purdue.edu","2550 NORTHWESTERN AVE # 1100","WEST LAFAYETTE","IN","479061332","7654941055","MPS","126900","","$0.00","Modern applications of statistics aim to solve complex scientific problems involving high-dimensional unknowns. One feature that these applications often share is that the high-dimensional unknown is believed to satisfy a complexity-limiting, low-dimensional structure. Specifics of the posited low-dimensional structure are mostly unknown, so a statistically interesting and scientifically relevant problem is structure learning, i.e., using data to learn the latent low-dimensional structure. Because structure learning problems are ubiquitous and reliable uncertainty quantification is imperative, results from this project will have an impact across the biomedical, physical, and social sciences. In addition, the project will offer multiple opportunities for career development of new generations of statisticians and data scientists.

Frequentist methods focus on data-driven estimation or selection of a candidate structure, but currently there are no general strategies for reliable uncertainty quantification concerning the unknown structure. Bayesian methods produce a data-dependent probability distribution over the space of structures that can be used for uncertainty quantification, but it comes with no reliability guarantees. A barrier to progress in reliable uncertainty quantification is the oppositely extreme perspectives: frequentists' anathema of modeling structural/parametric uncertainty versus Bayesians' insistence that such uncertainty always be modeled precisely and probabilistically. Overcoming this barrier requires a new perspective falling between these two extremes, and this project will develop a new framework that features a more general and flexible perspective on probability, namely, imprecise probability. Most importantly, this framework will resolve the aforementioned issues by offering new and powerful methods boasting provably reliable uncertainty quantification in structure learning applications.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413891","Nonparametric estimation in causal inference: optimality in traditional models and newer ones","DMS","STATISTICS","08/01/2024","06/14/2024","Matteo Bonvini","NJ","Rutgers University New Brunswick","Continuing Grant","Yong Zeng","07/31/2027","$59,393.00","","mb1662@stat.rutgers.edu","3 RUTGERS PLZ","NEW BRUNSWICK","NJ","089018559","8489320150","MPS","126900","075Z","$0.00","This project provides new methods for estimating causal effects from non-randomized studies. Quantifying the causal effect of a variable on another one is of fundamental importance in science because it allows for the understanding of what happens if a certain action is taken, e.g., if a drug is prescribed to a patient. When randomized experiments are not feasible, e.g., because of costs or ethical concerns, quantifying the effect of a treatment on an outcome can be very challenging. Roughly, this is because the analysis must ensure that the treated and untreated units are ?comparable,? a condition implied by proper randomization. In these settings, the analyst typically proceeds in two steps: 1) they introduce the key assumptions needed to identify the causal effect, and 2) they specify a model for the distribution of the data, often nonparametric, to accommodate modern, complex datasets, as well as the appropriate estimation strategy. One key difficulty in non-randomized studies is that estimating causal effects typically requires estimating nuisance components of the data distribution that are not of direct interest and that can be potentially quite hard to estimate. Focused on the second part of the analysis, this project aims to design optimal methods for estimating causal effects in different settings. Informally, an optimal estimator converges to the true causal effect ?as quickly as possible? as a function of the sample size and thus leads to the most precise inferences. Establishing optimality has thus two fundamental benefits: 1) it leads to procedures that make the most efficient use of the available data, and 2) it serves as a benchmark against which future methods can be evaluated. In this respect, the theoretical and methodological contributions of this project are expected to lead to substantial improvements in the analysis of data from many domains, such as medicine and the social sciences. The project also aims to offer opportunities for training and mentoring graduate and undergraduate students.

For certain estimands and data structures, the principles of semiparametric efficiency theory can be used to derive optimal estimators. However, they are not directly applicable to causal parameters that are ?non-smooth? or for which the nuisance parts of the data distribution can only be estimated at such slow rates that root-n convergence of the causal effect estimator is not attainable. As part of this project, the Principal Investigator aims to study the optimal estimation of prominent examples of non-smooth parameters, such as causal effects defined by continuous treatments. Furthermore, this project will consider optimal estimation of ?smooth? parameters, such as certain average causal effects, in newer nonparametric models for which relatively fast rates of convergence are possible, even if certain components of the data distribution can only be estimated at very slow rates. In doing so, the project aims to propose new techniques for reducing the detrimental effect of the nuisance estimators? bias on the quality of the causal effect estimator. It also aims to design and implement inferential procedures for the challenging settings considered, thereby enhancing the adoption of the methods proposed in practice.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413557","Collaborative Research: Systemic Shock Inference for High-Frequency Data","DMS","STATISTICS","07/01/2024","06/14/2024","Jose Figueroa-Lopez","MO","Washington University","Standard Grant","Jun Zhu","06/30/2027","$99,957.00","","figueroa@math.wustl.edu","ONE BROOKINGS DR","SAINT LOUIS","MO","63110","3147474134","MPS","126900","","$0.00","Unexpected ?shocks,? or abrupt deviations from periods of stability naturally occur in time-dependent data-generating mechanisms across a variety of disciplines. Examples include crashes in stock markets, flurries of activity on social media following news events, and changes in animal migratory patterns during global weather events, among countless others. Reliable detection and statistical analysis of shock events is crucial in applications, as shock inference can provide scientists deeper understanding of large systems of time-dependent variables, helping to mitigate risk and manage uncertainty. When large systems of time-dependent variables are observed at high sampling frequencies, information at fine timescales can reveal hidden connections and provide insights into the collective uncertainty shared by an entire system. High-frequency observations of such systems appear in econometrics, climatology, statistical physics, and many other areas of empirical science that can benefit from reliable inference of shock events. This project will develop new statistical techniques for the both the detection and analysis of shocks in large systems of time-dependent variables observed at high temporal sampling frequencies. The project will also involve mentoring students, organizing workshops, and promoting diversity in STEM.

The investigators will study shock inference problems in a variety of settings in high dimensions. Special focus will be paid to semi-parametric high-frequency models that display a factor structure. Detection based on time-localized principal component analysis and related techniques will be explored, with a goal towards accounting for shock events that impact a large number of component series in a possibly asynchronous manner. Time-localized bootstrapping methods will also be considered for feasible testing frameworks for quantifying the system-level impact of shocks. Complimentary lines of inquiry will concern estimation of jump behavior in high-frequency models in multivariate contexts and time-localized clustering methods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." -"2413748","Collaborative Research: NSF MPS/DMS-EPSRC: Stochastic Shape Processes and Inference","DMS","STATISTICS","08/01/2024","06/20/2024","Anuj Srivastava","FL","Florida State University","Standard Grant","Yulia Gel","07/31/2027","$200,000.00","","anuj@stat.fsu.edu","874 TRADITIONS WAY","TALLAHASSEE","FL","323060001","8506445260","MPS","126900","1269, 7929","$0.00","The intimate link between form, or shape, and function is ubiquitous in science. In biology, for instance, the shapes of biological components are pivotal in understanding patterns of normal behavior and growth; a notable example is protein shape, which contributes to our understanding of protein function and classification. This project, led by a team of investigators from the USA and the UK, will develop ways of modeling how biological and other shapes change with time, using formal statistical frameworks that capture not only the changes themselves, but how these changes vary across objects and populations. This will enable the study of the link between form and function in all its variability. As example applications, the project will develop models for changes in cell morphology and topology during motility and division, and changes in human posture during various activities, facilitating the exploration of scientific questions such as how and why cell division fails, or how to improve human postures in factory tasks. These are proofs of concept, but the methods themselves will have much wider applicability. This project will thus not only progress the science of shape analysis and the specific applications studied; it will have broader downstream impacts on a range of scientific application domains, providing practitioners with general and useful tools.

While there are several approaches for representing and analyzing static shapes, encompassing curves, surfaces, and complex structures like trees and shape graphs, the statistical modeling and analysis of dynamic shapes has received limited attention. Mathematically, shapes are elements of quotient spaces of nonlinear manifolds, and shape changes can be modeled as stochastic processes, termed shape processes, on these complex spaces. The primary challenges lie in adapting classical modeling concepts to the nonlinear geometry of shape spaces and in developing efficient statistical tools for computation and inference in such very high-dimensional, nonlinear settings. The project consists of three thrust areas, dealing with combinations of discrete and continuous time, and discrete and continuous representations of shape, with a particular emphasis on the issues raised by topology changes. The key idea is to integrate spatiotemporal registration of objects and their evolution into the statistical formulation, rather than treating them as pre-processing steps. This project will specifically add to the current state-of-the-art in topic areas such as stochastic differential equations on shape manifolds, time series models for shapes, shape-based functional data analysis, and modeling and inference on infinite-dimensional shape spaces.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413425","Collaborative Research: Synergies between Steins Identities and Reproducing Kernels: Modern Tools for Nonparametric Statistics","DMS","STATISTICS","07/01/2024","06/17/2024","Bharath Sriperumbudur","PA","Pennsylvania State Univ University Park","Standard Grant","Yong Zeng","06/30/2027","$179,999.00","","bks18@psu.edu","201 OLD MAIN","UNIVERSITY PARK","PA","168021503","8148651372","MPS","126900","079Z","$0.00","The project aims to conduct comprehensive statistical and computational analyses, with the overarching objective of advancing innovative nonparametric data analysis techniques. The methodologies and theories developed are anticipated to push the boundaries of modern nonparametric statistical inference and find applicability in other statistical domains such as nonparametric latent variable models, time series analysis, and sequential nonparametric multiple testing. This project will enhance the interconnections among statistics, machine learning, and computation and provide training opportunities for postdoctoral fellows, graduate students, and undergraduates.

More specifically, the project covers key problems in nonparametric hypothesis testing, intending to establish a robust framework for goodness-of-fit testing for distributions on non-Euclidean domains with unknown normalization constants. The research also delves into nonparametric variational inference, aiming to create a particle-based algorithmic framework with discrete-time guarantees. Furthermore, the project focuses on nonparametric functional regression, with an emphasis on designing minimax optimal estimators using infinite-dimensional Stein's identities. The study also examines the trade-offs between statistics and computation in all the aforementioned methods. The common thread weaving through these endeavors is the synergy between various versions of Stein's identities and reproducing kernels, contributing substantially to the advancement of models, methods, and theories in contemporary nonparametric statistics.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2412833","Collaborative Research: Statistical Modeling and Inference for Object-valued Time Series","DMS","STATISTICS","07/01/2024","06/17/2024","Xiaofeng Shao","IL","University of Illinois at Urbana-Champaign","Standard Grant","Jun Zhu","06/30/2027","$174,997.00","","xshao@illinois.edu","506 S WRIGHT ST","URBANA","IL","618013620","2173332187","MPS","126900","","$0.00","Random objects in general metric spaces have become increasingly common in many fields. For example, the intraday return path of a financial asset, the age-at-death distributions, the annual composition of energy sources, social networks, phylogenetic trees, and EEG scans or MRI fiber tracts of patients can all be viewed as random objects in certain metric spaces. For many endeavors in this area, the data being analyzed is collected with a natural ordering, i.e., the data can be viewed as an object-valued time series. Despite its prevalence in many applied problems, statistical analysis for such time series is still in its early development. A fundamental difficulty of developing statistical techniques is that the spaces where these objects live are nonlinear and commonly used algebraic operations are not applicable. This research project aims to develop new models, methodology and theory for the analysis of object-valued time series. Research results from the project will be disseminated to the relevant scientific communities via publications, conference and seminar presentations. The investigators will jointly mentor a Ph.D. student and involve undergraduate students in the research, as well as offering advanced topic courses to introduce the state-of-the-art techniques in object-valued time series analysis.

The project will develop a systematic body of methods and theory on modeling and inference for object-valued time series. Specifically, the investigators propose to (1) develop a new autoregressive model for distributional time series in Wasserstein geometry and a suite of tools for model estimation, selection and diagnostic checking; (2) develop new specification testing procedures for distributional time series in the one-dimensional Euclidean space; and (3) develop new change-point detection methods to detect distribution shifts in a sequence of object-valued time series. The above three projects tackle several important modeling and inference issues in the analysis of object-valued time series, the investigation of which will lead to innovative methodological and theoretical developments, and lay groundwork for this emerging field.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria." "2413558","Collaborative Research: Systemic Shock Inference for High-Frequency Data","DMS","STATISTICS","07/01/2024","06/14/2024","Benjamin Boniece","PA","Drexel University","Continuing Grant","Jun Zhu","06/30/2027","$26,626.00","","cooper.boniece@drexel.edu","3141 CHESTNUT ST","PHILADELPHIA","PA","191042875","2158956342","MPS","126900","","$0.00","Unexpected ?shocks,? or abrupt deviations from periods of stability naturally occur in time-dependent data-generating mechanisms across a variety of disciplines. Examples include crashes in stock markets, flurries of activity on social media following news events, and changes in animal migratory patterns during global weather events, among countless others. Reliable detection and statistical analysis of shock events is crucial in applications, as shock inference can provide scientists deeper understanding of large systems of time-dependent variables, helping to mitigate risk and manage uncertainty. When large systems of time-dependent variables are observed at high sampling frequencies, information at fine timescales can reveal hidden connections and provide insights into the collective uncertainty shared by an entire system. High-frequency observations of such systems appear in econometrics, climatology, statistical physics, and many other areas of empirical science that can benefit from reliable inference of shock events. This project will develop new statistical techniques for the both the detection and analysis of shocks in large systems of time-dependent variables observed at high temporal sampling frequencies. The project will also involve mentoring students, organizing workshops, and promoting diversity in STEM.

The investigators will study shock inference problems in a variety of settings in high dimensions. Special focus will be paid to semi-parametric high-frequency models that display a factor structure. Detection based on time-localized principal component analysis and related techniques will be explored, with a goal towards accounting for shock events that impact a large number of component series in a possibly asynchronous manner. Time-localized bootstrapping methods will also be considered for feasible testing frameworks for quantifying the system-level impact of shocks. Complimentary lines of inquiry will concern estimation of jump behavior in high-frequency models in multivariate contexts and time-localized clustering methods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria."