-
Improved Algorithm and Bounds for Successive Projection
Authors:
Jiashun Jin,
Zheng Tracy Ke,
Gabriel Moryoussef,
Jiajun Tang,
Jingming Wang
Abstract:
Given a $K$-vertex simplex in a $d$-dimensional space, suppose we measure $n$ points on the simplex with noise (hence, some of the observed points fall outside the simplex). Vertex hunting is the problem of estimating the $K$ vertices of the simplex. A popular vertex hunting algorithm is successive projection algorithm (SPA). However, SPA is observed to perform unsatisfactorily under strong noise…
▽ More
Given a $K$-vertex simplex in a $d$-dimensional space, suppose we measure $n$ points on the simplex with noise (hence, some of the observed points fall outside the simplex). Vertex hunting is the problem of estimating the $K$ vertices of the simplex. A popular vertex hunting algorithm is successive projection algorithm (SPA). However, SPA is observed to perform unsatisfactorily under strong noise or outliers. We propose pseudo-point SPA (pp-SPA). It uses a projection step and a denoise step to generate pseudo-points and feed them into SPA for vertex hunting. We derive error bounds for pp-SPA, leveraging on extreme value theory of (possibly) high-dimensional random vectors. The results suggest that pp-SPA has faster rates and better numerical performances than SPA. Our analysis includes an improved non-asymptotic bound for the original SPA, which is of independent interest.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
The Fallacy of Minimizing Local Regret in the Sequential Task Setting
Authors:
Ziping Xu,
Kelly W. Zhang,
Susan A. Murphy
Abstract:
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of explor…
▽ More
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of exploration. However, these theoretical setups often oversimplify the complexities encountered in real-world RL implementations, where tasks arrive sequentially with substantial changes between tasks and the algorithm may not be allowed to adaptively learn within certain tasks. We study the changes beyond the outcome distributions, encompassing changes in the reward designs (mappings from outcomes to rewards) and the permissible policy spaces. Our results reveal the fallacy of myopically minimizing regret within each task: obtaining optimal regret rates in the early tasks may lead to worse rates in the subsequent ones, even when the outcome distributions stay the same. To realize the optimal cumulative regret bound across all the tasks, the algorithm has to overly explore in the earlier tasks. This theoretical insight is practically significant, suggesting that due to unanticipated changes (e.g., rapid technological development or human-in-the-loop involvement) between tasks, the algorithm needs to explore more than it would in the usual stationary setting within each task. Such implication resonates with the common practice of using clipped policies in mobile health clinical trials and maintaining a fixed rate of $ε$-greedy exploration in robotic learning.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning
Authors:
Zana Buçinca,
Siddharth Swaroop,
Amanda E. Paluch,
Susan A. Murphy,
Krzysztof Z. Gajos
Abstract:
Imagine if AI decision-support tools not only complemented our ability to make accurate decisions, but also improved our skills, boosted collaboration, and elevated the joy we derive from our tasks. Despite the potential to optimize a broad spectrum of such human-centric objectives, the design of current AI tools remains focused on decision accuracy alone. We propose offline reinforcement learning…
▽ More
Imagine if AI decision-support tools not only complemented our ability to make accurate decisions, but also improved our skills, boosted collaboration, and elevated the joy we derive from our tasks. Despite the potential to optimize a broad spectrum of such human-centric objectives, the design of current AI tools remains focused on decision accuracy alone. We propose offline reinforcement learning (RL) as a general approach for modeling human-AI decision-making to optimize human-AI interaction for diverse objectives. RL can optimize such objectives by tailoring decision support, providing the right type of assistance to the right person at the right time. We instantiated our approach with two objectives: human-AI accuracy on the decision-making task and human learning about the task and learned decision support policies from previous human-AI interaction data. We compared the optimized policies against several baselines in AI-assisted decision-making. Across two experiments (N=316 and N=964), our results demonstrated that people interacting with policies optimized for accuracy achieve significantly better accuracy -- and even human-AI complementarity -- compared to those interacting with any other type of AI support. Our results further indicated that human learning was more difficult to optimize than accuracy, with participants who interacted with learning-optimized policies showing significant learning improvement only at times. Our research (1) demonstrates offline RL to be a promising approach to model human-AI decision-making, leading to policies that may optimize human-centric objectives and provide novel insights about the AI-assisted decision-making space, and (2) emphasizes the importance of considering human-centric objectives beyond decision accuracy in AI-assisted decision-making, opening up the novel research challenge of optimizing human-AI interaction for such objectives.
△ Less
Submitted 14 April, 2024; v1 submitted 9 March, 2024;
originally announced March 2024.
-
reBandit: Random Effects based Online RL algorithm for Reducing Cannabis Use
Authors:
Susobhan Ghosh,
Yongyi Guo,
Pei-Yao Hung,
Lara Coughlin,
Erin Bonar,
Inbal Nahum-Shani,
Maureen Walton,
Susan Murphy
Abstract:
The escalating prevalence of cannabis use, and associated cannabis-use disorder (CUD), poses a significant public health challenge globally. With a notably wide treatment gap, especially among emerging adults (EAs; ages 18-25), addressing cannabis use and CUD remains a pivotal objective within the 2030 United Nations Agenda for Sustainable Development Goals (SDG). In this work, we develop an onlin…
▽ More
The escalating prevalence of cannabis use, and associated cannabis-use disorder (CUD), poses a significant public health challenge globally. With a notably wide treatment gap, especially among emerging adults (EAs; ages 18-25), addressing cannabis use and CUD remains a pivotal objective within the 2030 United Nations Agenda for Sustainable Development Goals (SDG). In this work, we develop an online reinforcement learning (RL) algorithm called reBandit which will be utilized in a mobile health study to deliver personalized mobile health interventions aimed at reducing cannabis use among EAs. reBandit utilizes random effects and informative Bayesian priors to learn quickly and efficiently in noisy mobile health environments. Moreover, reBandit employs Empirical Bayes and optimization techniques to autonomously update its hyper-parameters online. To evaluate the performance of our algorithm, we construct a simulation testbed using data from a prior study, and compare against commonly used algorithms in mobile health studies. We show that reBandit performs equally well or better than all the baseline algorithms, and the performance gap widens as population heterogeneity increases in the simulation environment, proving its adeptness to adapt to diverse population of study participants.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials
Authors:
Anna L. Trella,
Kelly W. Zhang,
Inbal Nahum-Shani,
Vivek Shetty,
Iris Yan,
Finale Doshi-Velez,
Susan A. Murphy
Abstract:
Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in cli…
▽ More
Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials. It emphasizes the responsibility of the algorithm to (1) safeguard participants and (2) preserve the scientific utility of the data for post-trial analyses. We also present a framework for pre-deployment planning and real-time monitoring to help algorithm developers and clinical researchers ensure algorithm fidelity. To illustrate our framework's practical application, we present real-world examples from the Oralytics clinical trial. Since Spring 2023, this trial successfully deployed an autonomous, online RL algorithm to personalize behavioral interventions for participants at risk for dental disease.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Evaluating the Effectiveness of Index-Based Treatment Allocation
Authors:
Niclas Boehmer,
Yash Nair,
Sanket Shah,
Lucas Janson,
Aparna Taneja,
Milind Tambe
Abstract:
When resources are scarce, an allocation policy is needed to decide who receives a resource. This problem occurs, for instance, when allocating scarce medical resources and is often solved using modern ML methods. This paper introduces methods to evaluate index-based allocation policies -- that allocate a fixed number of resources to those who need them the most -- by using data from a randomized…
▽ More
When resources are scarce, an allocation policy is needed to decide who receives a resource. This problem occurs, for instance, when allocating scarce medical resources and is often solved using modern ML methods. This paper introduces methods to evaluate index-based allocation policies -- that allocate a fixed number of resources to those who need them the most -- by using data from a randomized control trial. Such policies create dependencies between agents, which render the assumptions behind standard statistical tests invalid and limit the effectiveness of estimators. Addressing these challenges, we translate and extend recent ideas from the statistics literature to present an efficient estimator and methods for computing asymptotically correct confidence intervals. This enables us to effectively draw valid statistical conclusions, a critical gap in previous work. Our extensive experiments validate our methodology in practical settings, while also showcasing its statistical power. We conclude by proposing and empirically verifying extensions of our methodology that enable us to reevaluate a past randomized control trial to evaluate different ML allocation policies in the context of a mHealth program, drawing previously invisible conclusions.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
A Bayesian Approach to Online Learning for Contextual Restless Bandits with Applications to Public Health
Authors:
Biyonka Liang,
Lily Xu,
Aparna Taneja,
Milind Tambe,
Lucas Janson
Abstract:
Restless multi-armed bandits (RMABs) are used to model sequential resource allocation in public health intervention programs. In these settings, the underlying transition dynamics are often unknown a priori, requiring online reinforcement learning (RL). However, existing methods in online RL for RMABs cannot incorporate properties often present in real-world public health applications, such as con…
▽ More
Restless multi-armed bandits (RMABs) are used to model sequential resource allocation in public health intervention programs. In these settings, the underlying transition dynamics are often unknown a priori, requiring online reinforcement learning (RL). However, existing methods in online RL for RMABs cannot incorporate properties often present in real-world public health applications, such as contextual information and non-stationarity. We present Bayesian Learning for Contextual RMABs (BCoR), an online RL approach for RMABs that novelly combines techniques in Bayesian modeling with Thompson sampling to flexibly model a wide range of complex RMAB settings, such as contextual and non-stationary RMABs. A key contribution of our approach is its ability to leverage shared information within and between arms to learn unknown RMAB transition dynamics quickly in budget-constrained settings with relatively short time horizons. Empirically, we show that BCoR achieves substantially higher finite-sample performance than existing approaches over a range of experimental settings, including one constructed from a real-world public health campaign in India.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Non-Stationary Latent Auto-Regressive Bandits
Authors:
Anna L. Trella,
Walter Dempsey,
Finale Doshi-Velez,
Susan A. Murphy
Abstract:
We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world s…
▽ More
We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world settings, especially in emerging scientific fields such as behavioral health or education where there are few mechanistic models of the environment. If the AR order $k$ is known, we propose an algorithm that achieves $\tilde{O}(k\sqrt{T})$ regret in this setting. Empirically, our algorithm outperforms standard UCB across multiple non-stationary environments, even if $k$ is mis-specified.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Online Uniform Risk Times Sampling: First Approximation Algorithms, Learning Augmentation with Full Confidence Interval Integration
Authors:
Xueqing Liu,
Kyra Gan,
Esmaeil Keyvanshokooh,
Susan Murphy
Abstract:
In digital health, the strategy of allocating a limited treatment budget across risk times is crucial to reduce user fatigue. This strategy, however, encounters a significant obstacle due to the unknown actual number of risk times, a factor not adequately addressed by existing methods lacking theoretical guarantees. This paper introduces, for the first time, the online uniform risk times sampling…
▽ More
In digital health, the strategy of allocating a limited treatment budget across risk times is crucial to reduce user fatigue. This strategy, however, encounters a significant obstacle due to the unknown actual number of risk times, a factor not adequately addressed by existing methods lacking theoretical guarantees. This paper introduces, for the first time, the online uniform risk times sampling problem within the approximation algorithm framework. We propose two online approximation algorithms for this problem, one with and one without learning augmentation, and provide rigorous theoretical performance guarantees for them using competitive ratio analysis. We assess the performance of our algorithms using both synthetic experiments and a real-world case study on HeartSteps mobile applications.
△ Less
Submitted 1 April, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks
Authors:
Eura Nofshin,
Siddharth Swaroop,
Weiwei Pan,
Susan Murphy,
Finale Doshi-Velez
Abstract:
Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us unders…
▽ More
Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Fundamental limits of community detection from multi-view data: multi-layer, dynamic and partially labeled block models
Authors:
Xiaodong Yang,
Buyu Lin,
Subhabrata Sen
Abstract:
Multi-view data arises frequently in modern network analysis e.g. relations of multiple types among individuals in social network analysis, longitudinal measurements of interactions among observational units, annotated networks with noisy partial labeling of vertices etc. We study community detection in these disparate settings via a unified theoretical framework, and investigate the fundamental t…
▽ More
Multi-view data arises frequently in modern network analysis e.g. relations of multiple types among individuals in social network analysis, longitudinal measurements of interactions among observational units, annotated networks with noisy partial labeling of vertices etc. We study community detection in these disparate settings via a unified theoretical framework, and investigate the fundamental thresholds for community recovery. We characterize the mutual information between the data and the latent parameters, provided the degrees are sufficiently large. Based on this general result, (i) we derive a sharp threshold for community detection in an inhomogeneous multilayer block model \citep{chen2022global}, (ii) characterize a sharp threshold for weak recovery in a dynamic stochastic block model \citep{matias2017statistical}, and (iii) identify the limiting mutual information in an unbalanced partially labeled block model. Our first two results are derived modulo coordinate-wise convexity assumptions on specific functions -- we provide extensive numerical evidence for their correctness. Finally, we introduce iterative algorithms based on Approximate Message Passing for community detection in these problems.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
AI and Generative AI for Research Discovery and Summarization
Authors:
Mark Glickman,
Yi Zhang
Abstract:
AI and generative AI tools, including chatbots like ChatGPT that rely on large language models (LLMs), have burst onto the scene this year, creating incredible opportunities to increase work productivity and improve our lives. Statisticians and data scientists have begun experiencing the benefits from the availability of these tools in numerous ways, such as the generation of programming code from…
▽ More
AI and generative AI tools, including chatbots like ChatGPT that rely on large language models (LLMs), have burst onto the scene this year, creating incredible opportunities to increase work productivity and improve our lives. Statisticians and data scientists have begun experiencing the benefits from the availability of these tools in numerous ways, such as the generation of programming code from text prompts to analyze data or fit statistical models. One area that these tools can make a substantial impact is in research discovery and summarization. Standalone tools and plugins to chatbots are being developed that allow researchers to more quickly find relevant literature than pre-2023 search tools. Furthermore, generative AI tools have improved to the point where they can summarize and extract the key points from research articles in succinct language. Finally, chatbots based on highly parameterized LLMs can be used to simulate abductive reasoning, which provides researchers the ability to make connections among related technical topics, which can also be used for research discovery. We review the developments in AI and generative AI for research discovery and summarization, and propose directions where these types of tools are likely to head in the future that may be of interest to statistician and data scientists.
△ Less
Submitted 26 March, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Decomposition with Monotone B-splines: Fitting and Testing
Authors:
Lijun Wang,
Xiaodan Fan,
Hongyu Zhao,
Jun S. Liu
Abstract:
A univariate continuous function can always be decomposed as the sum of a non-increasing function and a non-decreasing one. Based on this property, we propose a non-parametric regression method that combines two spline-fitted monotone curves. We demonstrate by extensive simulations that, compared to standard spline-fitting methods, the proposed approach is particularly advantageous in high-noise s…
▽ More
A univariate continuous function can always be decomposed as the sum of a non-increasing function and a non-decreasing one. Based on this property, we propose a non-parametric regression method that combines two spline-fitted monotone curves. We demonstrate by extensive simulations that, compared to standard spline-fitting methods, the proposed approach is particularly advantageous in high-noise scenarios. Several theoretical guarantees are established for the proposed approach. Additionally, we present statistics to test the monotonicity of a function based on monotone decomposition, which can better control Type I error and achieve comparable (if not always higher) power compared to existing methods. Finally, we apply the proposed fitting and testing approaches to analyze the single-cell pseudotime trajectory datasets, identifying significant biological insights for non-monotonically expressed genes through Gene Ontology enrichment analysis. The source code implementing the methodology and producing all results is accessible at https://github.com/szcf-weiya/MonotoneDecomposition.jl.
△ Less
Submitted 9 April, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Recent Advances in Text Analysis
Authors:
Zheng Tracy Ke,
Pengsheng Ji,
Jiashun Jin,
Wanshan Li
Abstract:
Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze MADSta…
▽ More
Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze MADStat - a dataset on statistical publications that we collected and cleaned.
The application of Topic-SCORE and other methods on MADStat leads to interesting findings. For example, $11$ representative topics in statistics are identified. For each journal, the evolution of topic weights over time can be visualized, and these results are used to analyze the trends in statistical research. In particular, we propose a new statistical model for ranking the citation impacts of $11$ topics, and we also build a cross-topic citation graph to illustrate how research results on different topics spread to one another.
The results on MADStat provide a data-driven picture of the statistical research in $1975$--$2015$, from a text analysis perspective.
△ Less
Submitted 7 February, 2024; v1 submitted 1 January, 2024;
originally announced January 2024.
-
Ensemble methods for testing a global null
Authors:
Yaowu Liu,
Zhonghua Liu,
Xihong Lin
Abstract:
Testing a global null is a canonical problem in statistics and has a wide range of applications. In view of the fact that no uniformly most powerful test exists, prior and/or domain knowledge are commonly used to focus on a certain class of alternatives to improve the testing power. However, it is generally challenging to develop tests that are particularly powerful against a certain class of alte…
▽ More
Testing a global null is a canonical problem in statistics and has a wide range of applications. In view of the fact that no uniformly most powerful test exists, prior and/or domain knowledge are commonly used to focus on a certain class of alternatives to improve the testing power. However, it is generally challenging to develop tests that are particularly powerful against a certain class of alternatives. In this paper, motivated by the success of ensemble learning methods for prediction or classification, we propose an ensemble framework for testing that mimics the spirit of random forests to deal with the challenges. Our ensemble testing framework aggregates a collection of weak base tests to form a final ensemble test that maintains strong and robust power for global nulls. We apply the framework to four problems about global testing in different classes of alternatives arising from Whole Genome Sequencing (WGS) association studies. Specific ensemble tests are proposed for each of these problems, and their theoretical optimality is established in terms of Bahadur efficiency. Extensive simulations and an analysis of a real WGS dataset are conducted to demonstrate the type I error control and/or power gain of the proposed ensemble tests.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
A Variational Spike-and-Slab Approach for Group Variable Selection
Authors:
Buyu Lin,
Changhao Ge,
Jun S. Liu
Abstract:
We introduce a class of generic spike-and-slab priors for high-dimensional linear regression with grouped variables and present a Coordinate-ascent Variational Inference (CAVI) algorithm for obtaining an optimal variational Bayes approximation. Using parameter expansion for a specific, yet comprehensive, family of slab distributions, we obtain a further gain in computational efficiency. The method…
▽ More
We introduce a class of generic spike-and-slab priors for high-dimensional linear regression with grouped variables and present a Coordinate-ascent Variational Inference (CAVI) algorithm for obtaining an optimal variational Bayes approximation. Using parameter expansion for a specific, yet comprehensive, family of slab distributions, we obtain a further gain in computational efficiency. The method can be easily extended to fitting additive models. Theoretically, we present general conditions on the generic spike-and-slab priors that enable us to derive the contraction rates for both the true posterior and the VB posterior for linear regression and additive models, of which some previous theoretical results can be viewed as special cases. Our simulation studies and real data application demonstrate that the proposed method is superior to existing methods in both variable selection and parameter estimation. Our algorithm is implemented in the R package GVSSB.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
A Mean Field Approach to Empirical Bayes Estimation in High-dimensional Linear Regression
Authors:
Sumit Mukherjee,
Bodhisattva Sen,
Subhabrata Sen
Abstract:
We study empirical Bayes estimation in high-dimensional linear regression. To facilitate computationally efficient estimation of the underlying prior, we adopt a variational empirical Bayes approach, introduced originally in Carbonetto and Stephens (2012) and Kim et al. (2022). We establish asymptotic consistency of the nonparametric maximum likelihood estimator (NPMLE) and its (computable) naive…
▽ More
We study empirical Bayes estimation in high-dimensional linear regression. To facilitate computationally efficient estimation of the underlying prior, we adopt a variational empirical Bayes approach, introduced originally in Carbonetto and Stephens (2012) and Kim et al. (2022). We establish asymptotic consistency of the nonparametric maximum likelihood estimator (NPMLE) and its (computable) naive mean field variational surrogate under mild assumptions on the design and the prior. Assuming, in addition, that the naive mean field approximation has a dominant optimizer, we develop a computationally efficient approximation to the oracle posterior distribution, and establish its accuracy under the 1-Wasserstein metric. This enables computationally feasible Bayesian inference; e.g., construction of posterior credible intervals with an average coverage guarantee, Bayes optimal estimation for the regression coefficients, estimation of the proportion of non-nulls, etc. Our analysis covers both deterministic and random designs, and accommodates correlations among the features. To the best of our knowledge, this provides the first rigorous nonparametric empirical Bayes method in a high-dimensional regression setting without sparsity.
△ Less
Submitted 25 October, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies
Authors:
Ryan Sun,
Zachary McCaw,
Xihong Lin
Abstract:
Causal mediation analysis, pleiotropy analysis, and replication analysis are three highly popular genetic study designs. Although these analyses address different scientific questions, the underlying inference problems all involve large-scale testing of composite null hypotheses. The goal is to determine whether all null hypotheses - as opposed to at least one - in a set of individual tests should…
▽ More
Causal mediation analysis, pleiotropy analysis, and replication analysis are three highly popular genetic study designs. Although these analyses address different scientific questions, the underlying inference problems all involve large-scale testing of composite null hypotheses. The goal is to determine whether all null hypotheses - as opposed to at least one - in a set of individual tests should simultaneously be rejected. Various recent methodology has been proposed for the aforementioned situations, and an appealing empirical Bayes strategy is to apply the popular two-group model, calculating local false discovery rates (lfdr) for each set of hypotheses. However, such a strategy is difficult due to the need for multivariate density estimation. Furthermore, the multiple testing rules for the empirical Bayes lfdr approach and conventional frequentist z-statistics can disagree, which is troubling for a field that ubiquitously utilizes summary statistics. This work proposes a framework to unify two-group testing in genetic association composite null settings, the conditionally symmetric multidimensional Gaussian mixture model (csmGmm). The csmGmm is shown to demonstrate more robust operating characteristics than recently-proposed alternatives. Crucially, the csmGmm also offers strong interpretability guarantees by harmonizing lfdr and z-statistic testing rules. We extend the base csmGmm to cover each of the mediation, pleiotropy, and replication settings, and we prove that the lfdr z-statistic agreement holds in each situation. We apply the model to a collection of translational lung cancer genetic association studies that motivated this work.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Total Variation Floodgate for Variable Importance Inference in Classification
Authors:
Wenshuo Wang,
Lucas Janson,
Lihua Lei,
Aaditya Ramdas
Abstract:
Inferring variable importance is the key problem of many scientific studies, where researchers seek to learn the effect of a feature $X$ on the outcome $Y$ in the presence of confounding variables $Z$. Focusing on classification problems, we define the expected total variation (ETV), which is an intuitive and deterministic measure of variable importance that does not rely on any model context. We…
▽ More
Inferring variable importance is the key problem of many scientific studies, where researchers seek to learn the effect of a feature $X$ on the outcome $Y$ in the presence of confounding variables $Z$. Focusing on classification problems, we define the expected total variation (ETV), which is an intuitive and deterministic measure of variable importance that does not rely on any model context. We then introduce algorithms for statistical inference on the ETV under design-based/model-X assumptions. These algorithms build on the floodgate notion for regression problems (Zhang and Janson 2020). The algorithms we introduce can leverage any user-specified regression function and produce asymptotic lower confidence bounds for the ETV. We show the effectiveness of our algorithms with simulations and a case study in conjoint analysis on the US general election.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
Multi-Response Heteroscedastic Gaussian Process Models and Their Inference
Authors:
Taehee Lee,
Jun S. Liu
Abstract:
Despite the widespread utilization of Gaussian process models for versatile nonparametric modeling, they exhibit limitations in effectively capturing abrupt changes in function smoothness and accommodating relationships with heteroscedastic errors. Addressing these shortcomings, the heteroscedastic Gaussian process (HeGP) regression seeks to introduce flexibility by acknowledging the variability o…
▽ More
Despite the widespread utilization of Gaussian process models for versatile nonparametric modeling, they exhibit limitations in effectively capturing abrupt changes in function smoothness and accommodating relationships with heteroscedastic errors. Addressing these shortcomings, the heteroscedastic Gaussian process (HeGP) regression seeks to introduce flexibility by acknowledging the variability of residual variances across covariates in the regression model. In this work, we extend the HeGP concept, expanding its scope beyond regression tasks to encompass classification and state-space models. To achieve this, we propose a novel framework where the Gaussian process is coupled with a covariate-induced precision matrix process, adopting a mixture formulation. This approach enables the modeling of heteroscedastic covariance functions across covariates. To mitigate the computational challenges posed by sampling, we employ variational inference to approximate the posterior and facilitate posterior predictive modeling. Additionally, our training process leverages an EM algorithm featuring closed-form M-step updates to efficiently evaluate the heteroscedastic covariance function. A notable feature of our model is its consistent performance on multivariate responses, accommodating various types (continuous or categorical) seamlessly. Through a combination of simulations and real-world applications in climatology, we illustrate the model's prowess and advantages. By overcoming the limitations of traditional Gaussian process models, our proposed framework offers a robust and versatile tool for a wide array of applications.
△ Less
Submitted 30 August, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Dyadic Reinforcement Learning
Authors:
Shuangning Li,
Lluis Salvat Niell,
Sung Won Choi,
Inbal Nahum-Shani,
Guy Shani,
Susan Murphy
Abstract:
Mobile health aims to enhance health outcomes by delivering interventions to individuals as they go about their daily life. The involvement of care partners and social support networks often proves crucial in helping individuals managing burdensome medical conditions. This presents opportunities in mobile health to design interventions that target the dyadic relationship -- the relationship betwee…
▽ More
Mobile health aims to enhance health outcomes by delivering interventions to individuals as they go about their daily life. The involvement of care partners and social support networks often proves crucial in helping individuals managing burdensome medical conditions. This presents opportunities in mobile health to design interventions that target the dyadic relationship -- the relationship between a target person and their care partner -- with the aim of enhancing social support. In this paper, we develop dyadic RL, an online reinforcement learning algorithm designed to personalize intervention delivery based on contextual factors and past responses of a target person and their care partner. Here, multiple sets of interventions impact the dyad across multiple time intervals. The developed dyadic RL is Bayesian and hierarchical. We formally introduce the problem setup, develop dyadic RL and establish a regret bound. We demonstrate dyadic RL's empirical performance through simulation studies on both toy scenarios and on a realistic test bed constructed from data collected in a mobile health study.
△ Less
Submitted 1 November, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Online learning in bandits with predicted context
Authors:
Yongyi Guo,
Ziping Xu,
Susan Murphy
Abstract:
We consider the contextual bandit problem where at each time, the agent only has access to a noisy version of the context and the error variance (or an estimator of this variance). This setting is motivated by a wide range of applications where the true context for decision-making is unobserved, and only a prediction of the context by a potentially complex machine learning algorithm is available.…
▽ More
We consider the contextual bandit problem where at each time, the agent only has access to a noisy version of the context and the error variance (or an estimator of this variance). This setting is motivated by a wide range of applications where the true context for decision-making is unobserved, and only a prediction of the context by a potentially complex machine learning algorithm is available. When the context error is non-vanishing, classical bandit algorithms fail to achieve sublinear regret. We propose the first online algorithm in this setting with sublinear regret guarantees under mild conditions. The key idea is to extend the measurement error model in classical statistics to the online decision-making setting, which is nontrivial due to the policy being dependent on the noisy context observations. We further demonstrate the benefits of the proposed approach in simulation environments based on synthetic and real digital intervention datasets.
△ Less
Submitted 17 March, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Monotone Cubic B-Splines with a Neural-Network Generator
Authors:
Lijun Wang,
Xiaodan Fan,
Huabai Li,
Jun S. Liu
Abstract:
We present a method for fitting monotone curves using cubic B-splines, which is equivalent to putting a monotonicity constraint on the coefficients. We explore different ways of enforcing this constraint and analyze their theoretical and empirical properties. We propose two algorithms for solving the spline fitting problem: one that uses standard optimization techniques and one that trains a Multi…
▽ More
We present a method for fitting monotone curves using cubic B-splines, which is equivalent to putting a monotonicity constraint on the coefficients. We explore different ways of enforcing this constraint and analyze their theoretical and empirical properties. We propose two algorithms for solving the spline fitting problem: one that uses standard optimization techniques and one that trains a Multi-Layer Perceptrons (MLP) generator to approximate the solutions under various settings and perturbations. The generator approach can speed up the fitting process when we need to solve the problem repeatedly, such as when constructing confidence bands using bootstrap. We evaluate our method against several existing methods, some of which do not use the monotonicity constraint, on some monotone curves with varying noise levels. We demonstrate that our method outperforms the other methods, especially in high-noise scenarios. We also apply our method to analyze the polarization-hole phenomenon during star formation in astrophysics. The source code is accessible at \texttt{\url{https://github.com/szcf-weiya/MonotoneSplines.jl}}.
△ Less
Submitted 17 November, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.
-
The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning
Authors:
Sarah Rathnam,
Sonali Parbhoo,
Weiwei Pan,
Susan A. Murphy,
Finale Doshi-Velez
Abstract:
Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of d…
▽ More
Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of discount regularization that exposes unintended consequences. We demonstrate that planning under a lower discount factor produces an identical optimal policy to planning using any prior on the transition matrix that has the same distribution for all states and actions. In fact, it functions like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. Our equivalence theorem leads to an explicit formula to set regularization parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific method across simple empirical examples as well as a medical cancer simulator.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Effect-Invariant Mechanisms for Policy Generalization
Authors:
Sorawit Saengkyongam,
Niklas Pfister,
Predrag Klasnja,
Susan Murphy,
Jonas Peters
Abstract:
Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call f…
▽ More
Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.
△ Less
Submitted 27 June, 2023; v1 submitted 19 June, 2023;
originally announced June 2023.