https://arxiv.org/api/XnFAReHlGO4LFYnaBLNQZh/cGd4 2026-03-20T14:25:39Z 5183 45 15 http://arxiv.org/abs/2603.11381v1 On the Use of Design-Based Simulations 2026-03-11T23:50:22Z Design-based simulations - procedures that hold realized outcomes fixed and generate variation by resampling treatment assignment or shocks - are widely used in both methodological and applied work to assess inference procedures. This paper studies the extent to which such simulations are informative about inference validity. Focusing on shift-share designs, we show that standard simulations that fix outcomes and resample shocks may rely on a data-generating process that is not aligned with the true one. In particular, these simulations confound true treatment effects with error dependence, potentially overstating inference distortions due to spatial correlation. We propose alternative simulation designs that circumvent this problem and illustrate their use in prominent empirical applications. Our results highlight that the usefulness of design-based simulations depends critically on how closely the simulated data-generating process aligns with the true one. 2026-03-11T23:50:22Z Bruno Ferman http://arxiv.org/abs/2603.11368v1 Spatially Robust Inference with Predicted and Missing at Random Labels 2026-03-11T23:14:21Z When outcome data are expensive or onerous to collect, scientists increasingly substitute predictions from machine learning and AI models for unlabeled cases, a process which has consequences for downstream statistical inference. While recent methods provide valid uncertainty quantification under independent sampling, real-world applications involve missing at random (MAR) labeling and spatial dependence. For inference in this setting, we propose a doubly robust estimator with cross-fit nuisances. We show that cross-fitting induces fold-level correlation that distorts spatial variance estimators, producing unstable or overly conservative confidence intervals. To address this, we propose a jackknife spatial heteroscedasticity and autocorrelation consistent (HAC) variance correction that separates spatial dependence from fold-induced noise. Under standard identification and dependence conditions, the resulting intervals are asymptotically valid. Simulations and benchmark datasets show substantial improvement in finite-sample calibration, particularly under MAR labeling and clustered sampling. 2026-03-11T23:14:21Z Stephen Salerno Zhenke Wu Tyler McCormick http://arxiv.org/abs/2603.10999v1 Double Machine Learning for Time Series 2026-03-11T17:22:57Z We modify the Double Machine Learning estimator to broaden its applicability to macroeconomic time-series settings. A deterministic cross-fitting step, termed Reverse Cross-Fitting, leverages the time-reversibility of stationary series to improve sample utilization and efficiency. We detail and prove the conditions under which the estimator is asymptotically valid. We then demonstrate, through simulations, that its performance remains valid in realistic finite samples and is robust to model misspecification and violations of assumptions, such as heteroskedasticity. In high dimensions, predictive metrics for tuning nuisance learners do not generally minimize bias in the causal score. We propose a calibration rule targeting a "Goldilocks zone", a region of tuning parameters that delivers stable, partialled-out signals and reduced small-sample bias. Finally, we apply our procedure to residualized Local Projections to estimate the dynamic effects of a rise in Tier 1 regulatory capital. The results underscore the usefulness of the methodology for inference in macroeconomic applications. 2026-03-11T17:22:57Z Milos Ciganovic Federico D'Amario Massimiliano Tancioni http://arxiv.org/abs/2602.09382v3 Initial-Condition-Robust Inference in Autoregressive Models 2026-03-11T08:14:42Z This paper considers confidence intervals (CIs) for the autoregressive (AR) parameter in an AR model with an AR parameter that may be close or equal to one. Existing CIs rely on the assumption of a stationary or fixed initial condition to obtain correct asymptotic coverage and good finite sample coverage. When this assumption fails, their coverage can be quite poor. In this paper, we introduce a new CI for the AR parameter whose coverage probability is completely robust to the initial condition, both asymptotically and in finite samples. This CI pays only a small price in terms of its length when the initial condition is stationary or fixed. The new CI also is robust to conditional heteroskedasticity of the errors. 2026-02-10T03:50:59Z Donald W. K. Andrews Ming Li Yapeng Zheng http://arxiv.org/abs/2603.10382v1 Gimbal Regression: Orientation-Adaptive Local Linear Regression under Spatial Heterogeneity 2026-03-11T03:51:57Z Local regression is widely used to explore spatial heterogeneity, but anisotropic or effectively low-dimensional neighborhoods can produce ill-conditioned local solves, causing coefficient variation driven by numerical artifacts rather than substantive structure. Such instability is often hidden when estimation relies on implicit tuning or optimization without exposing local diagnostics. This paper proposes Gimbal Regression (GR), a deterministic, geometry-aware local regression framework for stable and auditable estimation. GR constructs directional weights from neighborhood geometry using explicit orientation objects and deterministic safeguards, and computes local coefficients by a closed-form solve. Theoretical results are stated conditional on the realized neighborhood configuration, under which the estimator is a deterministic linear operator with finite-perturbation stability bounds. Simulations and empirical examples demonstrate predictable computation, transparent diagnostics, and improved numerical stability relative to common local regression baselines. 2026-03-11T03:51:57Z Yuichiro Otani http://arxiv.org/abs/2505.02327v4 Slope Consistency of Quasi-Maximum Likelihood Estimator for Binary Choice Models 2026-03-11T02:13:28Z Although QMLE is generally inconsistent, logistic regression relying on the binary choice model (BCM) with logistic errors is widely used, especially in machine learning contexts with many covariates. This paper revisits the slope consistency of QMLE for BCMs. Ruud (1983) introduced a set of conditions under which QMLE may yield a constant multiple of the slope coefficient of BCMs asymptotically. However, he did not fully establish the slope consistency of QMLE, which requires the existence of a positive multiple of the true slope that maximizes the population QMLE likelihood over an appropriately restricted parameter space. We close this gap by providing a formal proof of slope consistency under the same set of conditions for BCMs identified as in Manski (1975, 1985). Our result implies that, under suitable conditions, logistic regression yields a consistent estimate of the slope coefficient for BCMs. 2025-05-05T02:48:44Z Yoosoon Chang Joon Y. Park Guo Yan http://arxiv.org/abs/2207.02943v2 Degrees of Freedom and Information Criteria for the Synthetic Control Method 2026-03-11T01:03:20Z We provide an analytical characterization of the model flexibility of the synthetic control method (SCM) in the familiar form of degrees of freedom. We obtain estimable information criteria, which may be used to circumvent cross-validation when selecting either the tuning parameter in penalized variants of SCM or the weighting matrix in the SCM with covariates. We assess the impact of car license rationing in Tianjin; while a natural match is available, both it and other donors are noisy, inviting the use of SCM to average over approximately matching donors. The very large number of candidate donors calls for penalized variants of SCM and we observe that model selection using information criteria outperforms that based on cross-validation. 2022-07-06T19:52:03Z Guillaume Allaire Pouliot Zhen Xie Ziyi Liu http://arxiv.org/abs/2603.10152v1 Shrinkage Regularization for (Non)Linear Serial Dependence Test 2026-03-10T18:42:11Z This paper introduces a regularized test of the null hypothesis of the absence of linear and nonlinear serial dependence for high-dimensional non-Gaussian time series. Our approach extends the portmanteau test introduced in Jasiak and Neyazi (2023) to the high-dimensional setting. 2026-03-10T18:42:11Z 10 pages Francesco Giancaterini Alain Hecq Joann Jasiak Aryan Manafi Neyazi http://arxiv.org/abs/1904.11060v8 Normal Approximation in Large Network Models 2026-03-10T17:15:58Z We prove a central limit theorem for network formation models with strategic interactions and homophilous agents. Since data often consists of observations on a single large network, we consider an asymptotic framework in which the network size diverges. We argue that a modification of ``stabilization'' conditions from the literature on geometric graphs provides a useful high-level formulation of weak dependence which we utilize to establish an abstract central limit theorem. Using results in branching process theory, we derive interpretable primitive conditions for stabilization. The main conditions restrict the strength of strategic interactions and equilibrium selection mechanism. We discuss practical inference procedures justified by our results. 2019-04-24T20:42:11Z Michael P. Leung Hyungsik Roger Moon http://arxiv.org/abs/2410.05861v2 Persistence-Robust Break Detection in Predictive CoVaR Regressions 2026-03-10T09:43:44Z Forecasting risk (as measured by quantiles) and systemic risk (as measured by Adrian and Brunnermeiers's (2016) CoVaR) is important in economics and finance. However, past research has shown that predictive relationships may be unstable over time. Therefore, this paper develops structural break tests in predictive quantile and CoVaR regressions. These tests can detect changes in the forecasting power of covariates, and are based on the principle of self-normalization. We show that our tests are valid irrespective of whether the predictors are stationary or near-stationary, rendering the tests suitable for a range of practical applications. Simulations illustrate the good finite-sample properties of our tests. Two empirical applications concerning equity premium and systemic risk forecasting models show the usefulness of the tests. 2024-10-08T09:52:34Z Yannick Hoga http://arxiv.org/abs/2509.15169v3 Trade Dynamics with Heterogeneous Fluctuations 2026-03-10T04:32:46Z In this paper, we design two chapters to discuss trade dynamics with heterogeneous fluctuations, contributing new insights to macroeconomic issues related to international trade. In the first chapter, we model general exchange rate fluctuations through stochastic processes and analyze the impact of heterogeneous price shocks on export competitiveness. We find that monetary policy and innovation both show positive effects on export trade, while monetary policy stabilizes exchange rate fluctuations to comprehensively boost provincial export competitiveness, innovation reduces its reliance on exchange rate mechanisms. The optimal policy according to exchange rate fluctuations aims to solve the wealth distribution of exporters, and it suggests that optimal policy should promote dynamic transitions in trade patterns rather than maintain existing comparative advantages in heterogeneous trade structures. In the second chapter, we model labor market fluctuations and the ability to utilize production factors through stochastic processes, and we analyze the impact of heterogeneous aggregate production shocks on general international trade. We find that labor market fluctuations only benefit international trade under the cooperation policy. Moreover, for both sanction and cooperation policy scenarios, positive shocks (i.e., shocks where average wage growth in the labor market exceeds unemployment) strengthen their impact on import trade while weakening their impact on export trade, and vice versa. Regarding the theories proposed in these two chapters, we prove them through empirical analyses using the provincial data of China. 2025-09-18T17:21:32Z 95 pages, 47 figures Yongheng Hu http://arxiv.org/abs/2307.14282v4 Causal Effects in Matching Mechanisms with Strategically Reported Preferences 2026-03-09T19:13:17Z A growing number of authorities use mechanisms to allocate students to schools in a way that reflects student preferences and school priorities. However, most real-world mechanisms incentivize students to strategically misreport their preferences. Misreporting complicates the identification of causal parameters that depend on true preferences, which are necessary inputs for a broad class of counterfactual analyses. We provide an identification approach robust to misreporting and derive sharp bounds on causal effects of school assignment. Our approach applies to allocation rules characterized by placement scores and cutoffs. We use data from a deferred acceptance mechanism that assigns students to university programs in Chile. Matching theory predicts and empirical evidence shows that students behave strategically in Chile because they face constraints on preference submission and have good prior information about school accessibility. Our bounds are informative enough to reveal significant heterogeneity in graduation success with respect to preferences and school assignment. 2023-07-26T16:35:42Z Marinho Bertanha Margaux Luflade Ismael Mourifié http://arxiv.org/abs/2503.00290v3 GMM and M Estimation under Network Dependence 2026-03-09T17:17:24Z This paper presents GMM and M estimators and their asymptotic properties for network-dependent data. To this end, I build on Kojevnikov, Marmer, and Song (KMS, 2021) and develop a novel uniform law of large numbers (ULLN), which is essential to ensure desired asymptotic behaviors of nonlinear estimators (e.g., Newey and McFadden, 1994, Section 2). Using this ULLN, I establish the consistency and asymptotic normality of both GMM and M estimators. For practical convenience, complete estimation and inference procedures are also provided. 2025-03-01T01:46:53Z Yuya Sasaki http://arxiv.org/abs/2603.08614v1 Online Learning in Semiparametric Econometric Models 2026-03-09T16:58:03Z Data in modern economic and financial applications often arrive as a stream, requiring models and inference to be updated in real time -- yet most semiparametric methods remain batch-based and computationally impractical in large-scale streaming settings. We develop an online learning framework for semiparametric monotone index models with an unknown monotone link function. Our approach uses a two-phase learning paradigm. In a warm-start phase, we introduce a new online algorithm for the finite-dimensional parameter that is globally stable, yielding consistent estimation from arbitrary initialization. In a subsequent rate-optimal phase, we update the finite-dimensional parameter using an orthogonalized score while learning the unknown link via an online sieve method; this phase achieves optimal convergence rates for both components. The procedure processes only the most recent data batch, making it suitable when data cannot be stored (e.g., memory, privacy, or security constraints), and its resulting parameter trajectories enable online inference such as confidence regions--on parameters including policy-effect analysis with negligible additional computation. Monte Carlo experiments on both simulated and real data show adequate performance especially relative to full sample methods. 2026-03-09T16:58:03Z Xiaohong Chen Elie Tamer Qingsong Yao http://arxiv.org/abs/2506.05996v3 Statistical significance in choice modelling: computation, usage and reporting 2026-03-09T16:12:29Z This paper offers a commentary on the use of notions of statistical significance in choice modelling. We review the reasons for uncertainty in parameter estimates, provide a precise discussion on the computation of measures of uncertainty and confidence intervals, and discuss the use of statistical tests. We argue that, as in many other areas of science, there is an over-reliance on 95\% confidence levels, and misunderstandings of the meaning of significance. We also observe a lack of precision in the reporting of measures of uncertainty in many studies, especially when using $p$-values and even more so with \emph{star} measures. The paper also stresses the importance of considering behavioural or policy significance in addition to statistical significance. Finally, we stress a number of points that are specific to choice modelling and which require special attention, notably in relation to derived measures such as willingness-to-pay, the treatment of random heterogeneity, and the use of repeated choice data. 2025-06-06T11:35:06Z Stephane Hess Andrew Daly Michiel Bliemer Angelo Guevara Ricardo Daziano Thijs Dekker