https://arxiv.org/api/FTP53nunaQO4aB+EQ8cCcYBYRnU 2026-07-20T04:34:25Z 5621 30 15 http://arxiv.org/abs/2607.12219v1 Partial Identification with Multiple Nonlinear Measurements of a Latent Regressor 2026-07-13T23:33:54Z

We study linear regression when the regressor is latent and observed only through multiple noisy measurements, each a smooth but possibly nonlinear function of the latent variable. The problem is acute in the measurement of occupational exposure to artificial intelligence, where competing scores yield downstream estimates that differ by a factor of eleven. A regression on any single measurement recovers a source-specific coefficient rather than the structural one. We fix the latent scale by requiring the consensus measurement function to be linear and bound the remaining curvature heterogeneity across sources relative to slope. Under this bound, the structural coefficient lies in a closed-form interval centered at a symmetric cross-source estimator. The interval is invariant to unknown source loadings, and its half-width is second order in the curvature bound and sharp to the same order. With at least four measurements, the bound is estimable from the joint distribution of the sources through a split-instrument auxiliary regression, and Imbens-Manski confidence intervals with the Stoye critical value attain uniform coverage over the curvature class, including at the point-identified boundary. The application matches six exposure measures to an American Community Survey panel of 8.88 million person-year observations for 2015 to 2024. The post-2022 employment coefficient changes sign between the language-model measures and the Webb patent-text measure, and an ex ante factor-analytic rule separates the Webb measure as a distinct construct. The five retained sources yield a loading-invariant consensus coefficient of -0.239, with a partial-identification half-width of 1.23 percent of the point estimate, or 1.88 percent at the one-sided 95 percent upper bound on the curvature. We read the application as measurement reconciliation rather than as a causal estimate of AI displacement.

2026-07-13T23:33:54Z Burhan Ogut Michelle Yin http://arxiv.org/abs/2405.07860v5 Order-Explicit Linearization of High-Dimensional $U$-Statistics 2026-07-13T19:52:34Z

We give an order-explicit large deviation bound for the difference between a high-dimensional $U$-statistic and its Hájek projection. In particular, we show that any $U$-statistic of order $b$ on $n$ observations, with a $d$-dimensional kernel whose coordinates have $ψ_1$-Orlicz norm at most $φ$, has a maximum deviation from its Hájek projection of order $O_p(φb n^{-1}\log^2(dn))$. The proof relies on the development of novel order-explicit moment inequalities for higher-order Hoeffding components. We show that this rate is unimprovable, up to the polynomial factor on the logarithmic term. As corollaries, we obtain new Bernstein-type concentration and Gaussian approximation results for high-dimensional $U$-statistics. We apply these results to establish the consistency of a set of resampling-based simultaneous confidence intervals built around a class of nonparametric regression estimators constructed with subsampled kernels. This class encompasses several forms of random forest regression, including Generalized Random Forests.

2024-05-13T15:46:11Z David M. Ritzwoller Vasilis Syrgkanis http://arxiv.org/abs/2607.11694v1 Calibrated Horizon-Weighted Local Projection Designs for Markov Switchbacks 2026-07-13T15:27:48Z

We study temporal assignment design for Markov switchback experiments when the reported object is a dynamic local-projection target. We develop a calibrated selector that chooses the feasible persistence minimizing the covariance, HAC, residual-bootstrap, or realized-schedule risk of the estimator and reporting object specified before the experiment. A balanced homoskedastic Markov benchmark yields a closed form because the lagged-assignment information matrix is AR(1)-Toeplitz with a tridiagonal inverse. The benchmark maps local-projection reporting weights into persistence recommendations within a prespecified first-order Markov class. Field recommendations replace the benchmark covariance with residualized, serially dependent, pilot-calibrated, or randomization-based risk. A semi-synthetic Low Carbon London evaluation uses observed half-hourly baseline dynamics and known injected responses to assess design risk. It evaluates the covariance calculations under realistic load autocovariance and identifies when calibrated covariance selection should replace the homoskedastic Markov formula. Near-boundary designs use randomization-first inference when many-spell normal approximations are unsupported.

2026-07-13T15:27:48Z Makoto Nakakita Teruo Nakatsuma http://arxiv.org/abs/2405.16547v2 Estimating Dyadic Treatment Effects with Unknown Confounders 2026-07-13T12:17:42Z

This paper proposes estimation and inference methods for assessing treatment effects with dyadic data. Under the assumption that the treatments follow an exchangeable distribution, our approach allows for the presence of any unobserved confounding factors that potentially cause endogeneity of treatment choice without requiring additional information other than the treatments and outcomes. Building on the literature of graphon estimation in network data analysis, we propose a neighbourhood kernel smoothing method for estimating dyadic average treatment effects, and derive the rate of convergence of the proposed estimator under certain regularity conditions. We also develop conformal inference methods for predicting outcomes conditional on treatment status. We apply our methods to international trade data to assess the impact of free trade agreements on bilateral trade flows.

2024-05-26T12:32:14Z Tadao Hoshino Takahide Yanagi http://arxiv.org/abs/2312.01162v4 High-dimensional inference on jumps in nonparametric time series regression models 2026-07-13T10:47:38Z

We study simultaneous inference on jumps in the conditional mean functions of a high-dimensional collection of heterogeneous nonparametric time series, where the number of series may exceed the sample size and the data may exhibit strong cross-sectional dependence. The jump depends on one specific covariate, and we allow the regression function to vary with additional latent variables. We propose two uniform tests: one for the existence of jumps and one for their homogeneity across series. We derive a simple closed-form approximation to the covariance structure of the jump estimators and establish a high-dimensional Gaussian approximation showing that, owing to the localized construction of the statistics, the maximum of the studentized jumps is approximated by the maximum of independent Gaussians. The cross-sectional dependence is thus asymptotically negligible for critical values, even under strong (e.g., factor) dependence, and the approximation requires estimating only the variance for each series. For pronounced cross-sectional dependence, a dependence-aware refinement restores the off-diagonal covariances, improving finite-sample size and power. Simulations show accurate size and reasonable power under both cross-sectional and serial dependence, and two empirical applications reveal significant non-smooth effects.

2023-12-02T15:52:24Z Likai Chen Georg Keilbar Liangjun Su Weining Wang http://arxiv.org/abs/2607.11268v1 Can looser ties sustain marriage? A dynamic matching model of specialisation and divorce 2026-07-13T08:51:31Z

Durable marriages are presumed to foster the household specialisation that marriage enables. We exploit a recent Dutch reform that temporarily lowered the cost of divorce while leaving consent requirements unchanged. We embed divorce hazards obtained from population-level administrative data into a dynamic structural matching model in which individuals repeatedly match and choose marital roles. We identify the structural parameters by fitting the model to the equilibrium matching distribution over time, using a novel computational approach. Compared to the high-cost counterfactual, we find that more couples choose marriage when divorce costs are lower, as higher rates of marriage entry outweigh the rise in divorce. Because specialisation is preserved, aggregate welfare rises.

2026-07-13T08:51:31Z Stefan Hubner Jan Kabatek http://arxiv.org/abs/2601.04663v4 Quantile Vector Autoregression without Crossing 2026-07-13T07:40:31Z

This paper considers estimation and model selection of quantile vector autoregression (QVAR). Conventional quantile regression often yields undesirable crossing quantile curves, violating the monotonicity of quantiles. To address this issue, we propose a simplex quantile vector autoregression (SQVAR) framework, which transforms the autoregressive (AR) structure of the original QVAR model into a simplex, ensuring that the estimated quantile curves remain monotonic across all quantile levels. In addition, we impose the smoothly clipped absolute deviation (SCAD) penalty on the SQVAR model to mitigate the explosive nature of the parameter space. We further develop a Bayesian information criterion (BIC)-based procedure for selecting the optimal penalty parameter and introduce new frameworks for impulse response analysis of QVAR models. Finally, we establish asymptotic properties of the proposed method, including the convergence rate and asymptotic normality of the estimator, the consistency of AR order selection, and the validity of the BIC-based penalty selection. For illustration, we apply the proposed method to U.S. stock market data, highlighting the usefulness of our SQVAR method.

2026-01-08T07:19:19Z Tomohiro Ando Tadao Hoshino Ruey Tsay http://arxiv.org/abs/2607.10943v1 It Takes Two to Tango, but More to Assess Systemic Risk: Credit Networks Through the Lens of Hypergraphs 2026-07-12T22:18:59Z

This paper provides the first analysis of credit relationships between financial institutions and firms through the lens of hypergraphs. Unlike traditional network approaches, which rely on pairwise connections, this framework explicitly represents the shared exposure of multiple financial institutions to the same firm as a simultaneous multilateral relationship. The approach is applied empirically to Credit Registry data from the Central Bank of Argentina, covering the period from August 2023 to December 2025 and focusing on commercial loans between banks and firms. Traditional centrality metrics are compared with hypergraph-specific measures to identify systemically relevant institutions. The paper also proposes an adjusted version of H-eigenvector centrality that nonlinearly weights both the centrality of neighboring institutions and each creditor's lending amount, in order to assess the relevance of a bank within the network. The systemic impact of shocking the top-ranked institutions according to each centrality metric is then estimated through an adaptation of the DebtRank algorithm. The results show that the proposed framework identifies institutions with greater shock-amplification capacity, providing a complementary tool for financial supervision and regulation.

2026-07-12T22:18:59Z 26 pages, 11 figures, 2 tables Federico D. Forte http://arxiv.org/abs/2501.07550v3 disco: Distributional Synthetic Controls 2026-07-12T16:56:28Z

The method of synthetic controls is widely used for evaluating causal effects of policy changes in settings with observational data. Often, researchers aim to estimate the causal impact of policy interventions on a treated unit at an aggregate level while also possessing data at a finer granularity. In this article, we introduce the new disco command, which implements the Distributional Synthetic Controls method introduced in Gunsilius (2023, Econometrica 91: 1105-1117). This command allows researchers to construct entire synthetic distributions for the treated unit based on an optimally weighted average of the distributions of the control units. Several aggregation schemes are provided to facilitate clear reporting of the distributional effects of the treatment. The package offers both quantile-based and cumulative distribution function-based approaches, comprehensive inference procedures via bootstrap and permutation methods, and visualization capabilities. We empirically illustrate the use of the package by replicating the results in Van Dijcke, Gunsilius, and Wright (2026, Review of Economics and Statistics, forthcoming).

2025-01-13T18:36:38Z 19 pages, 4 figures, replication code available at https://tinyurl.com/msz9ct2e, software available at https://github.com/Davidvandijcke/DiSCos_stata Florian Gunsilius David Van Dijcke http://arxiv.org/abs/2607.11961v1 KRAFT: A Transaction-Level Dataset for Korean Apartment Sales Integrated with Contextual Indicators 2026-07-12T10:19:19Z

Apartment transaction records are useful for studying housing markets, household finance, regional economics, and macro-financial transmission, but transaction data are often distributed separately from contextual socioeconomic indicators. We present KRAFT, a nationwide transaction-level dataset of South Korean apartment sales from January 2015 to December 2024. The dataset contains 5,320,379 apartment sale transactions across all 17 Sido regions and includes transaction timing, administrative location, exclusive residential area, reported transaction price, floor level, and construction year. KRAFT also provides auxiliary indicators covering macro-financial conditions, demographic structure, education infrastructure, private education expenditure, housing price indices, consumer sentiment, and economic policy uncertainty. The released files are organized as year-specific transaction files and separate auxiliary data tables to preserve the original temporal and spatial resolution of each source. KRAFT supports reproducible research on apartment price modeling, regional housing-market comparison, housing-demand analysis, and links between housing transactions and socioeconomic context.

2026-07-12T10:19:19Z Sejin Myung Hyungjoon Kim http://arxiv.org/abs/2604.09858v2 Coupling Designs for Randomized Experiments with Complex Treatments 2026-07-12T08:55:32Z

We describe a new family of experimental designs that extends the principle of stratified randomization to settings with continuous, constrained multivariate, and other irregular treatment spaces. Our approach is to first match units into homogeneous groups, then use Monte Carlo couplings to assign within-group treatments to be highly dispersed over the treatment space. We show that ensuring similar units receive dissimilar treatments improves estimation efficiency. The efficiency gains are proportional to the product of dispersion and match quality, where dispersion measures how spread out the assignments are relative to independent randomization. We develop a new spectral analysis showing how efficiency depends on alignment between the smoothness and shape of the estimator's influence function and the coupling's principal directions. We illustrate these designs with examples from development, behavioral, and labor economics. In particular, our empirical application uses data from a real experiment allocating savings monitors using their position within village social networks.

2026-04-10T19:39:32Z Max Cytrynbaum Fredrik Sävje http://arxiv.org/abs/2607.10613v1 Network-Adjusted GMM Estimation under Network Uncertainty 2026-07-12T07:18:02Z

This paper proposes a network-adjusted generalized method of moments (NA-GMM) estimator for social interaction models when the observed network may differ from the true interaction network. NA-GMM is a novel penalized GMM approach that allows the elements of the observed interaction matrix to be modified to improve the fit of the moment conditions. To avoid unrestricted network adjustments, the NA-GMM criterion introduces a penalty on the amount of adjustment. Since NA-GMM does not aim to estimate the true interaction network itself, the estimator generally converges to a pseudo-true parameter. For a linear spatial autoregressive model, we prove that the NA-GMM estimator is consistent for the pseudo-true parameter and is asymptotically normally distributed under general moment misspecification. We also prove that a fixed-weight version of the NA-GMM estimator has a desirable bias reduction property relative to conventional GMM without network adjustment. An empirical application to U.S. county-level COVID-19 infection data demonstrates the usefulness of the proposed method.

2026-07-12T07:18:02Z Tadao Hoshino http://arxiv.org/abs/2607.10558v1 Local Asymptotic Power of Honest Confidence Intervals 2026-07-12T04:17:16Z

Confidence intervals that are conservative against an untestable bias, called bias-aware or honest, are now standard in DiD, IV, RD, and factor-model settings. This paper characterises the local power of the tests they induce. Power is governed by the rate of the bias bound relative to the sampling rate, giving three regimes: when the bound vanishes faster than the standard error, conservatism is asymptotically free; when the two are of the same order it costs a bounded, explicit amount; and when the bound dominates, the typical case at the parametric rate, the honest test has zero local power, failing to reject local alternatives with probability approaching one. A minimax argument shows this loss is intrinsic to honesty itself, not a property of any particular construction. No honest procedure recovers it, and the standard bias-aware interval is rate-optimal. Broadly, any confidence interval whose width fails to shrink fast enough has no local power in the interior of the set it traces out, and at best one-sided power at the boundary. Partial identification is the limiting case of this argument. Simulations and two empirical applications illustrate the three regimes. The practical recommendation is to report the half-width of the power "dead zone" alongside bias-aware intervals.

2026-07-12T04:17:16Z Hugo Freeman http://arxiv.org/abs/2607.10519v1 Dynamically Consistent Statistical Decisions 2026-07-12T00:45:36Z

A large literature in econometrics proposes decision rules with optimality guarantees based on ex ante criteria, such as minimax regret. We develop a framework for analyzing the dynamic consistency of such rules and show that, in many empirically relevant settings, the researcher may wish to deviate from the interim prescription of ex ante optimal rules after observing the data realization. To address this problem, we propose and axiomatize two classes of optimality criteria that yield dynamically consistent decision rules.

2026-07-12T00:45:36Z 45 pages, 3 figures Cheaheon Lim Yechan Park http://arxiv.org/abs/2607.10276v1 Bayesian Robustness Values for Modern Causal Panel Estimators via Riesz Representations 2026-07-11T12:15:21Z

We develop a sensitivity-analysis workflow for causal panel estimators, covering synthetic difference-in-differences, matrix completion, fixed-effect imputation, and group-time average treatment effects. The workflow combines Riesz-representation omitted-variable-bias bounds with partial-$R^2$ robustness values and separates two reporting routes. Route A gives a direct sensitivity profile for additive or projected confounding summarized by outcome-side and Riesz-side partial $R^2$ values. Route B treats observed-covariate benchmarks as auxiliary data only when benchmark-count, alpha-side alignment, model-check, dependence, and dominance diagnostics are credible; otherwise its main role is demotion. We derive estimator-specific Riesz diagnostics and clarify which are fixed-weight, target-level, or first-stage-conditional rather than full derivatives of regularized training maps. Monte Carlo stress tests distinguish calibrated benchmark settings from dominance failure, coarse alpha-side benchmarks, benchmark dependence, noisy covariates, and concentrated SDID weights. In the California tobacco-control panel, the SDID estimate is $-15.60$ packs per capita; corrected finite-donor placebo inference gives standard error 9.49 and add-one $p=0.051$. A refit-weight finite-difference audit changes the Route A nullification robustness value from 0.054 to 0.045, leaving the low-single-digit conclusion unchanged. A county-level minimum-wage application applies the same profile to a multi-cohort staggered panel.

2026-07-11T12:15:21Z Makoto Nakakita Takahiro Hoshino