https://arxiv.org/api/F1A1IF+Pem53VywFaSLCiDtE9jw 2026-06-22T05:44:36Z 12181 300 15 http://arxiv.org/abs/2605.06420v1 Beyond Object-Level Alignment: Do Brains and DNNs Preserve the Same Transformations? 2026-05-07T15:27:31Z

Brain-DNN alignment is usually assessed through stimulus-level correspondence or stimulus-set geometry. Inspired by category theory, we operationalize a different question: do brain and model preserve the same candidate transformations among stimuli? We formalize this as approximate naturality: if a proxy-defined stimulus change is propagated through the brain side and then translated to the model side, the result should match translating first and then propagating, so that the naturality square approximately commutes. We quantify deviations from commutativity by a Naturality Violation Score (NVS) normalized to a permutation null, shifting alignment from per-stimulus sameness to preservation of structure under an explicitly chosen comparison map. As a proof of concept, a controlled five-factor synthetic setting shows that NVS separates complementary alignment failures that aggregate object- and geometry-level scalars cannot resolve. Applied to fMRI responses from the GOD dataset (5 subjects), 3 vision DNNs, and 3 World-Model proxy embeddings, the axis-resolved analysis reveals a hierarchy crossover: semantic axes align most strongly toward HVC and deeper DNN layers (NVS^animacy = 0.39 vs 0.52 for the next-best axis and 1.0 for the permutation-null baseline), whereas low- and mid-level visual axes align toward earlier visual cortex and shallower layers. Supporting analyses (a 15-axis appendix atlas, dissociation tests against RSA/CKA and encoding/decoding accuracy, and a W-less anchor-ablation control) confirm that the alignment is selective over candidate morphism families rather than uniform. NVS thereby turns brain-DNN comparison into a test of jointly preserved candidate transformations, relative to an explicit proxy space and permutation null, and opens a path to richer proxy spaces and controlled world-side transformations.

2026-05-07T15:27:31Z Yukiyasu Kamitani http://arxiv.org/abs/2605.18789v1 Features have life history. And we should care 2026-05-07T15:12:42Z

Features in language models have life history: they emerge, persist, and die during training, yet the importance of that history remains largely unexplored. We find evidence of a persistent representational backbone, which we identify in Pythia-160M and -410M as the carrier scaffold: ${\sim}50$ sparse features with stable life histories, around which the model's representational structure organises. It has four properties. \emph{(i)}~\emph{It assembles early:} features emerge, die, and reorganise ${\sim}40\!\times$ faster in the first $1\%$ of training than afterwards, and the scaffold is already largely fixed by then. \emph{(ii)}~\emph{It is load-bearing:} joint cross-layer ablation identifies the carriers as far more load-bearing than any count-matched non-scaffold population, a gap invisible to per-firing single-feature methods. \emph{(iii)}~\emph{Function precedes direction:} which features will become carriers is already predictable from training-onset firing patterns alone, correctly distinguishing future carriers from non-carriers in $4$ of $5$ cases, before the geometry has settled. \emph{(iv)}~\emph{It seeds subsequent development:} by the end of training, scaffold carriers have recruited $64\%$ of all active features into the scaffold hierarchy. Life history is consistent with a two-phase account of training: selection appears to largely determine the scaffold in the first $1\%$; the remaining $99\%$ appears to calibrate geometry around a substrate already set.

2026-05-07T15:12:42Z 21 pages, 7 figures Philipp Stecher Sandro Radovanović Vlasta Sikimić Reinhard Kahle http://arxiv.org/abs/2605.06304v1 A multi-scale information geometry reveals the structure of mutual information in neural populations 2026-05-07T14:07:43Z

Understanding how neural population responses represent sensory information is a central problem in systems neuroscience. One approach is to define a representational geometry on stimulus space in which distances reflect how reliably stimuli can be distinguished from neural activity. However, different constructions of these distances can lead to qualitatively different conclusions about the neural code. Here, we show that a unique Riemannian representational geometry emerges from first principles governing how distances contract as stimulus resolution is lost through coarse-graining. This results in a multi-scale extension of the Fisher information metric, capturing encoding structure from fine stimulus details to coarse global distinctions. The resulting geometry is exactly related to the mutual information encoded by the population: well encoded stimulus directions - those contributing more to mutual information - are expanded, whereas poorly encoded directions are contracted. The metric tensor can be estimated using diffusion models, making the framework practical for large neural populations and high-dimensional stimuli. Applied to visual cortical responses to natural images, the eigenvectors of the metric tensor identify stimulus variations that contribute most to information transmission, yielding interpretable features that are robust to modelling choices. Together, these results provide a principled, information-theoretic framework for characterising neural population codes.

2026-05-07T14:07:43Z Simone Azeglio Steeve Laquitaine Ulisse Ferrari Matthew Chalk http://arxiv.org/abs/2605.05907v1 Decoding Alignment without Encoding Alignment: A critique of similarity analysis in neuroscience 2026-05-07T09:17:47Z

Decoding approaches are widely used in neuroscience and machine learning to compare stimulus representations across neural systems, such as different brain regions, organisms, and deep learning models. Popular methods include decoding (perceptual) manifolds and alignment metrics such as Representational Similarity Analysis (RSA) and Dynamic Similarity Analysis (DSA), where similarity in decoding representations is interpreted as evidence for similar computation. This paper demonstrates a fundamental weakness behind this approach: it is misleading to assume that representational geometry is representative of a neuronal population as a whole, when such representations may actually be shaped by a very small subset of neurons. We show that the complementary encoding paradigm addresses this issue directly: it characterizes how neurons are organized globally in terms of their responses to a set of data, providing insight into how the decoding representation is implemented by neurons within a population. We demonstrate across experiments in biological systems and deep learning models that (i) surprisingly, similar decoding behavior and high representational alignment can arise from small, non-representative subpopulations of neurons; and critically, (ii) alignment metrics are insensitive to encoding manifold topology (how function is distributed across neurons), despite this being a key signature of differentiation across biological systems. A controlled MNIST experiment provides causal evidence: decoding metrics remain unchanged even when encoding topology is causally manipulated via the training loss. Overall, similarity in decoding behavior, as measured by classic alignment metrics, does not imply similarity in function or computation, motivating the use of encoding manifolds as a complementary tool for comparing neural systems.

2026-05-07T09:17:47Z 40 pages, 27 figures Johannes Bertram Luciano Dyballa T. Anderson Keller Savik Kinger Steven W. Zucker http://arxiv.org/abs/2603.16281v2 Laya: A LeJEPA Approach to EEG via Latent Prediction over Reconstruction 2026-05-07T06:34:57Z

Electroencephalography (EEG) is a widely used tool for studying brain function, with applications in clinical neuroscience, diagnosis, and brain-computer interfaces (BCIs). Recent EEG foundation models trained on large unlabeled corpora aim to learn transferable representations, but their effectiveness remains unclear; reported improvements over smaller task-specific models are often modest, sensitive to downstream adaptation and fine-tuning strategies, and limited under linear probing. We hypothesize that one contributing factor is the reliance on signal reconstruction as the primary self-supervised learning (SSL) objective, which biases representations toward high-variance artifacts rather than task-relevant neural structure. To address this limitation, we explore an SSL paradigm based on Joint Embedding Predictive Architectures (JEPA), which learn by predicting latent representations instead of reconstructing raw signals. We introduce Laya, the first EEG foundation model based on LeJEPA. We show that latent prediction yields representations that encode semantic structure in EEG: Laya embeddings track clinically meaningful state changes such as seizure onset, are resilient to noise, and achieve the strongest mean clinical accuracy under frozen linear probing, with particular gains on tasks where relevant neural patterns are subtle and easily obscured by artifacts. Controlled ablations against matched MAE variants confirm that the choice of pretraining objective, rather than architecture or data, is the primary driver of these gains.

2026-03-17T09:13:29Z Saarang Panchavati Uddhav Panchavati Hiroki Nariai Corey Arnold William Speier http://arxiv.org/abs/2509.15832v3 Overcoming Output Dimension Collapse: When Sparsity Enables Zero-shot Brain-to-Image Reconstruction at Small Data Scales 2026-05-07T06:31:31Z

Advances in brain-to-image reconstruction are enabling us to externalize the subjective visual experiences encoded in the brain as images. A key challenge in this task is data scarcity: a translator that maps brain activity to latent image features is trained on a limited number of brain-image pairs, making the translator a bottleneck for zero-shot reconstruction beyond the training stimuli. In this paper, we mathematically analyze the behavior of two translators commonly used in recent reconstruction pipelines: naive multivariate linear regression and sparse multivariate linear regression. We define the data scale as the ratio of the number of training samples to the latent feature dimensionality and characterize the behavior of each model across data scales. Building on a standard structural property of naive multivariate regression, we first show that the resulting ``output dimension collapse'' can become a practical generalization bottleneck in brain-to-image reconstruction. We introduce the best prediction diagnostic, which is computable without brain activity, to quantify the practical impact of this collapse. We then analyze sparse linear regression models in a student--teacher framework and derive expressions for the prediction error in terms of data scale and other sparsity-related parameters. Our analysis clarifies when variable selection can reduce prediction error at small data scales by exploiting the sparsity of the brain-to-feature mapping. Our findings provide quantitative guidelines for diagnosing output dimension collapse and for designing effective translators and feature representations for zero-shot reconstruction.

2025-09-19T10:01:43Z Transactions on Machine Learning Research, 2026 Kenya Otsuka Yoshihiro Nagano Yukiyasu Kamitani http://arxiv.org/abs/2508.11659v2 Toward Practical Equilibrium Propagation: Brain-inspired Recurrent Neural Network with Feedback Regulation and Residual Connections 2026-05-07T04:26:28Z

Brain-like intelligent systems need brain-like learning methods. Equilibrium Propagation (EP) is a biologically plausible learning framework with strong potential for brain-inspired computing hardware. However, existing im-plementations of EP suffer from instability and prohibi-tively high computational costs. Inspired by the structure and dynamics of the brain, we propose a biologically plau-sible Feedback-regulated REsidual recurrent neural network (FRE-RNN) and study its learning performance in EP framework. Feedback regulation enables rapid convergence by reducing the spectral radius. The improvement in con-vergence property reduces the computational cost and train-ing time of EP by orders of magnitude, delivering perfor-mance on par with backpropagation (BP) in benchmark tasks. Meanwhile, residual connections with brain-inspired topologies help alleviate the vanishing gradient problem that arises when feedback pathways are weak in deep RNNs. Our approach substantially enhances the applicabil-ity and practicality of EP in large-scale networks that un-derpin artificial intelligence. The techniques developed here also offer guidance to implementing in-situ learning in physical neural networks.

2025-08-05T15:07:50Z Zhuo Liu Tao Chen http://arxiv.org/abs/2605.04088v2 Noise-accelerated Kramers Escape and Coherence Resonance in a 5D Neural Manifold 2026-05-07T02:45:06Z

Intrinsic channel noise is fundamental to neural processing, yet its state-dependent nature, when constrained by strict Feller boundary conditions, is often overlooked. Here, we demonstrate that this bounded multiplicative noise is not merely a source of jitter but an active dynamical force that fundamentally reshapes neural excitability. Investigating a 5D Hodgkin-Huxley-type cortical pacemaker model, we utilize a full-truncation semi-implicit Euler scheme to ensure rigorous probability conservation and domain-preserving integration. Through comprehensive parameter sweeps, we uncover a rich triphasic landscape of noise-induced transitions dictated by the underlying bifurcation structure. Deep in the subthreshold regime, multiplicative noise acts as a constructive force, triggering stochastic awakening via Kramers escape. Near the subcritical Hopf bifurcation, this evolves into highly robust coherence resonance (CR). Crucially, in the supra-threshold oscillatory regime, our framework reveals a striking dynamical shift: a generalized, noise-accelerated Kramers escape. Under extreme multiplicative noise - characteristic of sparse channel populations - strictly bounded fluctuations actively amplify escape rates from the hyperpolarized slow manifold, transforming regular pacing into high-frequency, irregular bursting. Conductance perturbation experiments confirm the profound biological robustness of this transition. These findings establish a physically rigorous mechanism for how boundary-constrained noise drives high-dimensional oscillators toward states of pathological hyperexcitability.

2026-04-23T21:06:52Z 12 pages, 7 figures, revised version with more rigorous stability derivations. Currently under review at Physical Review E Yefan Wu http://arxiv.org/abs/2507.02304v2 Overcoming the Curse of Dimensionality: Structural Connectivity Reconstruction via Pairwise Information Flow in Nonlinear Networks 2026-05-06T21:40:06Z

Inferring structural connectivity from observed dynamics remains a fundamental open problem in complex systems, particularly for nonlinear networks where direct measurements are unavailable, and existing methodological approaches each incur characteristic limitations. Model-based methods require prior knowledge of the mechanistic form of the underlying dynamics, while model-free approaches often lack quantitative correspondence to network structural connectivity, and suffer from the curse of dimensionality as the size and complexity of the system increases. Here we show that pairwise time-delayed information flow is sufficient to recover, without high-dimensional conditioning, structural connectivity in general nonlinear networks. We introduce a pairwise delayed information flow (PDIF) as an information-theoretic framework and derive a theoretical quadratic relationship between PDIF and coupling strength, establishing a direct correspondence between information flow and network architecture. We further show that indirect interaction contributions are suppressed at leading order, enabling accurate reconstruction solely from pairwise measurements. Combining binary state representations, pairwise inference, and time-delayed statistics, PDIF overcomes the dimensionality barrier while remaining model-agnostic and scalable. Validated across nonlinear dynamical systems, neuronal network models, and large-scale electrophysiological recordings, PDIF achieves high reconstruction accuracy and robustness to noise, outperforming existing methods. These results establish a principled, efficient and model-agnostic framework for connectivity reconstruction, and reveal a general mechanism by which pairwise observable statistics encode network structure in nonlinear systems.

2025-07-03T04:21:49Z 27 pages, 13 figures Kai Chen Zhong-qi K. Tian Yifei Chen Shouwei Luo Songting Li Douglas Zhou http://arxiv.org/abs/2409.14566v4 n:m Phase-Locking of Coupled Oscillators with Nonlinearities in Coupling Strength and Heterogeneity 2026-05-06T17:59:21Z

We introduce a scalar reduction method for forced or coupled systems with nonlinearities in both heterogeneity and coupling strength. Heterogeneity is formulated as a relatively weak but nonlinear alteration of the vector field(s). The method can be used to determine the existence and stability of $n{:}m$ phase-locked states in a variety of forced or coupled biological oscillator models, including the nonradial isochron clock, a thalamic neural oscillator, and the Van der Pol oscillator. The proposed scalar reduction successfully captures the emergence and disappearance of phase-locked states as a function of nonlinear coupling strength and nonlinear heterogeneity. We find that even small amounts of heterogeneity can significantly alter phase-locked states in ways that cannot be captured by assuming identical oscillators. The proposed method enables a reduction and analysis of high-dimensional systems of coupled oscillators in more biologically realistic settings.

2024-09-22T19:04:33Z 58 pages, 28 figures Youngmin Park http://arxiv.org/abs/2511.20179v5 Human-computer interactions predict mental health 2026-05-06T16:31:21Z

Scalable assessments of mental illness remain a critical roadblock toward accessible and equitable care. Here, we show that everyday human-computer interactions encode mental health with biomarker accuracy. We introduce MAILA, a MAchine-learning framework for Inferring Latent mental states from digital Activity. We trained MAILA on 18,200 cursor and touchscreen recordings labeled with 1.3 million mental-health self-reports collected from 9,500 participants. MAILA tracks dynamic mental states along 13 clinically relevant dimensions, resolves circadian fluctuations and experimental manipulations of arousal and valence, achieves near-ceiling accuracy at the group level, captures information that is only partially reflected in verbal self-report, and improves the ability of large language models to infer user mental health. By extracting signatures of psychological function that have so far remained untapped, MAILA establishes human-computer interactions as a new modality for scalable digital phenotyping and a foundation for context-aware artificial intelligence.

2025-11-25T11:00:39Z Veith Weilnhammer Jefferson Ortega David Whitney http://arxiv.org/abs/2605.05091v1 Think-Aloud Reshapes Automated Cognitive Model Discovery Beyond Behavior 2026-05-06T16:29:35Z

Computational cognitive models discovered using large language models have so far relied solely on behavioral data. However, it is well-known that models produced from the behavioral trajectory alone are typically under-determined. In this work, we explore the use of Think Aloud traces as an additional form of data constraint during automated model discovery. When applied to the domain of risky decision-making, we find that the models discovered with think-aloud achieve significantly improved predictive performance on held-out data. Additionally, we find that the discovered models belong to different structural classes than those discovered from behavior alone for the majority of participants (69.4\%), specifically, it shifts from Explicit comparator towards Integrated utility. These results suggest that process-level language data not only improve model fit, but also systematically reshape the structure of the discovered cognitive models, enabling the identification of mechanisms that are not recoverable from behavior alone.

2026-05-06T16:29:35Z Hanbo Xie Akshay K. Jagadish Lan Pan Robert C. Wilson http://arxiv.org/abs/2605.04636v1 A Generalized Framework of Antisymmetric Polyspectral Indices for Identifying High-Order Neural Interactions 2026-05-06T08:30:35Z

Cross-frequency interactions are fundamental brain mechanisms for integrating information across temporal scales. However, accurate identification of these couplings is hindered by complex multi-frequency nonlinearities and by spurious, zero-lag artifacts caused by volume conduction. To our knowledge, conventional metrics lack a robust framework to characterize genuine interactions among multiple time series where a frequency of interest $f_N$ arises from the combination of $N-1$ components such that $f_N = \sum_{i=1}^{N-1} f_i$. We introduce a general family of antisymmetric cross-polyspectral indices designed to quantify these harmonic dependencies while being intrinsically robust to instantaneous mixing. We derive the theoretical properties of these quantities and validate them through simulations of cubic nonlinearities. As a proof of concept, we apply the indices to empirical EEG recordings; the results reveal significant higher-order dependencies that elude standard analytical approaches. We further discuss how these indices can inform novel, personalized multi-site transcranial magnetic stimulation (mTMS) protocols by enabling the selective monitoring and modulation of specific multi-frequency network interactions.

2026-05-06T08:30:35Z Alessio Basti Rikkert Hindriks Ruggero Freddi Gian Luca Romani Vittorio Pizzella Guido Nolte Laura Marzetti http://arxiv.org/abs/2605.04443v1 Dissociating spatial frequency reliance from adversarial robustness advantages in neurally guided deep convolutional neural networks 2026-05-06T03:14:06Z

Deep convolutional neural networks (DCNNs) have rivaled humans on many visual tasks, yet they remain vulnerable to near-imperceptible perturbations generated by adversarial attacks. Recent work shows that aligning DCNN representations with human visual cortex activity improves adversarial robustness, but the mechanisms driving this advantage are unclear. One hypothesis suggests that neural alignment confers robustness by biasing models away from brittle high-frequency details and towards the low spatial frequencies (LSF). However, recent work shows that human object recognition critically depends on a narrow, mid-frequency "human channel". Interestingly, this band was partially preserved in prior LSF-focused studies. Here, we investigate whether a spectral bias towards the LSF or the human channel is the primary driver of the adversarial robustness observed in neurally aligned DCNNs. We first show that DCNNs aligned to higher-order regions of the human ventral visual stream systematically increase reliance on both LSF and the human channel. However, directly steering DCNNs towards these bands revealed a clear dissociation. Biasing models towards the human channel, either alone or together with LSF, does not improve robustness and even impairs it. LSF bias produced some robustness gains, but such improvements are modest despite inducing much larger shifts in spatial-frequency reliance than neurally aligned models. Spatial-frequency-biased models overall show little, if any, increase in similarity to human neural representational geometry. Together, our results suggest that altered spatial-frequency reliance is likely an emergent property of learning more human-like representations rather than the primary mechanism by which neural alignment confers adversarial robustness, and motivate the need for future research examining representational properties beyond spatial-frequency profiles.

2026-05-06T03:14:06Z Zhenan Shao Tianyu Ren Chengxiao Wang Leyla Isik Diane M. Beck http://arxiv.org/abs/2605.04326v1 A foundation model of vision, audition, and language for in-silico neuroscience 2026-05-05T22:13:48Z

Cognitive neuroscience is fragmented into specialized models, each tailored to specific experimental paradigms, hence preventing a unified model of cognition in the human brain. Here, we introduce TRIBE v2, a tri-modal (video, audio and language) foundation model capable of predicting human brain activity in a variety of naturalistic and experimental conditions. Leveraging a unified dataset of over 1,000 hours of fMRI across 720 subjects, we demonstrate that our model accurately predicts high-resolution brain responses for novel stimuli, tasks and subjects, superseding traditional linear encoding models, delivering several-fold improvements in accuracy. Critically, TRIBE v2 enables in silico experimentation: tested on seminal visual and neuro-linguistic paradigms, it recovers a variety of results established by decades of empirical research. Finally, by extracting interpretable latent features, TRIBE v2 reveals the fine-grained topography of multisensory integration. These results establish artificial intelligence as a unifying framework for exploring the functional organization of the human brain.

2026-05-05T22:13:48Z Stéphane d'Ascoli Jérémy Rapin Yohann Benchetrit Teon Brooks Katelyn Begany Joséphine Raugel Hubert Banville Jean-Rémi King