https://arxiv.org/api/K3crogWN6kSAKlgvTh4Kvpy74hE2026-06-21T16:34:33Z1218113515http://arxiv.org/abs/2606.00226v1Consciousness, AI, and the Limits of Scientific Explanation2026-05-29T18:01:05ZScience is constitutively third-personal: its findings are in principle reproducible by any observer, independent of perspective, and answerable to measurement. This is the source of its power and also its limit when it comes to phenomena that are first-personal. While it is obvious that a science of the Meaning of Life is unattainable, researchers have not drawn the same conclusion for consciousness -- in its phenomenal dimension, the qualia of seeing red, of feeling pain, of being anything at all. I argue they should. The hard problem of consciousness is not a scientific problem awaiting better tools or a more ambitious theory, but a category error. The same structural problem applies to machine consciousness: neither attribution nor denial is scientifically adjudicable. I situate science within a broader ecology of understanding and argue that a unified framework that addresses both the objective and the subjective may be unattainable.2026-05-29T18:01:05Z18 pages, no figuresBradley C. Lovehttp://arxiv.org/abs/2605.31473v1The Metastable Mind: Neural Underpinnings of Naturalistic Cognition Through the Synthesis of Event Segmentation and Metastable Neural States2026-05-29T16:02:51ZA multitude of findings and theories from cognitive, behavioural and computational neuroscience show that neural activity unfolds in a variety of meaningful temporal units. Behavioural research on event segmentation (ES) has shown that continuous experience is segmented into discrete events and sub-events, which aid real-time comprehension, memory, and decision-making. Computational neuroscience research observes and models ongoing brain activity as a series of stable population activity that occur across wide spatial and temporal scales, referred to as metastable neural activity (MNA). Through this review, we show that these isolated branches of literature, the cognitive theory of Event Segmentation (ES) and the mechanistic approach of metastability (MNA), actually study the same metastable neural states from different perspectives. While the behavioural branch offers a theory for the cognitive and behavioural utility of segmentation, the metastability literature provides the mechanistic account at the implementational level. We describe how metastable neural states act as the fundamental computational units of cognition and identify a number of core principles of how they operate. One is the spatio-temporally nested hierarchy of states, where longer-duration states in higher-order regions both constrain and are shaped by states in faster-operating regions. Another is that neural states are a reflection of underlying predictive models which shape perception, decision making, memory encoding and recall. And finally that neural states are periods of more modular processing, which are interspersed by boundaries where there is a reconfiguration of connectivity. Understanding how neural states emerge, interact, and shape cognition brings us closer to understanding the brain in its natural mode of operation.2026-05-29T16:02:51Z24 pagesDora GozukaraNasir AhmadDjamari OetringerLinda Geerligshttp://arxiv.org/abs/2603.17306v3Evidence for systematic semantic structure in individual phonemes2026-05-29T15:14:51ZA foundational assumption in linguistics holds that sound-meaning relations are largely arbitrary. Here we show that this assumption fails at the level of individual phonemes: each English phoneme carries a structured, multidimensional semantic profile that is recoverable from text, perceived across languages, and grounded in articulation. Three large language models independently detected consistent semantic structure across nine perceptual dimensions in 220 pairwise letter contrasts. Native English speakers (N = 93) confirmed these associations in a preregistered forced-choice task (85.3% agreement with model predictions), and listeners of five typologically diverse languages (N = 155) replicated the effect under audio presentation (73.2%-81.9% accuracy). Articulatory features predicted the structure with cross-validated R^2 of 0.56-0.98, indicating that the bodily act of producing a sound systematically shapes the meaning it conveys. These findings reframe phoneme-level iconicity as a pervasive, embodied property of the phonological system.2026-03-18T03:02:10Z31 pages, 4 figuresGexin Zhaohttp://arxiv.org/abs/2603.26506v2Identifying Connectivity Distributions from Neural Dynamics Using Flows2026-05-29T07:03:37ZConnectivity structure shapes neural computation, but inferring this structure from population recordings is degenerate: multiple connectivity structures can generate identical dynamics. Recent work uses low-rank recurrent neural networks (lrRNNs) to infer low-dimensional latent dynamics and connectivity from observed activity, enabling a mechanistic interpretation of the dynamics. However, standard approaches for training lrRNNs can recover spurious structures irrelevant to the underlying dynamics. We first characterize the identifiability of connectivity structures in lrRNNs and determine conditions under which a unique solution exists. To find such solutions, we develop an inference framework based on maximum entropy and continuous normalizing flows (CNFs), trained via flow matching. Instead of estimating a single connectivity matrix, our method learns a distribution over connection weights that is maximally unbiased over unidentifiable components while matching the observed dynamics. This approach captures complex yet necessary distributions such as heavy-tailed connectivity found in empirical data. We validate our method on synthetic datasets with connectivity structures that generate multistable attractors, limit cycles, and ring attractors, and demonstrate its applicability in recordings from rat frontal cortex during decision-making. Our framework shifts circuit inference from recovering connectivity to identifying which connectivity structures are computationally required, and which are artifacts of underconstrained inference.2026-03-27T15:14:55ZTimothy Doyeon KimUlises Pereira-ObilinovicYiliu WangEric Shea-BrownUygar Sümbülhttp://arxiv.org/abs/2605.30882v1Extended predictive coding framework as variational free-energy minimisation under exponential-family assumption2026-05-29T06:12:55ZThe sensory cortices of the brain perform perceptual inference efficiently through their complex networks of neurons. One of the theoretical accounts of this process is the free-energy principle (FEP), which postulates that the brain performs variational Bayesian inference. Pioneering studies have shown that FEP can correspond to the predictive coding (PC) hypothesis under the Gaussian assumption and Laplace approximation. However, PC-based implementations of FEP within such a limited Gaussian regime have failed to capture several properties of biological neural networks, such as nonlinearity and heterogeneity of input--output properties within a network, and the biological implausibility of negative firing rates. This study shows that, when a broader class of probability distributions, namely the exponential family of distributions (EFD), is assumed for the variational posterior and prior, these missing characteristics are exhibited within the network, maintaining the FEP--PC correspondence up to the second cumulant of the posterior. We also show that the proposed model can be trained by biologically plausible local plasticity rules. Our results enrich the explanatory power of FEP regarding neural dynamics involved in perception as variational inference.2026-05-29T06:12:55ZAsaki KataokaKenji Doyahttp://arxiv.org/abs/2605.30864v1What makes an action sequence enjoyable to watch?2026-05-29T05:46:12ZPeople often seek out ways to watch others perform complex action sequences (e.g., sports). What makes some sequences more enjoyable to watch than others? We generated 24 video clips of gameplay from a Flappy Bird-style video game. Clips varied in difficulty (how often players succeeded on average) and in moment-to-moment uncertainty (how likely the player was to crash at any given step). Participants (N=864) rated each video on one of three dimensions: how much they enjoyed it, how difficult the level appeared, or how dangerous the player's trajectory appeared. We found that participants preferred videos where the player seemed to be completing more difficult obstacle courses, but dangerousness did not predict enjoyment ratings. These findings show how procedurally generated stimuli can isolate the factors that affect how enjoyable an action sequence is to watch.2026-05-29T05:46:12Z6 pages, 4 figures, cogsci 2026Jean-Peïc ChouKristine ZhengJunyi ChuManeesh AgrawalaJudith E. Fanhttp://arxiv.org/abs/2602.03896v2A hitchhiker's guide to Poisson gradient estimation2026-05-29T01:08:34ZPoisson-distributed latent variable models are widely used in computational neuroscience, but differentiating through discrete stochastic samples remains challenging. Two approaches address this: *Exponential Arrival Time* (EAT) simulation and *Gumbel-SoftMax* (GSM) relaxation. We provide the first systematic comparison of these methods, along with practical guidance for practitioners. Our main technical contribution is a modification to the EAT method that theoretically guarantees an unbiased first moment (exactly matching the firing rate), and reduces second-moment bias. We evaluate these methods on their distributional fidelity, gradient quality, and performance on two tasks: (1) variational autoencoders with Poisson latents, and (2) partially observable generalized linear models, where latent neural connectivity must be inferred from observed spike trains. Across all metrics, our modified EAT method exhibits better overall performance (often comparable to exact gradients), and substantially higher robustness to hyperparameter choices. These results extend to over-dispersed Negative Binomial latents, where modified EAT again performs best. However, only GSM generalizes to arbitrary non-Poisson distributions, including the under-dispersed regime. Together, our results clarify the trade-offs between these methods and offer concrete recommendations for practitioners working with Poisson latent variable models.2026-02-03T08:47:30ZPublished at ICML2026 --- code: https://github.com/hadivafaii/PoissonGradientEstimationMichael IbrahimHanqi ZhaoEli SenneshZhi LiAnqi WuJacob L. YatesChengrui LiHadi Vafaiihttp://arxiv.org/abs/2509.23195v2The relative strength of hierarchical structure and statistics differs across the measures in naturalistic reading2026-05-28T21:59:07ZThe hierarchical syntactic structure and non-hierarchical, statistical, or sequential factors have long been framed as rival theories in accounting for online comprehension. A lot of evidence has shown that both hierarchical and non-hierarchical factors can shape comprehension and the more open question is when, and how strongly, hierarchy exerts its influence in comprehension. We addressed the question with co-registered EEG and eye-tracking, treating syntactic depth as the variable for operationalizing hierarchical structure. For the timing question, hierarchical syntactic structure is shown to influence reading before reading a sentence and can emerge as early as 108ms before reading. This is supported by both transitional probability analysis and regression on fixation-related potential. Analyses on fixation-transition showed that readers preferentially moved between syntactically central words rather than according to serial word order, suggesting that scanpaths are driven by deep syntactic structure rather than by pure statistics. For the strength question, we combined Bayesian network modeling and regression analysis to show that strength of a variable is dependent on the phenomenon that is to be explained. Bayesian network analysis showed that hierarchical syntactic structure carried more predictive weight than statistical features. Regression on fixation-related potential demonstrated that hierarchical syntactic structure significantly predicted word-level neural activity in the front-right region in regression analyses, but is generally weaker in comparison with lexical surprisal. Evidence combined, our analyses suggested that hierarchical structure can anticipatorily guide subjects' online comprehension both on a behavioral and neural level, with its strength varies across different facets of reading behavior.2025-09-27T08:56:12ZNan WangHanlin WuJiaxuan Lihttp://arxiv.org/abs/2605.04200v2Neural Manifolds as Crystallized Embeddings: A Synthesis of the Free Energy Principle, Generalized Synchronization, and Hebbian Plasticity2026-05-28T21:38:41ZThe free energy principle casts perception as variational inference, but its biological implementation is underspecified. The generalized-coordinate formalism is not a literal claim that neurons compute arbitrary Taylor expansions. This paper argues that generalized synchronization (GS) provides the missing bottom-up mechanism. Certain recurrent circuits satisfy a contraction property: nearby trajectories converge exponentially. A contracting circuit driven by structured sensory input synchronizes to driving dynamics. Under generic embedding conditions, the resulting synchronization map embeds the low-dimensional sensory manifold into neural state space. The geometry predicted by the free energy principle is not imposed from above by an explicitly Bayesian neural calculus. It arises from ordinary recurrent dynamics.
I then propose a developmental extension. Hebbian plasticity acting on the correlations generated by sensory-driven synchronization shapes the embedded manifold into recurrent connectivity, producing a continuous attractor network that approximates the embedded sensory manifold. Prediction-separation results bound the representational fidelity of the resulting circuit by prediction accuracy: where the network predicts future observations well, the synchronization map separates underlying states; where prediction fails, the representation collapses. The collapses are observable as categorical perception, metameric equivalence, and discrimination thresholds. On this view, mature head-direction, grid-cell, and stimulus-driven visual manifolds are developmental products of three interacting processes: dynamical contraction, generalized synchronization, and correlation-based plasticity. The central open problems are whether the Hebbian fixed point exists and whether Hebbian dynamics produce a sufficiently accurate predictor on the relevant input distribution.2026-05-05T18:42:08ZUpdated to expand open mathematical problems and incorporate prediction-separation link as specific predictions of the synthesisVikas N. O'Reilly-Shahhttp://arxiv.org/abs/2605.30556v1Supervised Training Rapidly Degrades Early Visual Cortex Alignment Across Biologically Plausible Learning Rules2026-05-28T20:39:52ZRandom, untrained neural networks consistently match or exceed trained networks in representational similarity to early visual cortex. This puzzling finding challenges the assumption that learning improves brain alignment. We investigate it by tracking representational similarity analysis (RSA) alignment to human fMRI data across training for four learning rules: backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent plasticity (STDP). Using 720 object images from the THINGS database and fMRI data from three subjects across six visual ROIs, we measure Spearman correlations between model and brain representational dissimilarity matrices at eight training checkpoints (epochs 0-40). We find that (1) a single epoch of training reduces V1 alignment by 25-90%, depending on the learning rule; (2) backpropagation reduces V1 alignment most severely (delta r = -0.080), while predictive coding and STDP preserve substantially more (delta r ~ -0.04); and (3) a weaker, opposite tendency appears in object-selective cortex (LOC), where BP shows the largest increase in alignment during training, although the absolute change is small. These results suggest that untrained architectures capture low-level visual statistics through inductive biases alone, and that global error signals (BP) reshape early representations more aggressively than local learning rules (PC, STDP), which better preserve brain-like structure.2026-05-28T20:39:52Z7 pages, 4 figuresNils Leuteneggerhttp://arxiv.org/abs/2605.30552v1High-Fidelity 3D Simulator for Synthetic fNIRS Data Generation2026-05-28T20:34:27ZFunctional near-infrared spectroscopy (fNIRS) provides a noninvasive window into brain activity by measuring task-related changes in oxygenated and deoxygenated hemoglobin in the cortex. A key advantage of fNIRS is its promise of use with mobile participants in complex, real-world environments, such as walking, sports, classroom learning, driving simulations, or social interactions. However, analyzing fNIRS data is challenging because of motion artifacts, physiological noise, and other confounding factors. This challenge is further compounded by the limited availability of annotated datasets, which hinders the development and validation of new analysis pipelines, particularly given the growing use of AI methods. Recognizing these challenges, we introduce a 3D fNIRS simulator that uses mesh-based Monte Carlo simulations to create physiologically realistic, full-head synthetic recordings with high spatiotemporal fidelity. Our simulator combines anatomically accurate sensitivity profiles with parameterized models of hemodynamic responses, systemic physiology, and nonsystematic artifacts. As a result, users can generate virtually unlimited labeled datasets for testing denoising algorithms, data augmentation, mechanistic modeling, or \textit{in silico} experimentation. We validate the simulator using experimental fNIRS data from open-source finger-tapping, pain-assessment, and surgical-skill datasets and provide an open-source implementation to support reproducibility and broad adoption.2026-05-28T20:34:27ZCondell EastmondNiels BracherXavier IntesStefan T. Radevhttp://arxiv.org/abs/2605.30522v1Private Noise and Public Error in Collective Information Acquisition2026-05-28T19:57:36ZCollective information acquisition requires groups to combine personal evidence with social information while remaining coupled to the external state. Communication noise can affect this process, but the role of noise remains unclear. In an online experiment, 600 participants worked in four-person human groups estimating a room temperature across 25 rounds while receiving either faithful social information, comprehension noise in which each receiver saw independently perturbed social information, or production noise in which perturbations were stored before display and could be seen by multiple receivers. The thermometer cue was objectively veridical, but its reliability was subjectively uncertain and the unitless 50--250 room-temperature range created a task-induced conflict between displayed evidence and everyday temperature expectations. Production-noise groups spent more rounds tightly clustered around a wrong value than comprehension-noise groups (\(p=0.016\), group-level permutation). Production noise more often created a wrong common signal (\(p=0.025\), Fisher's exact test) and made that signal persist across more rounds (\(p=0.004\), permutation). Dynamic update models showed that production noise was not more harmful because people followed peers more strongly, but because the same peer influence acted on more correlated production-noise perturbations. Exploratory human analyses linked the mechanism to psychological patterns while a GPT-agent experiment clarified a boundary condition: GPT agents registered uncertainty through reduced confidence without reproducing human-scale production-noise vulnerability. Overall, noise did not simply degrade collective information acquisition. Comprehension noise could sometimes improve correction relative to the faithful control, whereas production noise could turn perturbations into common evidence and stabilize consensus on error.2026-05-28T19:57:36Z48 pages, 8 figuresMohammad SalahshourSumanth BhargavaKajal KumariNiccolo PescetelliYasser RoudiBahador BahramiIain D. Couzinhttp://arxiv.org/abs/2512.13517v2A Deep Learning Model of Mental Rotation Informed by Interactive VR Experiments2026-05-28T15:09:34ZMental rotation -- the ability to compare objects seen from different viewpoints -- is a fundamental example of mental simulation and spatial world modeling in humans. Here we propose a mechanistic model of human mental rotation, leveraging recent advances in deep, equivariant, and neuro-symbolic learning. Our model consists of three stacked components: (1) an equivariant neural encoder, producing 3D spatial representations of objects from images, (2) a neuro-symbolic object encoder, deriving symbolic objects descriptions from these spatial representations, and (3) a neural decision agent, comparing these symbolic descriptions to prescribe rotation simulations in 3D latent space via a recurrent pathway. Our model design is guided by the existing experimental literature on mental rotation, which we complemented with experiments in VR where participants could at times manipulate the objects to compare. Our model captures well the performance, response times and behavior of participants in our and others' experiments, and through ablation studies we demonstrate the necessity of each component. Our work adds to a recent collection of deep neural models of human spatial reasoning, further demonstrating the potency of integrating deep, equivariant, and symbolic representations to model the human mind.2025-12-15T16:43:50ZVersion accepted at ICML 2026Raymond KhazoumDaniela FernandesAleksandr KrylovQin LiStephane Denyhttp://arxiv.org/abs/2411.14107v3Inward rectifier potassium channels interact with calcium channels to promote robust and physiological bistability2026-05-28T14:12:18ZProjection neurons in the dorsal horn relay nociceptive input to supraspinal centers. During central sensitization, a subset of them switches from tonic firing to plateau potentials with sustained afterdischarges, a change that requires intrinsic bistability between a resting and a spiking state. Voltage-gated L-type calcium (CaL) channels can produce bistability, but reach physiological resting states only when paired with voltage-gated potassium channels, most of which simultaneously shrink the bistability window. How robust, physiological bistability arises has therefore remained unclear. Using a minimal conductance-based model, we show that inward rectifier potassium (Kir) channels enlarge the bistability window when combined with CaL channels, while M-type potassium (KM) channels slightly reduce it. Within the parameter region where bistability is both robust and physiological, both channel types can sustain bistability, but the CaL+Kir combination produces a substantially larger window and is more robust to noise and intrinsic variability. This window-enlarging effect traces to a shape feature of the outward Kir steady-state current: like the CaL current, it has a region of negative differential conductance around the spike threshold, a feature absent from KM and from most other voltage-gated potassium currents. Bifurcation analysis further shows that the two pairs support qualitatively distinct excitability: plateau-generating bistability for CaL+Kir and resonator-like dynamics for CaL+KM. These conclusions hold in a two-compartment model of deep projection neurons with realistic ion channel complements, and identify the CaL+Kir pair as a candidate intrinsic mechanism for central sensitization.2024-11-21T13:15:14ZAnaëlle De WormGuillaume DrionPierre Sacréhttp://arxiv.org/abs/2502.01360v4A Quotient Homology Theory of Representation in Neural Networks2026-05-28T13:22:52ZPrevious research has proven that the set of maps implemented by neural networks with a ReLU activation function is identical to the set of piecewise linear continuous maps. Furthermore, such networks induce a hyperplane arrangement splitting the input domain of the network into convex polyhedra $G_J$ over which a network $Φ$ operates in an affine manner.
In this work, we leverage these properties to define an equivalence relation $\sim_Φ$ on top of an input dataset, which defines a quotient space that can be split into two sets related to the local rank of $Φ_J$ and the intersections $\cap \text{Im}Φ_{J_i}$. We refer to the latter as the \textit{overlap decomposition} $\mathcal{O}_Φ$ and prove that if the intersections between each polyhedron and an input manifold are convex, the homology groups of neural representations are isomorphic to quotient homology groups $H_k(Φ(\mathcal{M})) \simeq H_k(\mathcal{M}/\mathcal{O}_Φ)$. This lets us intrinsically calculate the Betti numbers of neural representations without the choice of an external metric. We develop methods to numerically compute the overlap decomposition through linear programming and a union-find algorithm.
Using this framework, we perform several experiments on toy datasets showing that, compared to standard persistent homology, our overlap homology-based computation of Betti numbers tracks purely topological rather than geometric features. Finally, we study the evolution of the overlap decomposition during training on several classification problems and discuss some shortcomings of our method.2025-02-03T13:52:17ZTransactions on Machine Learning Research, 05/2026, https://openreview.net/forum?id=RluspxztzSKosio Beshkov