Towards Practical Automatic Piano Reduction using BERT with Semi-supervised Learning

2026-01-05T13:25:08Z

In this study, we present a novel automatic piano reduction method with semi-supervised machine learning. Piano reduction is an important music transformation process, which helps musicians and composers as a musical sketch for performances and analysis. The automation of such is a highly challenging research problem but could bring huge conveniences as manually doing a piano reduction takes a lot of time and effort. While supervised machine learning is often a useful tool for learning input-output mappings, it is difficult to obtain a large quantity of labelled data. We aim to solve this problem by utilizing semi-supervised learning, so that the abundant available data in classical music can be leveraged to perform the task with little or no labelling effort. In this regard, we formulate a two-step approach of music simplification followed by harmonization. We further propose and implement two possible solutions making use of an existing machine learning framework -- MidiBERT. We show that our solutions can output practical and realistic samples with an accurate reduction that needs only small adjustments in post-processing. Our study forms the groundwork for the use of semi-supervised learning in automatic piano reduction, where future researchers can take reference to produce more state-of-the-art results.

f4ncgb: High Performance Gröbner Basis Computations in Free Algebras

2026-01-01T22:11:41Z

We present f4ncgb, a new open-source C++ library for Gröbner basis computations in free algebras, which transfers recent advancements in commutative Gröbner basis software to the noncommutative setting. As our experiments show, f4ncgb establishes a new state-of-the-art for noncommutative Gröbner basis computations. We also discuss implementation details and design choices.

Conditions for eigenvalue configurations of two real symmetric matrices (symmetric polynomial approach)

2026-01-01T04:11:10Z

Given two real symmetric matrices, their eigenvalue configuration is the relative arrangement of their eigenvalues on the real line. In this paper, we consider the following problem: given two parametric real symmetric matrices and an eigenvalue configuration, find a simple condition on the parameters such that their eigenvalues have the given configuration. In this paper, we consider the problem under a mild condition that the two matrices do not share any eigenvalues. We give an algorithm which expresses the eigenvalue configuration problem as a real root counting problem of certain symmetric polynomials, whose roots can be counted using the Fundamental Theorem of Symmetric Polynomials and Descartes' rule of signs.

Fast Symbolic Integer-Linear Spectra

2025-12-31T22:14:03Z

Here we contribute a fast symbolic eigenvalue solver for matrices whose eigenvalues are $\mathbb{Z}$-linear combinations of their entries, alongside efficient general and stochastic $M^{X}$ generators. Users can interact with a few degrees of freedom to create linear operators, making high-dimensional symbolic analysis feasible for when numerical analyses are insufficient.

Conditions for eigenvalue configurations of two real symmetric matrices (signature approach)

2025-12-31T01:54:04Z

For two real symmetric matrices, their eigenvalue configuration is therelative arrangement of their eigenvalues on the real line. We consider the following problem: given two parametric real symmetric matrices and an eigenvalue configuration, find a simple condition on the parameters such that the two matrices have the given eigenvalue configuration. In this paper, we develop theory and give an algorithm for this problem. The output of the algorithm is a condition written in terms of the signatures of certain related symmetric matrices.

Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge

2025-12-25T12:16:53Z

Recent advances in Hierarchical Multi-label Classification (HMC), particularly neurosymbolic-based approaches, have demonstrated improved consistency and accuracy by enforcing constraints on a neural model during training. However, such work assumes the existence of such constraints a-priori. In this paper, we relax this strong assumption and present an approach based on Error Detection Rules (EDR) that allow for learning explainable rules about the failure modes of machine learning models. We show that these rules are not only effective in detecting when a machine learning classifier has made an error but also can be leveraged as constraints for HMC, thereby allowing the recovery of explainable constraints even if they are not provided. We show that our approach is effective in detecting machine learning errors and recovering constraints, is noise tolerant, and can function as a source of knowledge for neurosymbolic models on multiple datasets, including a newly introduced military vehicle recognition dataset.

Quantitative Verification of Omega-regular Properties in Probabilistic Programming

2025-12-25T09:26:29Z

Probabilistic programming provides a high-level framework for specifying statistical models as executable programs with built-in randomness and conditioning. Existing inference techniques, however, typically compute posterior distributions over program states at fixed time points, most often at termination, thereby failing to capture the temporal evolution of probabilistic behaviors. We introduce temporal posterior inference (TPI), a new framework that unifies probabilistic programming with temporal logic by computing posterior distributions over execution traces that satisfy omega-regular specifications, conditioned on possibly temporal observations. To obtain rigorous quantitative guarantees, we develop a new method for computing upper and lower bounds on the satisfaction probabilities of omega-regular properties. Our approach decomposes Rabin acceptance conditions into persistence and recurrence components and constructs stochastic barrier certificates that soundly bound each component. We implement our approach in a prototype tool, TPInfer, and evaluate it on a suite of benchmarks, demonstrating effective and efficient inference over rich temporal properties in probabilistic models.

OGRe: An Object-Oriented General Relativity Package for Mathematica

2025-12-25T02:09:03Z

We present OGRe, a modern Mathematica package for tensor calculus, designed to be both powerful and user-friendly. The package can be used in a variety of contexts where tensor calculations are needed, in both mathematics and physics, but it is especially suitable for general relativity. By implementing an object-oriented design paradigm, OGRe allows calculating arbitrarily complicated tensor formulas easily, and automatically transforms between index configurations and coordinate systems behind the scenes as needed, eliminating user errors by making it impossible for the user to combine tensors in inconsistent ways. Other features include displaying tensors in various forms, automatic calculation of curvature tensors and geodesic equations, easy importing and exporting of tensors between sessions, optimized algorithms and parallelization for improved performance, and more.

OGRePy: An Object-Oriented General Relativity Package for Python

2025-12-24T23:02:19Z

OGRePy is a modern, open-source Python package designed to perform symbolic tensor calculations, with a particular focus on applications in general relativity. Built on an object-oriented architecture, OGRePy encapsulates tensors, metrics, and coordinate systems as self-contained objects, automatically handling raising and lowering of indices, coordinate transformations, contractions, partial or covariant derivatives, and all tensor operations. By leveraging the capabilities of SymPy and Jupyter Notebook, OGRePy provides a robust, user-friendly environment that facilitates both research and teaching in general relativity and differential geometry. This Python package reproduces the functionality of the popular Mathematica package OGRe, while greatly improving upon it by making use of Python's native object-oriented syntax. In this paper, we describe OGRePy's design and implementation, and discuss its potential for reuse across research and education in mathematics and physics.

Logic Sketch Prompting (LSP): A Deterministic and Interpretable Prompting Method

2025-12-24T09:20:35Z

Large language models (LLMs) excel at natural language reasoning but remain unreliable on tasks requiring strict rule adherence, determinism, and auditability. Logic Sketch Prompting (LSP) is a lightweight prompting framework that introduces typed variables, deterministic condition evaluators, and a rule based validator that produces traceable and repeatable outputs. Using two pharmacologic logic compliance tasks, we benchmark LSP against zero shot prompting, chain of thought prompting, and concise prompting across three open weight models: Gemma 2, Mistral, and Llama 3. Across both tasks and all models, LSP consistently achieves the highest accuracy (0.83 to 0.89) and F1 score (0.83 to 0.89), substantially outperforming zero shot prompting (0.24 to 0.60), concise prompts (0.16 to 0.30), and chain of thought prompting (0.56 to 0.75). McNemar tests show statistically significant gains for LSP across nearly all comparisons (p < 0.01). These results demonstrate that LSP improves determinism, interpretability, and consistency without sacrificing performance, supporting its use in clinical, regulated, and safety critical decision support systems.

Memory as Resonance: A Biomimetic Architecture for Infinite Context Memory on Ergodic Phonetic Manifolds

2025-12-23T10:55:32Z

The memory of contemporary Large Language Models is bound by a physical paradox: as they learn, they fill up. The linear accumulation (O(N)) of Key-Value states treats context as a warehouse of static artifacts, eventually forcing a destructive choice between amnesia and latency. We challenge this discrete orthodoxy, proposing that long-term memory is not the storage of items, but the persistence of a trajectory. We introduce Phonetic Trajectory Memory (PTM), a neuro-symbolic architecture that encodes language not as a sequence of tensors, but as a continuous path on an ergodic manifold governed by irrational rotation matrices. By decoupling the navigation (an invariant O(1) geometric signal) from the reconstruction (a probabilistic generative act), PTM achieves a compression magnitude of greater than 3,000x relative to dense caches. We demonstrate that retrieval becomes a process of resonance: the phonetic trace stabilizes the model against hallucination via "Signal Consensus" mechanism, securing up to approximately 92% factual accuracy. While this aggressive abstraction alters generative texture, it unlocks immediate access latency (approximately 34ms) independent of depth. Our results suggest that infinite context does not require infinite silicon; it requires treating memory not as data to be stored, but as a reconstructive process acting on a conserved, undying physical signal.

Bridging the Gap Between Scientific Laws Derived by AI Systems and Canonical Knowledge via Abductive Inference with AI-Noether

2025-12-22T18:45:53Z

Advances in AI have shown great potential in contributing to the acceleration of scientific discovery. Symbolic regression can fit interpretable models to data, but these models are not necessarily derivable from established theory. Recent systems (e.g., AI-Descartes, AI-Hilbert) enforce derivability from prior knowledge. However, when existing theories are incomplete or incorrect, these machine-generated hypotheses may fall outside the theoretical scope. Automatically finding corrections to axiom systems to close this gap remains a central challenge in scientific discovery. We propose a solution: an open-source algebraic geometry-based system that, given an incomplete axiom system expressible as polynomials and a hypothesis that the axioms cannot derive, generates a minimal set of candidate axioms that, when added to the theory, provably derive the (possibly noisy) hypothesis. We illustrate the efficacy of our approach by showing that it can reconstruct key axioms required to derive the carrier-resolved photo-Hall effect, Einstein's relativistic laws, and several other laws.

Parallel Heuristic Exploration for Additive Complexity Reduction in Fast Matrix Multiplication

2025-12-21T18:36:59Z

This paper presents a parallel random-search method for reducing additive complexity in fast matrix multiplication algorithms with ternary coefficients $\{-1,0,1\}$. The approach replaces expensive exact evaluation with fast heuristic scoring, including the new Greedy-Intersections strategy. The method runs many independent common subexpression elimination processes in parallel, exploring the search space through random pair substitutions and diverse selection strategies while sharing promising partial solutions. Tested on 149 ternary-coefficient schemes, the method achieves lower addition counts than the state-of-the-art Greedy-Potential on 102 schemes (including 57 new best-known results for optimal-rank schemes), matches it on 45, and is outperformed on only 2. For most schemes, it provides equal or better results while being significantly faster, making it practical for algorithm exploration. All software and results are open source.

NL2CA: Auto-formalizing Cognitive Decision-Making from Natural Language Using an Unsupervised CriticNL2LTL Framework

2025-12-20T03:10:04Z

Cognitive computing models offer a formal and interpretable way to characterize human's deliberation and decision-making, yet their development remains labor-intensive. In this paper, we propose NL2CA, a novel method for auto-formalizing cognitive decision-making rules from natural language descriptions of human experience. Different from most related work that exploits either pure manual or human guided interactive modeling, our method is fully automated without any human intervention. The approach first translates text into Linear Temporal Logic (LTL) using a fine-tuned large language model (LLM), then refines the logic via an unsupervised Critic Tree, and finally transforms the output into executable production rules compatible with symbolic cognitive frameworks. Based on the resulted rules, a cognitive agent is further constructed and optimized through cognitive reinforcement learning according to the real-world behavioral data. Our method is validated in two domains: (1) NL-to-LTL translation, where our CriticNL2LTL module achieves consistent performance across both expert and large-scale benchmarks without human-in-the-loop feed-backs, and (2) cognitive driving simulation, where agents automatically constructed from human interviews have successfully learned the diverse decision patterns of about 70 trials in different critical scenarios. Experimental results demonstrate that NL2CA enables scalable, interpretable, and human-aligned cognitive modeling from unstructured textual data, offering a novel paradigm to automatically design symbolic cognitive agents.

Certified bounds on optimization problems in quantum theory

2025-12-19T15:44:15Z

Semidefinite relaxations of polynomial optimization have become a central tool for addressing the non-convex optimization problems over non-commutative operators that are ubiquitous in quantum information theory and, more in general, quantum physics. Yet, as these global relaxation methods rely on floating-point methods, the bounds issued by the semidefinite solver can - and often do - exceed the global optimum, undermining their certifiability. To counter this issue, we introduce a rigorous framework for extracting exact rational bounds on non-commutative optimization problems from numerical data, and apply it to several paradigmatic problems in quantum information theory. An extension to sparsity and symmetry-adapted semidefinite relaxations is also provided and compared to the general dense scheme. Our results establish rational post-processing as a practical route to reliable certification, pushing semidefinite optimization toward a certifiable standard for quantum information science.