https://arxiv.org/api/EZXrgTed0nHzunAuIjDMgDfJSDc2026-06-11T06:54:53Z808946515http://arxiv.org/abs/2512.11445v2The Complexity of One or Many Faces in the Overlay of Many Arrangements2025-12-30T17:22:22ZWe present an extension of the Combination Lemma of [GSS89] that expresses the complexity of one or several faces in the overlay of many arrangements, as a function of the number of arrangements, the number of faces, and the complexities of these faces in the separate arrangements. Several applications of the new Combination Lemma are presented: We first show that the complexity of a single face in an arrangement of $k$ simple polygons with a total of $n$ sides is $Θ(n α(k) )$, where $α(\cdot)$ is the inverse of Ackermann's function. We also give a new and simpler proof of the bound $O \left( \sqrt{m} λ_{s+2}( n ) \right)$ on the total number of edges of $m$ faces in an arrangement of $n$ Jordan arcs, each pair of which intersect in at most $s$ points, where $λ_{s}(n)$ is the maximum length of a Davenport-Schinzel sequence of order $s$ with $n$ symbols. We extend this result, showing that the total number of edges of $m$ faces in a sparse arrangement of $n$ Jordan arcs is $O \left( (n + \sqrt{m}\sqrt{w}) \frac{λ_{s+2}(n)}{n} \right)$, where $w$ is the total complexity of the arrangement. Several other applications and variants of the Combination Lemma are also presented.2025-12-12T10:32:56ZArticle based on MS ThesisSariel Har-Peled10.1016/S0925-7721(98)00042-Xhttp://arxiv.org/abs/2512.24078v1High-dimensional Regret Minimization2025-12-30T08:40:48ZMulti-criteria decision making in large databases is very important in real world applications. Recently, an interactive query has been studied extensively in the database literature with the advantage of both the top-k query (with limited output size) and the skyline query (which does not require users to explicitly specify their preference function). This approach iteratively asks the user to select the one preferred within a set of options. Based on rounds of feedback, the query learns the implicit preference and returns the most favorable as a recommendation.
However, many modern applications in areas like housing or financial product markets feature datasets with hundreds of attributes. Existing interactive algorithms either fail to scale or require excessive user interactions (often exceeding 1000 rounds). Motivated by this, we propose FHDR (Fast High-Dimensional Reduction), a novel framework that takes less than 0.01s with fewer than 30 rounds of interaction. It is considered a breakthrough in the field of interactive queries since most, if not all, existing studies are not scalable to high-dimensional datasets.
Extensive experiments demonstrate that FHDR outperforms the best-known algorithms by at least an order of magnitude in execution time and up to several orders of magnitude in terms of the number of interactions required, establishing a new state of the art for scalable interactive regret minimization.2025-12-30T08:40:48ZJunyu LiaoAshwin LallMitsunori OgiharaRaymond Wonghttp://arxiv.org/abs/2512.24061v1Notes on the 33-point Erdős--Szekeres problem2025-12-30T08:10:49ZThe determination of $ES(7)$ is the first open case of the planar Erdős--Szekeres problem, where the general conjecture predicts $ES(7)=33$. We present a SAT encoding for the 33-point case based on triple-orientation variables and a 4-set convexity criterion for excluding convex 7-gons, together with convex-layer anchoring constraints. The framework yields UNSAT certificates for a collection of anchored subfamilies. We also report pronounced runtime variability across configurations, including heavy-tailed behavior that currently dominates the computational effort and motivates further encoding refinements.2025-12-30T08:10:49ZBogdan Dumitruhttp://arxiv.org/abs/2108.04798v4Pointwise Distance Distributions for detecting near-duplicates in large materials databases2025-12-29T22:26:33ZMany real objects are modeled as discrete sets of points, such as corners or other salient features. For our main applications in chemistry, points represent atomic centers in a molecule or a solid material. We study the problem of classifying discrete (finite and periodic) sets of unordered points under isometry, which is any transformation preserving distances in a metric space.
Experimental noise motivates the new practical requirement to make such invariants Lipschitz continuous so that perturbing every point in its epsilon-neighborhood changes the invariant up to a constant multiple of epsilon in a suitable distance satisfying all metric axioms. Since the given points are unordered, the key challenge is to compute all invariants and metrics in a near-linear time of the input size.
We define the Pointwise Distance Distribution (PDD) for any discrete set and prove, in addition to the properties above, the completeness of PDD for all periodic sets in general position. The PDD can compare nearly 2 million crystals from the world's five largest databases within 2 hours on a modest desktop computer. The impact is upholding data integrity in crystallography because the PDD will not allow anyone to claim a `new' material as a noisy disguise of a known crystal.2021-08-10T17:15:10Z38 pages, 15 figures. This version extended Lemma 5.3 to Theorem 5.3 in high dimensions, updated bibliographic details of published references, and corrected a few typos. The official version will appear in SIAM Journal on Applied Mathematics. The authors' version is maintained at https://kurlin.org/projects/periodic-geometry/near-duplicates-materials-databases.pdfDaniel WiddowsonVitaliy Kurlin10.1137/25M1736657http://arxiv.org/abs/2410.08203v3Complete and bi-continuous invariant of protein backbones under rigid motion2025-12-29T18:22:04ZProteins are large biomolecules that regulate all living organisms and consist of one or several chains. The primary structure of a protein chain is a sequence of amino acid residues whose three main atoms (alpha-carbon, nitrogen, and carbonyl carbon) form a protein backbone. The tertiary structure is the rigid shape of a protein chain represented by atomic positions in 3-dimensional space. Because different geometric structures often have distinct functional properties, it is important to continuously quantify differences in rigid shapes of protein backbones. Unfortunately, many widely used similarities of proteins fail axioms of a distance metric and discontinuously change under tiny perturbations of atoms.
This paper develops a complete invariant that identifies any protein backbone in 3-dimensional space, uniquely under rigid motion. This invariant is Lipschitz bi-continuous in the sense that it changes up to a constant multiple of a maximum perturbation of atoms, and vice versa. The new invariant has been used to detect thousands of (near-)duplicates in the Protein Data Bank, whose presence inevitably skews machine learning predictions. The resulting invariant space allows low-dimensional maps with analytically defined coordinates that reveal substantial variability in the protein universe.2024-10-10T17:59:26ZThis version updated bibliographic details of published references and corrected a few typos. The official version appeared in the journal MATCH Communications in Mathematical and in Computer Chemistry, v.94(1), p.97-134 (2025). The latest version is maintained at https://kurlin.org/projects/geometry-proteins/complete-invariants-proteins.pdfOlga AnosovaAlexey GorelovWilliam JeffcottZiqiu JiangVitaliy Kurlin10.46793/match.94-1.097Ahttp://arxiv.org/abs/2512.23766v1A Granular Grassmannian Clustering Framework via the Schubert Variety of Best Fit2025-12-29T02:28:45ZIn many classification and clustering tasks, it is useful to compute a geometric representative for a dataset or a cluster, such as a mean or median. When datasets are represented by subspaces, these representatives become points on the Grassmann or flag manifold, with distances induced by their geometry, often via principal angles. We introduce a subspace clustering algorithm that replaces subspace means with a trainable prototype defined as a Schubert Variety of Best Fit (SVBF) - a subspace that comes as close as possible to intersecting each cluster member in at least one fixed direction. Integrated in the Linde-Buzo-Grey (LBG) pipeline, this SVBF-LBG scheme yields improved cluster purity on synthetic, image, spectral, and video action data, while retaining the mathematical structure required for downstream analysis.2025-12-29T02:28:45ZKarim SaltaMichael KirbyChris Petersonhttp://arxiv.org/abs/2512.09170v4Magic Gems: A Polyhedral Framework for Magic Squares2025-12-28T07:38:43ZWe introduce Magic Gems, a geometric representation of magic squares as three-dimensional polyhedra. By mapping an n times n magic square onto a centered coordinate grid with cell values as vertical displacements, we construct a point cloud whose convex hull defines the Magic Gem. Building on prior work connecting magic squares to physical properties such as moment of inertia, this construction reveals an explicit statistical structure: we show that magic squares have vanishing covariances between position and value. We develop a covariance energy functional (the sum of squared covariances with individual row, column, and diagonal indicator variables) and prove that for all orders of n greater than or equal to three, an arrangement is a magic square if and only if this complete energy vanishes. This characterization transforms the classical line-sum definition into a statistical orthogonality condition. We also study a simpler low-mode relaxation using only four aggregate position indicators; this coincides with the complete characterization for n equals three (verified exhaustively) but defines a strictly larger class for n greater than or equal to four (explicit counterexamples computed). Perturbation analysis demonstrates that magic squares are isolated local minima in the energy landscape. The representation is invariant under dihedral symmetry D4, yielding canonical geometric objects for equivalence classes.2025-12-09T22:33:47ZConnecting Combinatorics, Geometry, and Linear Algebra. 8 figures, ancillary code included. Interactive visualization: https://magicgemweb.netlify.app/Kyle Elliott Mathewsonhttp://arxiv.org/abs/2508.19582v2Approximating mixed volumes to arbitrary accuracy2025-12-27T03:26:34ZWe study the problem of approximating the mixed volume $V(P_1^{(α_1)}, \dots, P_k^{(α_k)})$ of an $k$-tuple of convex polytopes $(P_1, \dots, P_k)$, each of which is defined as the convex hull of at most $m_0$ points in $\mathbb{Z}^n$. We design an algorithm that produces an estimate that is within a multiplicative $1 \pm ε$ factor of the true mixed volume with a probability greater than $1 - δ.$ Let the constant $ \prod_{i=2}^{k} \frac{(α_{i}+1)^{α_{i}+1}}{α_{i}^{\,α_{i}}}$ be denoted by $\tilde{A}$. When each $P_i \subseteq B_\infty(2^L)$, we show in this paper that the time complexity of the algorithm is bounded above by a polynomial in $n, m_0, L, \tilde{A}, ε^{-1}$ and $\log δ^{-1}$. In fact, a stronger result is proved in this paper, with slightly more involved terminology.
In particular, we provide the first randomized polynomial time algorithm for computing mixed volumes of such polytopes when $k$ is an absolute constant, but $α_1, \dots, α_k$ are arbitrary. Our approach synthesizes tools from convex optimization, the theory of Lorentzian polynomials, and polytope subdivision.2025-08-27T05:31:30ZHariharan NarayananSourav Royhttp://arxiv.org/abs/2512.21901v1Graph Drawing Stress Model with Resistance Distances2025-12-26T07:27:10ZThis paper challenges the convention of using graph-theoretic shortest distance in stress-based graph drawing. We propose a new paradigm based on resistance distance, derived from the graph Laplacian's spectrum, which better captures global graph structure. This approach overcomes theoretical and computational limitations of traditional methods, as resistance distance admits a natural isometric embedding in Euclidean space. Our experiments demonstrate improved neighborhood preservation and cluster faithfulness. We introduce Omega, a linear-time graph drawing algorithm that integrates a fast resistance distance embedding with random node-pair sampling for Stochastic Gradient Descent (SGD). This comprehensive random sampling strategy, enabled by efficient pre-computation of resistance distance embeddings, is more effective and robust than pivot-based sampling used in prior algorithms, consistently achieving lower and more stable stress values. The algorithm maintains $O(|E|)$ complexity for both weighted and unweighted graphs. Our work establishes a connection between spectral graph theory and stress-based layouts, providing a practical and scalable solution for network visualization.2025-12-26T07:27:10ZAccepted by PacificVis 2026 (TVCG Journal Track)Yosuke Onouehttp://arxiv.org/abs/2512.20926v2Uncovering Hierarchical Structure in LLM Embeddings with $δ$-Hyperbolicity, Ultrametricity, and Neighbor Joining2025-12-25T17:36:52ZThe rapid advancement of large language models (LLMs) has enabled significant strides in various fields. This paper introduces a novel approach to evaluate the effectiveness of LLM embeddings in the context of inherent geometric properties. We investigate the structural properties of these embeddings through three complementary metrics $δ$-hyperbolicity, Ultrametricity, and Neighbor Joining. $δ$-hyperbolicity, a measure derived from geometric group theory, quantifies how much a metric space deviates from being a tree-like structure. In contrast, ultrametricity characterizes strictly hierarchical structures where distances obey a strong triangle inequality. While Neighbor Joining quantifies how tree-like the distance relationships are, it does so specifically with respect to the tree reconstructed by the Neighbor Joining algorithm. By analyzing the embeddings generated by LLMs using these metrics, we uncover to what extent the embedding space reflects an underlying hierarchical or tree-like organization. Our findings reveal that LLM embeddings exhibit varying degrees of hyperbolicity and ultrametricity, which correlate with their performance in the underlying machine learning tasks.2025-12-24T04:15:36ZPrakash ChourasiaSarwan AliMurray Pattersonhttp://arxiv.org/abs/2409.19464v7Harmonious loci of Poncelet triangles about the incircle and their degeneracies2025-12-25T11:24:15ZWe tour several Euclidean properties of Poncelet triangles inscribed in an ellipse and circumscribing the incircle, including loci of triangle centers and envelopes of key objects. We also show that a number of degenerate behaviors are triggered by the presence of an equilateral triangle in the family.2024-09-28T21:54:47Z24 pages, 24 figures, 4 tablesMark HelmanRonaldo A. GarciaDan Reznikhttp://arxiv.org/abs/2601.03275v1Sufficient Conditions for the Shrinking Wellness Lemma2025-12-25T07:06:53ZThe well groups were introduced by Edelsbrunner, Morozov, and Patel to measure the robustness of geometric features of a function with respect to perturbations. Roughly speaking, the $r$-th well group measures the number of features that cannot be removed by perturbing the function by at most $r$. The Shrinking Wellness Lemma states that the rank of these groups decreases as $r$ increases. In the generality originally stated, it is wrong. We present a counterexample and give conditions under which the result holds. These conditions are general enough to cover most cases in which the well groups have been applied.2025-12-25T07:06:53Z8 pagesClemens Bannwarthttp://arxiv.org/abs/2512.20325v1Top-K Exterior Power Persistent Homology: Algorithm, Structure, and Stability2025-12-23T12:49:44ZExterior powers play important roles in persistent homology in computational geometry. In the present paper we study the problem of extracting the $K$ longest intervals of the exterior-power layers of a tame persistence module. We prove a structural decomposition theorem that organizes the exterior-power layers into monotone per-anchor streams with explicit multiplicities, enabling a best-first algorithm. We also show that the Top-$K$ length vector is $2$-Lipschitz under bottleneck perturbations of the input barcode, and prove a comparison-model lower bound. Our experiments confirm the theory, showing speedups over full enumeration in high overlap cases. By enabling efficient extraction of the most prominent features, our approach makes higher-order persistence feasible for large datasets and thus broadly applicable to machine learning, data science, and scientific computing.2025-12-23T12:49:44ZYoshihiro Maruyamahttp://arxiv.org/abs/2512.20311v1Algorithm for Interpretable Graph Features via Motivic Persistent Cohomology2025-12-23T12:29:58ZWe present the Chromatic Persistence Algorithm (CPA), an event-driven method for computing persistent cohomological features of weighted graphs via graphic arrangements, a classical object in computational geometry. We establish rigorous complexity results: CPA is exponential in the worst case, fixed-parameter tractable in treewidth, and nearly linear for common graph families such as trees, cycles, and series-parallel graphs. Finally, we demonstrate its practical applicability through a controlled experiment on molecular-like graph structures.2025-12-23T12:29:58ZYoshihiro Maruyamahttp://arxiv.org/abs/2512.20239v1Hierarchical Rectangle Packing Solved by Multi-Level Recursive Logic-based Benders Decomposition2025-12-23T10:50:33ZWe study the two-dimensional hierarchical rectangle packing problem, motivated by applications in analog integrated circuit layout, facility layout, and logistics. Unlike classical strip or bin packing, the dimensions of the container are not fixed, and the packing is inherently hierarchical: each item is either a rectangle or a block occurrence, whose dimensions are a solution of another packing problem. This recursive structure reflects real-world scenarios in which components, boxes, or modules must be packed within higher-level containers. We formally define the problem and propose exact formulations in Mixed-Integer Linear Programming and Constraint Programming. Given the computational difficulty of solving complex packing instances directly, we propose decomposition heuristics. First, we implement an existing Bottom-Up baseline method that solves subblocks before combining them at higher levels. Building upon this, we introduce a novel multilevel Logic-based Benders Decomposition method. This heuristic method dynamically refines block dimension constraints, eliminating the need for manual selection of candidate widths or aspect ratios. Experiments on synthetic instances with up to seven hierarchy levels, 80 items per block, and limited computation time show that the proposed decomposition significantly outperforms both monolithic formulations and the Bottom-Up method in terms of solution quality and scalability.2025-12-23T10:50:33ZPreprint submitted to Computers and Operations Research, 55 pagesJosef GrusZdeněk HanzálekChristian ArtiguesCyrille BriandEmmanuel Hebrard