Harmonic numbers as the summation of integrals

2021-12-01T03:28:08Z

Harmonic numbers arise from the truncation of the harmonic series. The $n^\text{th}$ harmonic number is the sum of the reciprocals of each positive integer up to $n$. In addition to briefly introducing the properties of harmonic numbers, we cover harmonic numbers as the summation of integrals that involve the product of exponential and hyperbolic secant functions. The proof is relatively simple since it only comprises the Principle of Mathematical Induction and integration by parts.

A Taxonomy of Anomalies in Log Data

2021-11-26T12:23:06Z

Log data anomaly detection is a core component in the area of artificial intelligence for IT operations. However, the large amount of existing methods makes it hard to choose the right approach for a specific system. A better understanding of different kinds of anomalies, and which algorithms are suitable for detecting them, would support researchers and IT operators. Although a common taxonomy for anomalies already exists, it has not yet been applied specifically to log data, pointing out the characteristics and peculiarities in this domain. In this paper, we present a taxonomy for different kinds of log data anomalies and introduce a method for analyzing such anomalies in labeled datasets. We applied our taxonomy to the three common benchmark datasets Thunderbird, Spirit, and BGL, and trained five state-of-the-art unsupervised anomaly detection algorithms to evaluate their performance in detecting different kinds of anomalies. Our results show, that the most common anomaly type is also the easiest to predict. Moreover, deep learning-based approaches outperform data mining-based approaches in all anomaly types, but especially when it comes to detecting contextual anomalies.

Towards a Theory of Bullshit Visualization

2021-09-23T17:20:37Z

In this unhinged rant, I lay out my suspicion that a lot of visualizations are bullshit: charts that do not have even the common decency to intentionally lie but are totally unconcerned about the state of the world or any practical utility. I suspect that bullshit charts take up a large fraction of the time and attention of actual visualization producers and consumers, and yet are seemingly absent from academic research into visualization design.

Towards the Classification of Error-Related Potentials using Riemannian Geometry

2021-09-21T06:42:47Z

The error-related potential (ErrP) is an event-related potential (ERP) evoked by an experimental participant's recognition of an error during task performance. ErrPs, originally described by cognitive psychologists, have been adopted for use in brain-computer interfaces (BCIs) for the detection and correction of errors, and the online refinement of decoding algorithms. Riemannian geometry-based feature extraction and classification is a new approach to BCI which shows good performance in a range of experimental paradigms, but has yet to be applied to the classification of ErrPs. Here, we describe an experiment that elicited ErrPs in seven normal participants performing a visual discrimination task. Audio feedback was provided on each trial. We used multi-channel electroencephalogram (EEG) recordings to classify ErrPs (success/failure), comparing a Riemannian geometry-based method to a traditional approach that computes time-point features. Overall, the Riemannian approach outperformed the traditional approach (78.2% versus 75.9% accuracy, p < 0.05); this difference was statistically significant (p < 0.05) in three of seven participants. These results indicate that the Riemannian approach better captured the features from feedback-elicited ErrPs, and may have application in BCI for error detection and correction.

Integrated Random Projection and Dimensionality Reduction by Propagating Light in Photonic Lattices

2021-08-19T12:37:42Z

It is proposed that the propagation of light in disordered photonic lattices can be harnessed as a random projection that preserves distances between a set of projected vectors. This mapping is enabled by the complex evolution matrix of a photonic lattice with diagonal disorder, which turns out to be a random complex Gaussian matrix. Thus, by collecting the output light from a subset of the waveguide channels, one can perform an embedding from a higher-dimension to a lower-dimension space that respects the Johnson-Lindenstrauss lemma and nearly preserves the Euclidean distances. It is discussed that distance-preserving random projection through photonic lattices requires intermediate disorder levels that allow diffusive spreading of light from a single channel excitation, as opposed to strong disorder which initiates the localization regime. The proposed scheme can be utilized as a simple and powerful integrated dimension reduction stage that can greatly reduce the burden of a subsequent neural computing stage.

Who Owns the Data? A Systematic Review at the Boundary of Information Systems and Marketing

2021-07-29T14:31:44Z

This paper gives a systematic research review at the boundary of the information systems (IS) and marketing disciplines. First, a historical overview of these disciplines is given to put the review into context. This is followed by a bibliographic analysis to select articles at the boundary of IS and marketing. Text analysis is then performed on the selected articles to group them into homogeneous research clusters, which are refined by selecting "distinct" articles that best represent the clusters. The citation asymmetries between IS and marketing are noted and an overall conceptual model is created that describes the "areas of collaboration" between IS and marketing. Forward looking suggestions are made on how academic researchers can better interface with industry and how academic research at the boundary of IS and marketing can be further developed.

The Graph Neural Networking Challenge: A Worldwide Competition for Education in AI/ML for Networks

2021-07-26T18:52:00Z

During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the International Telecommunication Union (ITU) has organized the "ITU AI/ML in 5G challenge'', an open global competition that has introduced to a broad audience some of the current main challenges in ML for networks. This large-scale initiative has gathered 23 different challenges proposed by network operators, equipment manufacturers and academia, and has attracted a total of 1300+ participants from 60+ countries. This paper narrates our experience organizing one of the proposed challenges: the "Graph Neural Networking Challenge 2020''. We describe the problem presented to participants, the tools and resources provided, some organization aspects and participation statistics, an outline of the top-3 awarded solutions, and a summary with some lessons learned during all this journey. As a result, this challenge leaves a curated set of educational resources openly available to anyone interested in the topic.

The Factors of Code Reviewing Process to Ensure Software Quality

2021-07-21T22:17:11Z

In the era of revolution, the development of softwares are increasing daily. The quality of software impacts the most in software development. To ensure the quality of the software it needs to be reviewed and updated. The effectiveness of the code review is that it ensures the quality of software and makes it updated. Code review is the best process that helps the developers to develop a system errorless. This report contains two different code review papers to be evaluated and find the influences that can affect the code reviewing process. The reader can easily understand the factor of the code review process which is directly associated with software quality assurance.

Special Purpose Computers for Statistical Physics: achievements and lessons

2021-07-16T08:47:27Z

In the late 80s and 90s, theoretical physicists of the Landau Institute for Theoretical Physics designed and developed several specialized computers for challenging computational problems in the physics of phase transitions. These computers did not have a central processing unit. They optimize algorithms to handle elementary operations on integers -- read, write, compare, and count. The approach allowed them to achieve recording run times. Computers performed calculations three orders of magnitude faster than similar calculations on the world's best supercomputers. The approach made it possible to obtain fundamentally new results, some of which have not yet been surpassed in the accuracy of calculations. The report will present the main ideas for the development of specialized computers and the scientific results obtained with their help. The lessons of planning and execution of long-term complex scientific projects will also be discussed.

A Guide for New Program Committee Members at Theoretical Computer Science Conferences

2021-05-04T19:40:57Z

In theoretical computer science, conferences play an important role in the scientific process. The decisions whether to accept or reject articles is taken by the program committee (PC) members. Serving on a PC for the first time can be a daunting experience. This guide will help new program-committee members to understand how the system works, and provide useful tips and guidelines. It discusses every phase of the paper-selection process, and the tasks associated to it.

Edsger W. Dijkstra: a Commemoration

2021-04-07T21:00:38Z

This article is a multiauthored portrait of Edsger Wybe Dijkstra that consists of testimonials written by several friends, colleagues, and students of his. It provides unique insights into his personality, working style and habits, and his influence on other computer scientists, as a researcher, teacher, and mentor.

What Kind of Person Wins the Turing Award?

2021-04-04T00:38:26Z

Computer science has grown rapidly since its inception in the 1950s and the pioneers in the field are celebrated annually by the A.M. Turing Award. In this paper, we attempt to shed light on the path to influential computer scientists by examining the characteristics of the 72 Turing Award laureates. To achieve this goal, we build a comprehensive dataset of the Turing Award laureates and analyze their characteristics, including their personal information, family background, academic background, and industry experience. The FP-Growth algorithm is used for frequent feature mining. Logistic regression plot, pie chart, word cloud and map are generated accordingly for each of the interesting features to uncover insights regarding personal factors that drive influential work in the field of computer science. In particular, we show that the Turing Award laureates are most commonly white, male, married, United States citizen, and received a PhD degree. Our results also show that the age at which the laureate won the award increases over the years; most of the Turing Award laureates did not major in computer science; birth order is strongly related to the winners' success; and the number of citations is not as important as one would expect.

The AI Index 2021 Annual Report

2021-03-09T02:29:44Z

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.

Empirical Standards for Software Engineering Research

2021-03-04T16:34:34Z

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.

The Slodderwetenschap (Sloppy Science) of Stochastic Parrots -- A Plea for Science to NOT take the Route Advocated by Gebru and Bender

2021-01-11T19:55:09Z

This article is a position paper written in reaction to the now-infamous paper titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Timnit Gebru, Emily Bender, and others who were, as of the date of this writing, still unnamed. I find the ethics of the Parrot Paper lacking, and in that lack, I worry about the direction in which computer science, machine learning, and artificial intelligence are heading. At best, I would describe the argumentation and evidentiary practices embodied in the Parrot Paper as Slodderwetenschap (Dutch for Sloppy Science) -- a word which the academic world last widely used in conjunction with the Diederik Stapel affair in psychology [2]. What is missing in the Parrot Paper are three critical elements: 1) acknowledgment that it is a position paper/advocacy piece rather than research, 2) explicit articulation of the critical presuppositions, and 3) explicit consideration of cost/benefit trade-offs rather than a mere recitation of potential "harms" as if benefits did not matter. To leave out these three elements is not good practice for either science or research.