Data Structures and Algorithms

Theoretical Structures as Networks from the Perspective of Automated Creation and Proving of Theorems

We envisage theoretical structures (especially in pure mathematics and theoretical physics) as networks made of elementary propositions (representing nodes) interconnected through deductive relationships (representing throats). This vision can be exploited as a basis for employing traditional network modeling techniques in the automated search for new theorems as well as for automated proving of proposed theorems and conjectures. This deductive, deterministic and intuitive approach can replace some of the conventional approaches (which are generally more sophisticated and elaborate and hence they are more expensive) in certain areas of automated and assisted theorem proving in addition to its benefit in the automated search for novel theorems. However, we admit that it has a number of limitations and shortcomings although this similarly applies to other methods in this field; moreover some of these limitations and shortcomings can be overcome by the reformulation of certain theoretical structures where we rely for the viability of this reformulation on our perception of theoretical structures as elaborate high-level linguistic systems.

[2] vixra:2407.0093 [pdf]

A Replication Study of the TA-Prioritized Algorithm with a Modified Deadlock Avoidance Method

Pieter Cawood

This paper proposes a modified deadlock avoidance method for the TA-Prioritized algorithm by Liu et al. [1]. Their algorithms were developed for the offline Multi-Agent Pickup and Delivery (MAPD) problems where a team of agents have delivery tasks with known release times (when the tasks are ready for pickup.) Offline MAPD problems exist in settings such as warehouses and factories where the release times of tasks are known in advance and Liu et al. [1] make use of this information to compute a good task sequence for each agent using a travelling salesman problem (TSP) solver. The task sequences are then used to plan agent paths accordingly and the deadlock avoidance method proposed attempts holding the pickup locations (keeping an agent stationed at a vertex) if an agent has reached it before the release time and retrying a 1 timestep delayed path from the agent’s initial parking location if any of the agent’s tasks’ path finding fails.

[3] vixra:2402.0156 [pdf]

Designing Performance Testing Software for Cloud Based Enterprise Applications

Pranshu Mishra, Surbhi Agarwal

The paper presents the design and implementation of performance testing software designed forcloud-based enterprise applications. With the increasing adoption of cloud computing in enterprise environments, ensuring the performance and scalability of these cloud-based applications is crucial. Traditional performance testing tools may not fully address the unique challenges posed by cloud environments as these tools are, generally speaking, usually designed as load generators. Therefore, we propose a comprehensive approach to designing performance testing software specifically for cloud-based enterprise applications in which load generation is one component of it’s architecture.

[4] vixra:2312.0116 [pdf]

Authentication Link: A Novel Authentication Architecture in IoT/IoB Environment

Gi-Yoon Jeon

The authentication is the process of determining whether someone or something is, and there are many authentication methods for digital environment. The digital authentication is divided into three main categories, 'What you have', 'What you know', and 'Who you are'. Furthermore, there are multi-factor authentications using a combination of two or more of these. However, these methods are always exposed to the risk of forgery, tampering, and stealing. This paper proposes a novel authentication architecture that is suitable for Internet of Things (IoT) and Internet of Behaviors (IoB) environment. In the aspect of technology, the proposed architecture is token based authentication method. However, this architecture is continuous, mimics real analog world, and has the advantage of being immediately recognizable in counterfeiting.

[5] vixra:2309.0146 [pdf]

A Polynomial Solution for the 3-Sat Problem

Geraldine Reichard

In this paper, an algorithm is presented, solving the 3-SAT problem in a polynomial runtime of O(n^3), which implies P=NP. The 3-SAT problem is about determining, whether or not a logical expression, consisting of clauses with up to 3 literals connected by OR-expressions, that are interconnected by AND-expressions, is satisfiable. For the solution a new data structure, the 3-SAT graph, is introduced. It groups the clauses from a 3-SAT expression into coalitions, that contain all clauses with literals consisting of the same variables. The nodes of the graph represent the variables connecting the corresponding coalitions. An algorithm R will be introduced, that identifies relations between clauses by transmitting markers, called upgrades, through the graph, making use of implications. The algorithm will start sequentially for every variable and create start upgrades, one for the variables negated and one for its non-negated literals. It will be shown, that the start upgrades have to be within a specific clause pattern, called edge pattern, to mark a beginning or ending of an unsatisfiable sequence. The algorithm will eventually identify other kinds of pattern within the upgraded clauses. Depending on the pattern, the algorithm either sends the upgrades on through the graph, creates new following upgrades to extend the upgrade path, a subgraph storing all previous upgrades, or if all connector literals of a pattern have received upgrades of the same path or two corresponding following upgrades circle, marks the upgrade as circling. If an upgrade circles, then it is unsatisfiable on its path. It will be proven, that if after several execution steps of algorithm R, two corresponding start upgrades circle, then the expression is unsatisfiable and if no upgrade steps are possible anymore and the algorithm did not return unsatisfiable, the expression is satisfiable. The algorithm R is similar to already existing solutions solving 2-SAT polynomial, also making use of implications using a graph.

[6] vixra:2308.0141 [pdf]

On a Solution of "P vs NP" Millennium Prize Problem Based on the Subset Sum Problem

B. Sinchev, A. B. Sinchev, A. М. Mukhanova

Given a set of distinct non-negative integers X^n and a target certificate S parametrized in: ∃X^k⊆X^n,∑_(x_i∈X^k)[]x_i =S (k=|X^k |,n=|X^n |). We present a polynomial solution of the subset sum problem with time complexity T≤O(n^2) and space complexity S≤O(n^2 ), so that P = NP.

[7] vixra:2307.0148 [pdf]

A Differential Datalog Interpreter

Matthew Stephenson

The core reasoning task for datalog engines is materialization, the evaluation of a datalog program over a database alongside its physical incorporation into the database itself. The de-facto method of computing it, is through the recursive application of inference rules. Due to it being a costly operation, it is a must for datalog engines to provide incremental materialization, that is, to adjustthe computation to new data, instead of restarting from scratch. One of the major caveats, is that deleting data is notoriously more involved than adding, since one has to take into account all possible data that has been entailed from what is being deleted. DifferentialDataflow is a computational model that provides efficient incremental maintenance, notoriously with equal performance between additions and deletions, and work distribution, of iterative dataflows. In this paper we investigate the performance of materialization with three reference datalog implementations, out of which one is built on top of a lightweight relational engine, and the two others are differential-dataflow and non-differential versions of the same rewrite algorithm, with the same optimizations.

[8] vixra:2307.0103 [pdf]

A Comparative Analysis of Smart Contract Fuzzers’ Effectiveness

Antonio Viggiano

This study presents a comparative analysis of randomized testing algorithms, commonly known as fuzzers, with a specific emphasis on their effectiveness in catching bugs in Solidity smart contracts. We employ the non-parametric Mann-Whitney U-test to gauge performance, defined as the ``time to break invariants per mutant'', using altered versions of the widely-forked Uniswap v2 protocol. We conduct 30 tests, each with a maximum duration of 24 hours or 4,294,967,295 runs, and evaluate the speed at which the fuzzers Foundry and Echidna can breach any of the 22 protocol's invariant properties for each of the 12 mutants, created both with mutation testing tools and with manual bug injection methods. The research shows significant performance variabilities between runs for both Foundry and Echidna depending on the instances of mutated code. Our analysis indicates that Foundry was able to break invariants faster in 9 out of 12 tests, while Echidna in 1 out of 12 tests, and in the remaining 2 tests, the difference in performance between the two fuzzers was not statistically significant. The paper concludes by emphasizing the necessity for further research to incorporate additional fuzzers and real-world bugs, and paves ground for further developments of more precise and rigorous evaluations of fuzzer effectiveness.

[9] vixra:2306.0119 [pdf]

The Fabbrini Problem

Ed Andersen

This document intends to emphasize some aspects of a recent algorithm capable of generating a secret key by transmitting information over a public channel. Given that the scheme’s construction is engaging and represents a topical innovation, we deem it useful to refer to it as "The Fabbrini Problem", after its author.

[10] vixra:2305.0088 [pdf]

Design and Implementation of a Real-time Portfolio Risk Management System

Farid Soroush

The purpose of this report is to describe the design and implementation of a real-time portfolio risk management system. The system is developed in Python and utilizes pandas and numpy libraries for data management and calculations. With the advent of high-frequency trading, risk management has become a crucial aspect of the trading process. Traditional risk management practices are often not suitable due to the high-speed nature of these trades. Therefore, there is a need for a real-time risk management system that can keep pace with high-frequency trades.

[11] vixra:2305.0062 [pdf]

A Simple Market Making Strategy for the S&P 500 Index Using Synthetic Bid-Ask Spread Data

Farid Soroush

Market making is a crucial component of financial markets, providing liquidity to market participants by quoting both buy (bid) and sell (ask) prices for an asset. The main objective of a market maker is to profit from the bid-ask spread while managing inventory risk. In this paper, we implement a simple market making strategy for the S&P 500 index using synthetic bid-ask spread data.

[12] vixra:2302.0099 [pdf]

Route Planning Algorithms for Efficient 3D Printing

Shulang Lei

General 3D printers use Fused Deposition Modeling (FDM) and Stereolithography (SLA) technologies to print 3D models. However, turning the nozzle on and off during FDM or SLA extruding will cause unwanted results. This project created an experimental 3D model slicer named Embodier that generates continuous extruding paths whenever possible. This enables 3D printers to draw out the printing layers accurately in a stable manner. Furthermore, the slicer partitions the outlines to tree structures for efficiency and applies flooding algorithm for water-tightness validation. Lastly, a 3D printing simulator is also created to visualize the printed paths in 3D for a more intuitive review of the Embodier slicer. The end result is that we have discovered that not only a single continuous-extruded-path slicer is possible, it can also be optimized for performance and robustness in practice.

[13] vixra:2212.0144 [pdf]

Models of Pen-PL Gaussian Games in Non-cooperative Communication

Gilbert Krougman, Pasoul Rozhier, Kawabata Yakurami

Consider non-cooperative pen games where both players act strategically and heavily influence each other. In spam and malware detection, players exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time. The result shows Pen-PL Games have a probability distribution that approximates a Gaussian distribution according to some probability distribution defined over the respective strategy set. With quadratic cost functions and multivariate Gaussian processes, evolving according to first order auto-regressive models, we show that Pen-PL "smooth" curve signaling rules are optimal. Finally, we show that computing a socially optimal Pen-PL network placement is NP-hard and that this result holds for all P-PL-G distributions.

[14] vixra:2207.0150 [pdf]

Simpler O(1) Query Algorithm for Level Ancestors

Sanjeev Saxena

This note describes a very simple O(1) query time algorithm for finding level ancestors. This is basically a serial (re)-implementation of the parallel algorithm of Berkman and Vishkin(O.Berkman and U.Vishkin, Finding level-ancestors in trees, JCSS, 48, 214--230, 1994).Although the basic algorithm has preprocessing time of O(n log n), by having additional levels or using table lookup, the preprocessing time can be reduced to almost linear or linear.The table lookup algorithm can be built in O(1) parallel time with n processors and can also be used to simplify the parallel algorithm of Berkman and Vishkin and make it optimal.

[15] vixra:2205.0039 [pdf]

A Hundred Attacks in Distributed Systems

Arash Vaezi, Sara Azarnoush, Parsa Mohammadian

The objective of any security system is the capacity to keep a secret. It is vital to keep the data secret when it is stored as well as when it is sent over a network. Nowadays, many people utilize the internet to access various resources, and several businesses employ a dispersed environment to give services to their users. As a result, a more secure distributed environment is required, in which all transactions and processes can be effectively completed safely. It is critical in a distributed system environment to deliver reliable services to users at any time and from any place. As an example of a distributed system, Blockchain is a unique distributed system that has confronted lots of attacks despite its security mechanism. Security is a top priority in a distributed setting. This paper organizes many attacks that byzantine users may apply to take advantage of the loyal users of a system. A wide range of previous articles dealt considered diverse types of attacks. However, we could not find a well-organized document that helps scientists consider different attacking aspects while designing a new distributed system. A hundred various kinds of most essential attacks are categorized and summarized in this article.

[16] vixra:2204.0040 [pdf]

Saving Proof-of-Work by Hierarchical Block Structure: Bitcoin 2.0?

Qui Somnium

We argue that the current Proof of Work based consensus algorithm of the Bitcoin network suffers from a fundamental economic discrepancy between the real-world transaction costs incurred by miners and the wealth that is being transacted. Put simply, whether one transacts 1 satoshi or 1 bitcoin, the same amount of electricity is needed when including this transaction into a block. The notorious Bitcoin blockchain problems such as its high energy usage per transaction or its scalability issues are, either partially or fully, mere consequences of this fundamental economic inconsistency. We propose making the computational cost of securing the transactions proportional to the wealth being transfered, at least temporarily. First, we present a simple incentive based model of Bitcoin's security. Then, guided by this model, we augment each transaction by two parameters, one controlling the time spent securing this transaction and the second determining the fraction of the network used to accomplish this. The current Bitcoin transactions are naturally embedded into this parametrized space. Then we introduce a sequence of hierarchical block structures (HBSs) containing these parametrized transactions. The first of those HBSs exploits only a single degree of freedom of the extended transaction, namely the time investment, but it allows already for transactions with a variable level of trust together with aligned network fees and energy usage. In principle, the last HBS should scale to tens of thousands timely transactions per second while preserving what the previous HBSs achieved. We also propose a simple homotopy based transition mechanism which enables us to relatively safely and continuously introduce new HBSs into the existing blockchain. Our approach is constructive and as rigorous as possible and we attempt to analyze all aspects of these developments, al least at a conceptual level. The process is supported by evaluation on recent transaction data.

[17] vixra:2201.0155 [pdf]

Image Deblurring: a Class of Matrices Approximating Toeplitz Matrices

Antonio Boccuto, Ivan Gerace, Valentina Giorgetti

We deal with the image deblurring problem. We assume that the blur mask has large dimensions. To restore the images, we propose a GNC-type technique, in which a convex approximation of the energy function is first minimized. The computational cost of the GNC algorithm depends strongly on the cost of such a first minimization. So, we propose of approximating the Toeplitz symmetric matrices in the blur operator by means of suitable matrices. Such matrices are chosen in a class of matrices which can be expressed as a direct sum between a circulant and a reverse circulant matrix.

[18] vixra:2112.0088 [pdf]

Knowledge Graph Based Query Processing by Matching Words to Entities Using Wikipedia

Guanxuan Wu

The thirty-years development of Search Engines could never set apart the fundamental problem: to improve relativity for the page rankings with the given query. As the NLP is now widely used, this paper discusses a data abstraction via knowledge graph (KG) for NLP models that could be applied in relating keywords to entities with a higher probability.

[19] vixra:2109.0113 [pdf]

Unishox a Hybrid Encoder for Short Unicode Strings

Arundale Ramanathan

Unishox is a hybrid encoding technique, with which short unicode strings could be compressed using context aware pre-mapped codes and delta coding resulting in surprisingly good ratios.

[20] vixra:2106.0132 [pdf]

Lower Bound for Arbitrarily Aligned Minimal Enclosing Rectangle

Boris Zavlin

We determine the lower bound for arbitrarily aligned perimeter and area Minimal Enclosing Rectangle (MER) problems to be Omega(n log n) by using a reduction technique for a problem with this known lower bound.

[21] vixra:2101.0033 [pdf]

Evaluation and Implementation of Proven Real-Time Image Processing Features for Use in Web Browsers

Amr Abdellatif

We explored the requirement of proven features for real-time use in web browsers, adopting a linear SVM based face detection model as a test case to evaluate each descriptor with appropriate parameters. After checking multiple feature extraction algorithms, we decided to study the following four descriptors Histogram of oriented gradients, Canny edge detection, Local binary pattern, and Dense DAISY . These four descriptors are used in various computer vision tasks to oer a wide range of options. We then investigated the influence of different parameters as well as dimension reduction on each descriptor computational time and its ability to be processed in real-time. We also evaluated the influence of such changes on the accuracy of each model.

[22] vixra:2101.0032 [pdf]

Evaluation of ML Methods for Online Use in the Browser

Hao Liu

Machine learning continues to be an increasingly integral component of our lives, whether we are applying the techniques to research or business problems. Machine learning models ought to be able to give accurate predictions in order to create real value for a given organization. At the same time Machine learning training by running algorithms on the browser has gradually become a trend. As the closest link to users in the Internet, the web front-end can also create a better experience for our users through AI capabilities. This article will focus on how to evaluate machine learning algorithms and deploy machine learning models in the browser.We will use "Cars", "MNIST" and "Cifar-10" datasets to test LeNet, AlexNet, GoogLeNet and ResNet models in the browser. On the other hand, we will also test emerging lightweight models such as MobileNet. By trying, comparing, and comprehensively evaluating regression and classification tasks, we can summarize some excellent methods/models and experiences suitable for machine learning in the browser.

[23] vixra:2101.0031 [pdf]

Improving BrowserWatermarking with Eye Tracking

Nikolai-Iraj Sanamrad

The problem of authenticating online users can be divided in two general sub problems: confirming two different web users being the same person, and confirming two different web users being not the same person. The easiest and most accessible method for fingerprinting used by online services is browser fingerprinting. Browser fingerprinting distinguishes between user devices using information about the browser and the system, which is usually provided by most browsers to any website, even when cookies are turned off. This data is usually unique enough even for identical devices, since it heavily relies on device usage by the user. In this work, browser fingerprinting is being improved with the usage of information acquired from an eye tracker.

[24] vixra:2101.0030 [pdf]

Erstellung Und Bewertung Eineswebbrowserbasierten Frameworks Zur Datenakquise Für Studien

Oliwia Oles

Diese Arbeit beschäftigt sich mit der Ertsellung und Bewertung eines Frameworks zur Durchführung und Datenerfassung von Studien, wie beispielswweise eine Blickpostionsvorhersage oder Emotionserkennung. Der Eyetracker basiert auf der webgazer.js Biblitohek der Brown Universität und die Emotionserkennung beruht auf einer FrontEnd-Emotionserkennungs Bibliothek auf Grundlage von der Google Shape Detection API. Beide Bibiliotheken, wurden implementiert, um sie für Studien zu gebrauchen. Zusätzlich zu diesen Daten werden auch noch Angaben wie das Alter und das Geschlecht, jedoch nur auf freiwilliger Basis, erfasst, um mehr Aussagekraft zu erhalten. Anschließend wird das Framework getestet und bewertet, um zu erfahren, ob es genauso gut in der Praxis wie in der Theorie funktioniert, da es sich hierbei um komplett clientseitige Anwendungen handelt. Im Normalfall benötigen solche Anwendungen nämlich noch zusätzliche Hardware, um zu funktionieren. Die verwendeten Biblitoheken kommen jedoch ohne diese aus, fordern nur eine funktionierende Webcam und bestehen nur auf JavaScript Code. Deshalb stellt sich hier nun die Frage, ob eine rein clientseitige Anwendung vergleichbar gut mit anderen Anwednungen in diesen Bereichen funktioniert und ob sich die Daten gut erfassen lassen.

[25] vixra:2101.0029 [pdf]

Erstellung Und Bewertung Einer Webplattformbasierten Datenhaltungs Und Bewertungssoftware Für Studien

Roufayda Salaheddine

Aufgrund der zunehmenden Nutzung und Felxibilität des Internets, hat sich der Einsatz von Webanwendungen in vielen Anwendungsgebieten etabliert. Vor allem in der empirischen Forschung, z.B. in Bereichen des maschinellen Lernens oder der Mensch-Computer-Interaktion, findet eine steigende Integration vonWebanwendungen in den Studienbetrieb statt, um schnell und flexibel große Datensätze verwalten und auswerten zu können. Hierzu ist es wichtig, eine Datenhaltungssoftware so zu gestalten, dass eine qualitativ hochwertige und effziente Speicherung der erhobenen Studiendaten gewährleistet wird. Das Ziel dieser Arbeit ist die Entwicklung und Bewertung einer Webplattformbasierten Datenhaltungs- und Bewertungssoftware für Eye-Tracking- und Emotion-Detection-Studien. Dazu wurde eine auf dem Client-Server-Modell basierende Webplattform erstellt, die Probandendaten sammelt und anschließend in einer Datenbank verwaltet. Darauf aufbauend wird im Rahmen dieser Arbeit diese Software unter Betrachtung verschiedener Anforderungen evaluiert und bewertet.

[26] vixra:2012.0211 [pdf]

EsoCipher: An Instance of Hybrid Cryptography

Neel Adwani

This paper proposes a whole new concept in the field of Cryptography, i.e., EsoCiphers. Short for Esoteric Cipher, EsoCipher is an algorithm, which can be understood by a few, who have the knowledge about its backend architecture. The idea behind this concept is derived from esoteric programming languages, but EsoCiphers will be having a practical use in the future, as more research is done on the topic. Although the algorithm is quite simple to understand, the complexity of the output will indeed prove to be difficult to brute-force if a secret is embedded in it. It uses a hybrid cryptography-based technique, which combines ASCII, Binary, Octal, and ROT 13 ciphers. The implementation and similarity index has been provided to show that it can be implemented practically.

[27] vixra:2011.0123 [pdf]

Computation of Cup I-Product and Steenrod Operations on the Classifying Space of Finite Groups

Daher Al Baydli

The aim of this paper is to give a computational treatment to compute the cup iproduct and Steerod operations on cohomology rings of as many groups as possible. We find a new method that could be faster than the methods of Rusin, Vo, and Guillot. There are some available approaches for computing Steenrod operations on these cohomology rings. The computation of all Steenrod squares on the Mod 2 cohomology of all groups of order dividing 32 and all but 58 groups of order 64; partial information on Steenrod square is obtained for all but two groups of order 64. For groups of order 32 this paper completes the partial results due to Rusin, Thanh Tung Vo and Guillot.

[28] vixra:2010.0059 [pdf]

Structural Entropy of Daily Number of COVID-19 Related Fatalities

Eren Unlu

A recently proposed temporal correlation based network framework applied on financial markets called Structural Entropy has prompted us to utilize it as a means of analysis for COVID-19 fatalities across countries. Our observation on the resemblance of volatility of fluctuations of daily novel coronavirus related number of deaths to the daily stock exchange returns suggests the applicability of this approach.

[29] vixra:2008.0122 [pdf]

Representing Sets with Minimal Space: Jamie’s Trit

James Dow Allen

The theoretical minimum storage needed to represent a set of size N drawn from a universe of size M is about N * (log_2(M/N) + 1.4472) bits (assuming neither N nor M/N is very small). I review the technique of `quotienting' which is used to approach this minimum, and look at the actual memory costs achieved by practical designs. Instead of somehow implementing and exploiting 1.4472 bits of steering information, most practical schemes use two bits (or more). In the conclusion I mention a scheme to reduce the overhead cost from two bits to a single trit.

[30] vixra:2008.0092 [pdf]

On Seat Allocation Problem with Multiple Merit Lists

Rahul Kumar Singh, Sanjeev Saxena

In this note, we present a simpler algorithm for joint seat allocation problem in case there are two or more merit lists. In case of two lists (the current situation for Engineering seats in India), the running time of the algorithm is proportional to sum of running time for two separate (delinked) allocations. The algorithm is straight forward and natural and is not (at least directly) based on deferred acceptance algorithm of Gale and Shapley. Each person can only move higher in his or her preference list. Thus, all steps of the algorithm can be made public. This will improve transparency and trust in the system.

[31] vixra:2007.0143 [pdf]

The Blockcard Protocol: It's the Proof-of-Thought That Counts

Robert S. Adlemir, Robert S. Adlemir, Chris K. Wong

We identify a major security flaw in modern gift transaction protocol that allows for malicious entities to send questionable metadata to insecure1 recipients. To address these weaknesses we introduce the Blockcard protocol, a novel variant of Blockchain technology that uses an asymmetric proof-of-work CPU cost function over payload metadata to provide a cryptographically secure and efficient method of verifying that gift-givers thought enough about the recipients payload or lack thereof for it to count. This has the advantage of making it computationally infeasible and socially awkward for adversarial gift-givers to double-spend, spoof, or precompute their celebratory thoughts.

[32] vixra:2007.0063 [pdf]

Is it Still Worth the Cost to Teach Compiling in 2020 ? a Pedagogical Experience Through Hortensias Compiler and Virtual Machine

Karim Baina, Boualem Benatallah

With the disruption produced by extensive automation of automation due to advanced research in machine learning, and auto machine learning, even in programming language translation, the main goal of this paper is to discuss the following question "Is it still worth the cost to teach compiling in 2020 ?". Our paper defends the "Yes answer" within software engineering majors. The paper also shares the experience of teaching compiling techniques course best practices since more than 15 years, presents and evaluates this experience through Hortensias, a pedagogical compiling laboratory platform providing a language compiler and a virtual machine. Hortensias is a multilingual pedagogical platform for learning end teaching how to build compilers front and back-end. Hortensias language offers the possibility to the programmer to customise the compiler associativity management, visualise the intermediary representations of compiled code, or customise the optimisation management, and the error management language for international students communities. Hortensias offers the possibility to the beginner programmer to use a graphical user interface to program by clicking. Hortensias compiling pedagogy evaluation has been conducted through two surveys involving in a voluntarily basis engineering students and alumni during one week. It targeted two null hypothesis : the first null hypothesis supposes that compiling teaching is becoming outdated with regards to current curricula evolution, and the second null hypothesis supposes Hortensias compiling based pedagogy has no impact neither on understanding nor on implementing compilers and interpreters. During fifteen years of teaching compiler engineering, Hortensias was a wonderful pedagogic experiment either for teaching and for learning, since vulgarising abstract concepts becomes very easier for teachers, lectures follow a gamification-like approach, and students become efficient in delivering versions of their compiler software product in a fast pace.

[33] vixra:2006.0086 [pdf]

Towards Geographic, Demographic, and Climatic Hypotheses Exploration on Covid-19 Spread an Open Source Johns Hopkins University Time Series Normalisation, and Integration Layer

Karim Baina

Epidemiologist, Scientists, Statisticians, Historians, Data engineers and Data scientists are working on finding descriptive models and theories to explain COVID-19 expansion phenomena or on building analytics predictive models for learning the apex of COVID-19 confirmed cases, recovered cases, and deaths evolution time series curves. In CRISP-DM life cycle, 75% of time is consumed only by data preparation phase causing lot of pressures and stress on scientists and data scientists building machine learning models. This paper aims to help reducing data preparation efforts by presenting detailed data preparation repository with shell and python scripts for formatting, normalising, and integrating Johns Hopkins University COVID-19 daily data via three normalisation user stories applying data preparation at lexical, syntactic & semantics and pragmatic levels, and four integration user stories through geographic, demographic, climatic, and distance based similarity dimensions, among others. This paper and related open source repository will help data engineers and data scientists aiming to deliver results in an agile analytics life cycle adapted to critical COVID-19 context.

[34] vixra:2006.0043 [pdf]

New Bounds for the Stochastic Knapsack Problem

Ekesh Kumar

The knapsack problem is a problem in combinatorial optimization that seeks to maximize an objective function subject to the a weight constraint. We consider the stochastic variant of this problem in which $\mathbf{v}$ remains deterministic, but $\mathbf{x}$ is an $n$-dimensional vector drawn uniformly at random from $[0, 1]^{n}$. We establish a sufficient condition under which the summation-bound condition is almost surely satisfied. Furthermore, we discuss the implications of this result on the deterministic problem.

[35] vixra:2006.0040 [pdf]

Accounting for the Impact of Media Coverage on Polling

Arun Jose

This paper examines the feedback cycle of news ratings and electoral polling, and offers an algorithmic news algorithm to patch the problem. The cycle hinges on overexposure of a candidate to familiarize their name in otherwise apathetic voters, and therefore, the algorithm weighs down exposure on a logarithmic scale to only pass increasingly important news as coverage of a candidate inflates. This problem is a symptom of a deeper issue, and the solution proposes to patch it for the present, as well as offer insight into the machinations of the issue, and therefore aid its understanding

[36] vixra:2005.0292 [pdf]

The Ritva Blockchain: Enabling Confidential Transactions at Scale

Henri Aare, Peter Vitols

The distributed ledger technology has been widely hailed as the break- through technology. It has realised a great number of application scenarios, and improved workflow of many domains. Nonetheless, there remain a few major concerns in adopting and deploying the distributed ledger technology at scale. In this white paper, we tackle two of them, namely the through- put scalability and confidentiality protection for transactions. We learn from the existing body of research, and build a scale-out blockchain plat- form that champions privacy called RVChain. RVChain takes advantage of trusted execution environment to offer confidentiality protection for trans- actions, and scale the throughput of the network in proportion with the number of network participants by supporting parallel shadow chains.

[37] vixra:2003.0680 [pdf]

Nonuniform Linear Depth Circuits Decide Everything

Luowen Qian

In this work, we introduce the nonuniform complexity class LD, which stands for linear depth circuit. We also prove that LD = ALL, which is the complexity class that contains all decision languages, and therefore resolving all questions on this new complexity class. No further research is needed [Mun20]. In principle, this shows that anything can be computed via circuits in linear time, despite with (possibly undecidable) pre-computation and very inefficient advice, however, we note that exponential sized advice suffices to achieve ALL.

[38] vixra:2003.0039 [pdf]

Cross-Language Substitution Cipher: An Approach of the Voynich Manuscript

Elie Duthoo

The Voynich Manuscript (VMS) is an illustrated hand-written document carbon-dated in the early 15th century. This paper aims at providing a statistically robust method for translating voynichese, the language used in the VMS. We will first provide a set of statistical properties that can be applied to any tokenizable language with sub-token elements, apply it to Universal Dependencies (UD) dataset plus VMS (V101 transliteration) to see how it compares to the 157 corpora written in 90 different languages from UD. In a second phase we will provide an algorithm to map characters from one language to characters from another language, and we will apply it to the 158 corpora we have in our possession to measure its quality. We managed to attack more than 60% of UD corpora with this method though results for VMS don't appear to be usable.

[39] vixra:2002.0415 [pdf]

TF-PSST : A Spatio-Temporal Scheduling Approach for Multi-FPGA Parallel Heterogeneous Architecture in High Performance Computing

Aditya Das, Rishab Bhattacharyya, Pramit Ghosh

This work is a proposed architectural prototype in the field of High Performance Computing (HPC). Intel Altera DE4 and Altera DE5a - Net FPGA boards were used as functional processors in our designed system. We further explore Peripheral Component Interconnect (PCI) Express communication and amalgamate the transfer of data through PCIe to two different kinds of FPGAs at the same time using a proposed scheduling algorithm called TF-PSST : Time First Power Second Scheduling Technique. This significantly improves efficiency of the system by reducing execution time and because of the heterogeneous nature of the architectural prototype, we also found a way to increase the hardware resource utilisation.

[40] vixra:2002.0186 [pdf]

A Constraint Based K-Shortest Path Searching Algorithm for Software Defined Networking

Siddhant Ray

Software Defined Networking (SDN) is a concept in the area of computer networks in which the control plane and data plane of traditional computer networks are separated as opposed to the mechanism in conventional routers and switches. SDN aims to provide a central control mechanism in the network through a controller known as the SDN Controller. The Controller then makes use of various southbound Application Programming Interfaces (APIs) to connect to the physical switches located on the network and pass on the control information, which used to program the data plane. SDN Controller also exposes several northbound APIs to connect to the applications which can leverage the controller to orchestrate the network. The controller used with regard to this paper is the Open Network Operating System (ONOS), on which the algorithm in question is to be deployed. ONOS provides several APIs which is leveraged to connect the application to the network devices. The typical network path between any two endpoints is a shortest path fulfilling a set of constraints. The algorithm developed here is for optimal K-Shortest path searching in a given network satisfying specified constraints, controlled by ONOS.

[41] vixra:1911.0127 [pdf]

Robust Quaternion Estimation with Geometric Algebra

Mauricio Cele Lopez Belon

Robust methods for finding the best rotation aligning two sets of corresponding vectors are formulated in the linear algebra framework, using tools like the SVD for polar decomposition or QR for finding eigenvectors. Those are well established numerical algorithms which on the other hand are iterative and computationally expensive. Recently, closed form solutions has been proposed in the quaternion's framework, those methods are fast but they have singularities i.e., they completely fail on certain input data. In this paper we propose a robust attitude estimator based on a formulation of the problem in Geometric Algebra. We find the optimal eigenquaternion in closed form with high accuracy and with competitive performance respect to the fastest methods reported in literature.

[42] vixra:1910.0534 [pdf]

Problème du Voyageur de Commerce TSP

Ayoub Abraich

The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in operations research and theoretical computer science.

[43] vixra:1908.0630 [pdf]

Determining Satisfiability of 3-Sat in Polynomial Time

Ortho Flint, Asanka Wickramasinghe, Jay Brasse, Chris Fowler

In this paper, we provide a polynomial time (and space), algorithm that determines satisfiability of 3-SAT. The complexity analysis for the algorithm takes into account no efficiency and yet provides a low enough bound, that efficient versions are practical with respect to today's hardware. We accompany this paper with a serial version of the algorithm without non-trivial efficiencies (link: polynomial3sat.org).

[44] vixra:1908.0562 [pdf]

A Performance Study of RDF Stores for Linked Sensor Data

Hoan Nguyen Mau Quoc, Martin Serrano, Han Nguyen Mau, John G. Breslin, Danh Le Phuoc

The ever-increasing amount of Internet of Things (IoT) data emanating from sensor and mobile devices is creating new capabilities and unprecedented economic opportunity for individuals, organizations and states. In comparison with traditional data sources, and in combination with other useful information sources, the data generated by sensors is also providing a meaningful spatio-temporal context. This spatio-temporal correlation feature turns the sensor data become even more valuables, especially for applications and services in Smart City, Smart Health-Care, Industry 4.0, etc. However, due to the heterogeneity and diversity of these data sources, their potential benefits will not be fully achieved if there are no suitable means to support interlinking and exchanging this kind of information. This challenge can be addressed by adopting the suite of technologies developed in the Semantic Web, such as Linked Data model and SPARQL. When using these technologies, and with respect to an application scenario which requires managing and querying a vast amount of sensor data, the task of selecting a suitable RDF engine that supports spatio-temporal RDF data is crucial. In this paper, we present our empirical studies of applying an RDF store for Linked Sensor Data. We propose an evaluation methodology and metrics that allow us to assess the readiness of an RDF store. An extensive performance comparison of the system-level aspects for a number of well-known RDF engines is also given. The results obtained can help to identify the gaps and shortcomings of current RDF stores and related technologies for managing sensor data which may be useful to others in their future implementation efforts.

[45] vixra:1908.0443 [pdf]

N-SAT in P with Non-Coherent Space Factorization

F.L.B. Périat

We know since Cook that Boolean satisfiability problems with at least three literals in each clauses are in NP and are NP-complete. With proving that 3-SAT (or more) is in P, corollary proves that P = NP. In this document, we explain how to find a SAT problem solution in polynomial complexity time.

[46] vixra:1908.0403 [pdf]

Unishox Guaranteed Compression for Short Unicode Strings

Arundale Ramanathan

A new hybrid encoding method is proposed, with which short unicode strings could be compressed using context aware pre-mapped codes and delta coding resulting in surprisingly good ratios.

[47] vixra:1902.0484 [pdf]

Shox96 - Guaranteed Compression for Short Strings

Arundale R., Charumathi A., Harsha N.

None of the lossless entropy encoding methods so far have addressed compression of small strings of arbitrary lengths. Although it appears inconsequent, space occupied by several independent small strings become significant in memory constrained environments. It is also significant when attempting efficient storage of such small strings in a database where while block compression is most efficient, retrieval efficiency could be improved if the strings are individually compressed. This paper formulates a hybrid encoding method with which small strings could be compressed using context aware static codes resulting in surprisingly good ratios and also be used in constrained environments like Arduino. We also go on to prove that this technique can guarantee compression for any English language sentence of minimum 3 words.

[48] vixra:1902.0184 [pdf]

Computation, Complexity, and P!=NP Proof

Hugh Wang

If we refer to a string for Turing machines as a guess and a rejectable substring a flaw, then all algorithms reject similarly flawed guesses flaw by flaw until they chance on an unflawed guess, settle with a flawed guess, or return the unflawed guesses. Deterministic algorithms therefore must identify all flaws before guessing flawlessly in the worst case. Time complexity is then bounded below by the order of the product of the least number of flaws to cover all flawed guesses and the least time to identify a flaw. Since there exists 3-SAT problems with an exponential number of flaws, 3-SAT is not in P, and therefore P!=NP.

[49] vixra:1901.0028 [pdf]

Algorithm for Evaluating Bivariate Kolmogorov Statistics in O(N Log N) Time.

Krystian Zawistowski

We propose an O(n log n) algorithm for evaluation of bivariate Kolmogorov- Smirnov statistics (n is number of samples). It offers few orders of magnitude of speedup over existing implementations for n > 100k samples of input. The algorithm is based on static Binary Search Trees and sweep algorithm. We share C++ implementation with Python bindings.

[50] vixra:1810.0096 [pdf]

Developing a New Cryptic Communication Protocol by Quantum Tunnelling over Classic Computer Logic

Mesut Kavak

I have been working for a time about basic laws of directing the universe [1,2]. It seems that the most basic and impressive principle which causes any physical phenomenon is the Uncertainty Principle of Heisenberg [3], that existence have any property because of the uncertainty. During this process, while I was thinking about conservation of information I noticed, that information cannot be lost; but at a point, it becomes completely unrecognizable according to us as there is no alternative. Any information and the information searched for become the same after a point relatively to us. The sensitivity increases forever but its loss. Each sensitivity level also has higher level; so actually an absolute protection seems possible.

[51] vixra:1809.0070 [pdf]

A Note on Rank Constrained Solutions to Linear Matrix Equations

Shravan Mohan

This preliminary note presents a heuristic for determining rank constrained solutions to linear matrix equations (LME). The method proposed here is based on minimizing a non- convex quadratic functional, which will hence-forth be termed as the Low-Rank-Functional (LRF). Although this method lacks a formal proof/comprehensive analysis, for example in terms of a probabilistic guarantee for converging to a solution, the proposed idea is intuitive and has been seen to perform well in simulations. To that end, many numerical examples are provided to corroborate the idea.

[52] vixra:1801.0274 [pdf]

A Note on Deutsch-Jozsa Quantum Algorithm

Zhengjun Cao, Lihua Liu

The Deutsch-Jozsa quantum algorithm is of great importance to modern quantum computation, but we find it is flawed. It confuses two unitary transformations: one is performed on a pure state, and the other on a superposition. In the past decades, no constructive specification on the unitary operator performed on involved superposition has been found, and no experimental test on the algorithm has been practically carried out. We think it needs more constructive specifications on the algorithm so as to confirm its correctness.

[53] vixra:1712.0407 [pdf]

Ellipsoid Method for Linear Programming Made Simple

Sanjeev Saxena

In this paper, ellipsoid method for linear programming is derived using only minimal knowledge of algebra and matrices. Unfortunately, most authors first describe the algorithm, then later prove its correctness, which requires a good knowledge of linear algebra.

[54] vixra:1712.0081 [pdf]

The Mathematics of Eating

Timothy W. Jones

Using simple arithmetic of the prime numbers, a model of nutritional content of food need and content is created. This model allows for the dynamics of natural languages to be specified as a game theoretical construct. The goal of this model is to evolve human culture.

[55] vixra:1710.0201 [pdf]

Automatic Intelligent Translation of Videos

Shivam Bansal

There are a lot of educational videos online which are in English and inaccessible to 80% population of the world. This paper presents a process to translate a video into another language by creating its transcript and using TTS to produce synthesized fragments of speech. It introduces an algorithm which synthesyses intelligent, synchronized, and easily understandable audio by combining those fragments of speech. This algorithm is also compared to an algorithm from another research paper on the basis of performance.

[56] vixra:1708.0373 [pdf]

Further Tractability Results for Fractional Hypertree Width

Wolfgang Fischl, Georg Gottlob, Reinhard Pichler

The authors have recently shown that recognizing low fractional hypertree-width (fhw) is NP-complete in the general case and that the problem becomes tractable if the hypergraphs under consideration have degree and intersection width bounded by a constant, i.e., every vertex is contained in only constantly many different edges and the intersection of two edges contains only constantly many vertices. In this article, we show that bounded degree alone suffices to ensure tractability.

[57] vixra:1612.0185 [pdf]

Computational Techniques for Efficient Conversion of Image Files from Area Detectors

Taha Sochi

Area detectors are used in many scientific and technological applications such as particle and radiation physics. Thanks to the recent technological developments, the radiation sources are becoming increasingly brighter and the detectors become faster and more efficient. The result is a sharp increase in the size of data collected in a typical experiment. This situation imposes a bottleneck on data processing capabilities, and could pose a real challenge to scientific research in certain areas. This article proposes a number of simple techniques to facilitate rapid and efficient extraction of data obtained from these detectors. These techniques are successfully implemented and tested in a computer program to deal with the extraction of X-ray diffraction patterns from EDF image files obtained from CCD detectors.

[58] vixra:1612.0179 [pdf]

Testing the Connectivity of Networks

Taha Sochi

In this article we discuss general strategies and computer algorithms to test the connectivity of unstructured networks which consist of a number of segments connected through randomly distributed nodes.

[59] vixra:1611.0352 [pdf]

Proof of Collatz' Conjecture

Robert Deloin

Collatz' conjecture (stated in 1937 by Collatz and also named Thwaites conjecture, or Syracuse, 3n+1 or oneness problem) can be described as follows: Take any positive whole number N. If N is even, divide it by 2. If it is odd, multiply it by 3 and add 1. Repeat this process to the result over and over again. Collatz' conjecture is the supposition that for any positive integer N, the sequence will invariably reach the value 1. The main contribution of this paper is to present a new approach to Collatz' conjecture. The key idea of this new approach is to clearly differentiate the role of the division by two and the role of what we will name here the jump: a = 3n + 1. With this approach, the proof of the conjecture is given as well as generalizations for jumps of the form qn + r and for jumps being polynomials of degree m >1.

[60] vixra:1607.0141 [pdf]

Kalman Folding 4: Streams and Observables

Brian Beckman

In Kalman Folding, Part 1, we present basic, static Kalman filtering as a functional fold, highlighting the unique advantages of this form for deploying test-hardened code verbatim in harsh, mission-critical environments. In that paper, all examples folded over arrays in memory for convenience and repeatability. That is an example of developing filters in a friendly environment. Here, we prototype a couple of less friendly environments and demonstrate exactly the same Kalman accumulator function at work. These less friendly environments are - lazy streams, where new observations are computed on demand but never fully realized in memory, thus not available for inspection in a debugger - asynchronous observables, where new observations are delivered at arbitrary times from an external source, thus not available for replay once consumed by the filter

[61] vixra:1607.0084 [pdf]

Kalman Folding 5: Non-Linear Models and the EKF

Brian Beckman

We exhibit a foldable Extended Kalman Filter that internally integrates non-linear equations of motion with a nested fold of generic integrators over lazy streams in constant memory. Functional form allows us to switch integrators easily and to diagnose filter divergence accurately, achieving orders of magnitude better speed than the source example from the literature. As with all Kalman folds, we can move the vetted code verbatim, without even recompilation, from the lab to the field.

[62] vixra:1607.0059 [pdf]

Kalman Folding 3: Derivations

Brian Beckman

In Kalman Folding, Part 1, we present basic, static Kalman filtering as a functional fold, highlighting the unique advantages of this form for deploying test-hardened code verbatim in harsh, mission-critical environments. The examples in that paper are all static, meaning that the states of the model do not depend on the independent variable, often physical time. Here, we present mathematical derivations of the basic, static filter. These are semi-formal sketches that leave many details to the reader, but highlight all important points that must be rigorously proved. These derivations have several novel arguments and we strive for much higher clarity and simplicity than is found in most treatments of the topic.

[63] vixra:1606.0328 [pdf]

Kalman Folding, Part 1 (Review Draft)

Brian Beckman

Kalman filtering is commonplace in engineering, but less familiar to software developers. It is the central tool for estimating states of a model, one observation at a time. It runs fast in constant memory. It is the mainstay of tracking and navigation, but it is equally applicable to econometrics, recommendations, control: any application where we update models over time. By writing a Kalman filter as a functional fold, we can test code in friendly environments and then deploy identical code with confidence in unfriendly environments. In friendly environments, data are deterministic, static, and present in memory. In unfriendly, real-world environments, data are unpredictable, dynamic, and arrive asynchronously. The flexibility to deploy exactly the code that was tested is especially important for numerical code like filters. Detecting, diagnosing and correcting numerical issues without repeatable data sequences is impractical. Once code is hardened, it can be critical to deploy exactly the same code, to the binary level, in production, because of numerical brittleness. Functional form makes it easy to test and deploy exactly the same code because it minimizes the coupling between code and environment.

[64] vixra:1605.0152 [pdf]

Sequential Ranking Under Random Semi-Bandit Feedback

Hossein Vahabi, Paul Lagree, Claire Vernade, Olivier Cappe

In many web applications, a recommendation is not a single item sug- gested to a user but a list of possibly interesting contents that may be ranked in some contexts. The combinatorial bandit problem has been studied quite extensively these last two years and many theoretical re- sults now exist : lower bounds on the regret or asymptotically optimal algorithms. However, because of the variety of situations that can be considered, results are designed to solve the problem for a specific reward structure such as the Cascade Model. The present work focuses on the problem of ranking items when the user is allowed to click on several items while scanning the list from top to bottom.

[65] vixra:1605.0109 [pdf]

Optimising Linear Seating Arrangements with a First-Principles Genetic Algorithm

Robert A. Martin

We discuss the problem of finding an optimum linear seating arrangement for a small social network, i.e. approaching the problem put forth in XKCD comic 173 – for a small social network, how can one determine the seating order in a row (e.g at the cinema) that corresponds to maximum enjoyment? We begin by improving the graphical notation of the network, and then propose a method through which the total enjoyment for a particular seating arrangement can be quantified. We then discuss genetic programming, and implement a first-principles genetic algorithm in python, in order to find an optimal arrangement. While the method did produce acceptable results, outputting an optimal arrangement for the XKCD network, it was noted that genetic algorithms may not be the best way to find such an arrangement. The results of this investigation may have tangible applications in the organising of social functions such as weddings.

[66] vixra:1603.0386 [pdf]

Communication Optimization of Parallel Applications in the Cloud

Carreño ED, Diener M, Cruz EHM, Navaux POA

One of the most important aspects that influences the performance of parallel applications is the speed of communication between their tasks. To optimize communication, tasks that exchange lots of data should be mapped to processing units that have a high network performance. This technique is called communication-aware task mapping and requires detailed information about the underlying network topology for an accurate mapping. Previous work on task mapping focuses on network clusters or shared memory architectures, in which the topology can be determined directly from the hardware environment. Cloud computing adds significant challenges to task mapping, since information about network topologies is not available to end users. Furthermore, the communication performance might change due to external factors, such as different usage patterns of other users. In this paper, we present a novel solution to perform communication- aware task mapping in the context of commercial cloud environments with multiple instances. Our proposal consists of a short profiling phase to discover the network topology and speed between cloud instances. The profiling can be executed before each application start as it causes only a negligible overhead. This information is then used together with the communication pattern of the parallel application to group tasks based on the amount of communication and to map groups with a lot of communication between them to cloud instances with a high network performance. In this way, application performance is increased, and data traffic between instances is reduced. We evaluated our proposal in a public cloud with a variety of MPI-based parallel benchmarks from the HPC domain, as well as a large scientific application. In the experiments, we observed substantial performance improvements (up to 11 times faster) compared to the default scheduling policies.

[67] vixra:1510.0473 [pdf]

A Still Simpler Way of Introducing Interior-Point Method for Linear Programming

Kurt Mehlhorn, Sanjeev Saxena

Linear programming is now included in algorithm undergraduate and postgraduate courses for computer science majors. We give a self-contained treatment of an interior-point method which is particularly tailored to the typical mathematical background of CS students. In particular, only limited knowledge of linear algebra and calculus is assumed.

[68] vixra:1510.0325 [pdf]

Multi-label Methods for Prediction with Sequential Data

Jesse Read, Luca Martino Jaakko Hollmén

The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation investigating performance on real-world sequential-prediction tasks: electricity demand, and route prediction. As well as showing that several popular multi-label algorithms are in fact easily applicable to sequencing tasks, our novel approaches, which benefit from a unified view of these areas, prove very competitive against established methods.

[69] vixra:1509.0104 [pdf]

A Formula Based Approach to Arithmetic Coding

Arundale Ramanathan

The Arithmetic Coding process involves re-calculation of intervals for each symbol that need to be encoded. This article discovers a formula based approach for calculating compressed codes and provides proof for deriving the formula from the usual approach. A spreadsheet is also provided for verification of the approach. Consequently, the similarities between Arithmetic Coding and Huffman coding are also visually illustrated.

[70] vixra:1506.0119 [pdf]

Viterbi Classifier Chains for Multi-Dimensional Learning

L. Martino, J. Read, F. Louzada

Multi-dimensional classification (also known variously as multi-target, multi-objective, and multi-output classification) is the supervised learning problem where an instance is associated to qualitative discrete variables (a.k.a. labels), rather than with a single class, as in traditional classification problems. Since these classes are often strongly correlated, modeling the dependencies between them allows MDC methods to improve their performance -- at the expense of an increased computational cost. A popular method for multi-label classification is the classifier chains (CC), in which the predictions of individual classifiers are cascaded along a chain, thus taking into account inter-label dependencies. Different variant of CC methods have been introduced, and many of them perform very competitively across a wide range of benchmark datasets. However, scalability limitations become apparent on larger datasets when modeling a fully-cascaded chain. In this work, we present an alternative model structure among the labels, such that the Bayesian optimal inference is then computationally feasible. The inference is efficiently performed using a Viterbi-type algorithm. As an additional contribution to the literature we analyze the relative advantages and interaction of three aspects of classifier chain design with regard to predictive performance versus efficiency: finding a good chain structure vs.a random structure, carrying out complete inference vs. approximate or greedy inference, and a linear vs. non-linear base classifier. We show that our Viterbi CC can perform best on a range of real-world datasets.

[71] vixra:1503.0218 [pdf]

Fusion D'images Par la Théorie de Dezert-Smarandache (DSmT) en Vue D'applications en Télédétection (Thèse Doctorale)

Azeddine Elhassouny

Thèse dirigée par Pr. Driss Mammass, préparée au Laboratoire Image et Reconnaissance de Formes-Systèmes Intelligents et Communicants IRF-SIC, soutenue le 22 juin 2013, Agadir, Maroc. L'objectif de cette thèse est de fournir à la télédétection des outils automatiques de la classification et de la détection des changements d'occupation du sol utiles à plusieurs fins, dans ce cadre, nous avons développé deux méthodes générales de fusion utilisées pour la classification des images et la détection des changements utilisant conjointement l'information spatiale obtenue par la classification supervisée ICM et la théorie de Dezert-Smarandache (DSmT) avec des nouvelles règles de décision pour surmonter les limites inhérentes des règles de décision existantes dans la littérature. L'ensemble des programmes de cette thèse ont été implémentés avec MATLAB et les prétraitements et visualisation des résultats ont été réalisés sous ENVI 4.0, ceci a permis d'effectuer une validation des résultats avec précision et dans des cas concrets. Les deux approches sont évaluées sur des images LANDSAT ETM+ et FORMOSAT-2 et les résultats sont prometteurs. The main objective of this thesis is to provide automatic remote sensing tools of classification and of change detection of land cover for many purposes, in this context, we have developed two general methods used for classification fusion images and change detection using joint spatial information obtained by supervised classification ICM and Dezert-Smarandache theory (DSmT) with new decision rules to overcome the limitations of decision rules existing in the literature. All programs of this thesis have been implemented in MATLAB and C language and preprocessing and visualization of results were achieved in ENVI 4.0, this has allowed for a validation of the results accurately and in concrete cases. Both approaches are evaluated on LANDSAT ETM + and FORMOSAT-2 and the results are promising.

[72] vixra:1502.0231 [pdf]

A Swot Analysis of Instruction Sequence Theory

Jan A. Bergstra

After 15 years of development of instruction sequence theory (IST) writing a SWOT analysis about that project is long overdue. The paper provides a comprehensive SWOT analysis of IST based on a recent proposal concerning the terminology for the theory and applications of instruction sequences.

[73] vixra:1502.0228 [pdf]

A Terminology for Instruction Sequencing

Jan A. Bergstra

Instruction sequences play a key role in computing and have the potential of becoming more important in the conceptual development of informatics in addition to their existing role in computer technology and machine architectures. After 15 years of development of instruction sequence theory a more robust and outreaching terminology is needed for it which may support further development. Instruction sequencing is the central concept around which a new family of terms and phrases is developed.

[74] vixra:1502.0003 [pdf]

An Interesting Perspective to the P Versus NP Problem

Wenming Zhang

We discuss the P versus NP problem from the perspective of addition operation about polynomial functions. Two contradictory propositions for the addition operation are presented. With the proposition that the sum of k (k<=n+1) polynomial functions on n always yields a polynomial function, we prove that P=NP, considering the maximum clique problem. And with the proposition that the sum of k polynomial functions may yield an exponential function, we prove that P!=NP by constructing an abstract decision problem. Furthermore, we conclude that P=NP and P!=NP if and only if the above propositions hold, respectively.

[75] vixra:1501.0203 [pdf]

A Nopreprint on Algebraic Algorithmics: Paraconsistency as an Afterthought

Jan A. Bergstra

Algebraic Algorithmics, a phase taken from G.E. Tseitlin, is given a specific interpretation for the line of work in the tradition of the program algebra and thread algebra. An application to algebraic algorithmics of preservationist paraconsistent reasoning in the style of chunk and permeate is suggested and discussed. In the first appendix nopreprint is coined as a tag for new a publication category, and a rationale for its use is given. In a second appendix some rationale is provided for the affiliation from which the paper is written and posted.

[76] vixra:1410.0193 [pdf]

High Availability-Aware Optimization Digest for Applications Deployment in Cloud

Manar Jammal, Ali Kanso, Abdallah Shami

Cloud computing is continuously growing as a business model for hosting information and communication technology applications. Although on-demand resource consumption and faster deployment time make this model appealing for the enterprise, other concerns arise regarding the quality of service offered by the cloud. One major concern is the high availability of applications hosted in the cloud. This paper demonstrates the tremendous effect that the placement strategy for virtual machines hosting applications has on the high availability of the services provided by these applications. In addition, a novel scheduling technique is presented that takes into consideration the interdependencies between applications components and other constraints such as communication delay tolerance and resource utilization. The problem is formulated as a linear programming multi-constraint optimization model. The evaluation results demonstrate that the proposed solution improves the availability of the scheduled components compared to OpenStack Nova scheduler.

[77] vixra:1409.0071 [pdf]

Anima: Adaptive Personalized Software Keyboard

Panos Sakkos, Dimitrios Kotsakos, Ioannis Katakis, Dimitrios Gunopoulos

We present a Software Keyboard for smart touchscreen de- vices that learns its owner’s unique dictionary in order to produce personalized typing predictions. The learning pro- cess is accelerated by analysing user’s past typed communi- cation. Moreover, personal temporal user behaviour is cap- tured and exploited in the prediction engine. Computational and storage issues are addressed by dynamically forgetting words that the user no longer types. A prototype implemen- tation is available at Google Play Store.

[78] vixra:1408.0048 [pdf]

Trillion by Trillion Matrix Inverse: not Actually that Crazy

Alexander Fix, Misha Collins

A trillion by trillion matrix is almost unimaginably huge, and finding its inverse seems to be a truly im- possible task. However, given current trends in com- puting, it may actually be possible to achieve such a task around 2040 — if we were willing to devote the the entirety of human computing resources to a single computation. Why would we want to do this? Perhaps, as Mallory said of Everest: “Because it’s there”.

[79] vixra:1407.0010 [pdf]

A Lower Bound of 2^n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

Samuel C. Hsieh

We establish a lower bound of 2^n conditional jumps for deciding the satisfiability of the conjunction of any two Boolean formulas from a set called a full representation of Boolean functions of n variables - a set containing a Boolean formula to represent each Boolean function of n variables. The contradiction proof first assumes that there exists a RAM program that correctly decides the satisfiability of the conjunction of any two Boolean formulas from such a set by following an execution path that includes fewer than 2^n conditional jumps. By using multiple runs of this program, with one run for each Boolean function of n variables, the proof derives a contradiction by showing that this program is unable to correctly decide the satisfiability of the conjunction of at least one pair of Boolean formulas from a full representation of n-variable Boolean functions if the program executes fewer than 2^n conditional jumps. This lower bound of 2^n conditional jumps holds for any full representation of Boolean functions of n variables, even if a full representation consists solely of minimized Boolean formulas derived by a Boolean minimization method. We discuss why the lower bound fails to hold for satisfiability of certain restricted formulas, such as 2CNF satisfiability, XOR-SAT, and HORN-SAT. We also relate the lower bound to 3CNF satisfiability.

[80] vixra:1306.0128 [pdf]

Analysis of Point Clouds Using Conformal Geometric Algebra

Dietmar Hildenbrand, Eckhard Hitzer

This paper presents some basics for the analysis of point clouds using the geometrically intuitive mathematical framework of conformal geometric algebra. In this framework it is easy to compute with osculating circles for the description of local curvature. Also methods for the fitting of spheres as well as bounding spheres are presented. In a nutshell, this paper provides a starting point for shape analysis based on this new, geometrically intuitive and promising technology. Keywords: geometric algebra, geometric computing, point clouds, osculating circle, fitting of spheres, bounding spheres.

[81] vixra:1303.0045 [pdf]

On the P-Untestability of Combinational Faults

Suresh k Devanathan, Michael L Bushnell

Abstract—We describe the p-untestability of faults in combinational circuits. They are similar to redundant faults, but are defined probabilistically. P-untestable fault is a fault that is not detectable after N random pattern simulation or a fault, FAN either proves to be redundant or aborts after K backtracks. We chose N to be about 1000000 and K to be about 1000. We provide a p-untestability detectability algorithm that works in about 85% of the cases, with average of about 14% false negatives. The algorithm is a simple hack to FAN and uses structural information and can be easily implemented. The algorithm does not prove redundancy completely but establishes a fault as a probabilistically redundant, meaning a fault with low probability of detection or no detection.

[82] vixra:1303.0043 [pdf]

Pror Compaction Scheme for Larger Circuits and Longer Vectors with Deterministic Atpg

Suresh k Devanathan, Michael L Bushnell

Abstract—Reverse order restoration ROR techniques have found great use in sequential automatic test pattern generation ATPG, esp. spectral and perturbation-based ATPG. This paper deals with improving ROR for that purpose. We introduce parallel-fault multipass 2-level polynomial reverse order restoration PROR algorithms with constant complexity of the form H(n)G(n) + c where H(n) is the number of vectors to be released this iteration and G(n) is the attenuation factor. In PROR H(n) = nk and G(n) here is 1

[83] vixra:1303.0037 [pdf]

Fquantum :A Quantum Computing Fault Simulator

Suresh kumar Devanathan, Michael L Bushnell

Abstract—We like to introduce fQuantum, a Quantum Computing Fault Simulator and new quantum computing fault model based on Hadamard, PauliX, PauliY and PauliZ gates, and the traditional stuckat-1 SA1 and stuck-at-0 SA0 faults. We had close to 100% fault coverage on most circuits. The problem with lower coverage comes from function gates, which we will deal with, in future versions of this paper.

[84] vixra:1212.0109 [pdf]

Polynomial 3-SAT-Solver

Matthias Mueller

Five different polynomial 3-SAT algorithms named "Algorithm A/B/C/D[M]/E" are provided: <table border='0'> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v1: "Algorithm A": </td> <td>Obsolete, incorrect. My first attempt. In retrospect, this Algorithm A is just a bounded-width logical resolution and thus no polynomial 3-SAT solver.</td> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v2: "Algorithm B": </td> <td>Obsolete. Never failed for millions of test runs, but I'm not sure if this Algorithm B is really correct. Some time after publishing, I found out that the algorithm keeps too many tuples enabled, for some SAT CNFs. Mr. M. Prunescu's paper 'About a surprizing computer program of Matthias Mueller' is about this Algorithm B.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v3: "Algorithm C": </td> <td>Obsolete, incorrect. A trial to replace the tuples of Algorithm B by single clauses. Fails for some SAT CNF types (e.g. for large pigeon hole problem CNFs).</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v4: "Algorithm D‑1.0": </td> <td>Obsolete. Never failed, solved absolutely everything that I ever inputted, detected even the pigeon hole problem PHP6 as UNSAT, what would not have been possible if this Algorithm D was just a resolution solver. The problem is that I did not understand the Algorithm D completely (I found it through trial and error, noticed it did never fail). The proof of correctness might not be completely satisfying.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v5: "Algorithm D‑1.1": </td> <td>Obsolete. Very same algorithm as v4, but better explained and with a re-written, completely new part of the proof of correctness.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v6: "Algorithm D‑1.1": </td> <td>Obsolete. Some helpful improvements (compared to v5).</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v7: "Algorithm D‑1.2": </td> <td>Obsolete. Paper from May 22nd, 2016.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v8: "Algorithm D‑1.3": </td> <td>Obsolete. Parts of the proof of correctness have been replaced by a completely re-written, more detailed variant.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>v9: "Algorithm D‑1.3": </td> <td>Obsolete. Another part of the proof of correctness has been made more detailed.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>vA: "Algorithm DM‑1.0": </td> <td>Obsolete. Completely re-written document. Still describes algorithm D, but as short as possible and in mathematical notation.</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>vB: "Algorithm DM‑2.0": </td> <td>Obsolete. Heavily revised and extended vA document. A much more precise notation is used (compared to vA) and most formulas are now comprehensively commented and explained. Might be easier to understand for learned readers, while others prefer v9 (D-1.3).</td> </tr> <tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>vC: "Algorithm DM‑2.1": </td> <td>Obsolete. Compared to DM-2.0, three substantial extensions have been added: Why the algorithm does NOT have the restrictions of a logical resolution, that the polynomial solver correctly solved the pigeon hole problem for n=6 ("PHP6"), and what the ideas behind the three rules of the polynomial algorithm are.</td> </tr> <td style='width:1px;white-space:nowrap;vertical-align:top;'>vD: "Algorithm E‑1.0": </td> <td>Please read this paper! This is the newest polynomial solving algorithm, called Algorithm E. Its source code is extremely simple, the main part consists of merely 4 nested loops and does not use tuples of 3-SAT clauses any more to save its internal state but merely single 3-SAT clauses. The first time it is explained in detail how the polynomial algorithm comes about, this time the polynomial algorithm is most widely understood. The algorithm has been extensively tested, the related code and binaries can be downloaded (see below). The algorithm and related paper should be much easier understandable than any of my previous works. </tr> </table> You can also visit www.louis-coder.com/Polynomial_3-SAT_Solver/Polynomial_3-SAT_Solver.htm for latest updates and news, and the Zip file containing the Windows/Linux demo implementation.

[85] vixra:1109.0036 [pdf]

An OpenCL Fast Fourier Transformation Implementation Strategy

Sven De Smet

This paper describes an implementation strategy in preparation for an implementation of an OpenCL FFT. The two most essential factors (memory bandwidth and locality) that are crucial to obtain high performance on a GPU for an FFT implementation are highlighted. Theoretical upper bounds for performance in terms of the locality factor are derived. An implementation strategy is proposed that takes these factors into consideration so that the resulting implementation has the potential to achieve high performance.

[86] vixra:1108.0028 [pdf]

One More Step Towards Generalized Graph-Based Weakly Relational Domains

Sven de Smet

This paper proposes to extend graph-based weakly relational domains to a generalized relational context. Using a new definition of coherence, we show that the definition of a normal form for this domain is simplified. A transitive closure algorithm for combined relations is constructed and a proof of its correctness is given. Using the observed similarity between transitive closure of a combined relation and the normal form closure of a graph-based weakly relational domain, we extract a mathematical property that a relational abstract domain must satisfy in order to allow us to use an algorithm with the same form as the transitive closure algorithm to compute the normal form of a graph-based weakly relational domain.

[87] vixra:1106.0033 [pdf]

Towards a Group Theoretical Model for Algorithm Optimization

Sven de Smet

This paper proposes to use a group theoretical model for the optimization of algorithms. We first investigate some of the fundamental properties that are required in order to allow the optimization of parallelism and communication. Next, we explore how a group theoretical model of computations can satisfy these requirements. As an application example, we demonstrate how this group theoretical model can uncover new optimization possibilities in the polyhedral model.

[88] vixra:1106.0023 [pdf]

A Lattice Model for the Optimization of Communication in Parallel Algorithms

Sven de Smet

This paper describes a unified model for the optimization of communication in parallel algorithms and architectures. Based on a property that provides a unified view of locality in space and time, an algorithm is constructed that generates a parallel architecture that is optimized for communication for a given computation. The optimization algorithm is constructed using the lattice algebraic properties of congruence relations and is therefor applicable in a general context. An application to a bio-informatics algorithm demonstrates the value of the model and optimization algorithm.

[89] vixra:1106.0022 [pdf]

Parallelisation with a Grid Abstraction

Sven de Smet

This paper describes a new technique for automatic parallelisation in the Z-polyhedral model. The presented technique is applicable to arbitrarily nested loopnests with iteration spaces that can be represented as unions of Z-polyhedra and affine modular data-access functions. The technique partitions both iteration and data spaces of the computation. The maximal amount of parallelism that can be represented using grid partitions is extracted.

viXra One

Data Structures and Algorithms