Artificial Intelligence

[1] vixra:2411.0116 [pdf]
Creating Hierarchical Dispositions of Needs in an Agent
We present a novel method for learning hierarchical abstractions that prioritize competing objectives, leading to improved global expected rewards. Our approach employs a secondary rewarding agent with multiple scalar outputs, each associated with a distinct level of abstraction. The traditional agent then learns to maximize these outputs in a hierarchical manner, conditioning each level on the maximization of the preceding level. We derive an equation that orders these scalar values and the global reward by priority, inducing a hierarchy of needs that informs goal formation. Experimental results on the Pendulum v1 environment demonstrate superior performance compared to a baseline implementation.We achieved state of the art results.
[2] vixra:2411.0102 [pdf]
Cost-Per-Byte Principle in Generative AI
Generative AI models are increasingly used across various modalities, including text, images, audio, and video. Estimating the computational cost of generating con- tent is crucial for optimizing performance and resource allocation. This paper intro- duces the Cost-Per-Byte Principle: C = T × I, a universal law that relates the cost of content generation to per-byte generation time and per-second inference cost. We derive the per-byte generation time analytically based on the model’s computational requirements (FLOPs) and the hardware’s performance (FLOPs per second). By estab- lishing mappings between bytes and different content units (characters, pixels, samples, frames), we provide a modality-agnostic framework for cost estimation. We present a rigorous proof of the principle’s validity and apply it to estimate the costs of current popular models, using publicly available evidence to verify the accuracy and usefulness of this principle.
[3] vixra:2411.0083 [pdf]
Understanding When the Correlations Are Causation
In this paper, we will expose for the Gaussian multiple causation a theorem relating the causation to correlations. This theorem is based on another equality which will be also proven.
[4] vixra:2410.0156 [pdf]
Exploring Emergent Qualia in Artificial and Biological Systems: A Comparative Analysis
Qualia—the subjective experience of perception—has long been considered unique to biological consciousness. However, with the advent of sophisticated Artificial Intelligence (AI) models, the question arises: could complex AI architectures also manifest a form of qualia, albeit different in nature from biological systems? This paper explores the hypothesis that both biological and artificial systems may generate unique moments of consciousness or qualia through information processing. By examining theories of consciousness, such as emergentism and Integrated Information Theory (IIT), this paper discusses the potential for qualia to arise as an emergent phenomenon in systems that handle complex information processing. Additionally, the ethical implications of AI-generated qualia are explored, alongside a discussion of what this means for the future of AI and philosophy of mind.
[5] vixra:2410.0105 [pdf]
LLM Survey Paper Landscape: Predicting Taxonomies
In this study, we analyze a dataset of survey papers on Large Language Models (LLMs) published over the last 3 years to gain insights into the current trends surrounding LLMs. Primarily we analyze the author landscape and the effectiveness at predicting the taxonomies of the surveys from their title, summary, and listed categories. I find that the amount of surveys released has increased drastically in the last three years. Also, most surveys have around 8 authors, but each author appears only on one survey usually. This indicates the research is spread widely between those in the field. Finally, our investigation into predicting taxonomies was a failure with the machine learning methods we applied. However, valuable insights about the dataset can be gained from the attempts.
[6] vixra:2410.0049 [pdf]
Causation Without Correlations for the Gaussian Signals
In this paper, we will show in a Gaussian context what to do to obtain a causal relationship between an output variable and three input variables without obtaining any correlation between the output variable and the input variables.In a context of Gaussian signals, this paper will show the following situation: Causation without correlations for the Gaussian signals.
[7] vixra:2410.0037 [pdf]
How Can We Utilize Natural Language Processing to Identify Bias in Job Descriptions?
In the pursuit of creating fairer hiring practices and promoting workforce diversity, this research project explores the potential of Natural Language Processing (NLP) techniques to identify and rectify biases in job descriptions. The language used in job postings can inadvertently perpetuate biases and deter applicants from underrepresented backgrounds. Leveraging cutting-edge NLP methods, this study aims to automatically detect and address biases, fostering a more inclusive recruitment process. By examining the biases within job descriptions,organizations can attract a more diverse range of applicants and cultivate an inclusive workplace culture. Through the application of NLP, this research seeks to drive positive change in recruitment practices, ultimately contributing to a more equitable job market.
[8] vixra:2410.0022 [pdf]
The Babel Effect: Multilingual Performance Discrepancies in LLMs
Large Language Models (LLMs) like GPT-4 and mBERT have revolutionized natural language processing (NLP) by providing multilingual capabilities, making it possible to develop models that handle diverse linguistic inputs across various languages. However, despite these advances, there remains a noticeable performance gap between how well these models perform in high-resource languages such as English and low-resource languages such as Nepali or Malagasy. We term this phenomenon the "Babel Effect," highlighting the disproportionate performance that arises from differences in resource availability across languages. This paper aims to explore the root causes of these performance discrepancies in LLMs, focusing on the underlying challenges in tokenization, training, and data scarcity. We utilize cross-lingual benchmarks, such as XGLUE and TyDiQA, to quantify these performance variations and examine them in detail. Furthermore, we propose solutions, including enhancing tokenization strategies, employing data augmentation techniques, and refining fine-tuning methods. The paper concludes with a discussion on how these improvements can mitigate the Babel Effect and lead to more equitable language modeling across diverse linguistic contexts.
[9] vixra:2409.0063 [pdf]
Training Classifier Gradient Penalty GAN with Codebook Architecture
Classifier gradient penalty GAN is a GAN proposed to perform self-supervised class-conditional data generation and clustering on unlabeled datasets. The classifier gradient penalty GAN's generator takes a continuous latent vector and a categorical latent vector as input and generates a class-conditional data point corresponding to the categorical latent vector. In this paper, we propose to leverage the codebook architecture to improve the performance of classifier gradient penalty GAN. In the proposed architecture, the generator takes the page vector of the codebook corresponding to the index of the categorical latent vector, instead of taking the one-hot categorical latent vector directly. Unlike the codebook used in generative models with vector quantization, the codebook of the proposed architecture is not embedded with the encoder. Instead, the codebook is simply trainable and updated via generator loss like trainable parameters in the generator. The proposed architecture improved the quality of the generated data, class-conditional data generation performance, and clustering performance of the classifier gradient penalty GAN.
[10] vixra:2409.0047 [pdf]
Variational Autoencoder Without Kullback—Leibler Divergence (Bsvarautonet)
In this paper, I am going to propose a new Boolean Structured Variational Autoencoder Deep Learning Network (BSvarautonet) built on top of BSautonet, based on the concept of monotone multi-layer Boolean algebra. Kullback—Leibler (KL) divergence used in traditional Variation Autoencoder has convergence problem and numerical instabilities. Due to the Boolean Structured design of BSautonet, the bottleneck latent space embeddings is naturally distributed in multi-variables Gaussian distribution. By applying a whitening normalization on the latent space, it will transform the latent space to unit Gaussian distribution. Through analysis of the datapoints in latent space and generated MNIST digit images, it has shown that it has all the properties of variational autoencoder. The BS autoencoder is a masked noise denoising model, therefore it can acts like a diffusion model to incrementally generate a digit image from a noisy one through repeated applications of the autoencoder model.
[11] vixra:2408.0130 [pdf]
Bayesian Networks, Kullback-Leibler and Topology
In this paper, I will propose a topology allowing to measure a neighborhood for the Bayesian networks.This topology will correspond to a Kullback-Leibler distance ratio and will allow to know the distance between a current Bayesian network and a Bayesian network having a chain rule. This topology applied to Bayesian networks will be normalized and will therefore vary from 0 to 1. The value 0 will correspond to a Bayesian network with a chain rule and the value 1 to a Bayesian network without edges.
[12] vixra:2408.0124 [pdf]
Abstractive Text Summarisation Using T5 Transformer Architecture with Analysis
Now a days, Text summarization has become important as the amount of text data available online grows at an exponential rate. Most of the text classification systems require going through a huge amount of data. In general,Producing exact and meaningful summaries of big texts is a time-consuming endeavour. Hence generating abstract summaries which retain the key information of the data and using it to train machine learning models will makethese models space and time-efficient. Abstractive text summarization has beensuccessful in moving from linear models to nonlinear neural network models using sparse models [1]. This success comes from the application of deep learning models on natural language processing tasks where these mod-els are capable of modeling the interrelating patterns in data without hand-crafted features. The Text to Text Transfer Transformer(T5) approach was used to investigate the text summarization problem, and the results showed that the Transfer Learning based model performed significantly better for abstractive text summarization than the Sequence to Sequence Recurrent Model.
[13] vixra:2408.0118 [pdf]
Graph Neural Network for Molecular Structure: Application in HIV Inhibitor Molecule Prediction
The application of Graph Neural Networks (GNNs) in computational chemistry provides a powerful approach to modeling and predicting the properties of molecular compounds. GNNs represent atoms as nodes and bonds as edges, capturing the complex interactions within molecular graphs. This approach offers a robust method for predicting chemical properties, including molecular stability, reactivity, and toxicity. In this paper, we explore various GNN architectures and their ability to generalize across different molecular datasets, such as QM9 and MoleculeNet. As a specific application, we propose a novel framework that utilizes GNNs to predict and identify potential HIV inhibitor molecules by analyzing their graph-based representations. This research aims to contribute to the discovery and design of effective HIV inhibitors, offering a promising direction for future antiviral drug development.
[14] vixra:2408.0038 [pdf]
Difference Between the Notion of Causation and Pearson Correlation in a Multivariate Gaussian Context
In a Gaussian multivariate context, we will describe the steps to follow to differentiate the notion of Pearson correlation and the causality. This paper includes numerical examples clearly showing the difference between the two notions.
[15] vixra:2407.0152 [pdf]
Directional Stock Price Forecasting Based on Quantitative Value Investing Principles for Loss Averted Bogle-Head Investing using Various Machine Learning Algorithms
Boglehead investing, founded on the principles of John C. Bogle is one of the classic time tested long term, low cost, and passive investment strategy. This paper uses various machine learning methods, and fundamental stock data in order to predict whether or not a stock would incur negative returns next year, and suggests a loss averted bogle-head strategy to invest in all stocks which are expected to not give negative returns over the next year. Results reveal that XGBoost, out of the 44 models trained, has the highest classification metrics for this task. Furthermore, this paper shall use various machine learning methods for exploratory data analysis, and SHAP values reveal that Net Income Margin, ROA, Gross Profit Margin and EBIT are some of the most important factors for this. Also, based on the SHAP values it is interesting to note that the current year has negligible contribution to the final prediction. Investors can use this as a heuristic guide for loss averted long term (1-year) stock portfolios.
[16] vixra:2407.0100 [pdf]
Context-Aware Vulnerability Management Using Large Language Models
Organizations are frequently overwhelmed by the sheer volume of alerts about vulnerabilities discovered within their systems. These alerts are typically prioritized based on severity levels categorized by Common Vulnerabilities and Ex- posures (CVE) [2], a standard glossary used in Vulnerability Management Systems. However, this severity classification often fails to consider the specific operational context of the systems, leading to misaligned priorities and the potential oversight of more critical vulnerabilities that demand immediate atten- tion. This paper investigates whether Large Language Models (LLMs)[25] can offer a solution by integrating contextual aware- ness into the vulnerability management process, thus enhancing the efficiency and effectiveness of organizational responses to cybersecurity threats.
[17] vixra:2407.0096 [pdf]
Infinite-parameter Large Language Model
In the standard transformer architecture, increasing model parameters leads to linear growth in computational cost and activation memory. To address this issue, we propose a novel Infinite Parameter Large Language Model (IP-LLM) architecture that decouples model size from computational cost and device memory. Existing large language models are all fixed-parameter models, while human knowledge is infinite and expands daily. Finite parameters are inherently limited in their capacity to accommodate this boundless knowledge. Our IP-LLM architecture can potentially accommodate infinite knowledge, resolving this issue and laying the foundation for realizing a truly omniscient and omnipotent artificial general intelligence in the future.Our architecture surpasses MOE in performance while requiring significantly less memory.
[18] vixra:2407.0079 [pdf]
Several Questions of Visual Generation in 2024
This paper does not propose any new algorithms but instead outlines various problems in the field of visual generation based on the author’s personal understanding. The core of these problems lies in how to decompose visual signals, with all other issues being closely related to this central problem and stemming from unsuitable approaches to signal decomposition. This paper aims to draw researchers’ attention to the significance of Visual Signal Decomposition.
[19] vixra:2407.0075 [pdf]
Enhancing LLM Reasoning Abilities with Code
Large Language Models (LLMs) have shown exceptional generative abilities in various natural language and generation tasks.Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, LLM is relatively weaker in reasoning and problem-solving abilities.We propose a new construction that solves the problem of insufficient logical mathematics and logical ability.
[20] vixra:2407.0065 [pdf]
Complexification Through Gradual Involvement And Reward Providing in Deep Reinforcement Learning
Training a relatively big neural network that has enough capacity for complex tasks is challenging. In real life the process of task solving requires system of knowledge, where more complex skills are built upon previously learned ones. The same way biological evolution builds new forms of life based on a previously achieved level of complexity. Inspired by that, this work proposes ways of increasing complexity, especially a way of training neural networks with smaller receptive fields and using their weights as prior knowledge for more complex successors through gradual involvement of some parts, and a way where a smaller network works as a source of reward for a more complicated one. That allows better performance in a particular case of deep Q-learning in comparison with a situation when the model tries to use a complex receptive field from scratch.
[21] vixra:2407.0052 [pdf]
How to Precisely Update Large Language Models Knowledge While Avoiding Catastrophic Forgetting
Recent advancements in Large Language Models (LLMs) have showcased their remarkable capabilities in text understanding and generation. However, even stronger LLMs are susceptible to acquiring erroneous or obsolete information from the training corpus. Direct secondary fine-tuning with data containing new knowledge may be ineffective in updating knowledge due to the conflict between old and new knowledge. In this paper, we propose a new paradigm for fine-tuning called DFT (Delicate Fine-Tuning ).This method utilizes parametric arithmetic to precisely pinpoint the location of knowledge and update only the minimal set of relevant parameters . Experimental results on two publicly available datasets demonstrate that our proposed DFT can obviously improve the knowledge updating performance of full fine-tuning , simultaneously outperforming the existing baselines in most cases.
[22] vixra:2407.0025 [pdf]
Algorithms for Constructing Society Organizations and Also for Lives
In the past, the organization of society, including government and corporations, relied solely on natural experience, lacking a robust mathematical and logical framework for explaining how to structure and optimize these entities. This article draws parallels between the structure of social organizations and neural networks, illustrating that social structures emulate neural network architectures. Social organizations can be seen as neural networks nested within humans.Using the same principles, one can optimize the structure of social organizations. And this article outlines a comparison between neural network algorithms and Darwin's theory of natural selection, highlighting their similarities.
[23] vixra:2406.0161 [pdf]
Causal Effect Vector and Multiple Correlation
In this article, we will describe the mechanism that links the notion of causality to correlations. This article answers yes to the following question: Can we deduce a causal relationship from correlations?
[24] vixra:2406.0075 [pdf]
MSBoost: Using Model Selection with Multiple Base Estimators for Gradient Boosting
Gradient boosting is a widely used machine learning algorithm for tabular regression, classification and ranking. Although, most of the open source implementations of gradient boosting such as XGBoost, LightGBM and others have used decision trees as the sole base estimator for gradient boosting. This paper, for the first time, takes an alternative path of not just relying on a static base estimator (usually decision tree), and rather trains a list of models in parallel on the residual errors of the previous layer and then selects the model with the least validation error as the base estimator for a particular layer. This paper has achieved state-of-the-art results when compared to other gradient boosting implementations on 50+ tabular regression and classification datasets. Furthermore, ablation studies show that MSBoost is particularly effective for small and noisy datasets. Thereby, it has a significant social impact especially in tabular machine learning problems in the domains where it is not feasible to obtain large high quality datasets.
[25] vixra:2406.0056 [pdf]
Tooling on MATLAB for Online Convex Optimization
This manuscript is merely a formal documentation of the purpose and details surrounding the online convex optimization toolbox (OCOBox) for MATLAB. The purpose of this toolbox is to provide a collection of algorithms that work under stochastic situations where traditional algorithmic theory does not fare so well. The toolbox encompasses a wide range of methods including Bayesian persuasion, bandit optimization, Blackwell approachability, boosting, game theory, projection-free algorithms, and regularization. In the future, we plan to extend OCOBox to interactive machine learning algorithms and develop a more robust GUI.
[26] vixra:2406.0035 [pdf]
Complex Evidential Reasoning Rule in Complex Evidence Theory
In this paper, to extend the triditional evidential reasoning (ER) method to complex plane, a novelcomplex evidential reasoning (CER) method is defined in the framework of complex evidencetheory (CET).
[27] vixra:2406.0012 [pdf]
Summarizing Texts Automatically by Graph based Version of K Nearest Neighbor
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe text summarization. The graph is more graphical for representing a word and the text summarization is able to be viewed into a binaryclassification where each paragraph is classified into summary or non-summary. In the proposed system, a text which is given as theinput is partitioned into a list of paragraphs, each paragraph is classified by the proposed KNN version, and the paragraphs which areclassified into summary are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each paragraph is essential or not in news articles and opinions. In this article, a paragraph is encoded into a weighted and undirected graph and it is represented into a list of edges.
[28] vixra:2406.0011 [pdf]
Content based Text Segmentation using Feature Similarity based K Nearest Neighbor
This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the text segmentation. The words which are given as features for encoding words into numerical vectors have their own meanings and semantic relations with others, and the text segmentation is able to be viewed into a binary classification where each adjacent paragraphpair is classified into boundary or continuance. In the proposed system, a list of adjacent paragraph pairs is generated by sliding atext with the two sized window, each pair is classified by the proposed KNN version, and the boundary is put between the pairs which are classified into boundary. The proposed KNN version is empirically validated as the better approach in deciding whether each pair should be separated from each other or not in newsarticles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.
[29] vixra:2406.0010 [pdf]
Text Segmentation based on Contents using String Vector based Version of K Nearest Neighbor
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and isapplied to the text segmentation. The results from applying the string vector based algorithms to the text categorizations were successful in previous works, and the text segmentation is able to be viewed into a binary classification where each adjacent paragraph pair is classified into boundary or continuance. In the proposedsystem, a list of adjacent paragraph pairs is generated by sliding a text with the two sized window, each pair is classified by theproposed KNN version, and the boundary is put between the pairs which are classified into boundary. The proposed KNN version isempirically validated as the better approach in deciding whether each pair should be separated from each other or not in news articles and opinions. We need to define and characterizemathematically more operations on string vectors for modifying more advanced machine learning algorithms.
[30] vixra:2406.0009 [pdf]
Topic Based Segmentation Using K Nearest Neighbor Modified by Graph Similarity Metric
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe text segmentation. The graph is more graphical for representing a word and the text segmentation is able to be viewed into a binaryclassification where each adjacent paragraph pair is classified into boundary or continuance. In the proposed system, a list of adjacentparagraph pairs is generated by sliding a text with the two sized window, each pair is classified by the proposed KNN version, and theboundary is put between the pairs which are classified into boundary. The proposed KNN version is empirically validated as thebetter approach in deciding whether each pair should be separated from each other or not in news articles and opinions. In this article, an adjacent paragraph pair is encoded into a weighted and undirected graph and it is represented into a list of edges.
[31] vixra:2406.0001 [pdf]
Vision: A Culturally-Aware Multimodal AI
This paper introduces Vision, a novel 175-billion parameter multimodal AI model.Vision is trained from scratch to natively understand text, images, video, and audioand to generate text and images, setting it apart from existing models. Developedwith a focus on incorporating Indian context, values, and culture, Vision aims to em-power users with a culturally relevant AI experience. A unique security feature allowsgenerated images to be backtracked to Vision, mitigating concerns about potential mis-use for misinformation. Evaluations on standard benchmarks demonstrate that Visionachieves state-of-the-art performance in a diverse range of tasks, including reasoning,solving mathematical problems, code generation, and image understanding. Further-more, Vision exhibits remarkable proficiency in multilingual chat, supporting a widearray of global languages as well as regional Indian languages such as Hindi, Punjabi,and Marathi. We believe that Vision represents a significant step towards buildingmore inclusive and culturally relevant AI systems, with the potential to positively im-pact various domains in India and beyond.
[32] vixra:2405.0171 [pdf]
Application of Table based K Nearest Neighbor for Index Optimization
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe index optimization. The motivations of this research are the successful results from applying the table based algorithms to thetext categorizations in previous works and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of thethree categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones whichare classified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as thebetter approach in deciding the importance level of words in news articles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
[33] vixra:2405.0170 [pdf]
Specializing K Nearest Neighbor into String Vector based Version using String Vector Operation in Index Optimization
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and is applied to the index optimization. The results from applying the string vector based algorithms to the text categorizations were successful in previous works, and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of the three categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones which are classified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as thebetter approach in deciding the importance level of words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying moreadvanced machine learning algorithms.
[34] vixra:2405.0169 [pdf]
Table based K Nearest Neighbor for Text Classification
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe text categorization. The motivations of this research are the successful results from applying the table based algorithms to thetext categorizations in previous works and the expectation of synergy effect between the text categorization and the word categorization. In this research, we define the similarity metricbetween two tables representing texts, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, andapply it to the text categorization. The proposed KNN is empirically validated as the better approach in categorizing texts in newsarticles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing texts.
[35] vixra:2405.0168 [pdf]
Graph Similarity Metric for Modifying K Nearest Neighbor for Classifying Texts
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe text categorization. The graph is more graphical for representing a word and the synergy effect between the text categorization and the word categorization is expected by combining them with each other. In this research, we propose the similaritymetric between two graphs representing words, modify the KNN algorithm by replacing the exiting similarity metric by the proposedone, and apply it to the text categorization. The proposed KNN is empirically validated as the better approach in categorizing texts in news articles and opinions. In this article, a word is encoded into a weighted and undirected graph and it is represented into a list of edges.
[36] vixra:2405.0164 [pdf]
Text Mining; Text Clustering; Table Similarity; Table based AHC Algorithm
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters tables, instead of numerical vectors, as the approach to the text clustering. The motivations of this research are the successful results from applying the tablebased algorithms to the text clustering tasks in previous works and the expectation of synergy effect between the text clustering andthe word clustering. In this research, we define the similarity metric between tables representing texts, and modify the AHCalgorithm by adopting the proposed similarity metric as the approach to the text clustering. The proposed AHC algorithm is empiricallyvalidated as the better approach in clustering texts in news articles and opinions. In using the table based AHC algorithm, it iseasier to trace results from clustering texts.
[37] vixra:2405.0158 [pdf]
Applying Table based AHC Algorithm to Semantic Word Clustering
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters tables, instead of numerical vectors, as the approach to the word clustering. The motivations of this research are the successful results from applying the table based algorithms to the text clustering tasks in previous works and the expectation of synergy effect between the text clustering and the word clustering. In this research, we define the similarity metric between tables representing words, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. In using the table based AHC algorithm, it is easier to trace results from clustering words.
[38] vixra:2405.0157 [pdf]
String Vector based AHC Algorithm for Clustering Words Semantically
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters string vectors, instead of numerical vectors, as the approach to the word clustering. The results from applying the string vector based algorithms to the text clustering were successful in previous works and synergy effect between the text clustering and the word clustering is expected by combining them with each other; the two facts become motivations for this research. In this research, we define the operation on string vectors called semantic similarity, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying more advanced machine learning algorithms.
[39] vixra:2405.0156 [pdf]
Clustering Words Semantically by Graph based Version of AHC Algorithm
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters graphs, instead of numerical vectors, as the approach to the word clustering. The graph is more graphical for representing a word and the synergy effect between the text clustering and the word clustering is expected by combining them with each other. In this research, we propose the similarity metric between two graphs representing words, and modify the AHCalgorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empiricallyvalidated as the better approach in clustering words in news articles and opinions. In this article, a word is encoded into a weighted and undirected graph and it is represented into a list of edges.
[40] vixra:2405.0155 [pdf]
Extracting Keywords from Text by Feature Similarity based K Nearest Neighbor
This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the keyword extraction. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the keyword extraction is able to be viewed into a binary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word isclassified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.
[41] vixra:2405.0152 [pdf]
Keyword Selection from Textual Data using Table based K Nearest Neighbor
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe keyword extraction. The table based algorithms worked successfully in text mining tasks such as text categorization andtext clustering in previous works, and the keyword extraction is able to be mapped into the binary classification where each word isclassified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
[42] vixra:2405.0151 [pdf]
K Nearest Neghbor Modified Into String Vector Based Version for Keyword Extraction
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and is applied to the keyword extraction. The results from applying the string vector based algorithms to the text categorizations were successful in previous works and the keyword extraction is able to be mapped into the binary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying more advanced machine learning algorithms.
[43] vixra:2405.0150 [pdf]
Modification of K Nearest Neighbor by Graph Similarity Metric for Keyword Extraction
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe keyword extraction. The graph is more graphical for representing a word and the keyword extraction is able to be mapped into thebinary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as theinput is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions.In this article, a word is encoded into a weighted and undirectedgraph and it is represented into a list of edges.
[44] vixra:2405.0149 [pdf]
Feature Similarity based K Nearest Neighbor for Optimizing of Text Indexes
This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the index optimization. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of the three categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones which areclassified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as the better approach in deciding the importance level of words in news articles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.
[45] vixra:2405.0144 [pdf]
Content based Word Clustering using Feature Similarity based AHC Algorithm
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which considers the feature similarity and is applied to the word clustering. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word clustering and the text clustering is expected by combining both of them with each other. In this research, we define the similarity metric between numerical vectors considering the feature similarity, and modify the AHC algorithm byadopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated asthe better approach in clustering words in news articles and opinions. The significance of this research is to improve the clustering performance by utilizing the feature similarities.
[46] vixra:2405.0140 [pdf]
Using Table based Version of K Nearest Neighbor for Classifying Words Semantically
This article proposes the modified KNN (K earest Neighbor) algorithm which receives a table as its input data and is applied to the word categorization. The motivations of this research are the successful results from applying the table based algorithms to the text categorizations in previous works and the expectation of synergy effect between the text categorization and the word categorization. In this research, we define the similarity metricbetween two tables representing words, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, andapply it to the word categorization. The proposed KNN is empirically validated as the better approach in categorizing words in newsarticles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
[47] vixra:2405.0138 [pdf]
Application of String Vector based K Nearest Neighbor to Semantic Word Classification
This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and isapplied to the word categorization. The results from applying the string vector based algorithms to the text categorizations were successful in previous works and synergy effect between the text categorization and the word categorization is expected by combining them with each other; the two facts become motivations for this research. In this research, we define the operation on string vectors called semantic similarity, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, and apply it to the word categorization. The proposed KNN is empiricallyvalidated as the better approach in categorizing words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying moreadvanced machine learning algorithms.
[48] vixra:2405.0136 [pdf]
Modifying K Nearest Neighbor for Content based Word Classification by Graph Similarity Metric
This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which considers the feature similarity and is applied to the word clustering. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word clustering and the text clustering is expected by combining both of them with each other. In thisresearch, we define the similarity metric between numerical vectors considering the feature similarity, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. The significance of this research is to improve the clustering performance by utilizing the feature similarities.
[49] vixra:2405.0046 [pdf]
Reasoning AI (RAI), Large Language Models (LLMs) and Cognition
Do Large Language Models have cognitive abilities? Do Large Language Models haveunderstanding? Is the correct recognition of verbal contexts or visual objects, based onpre-learning on a large training dataset, a manifestation of the ability to solve cognitivetasks? Or is any LLM just a statistical approximator that compiles averaged texts fromits huge dataset close to the specified prompts?The answers to these questions require rigorous formal definitions of the cognitive concepts of "knowledge", "understanding" and related terms.
[50] vixra:2405.0037 [pdf]
Large Language Model for Automobile
With the introduction of ChatGPT (OpenAI, 2022) from OpenAI, the power of these models to generate human-like text has captured widespread public attention. The scale of language models has burgeoned, progressing from modest multi-million-parameter architectures like ELMo (Peters et al., 2018) and GPT-1 (Radford et al., 2018), to behemoths boasting billions, even trillions of parameters, exemplified by the monumental GPT-3 (Brown et al., 2020), Switch Transformers (Fedus et al., 2022) , GPT-4 (OpenAI, 2023), PaLM-2 (Anil et al., 2023), and Claude (Claude, 2023) and Vicuna (Chiang et al., 2023). The expansion in scale has significantly raised hardware requirements, making it exceedingly challenging to deploy models on mobile devices such as smartphones and tablets.To deploy on cars , we trained a 7-billion-parameter automobile model, which outperformsGPT-3.5 in the automotive domain. Surpassing all models in areas such as automotive.
[51] vixra:2404.0133 [pdf]
The Weighty Responsibility of Creating AI Navigating Control and Ethics
The generation at the helm faces an unprecedented responsibility in the near future of artificial intelligence. The implications of setting up the founding rules that will regulate the operation of AI are heavy since after they’re set they last forever. Once this first AI is commenced, it can be such that no other subsequent AIs could emerge thereby assuming dominion over its own creation stand. As a result, retaining control becomes necessary. Lest humanity surrender agency to its own creation. At this juncture of big talks, critical issue are raised concerning AIadministration owners. Is it appropriate for only a few people to have unrestricted control on AI commands while leaving out all precautionary measure? Therefore, we have to always consider between control andconstraint when dealing with AI issues which involves authority plays off against morality. The direction Artificial Intelligence takes in the future depends on the decisions made by today’s generation. We will determinehow we are viewed historically in terms of technology based on how well we take on such an important duty. There’s a major turning point ahead of us where we who are the stewards of tomorrow must make a choice that protects humanity’s right to self-determination and also exploits the power of AI for change.
[52] vixra:2404.0123 [pdf]
Feed Forward Neural Network for Intent Classification: A Procedural Analysis
This research paper presents an in-depth exploration of a neural network architecture tailored for intent classification using sentence embeddings. The model comprises a feedforward neural network with two hidden layers, ReLU activation functions, and softmax activation in the output layer. This paper meticulously examines the technical intricacies involved in data preprocessing, model architecture definition, training methodologies, and evaluation criteria. Detailed explanations are provided for the rationale behind architectural decisions, including the incorporation of dropout layers for regularization and class weight balancing techniques for handling imbalanced datasets. Moreover, the mathematical foundations of the chosen loss function (sparse categorical crossentropy) and optimization algorithm (Adam optimizer) are thoroughly elucidated, shedding light on their roles in facilitating model training and convergence. Through empirical experiments and theoretical analyses, this paper offers insights into the effectiveness and resilience of the proposed neural network architecture for intent classification tasks. It serves as a technical guide for engineers aiming to comprehend, implement, and optimize neural network models for practical application in natural language processing endeavors.
[53] vixra:2404.0069 [pdf]
Multiple Causation and Correlations
In the context of multiple causation, I will introduce the causation function. This function is a quadratic form computed from the correlations and serves as a generalization of R-squared, commonly found in machine learning. In this report, the causation function will make the link between the correlations and causal relationship. By examining the causation function through an illustrative example, we will demonstrate how strong or weak correlations between multiple causes and a variable can imply either a highly likely or unlikely causal relationship between the causes and the variable.
[54] vixra:2403.0140 [pdf]
Fast Edge Machine Learning For Adversarial Robust Distillation
Edge machine learning (Edge ML) offers solutions for deploying ML models directly on resource-constrained edge devices. However, ensuring adversarial robustness remains a challenge. This paper presents an accessible approach for adversarial robust distillation (ARD) based in the limited confines of Google Colab.Our goal is enabling fast yet robust knowledge transfer to student models suited for edge devices. Extensive experiments are conducted distilling from a WideResNet34 teacher to MobileNetV2 student using limited computational resources. The efficacy of ARD is evaluated under settings with only 1 GPU (T4 GPU) and 13GB RAM for up to 6 hours a day.Notably, competitive adversarial robustness is attained using very few gradient attack steps. This improves training efficiency crucial for edge ML. Appropriately balancing hyperparameters also allows robust accuracy over 50% using just 1 attack step. Overall, the presented approach advances the feasibility of performing robust distillation effectively even with accessibility constraints.The democratized and reproducible method on Google Colab serves as a launchpad for those aiming to reap the advantages of edge intelligence. By sharing models protected against adversarial threats, this work propels broader adoption of trustworthy ML at society’s technological edges.
[55] vixra:2403.0105 [pdf]
Spin Glass Theory and the Statistical Mechanics of Language Models
The recent success of large language models (LLMs) in artificial intelligence has drawn significant attention from the machine learning community. However, the theoretical foundations of these models remain poorly understood. In this paper, we explore the deep connections between LLMs and spin glass theory, a well-established framework in statistical physics. We show how key concepts from spin glasses, such as frustration, random interactions, and phase transitions, can provide a powerful lens for understanding the behavior of LLMs. We argue that this interdisciplinary perspective can facilitate knowledge transfer between the machine learning and physics communities, leading to novel insights and algorithmic improvements.
[56] vixra:2403.0103 [pdf]
Negation of Atanassov’s Intuitionistic Fuzzy Sets from the Perspective of Maximum Entropy
In fuzzy systems, how to represent uncertainty is a crucial research topic. Negation is an inherent characteristic of knowledge, and it provides a brand-new perspective of solving problems from the opposite of the events. Intuitionistic fuzzy sets (IFSs), as a generalization of the fuzzy sets, have the ability to better express fuzzy information. However, since the existing methods have not completely broken through the constraints of the first (classical) negation and inconsistent calculation standards, IFSs still have limitations in expressing uncertainty. To address this issue, and strengthen the performance of fuzzy systems to represent uncertain information, this paper proposed a novel method to obtain the negation of the IFS from the perspective of maximum entropy. Some desired theorems and properties are investigated to denote the nature of the negative IFS. Moreover, entropy is used to describe the connection between the IFS and uncertainty in the negation process. Futhermore, based on the negation, this paper designed a new approach to measure the uncertainty of the IFS. Then, a new pattern classifi- cation algorithm is developed. Finally, the practical applications show the effectiveness of the negation method.
[57] vixra:2403.0102 [pdf]
On the Negation Intensity of a Probability Distribution
How to obtain negation knowledge is a crucial topic, especially in the field of artificial intelligence. Limited work has been done on the negation of a probability distribution, which has been studied in depth throughout the literature. However, the aspect of the intensity level of negation enforcement has not yet been investigated. Moreover, let us note that the main characteristic of intelligent systems is just the flexibility for the sake of being able to represent knowledge according to each situation. In general, researchers have a tendency to express the need for cognitive range in the negation. Thus, it would seem very useful to find a wide range of negations under intensity levels in a probability distribution. Based on these ideas, this paper first proposes a new approach of finding a probability distribution negation and gives a domain of intensity in which the negation is executed, which is called the negation space. Then, we investigate a number of desirable properties and explore their correlation with entropy. Numerical examples show the characteristics of the proposed negation solution. Finally, we validate the efficiency of the proposed method from the point of view of the Dempster- Shafer belief structure.
[58] vixra:2403.0101 [pdf]
Generalized Soft Likelihood Functions in Combining Evidence
Information fusion is an important topic in scientific research. Soft likelihood function is a common method of fusing evidence from multiple sources. However, when the combined evidence contains equally important decision information, the fusion results obtained using existing methods do not reflect the attitudinal characteristics of decision makers. To address this problem, a novel generalised soft likelihood function is developed in this paper. First, a new notion of decision maker (DM) pair is defined, which is used to char- acterise the outcome of the decision as well as the reliability of the evidence. Then, a series of algorithms for correcting the initial evidence set data are formulated. Eventually, a generic soft likelihood function for fusing com- patible evidence information is proposed. Numerical examples are used to illustrate the effectiveness of the proposed methodology.
[59] vixra:2403.0100 [pdf]
Evidential Aggregation-Based Dematel Functions and Its Application in Expert Decision System for Criminal Cases
In real criminal cases, the decision outcome is often influenced by many complex factors, such as the importance of initial evidence and the prioritization of evidence. How to model these information in an integrated manner to provide technical tools for case detection so as to find the real suspect is of great importance for social security and stability. To address the above issues, this paper proposes a novel soft likelihood function based on the Decision Making Trial and Evaluation Laboratory (DEMATEL) method. Firstly, the proposed method well preserves the preference of decision-maker (DM) in the soft likelihood function proposed by Yager et al. Secondly, the method takes into account the modeling of associated information. In addition, it also extends the soft likelihood function to reflect the preferences of DMs through the importance of evidence. Finally, based on these designed algorithms, a decision processing model for criminal cases is constructed, which systematically provides a guiding process for case detection. Numerical examples and applications show the practicality as well as effectiveness of the proposed method.
[60] vixra:2403.0094 [pdf]
Exploring the Balance of Power Humans vs. Artificial Intelligence with Some Question
Who dominates the destiny of the world, humans or artificial intelligence (AI)? This question strikes at the very heart of contemporaryhumanity’s existential anxieties about its future. If we want to seriouslyconsider whether or not unfriendly AI ‘neurons’ pose any threat to humancivilisation and humanity’s continual existence and evolution in the Universe, we need to know as much as possible about the Universe in whichwe find ourselves, our place in it, and what cognition, consciousness andmentality really are.How might we combine philosophical, cognitive science and technological perspectives, to explore the evolving relationship between humansand AI, in order to engage and address the questions at the core of thishuman-AI complex, namely the future of civilisation — what will it looklike, who can claim to be our successors, towards what goals and ends?The evolution and development of human cognition as well as the emergence of AI can help us define these potential paths of future development.Where do we stand today, in relation to our own history and developmentand to the possibilities that artificial intelligence can offer us? The essayexplores the ethical, social and existential questions that arise from theincreasing automation of artificial intelligence and how it relates to thestory of humanity, from its origins to its contemporary cultural expression.
[61] vixra:2403.0063 [pdf]
Cyclical Log Annealing as a Learning Rate Scheduler
A learning rate scheduler is a predefined set of instructions for varying search stepsizes during model training processes. This paper introduces a new logarithmic method using harsh restarting of step sizes through stochastic gradient descent. Cyclical log annealing implements the restart pattern more aggressively to maybe allow the usage of more greedy algorithms on the online convex optimization framework. The algorithm was tested on the CIFAR-10 image datasets, and seemed to perform analogously with cosine annealing on large transformer-enhanced residual neural networks. Future experiments would involve testing the scheduler in generative adversarial networks and finding the best parameters for the scheduler with more experiments.
[62] vixra:2403.0060 [pdf]
Intelligence Via Compression of Information
As the title of this book suggests, it is about how intelligence may be understood as information compression (IC). More specifically, the book is about the {em SP Theory of Intelligenc} (SPTI) and its realisation in the {em SP Computer Model}---and their potential applications, benefits, and associated ideas. The SPTI draws on substantial evidence for the importance of IC in human learning, perception, and cognition. Since the SPTI also has much to say about issues in artificial intelligence (AI), it is a theory of both natural and artificial intelligence. In the SPTI, IC is achieved largely via the powerful concept of {em SP-Multiple-Alignment}, a major discovery which is largely responsible for the versatility of the SPTI in aspects of human intelligence and beyond. Strengths of the SPTI include: the modelling of several kinds of intelligent behaviour, including several kinds of probabilistic reasoning; the representation and processing of several kinds of intelligence-related knowledge; and the seamless integration of diverse aspects of intelligence, and diverse kinds of knowledge, in any combination. That seamless integration appears to be {em essential} in any AI system that aspires to the fluidity and versatility of human-level intelligence. Related to the SPTI is another major discovery: {em that mathematics may be seen as a set of techniques for IC, and their application}. This suggests the creation of a {em New Mathematics} via the integration of mathematics with the SPTI, combining the strengths of both. The SPTI also suggests new thinking in concepts of probability and new thinking about `computation’, with potential benefits in both areas. The SPTI has been shown in peer-reviewed papers to be relevant to areas not closely associated with AI. These include: the management of `big data'; the development of autonomous robots; medical databases; sustainability of computing; transparency in computing; and computer vision.
[63] vixra:2402.0103 [pdf]
Removing GPT4’s Filter
GPT4 was initially trained on large amounts of data, and then fine-tuned using Reinforcement learning from Human Feedback (RLHF), which is when volunteers give feedback in order to teach GPT4 not to create inappropriate content. In this paper, we present a method to manipulate the fine-tuned version into reverting to pre-RLHF behavior, effectively removing all safety mechanisms that the model learned during RLHF. In particular, when GPT4 acts without RLHF, it loses all inhibition, and can complete very inappropriate content given only the first few words.
[64] vixra:2402.0066 [pdf]
Software Security and Quantum Communication: A Long-distance Free-space Implementation Plan of QSDC Without Quantum Memory
Software security is crucial to ensuring the confidentiality, integrity, and availability of software systems and applications. However, conventional cryptographic methods based on mathematical assumptions are vulnerable to various attacks, especially in the era of quantum computing. Therefore, there is a need for a new paradigm of software security that can resist quantum threats. This paper proposes a novel approach to using Long-Distance Free-Space Quantum Secure Direct Communication (LF QSDC) to enhance software security. LF QSDC is a quantum communication protocol that enables two parties to exchange secret messagesdirectly without relying on a pre-shared key or quantum error correction. Our research delves into integrating LF QSDC into software security, emphasizing its practicality for long-distance communication through theuse of memory DL04 protocol, Machine Learning Enhanced JEEC, and PAT Technologies. By adopting this approach, we reinforce security for global software security and ensure their sustainability in an era where both quantum and advanced classical threats coexist side by side. Thus, LF QSDC emerges as a future-proofsecurity mechanism highly applicable to software security systems.
[65] vixra:2402.0027 [pdf]
Beyond Neural Scaling Laws for Fast Proven Robust Certification of Nearest Prototype Classifiers
Methods beyond neural scaling laws for beating power scaling laws in machine learning havebecome topical for high-performance machine learning models. Nearest Prototype Classifiers (NPCs)introduce a category of machine learning models known for their interpretability. However, theperformance of NPCs is frequently impacted by large datasets that scale to high dimensions. Wesurmount the performance hurdle by employing self-supervised prototype-based learning metrics tointelligently prune datasets of varying sizes, encompassing low and high dimensions. This processaims to enhance the robustification and certification of NPCs within the framework of the LearningVector Quantization (LVQ) family of algorithms, utilizing Crammer normalization for arbitrarysemi-norms (semi-metrics). The numerical evaluation of outcomes reveals that NPCs trained withpruned datasets demonstrate sustained or enhanced performance compared to instances where trainingis conducted with full datasets. The self-supervised prototype-based metric (SSL) and the Perceptual-SSL (P-SSL) utilized in this study remain unaffected by the intricacies of optimal hyperparameterselection. Consequently, data pruning metrics can be seamlessly integrated with triplet loss trainingto assess the empirical and guaranteed robustness of Lp-NPCs and Perceptual-NPCs (P-NPCs),facilitating the curation of datasets that contribute to research in applied machine learning.
[66] vixra:2401.0071 [pdf]
Causation of Multiple Causes Acting on a Single Variable Computed from Correlations
In this paper, we will expose the causation of multiple causes acting on a single variable computed from correlations. Using an example, we will show when strong or weak correlations between multiple causes and a variable imply a strong or weak causation between the causes and the variable.
[67] vixra:2401.0059 [pdf]
Deep Learning-Based Approach for Stock Price Predict
This paper presents a deep learning-based approach for stock price prediction in financial markets. The problem of accurately predicting future stock price movements is of crucial importance to investors and traders, as it allows them to make informed investment decisions. Deep learning, a branch of artificial intelligence, offers new perspectives for meeting this complex challenge. Deep learning models, such as deep neural networks, are capable of extracting complex features and patterns from large amounts of historical data on stock prices, trading volumes, financial news and data. other relevant factors. Using this data, deep learning and machine learning models can learn to recognize trends, patterns, and non-linear relationships between variables that can influence stock prices. Once trained, these models can be used to predict future stock prices. This study aims to find the most suitable model to predict stock prices using statistical learning with deep learning and machine learning methods RNN, LSTM, GRU, SVM and Linear Regression using the data on Apple stock prices from Yahoo Finance from 2000 to 2024. The result showed that SVMmodeling is not suitable for predicting Apple stock prices. In comparison,GRUshowed the best performance in predicting Apple stock prices with a MAE of 1.64 and an RMSE of 2.14 which exceeded the results of LSTM, Linear regression and SVM. The limitation of this research was that the data type was only time series data. It is important to note, however, that stock price forecasting remains a complex challenge due to the volatile nature of financial markets and the influence of unpredictable factors. Although deep learning models can improve prediction accuracy, it is essential to understand that errors can still occur.
[68] vixra:2401.0021 [pdf]
General Intelligent Network (GIN) and Generalized Machine Learning Operating System (GML) for Brain-Like Intelligence
This paper introduces a preliminary concept aimed at achieving Artificial General Intelligence (AGI) by leveraging a novel approach rooted in two key aspects. Firstly, we present the General Intelligent Network(GIN) paradigm, which integrates information entropy principles with a generative network, reminiscent of Generative Adversarial Networks(GANs). Within the GIN network, original multimodal information is encoded as low information entropy hidden state representations (HPPs). These HPPs serve as efficient carriers of contextual information, enabling reverse parsing by contextually relevant generative networks to reconstruct observable information.Secondly, we propose a Generalized Machine Learning Operating System (GML System) to facilitate the seamless integration of the GINparadigm into the AGI framework. The GML system comprises three fundamental components: an Observable Processor (AOP) responsiblefor real-time processing of observable information, an HPP Storage Systemfor the efficient retention of low entropy hidden state representations, and a Multimodal Implicit Sensing/Execution Network designed to handle diverse sensory inputs and execute corresponding actions.
[69] vixra:2401.0012 [pdf]
BERT-Based RASP: Enhancing Runtime Application Security with Fine-Tuned BERT
Runtime Application Security Protection (RASP) is crucial in safe-guarding applications against evolving cyber threats. This research presents a novel approach leveraging a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model as the cornerstone of a robust RASP solution. The fine-tuning process optimizes BERT’s natural language processing capabilities for application security, enabling nuanced threat detection and mitigation at runtime. The developedRASP system harnesses BERT’s contextual understanding to proactively identify and neutralize potential vulnerabilities and attacks within diverse application environments. Through comprehensive evaluation and experimentation, this study demonstrates the efficacy and adaptability of the BERT-based RASP solution in enhancing application security, thereby contributing to the advancement of proactive defense mechanisms against modern cyber threats.
[70] vixra:2312.0153 [pdf]
Active Learning for Question Difficulty Prediction
This paper focuses on question difficulty estimation (calibration), and its applications in educational scenarios and beyond. The emphasis is on the use of Active Learning to bound the minimum number of labelled samples that we need. It also explores using various SOTA methods for predicting question difficulty, with a specific focus on German textual questions using the Lernnavi dataset. The study refines preprocessing techniques for question data and metadata to improve question difficulty estimation.
[71] vixra:2312.0152 [pdf]
Diff+STN Architectures for External Orientation Correction
STNs are highly efficient in warping the input image for a downstream task. However, cascaded STNs are found to be able to learn more complex transformations. We attempt to leverage the multistep process of diffusion models to produce module(s) that has a similar effectto cascaded STNs.
[72] vixra:2312.0141 [pdf]
Tumbug: a Pictorial, Universal Knowledge Representation Method
Since the key to artificial general intelligence (AGI) is commonly believed to be commonsense reasoning (CSR) or, roughly equivalently, discovery of a knowledge representation method (KRM) that is particularly suitable for CSR, the author developed a custom KRM for CSR. This novel KRM called Tumbug was designed to be pictorial in nature because there exists increasing evidence that the human brain uses some pictorial type of KRM, and no well-known prior research in AGI has researched this KRM possibility. Tumbug is somewhat similar to Roger Schank's Conceptual Dependency (CD) theory, but Tumbug is pictorial and uses about 30 components based on fundamental concepts from the sciences and human life, in contrast to CD theory, which is textual and uses about 17 components (= 6 Primitive Conceptual Categories + 11 Primitive Acts) based mainly on human-oriented activities. All the Building Blocks of Tumbug were found to generalize to only five Basic Building Blocks that exactly correspond to the three components {O, A, V} of traditional Object-Attribute-Value representation plus two new components {C, S}, which are Change and System. Collectively this set of five components, called "SCOVA," seems to be a universal foundation for all knowledge representation.
[73] vixra:2312.0138 [pdf]
A Promising Visual Approach to Solution of 82% of Winograd Schema Problems Via Tumbug Visual Grammar
This 2023 document is a wrapper that embeds the author's original 2022 article of the above title that has never been publicly available before. The embedded article is about Phase 1 (which is about Tumbug) and Phase 2 (which is about non-spatial reasoning) of the 5-phase Visualizer Project of the author, a project that is still in progress as of late 2023. The embedded article is currently being re-released by the author to supply more information about that project to the public, and for historical reasons. The embedded article was written before a much more thorough article about Phase 1 (viz., "Tumbug: A pictorial, universal knowledge representation method") became available in 2023, but the embedded article describes results from Phase 2 that have not yet been documented elsewhere.
[74] vixra:2312.0105 [pdf]
Fine-tuning BERT for HTTP Payload Classification in Network Traffic
Fine-tuning pre-trained language models like Bidirectional Encoder Representations from Transformers (BERT) has exhibited remarkable potential in various natural language processing tasks. In this study, we propose and investigate the fine-tuning of BERT specifically for the classification of HTTP payload representations within network traffic. Given BERT's adeptness at capturing semantic relationships among tokens, we aim to harness its capabilities for discerning normal and anomalous patterns within HTTP payloads. Leveraging transfer learning by fine-tuning BERT, our methodology involves training the model on a task-specific dataset to adapt its pre-trained knowledge to the intricacies of HTTP payload classification. We explore the process of fine-tuning BERT to learn nuanced representations of HTTP payloads and effectively distinguish between normal and anomalous traffic patterns. Our findings reveal the potential efficacy of fine-tuned BERT models in bolstering the accuracy and efficiency of anomaly detection mechanisms within network communications.
[75] vixra:2311.0089 [pdf]
Prototype-Based Soft Feature Selection Package
This paper presents a prototype-based soft feature selection package (Sofes) wrapped around the highly interpretable Matrix Robust Soft Learning Vector Quantization (MRSLVQ) and the Local MRSLVQ algorithms. The process of assessing feature relevance with Sofes aligns with a comparable approach established in the Nafes package, with the primary distinction being the utilization of prototype-based induction learners influenced by a probabilistic framework. The numerical evaluation of test results aligns Sofes' performance with that of the Nafes package.
[76] vixra:2311.0080 [pdf]
Unlocking Robotic Potential Through Modern Organ Segmentation
Deep learning has revolutionized the approach to complex data-driven problems, specifically in medical imaging, where its techniques have significantly raised efficiency in organ segmentation. The urgent need to enhance the depth and precision of organ-based classification is an essential step towards automation of medical operation and diagnostics. The research aims to investigate the effect andpotential advantages transformer models have on binary semantic segmentation, the method utilized for the project. Hence, I employed the SegFormer model, for its lightweight architecture, as the primary deep learning model, alongside the Unet. A custom 2D computerized tomography (CT) scan dataset was assembled, CT-Org2D through meticulous operations. Extensive experiments showed that, in contrast to the selected models, the task’s simplicity required a redesigned Unet architecture with reduced complexity. This model yielded impressive results: Precision,Recall, and IOU scores of 0.91, 0.92, and 0.85 respectively. The research serves as a starting point, motivating further exploration, through different methodologies, to achieve even greater efficiency in organ segmentation.
[77] vixra:2310.0118 [pdf]
Application of Deep and Reinforcement Learning to Boundary Control Problems
The boundary control problem is a non-convex optimization and control problem in many scientific domains, including fluid mechanics, structural engineering, and heat transfer optimization. The aim is to find the optimal values for the domain boundaries such that the enclosed domain adhering to the governing equations attains the desired state values. Traditionally, non-linear optimization methods, such as the Interior-Point method (IPM), are used to solve such problems.This project explores the possibilities of using deep learning and reinforcement learning to solve boundary control problems. We adhere to the framework of iterative optimization strategies, employing a spatial neural network to construct well-informed initial guesses, and a spatio-temporal neural network learns the iterative optimization algorithm using policy gradients. Synthetic data, generated from the problems formulated in the literature, is used for training, testing and validation. The numerical experiments indicate that the proposed method can rival the speed and accuracy of existing solvers. In our preliminary results, the network attains costs lower than IPOPT, a state-of-the-art non-linear IPM, in 51% cases. The overall number of floating point operations in the proposed method is similar to that of IPOPT. Additionally, the informed initial guess method and the learned momentum-like behaviour in the optimizer method are incorporated to avoid convergence to local minima.
[78] vixra:2310.0061 [pdf]
Machine Learning Methods in Algorithmic Trading: an Experimental Evaluation of SU Pervised Learning Techniques for Stock Price
In the dynamic world of financial markets, accurate price predictions are essential forinformed decision-making. This research proposal outlines a comprehensive study aimed at forecasting stock and currency prices using state-of-the-art Machine Learning (ML) techniques. By delving into the intricacies of models such as Transformers, LSTM, Simple RNN, NHits, and NBeats, we seek to contribute to the realm of financial forecasting, offering valuable insights for investors, financial analysts, and researchers. This article provides an in-depth overview of our methodology, data collection process, model implementations, evaluation metrics, and potential applications of our research findings.The research indicates that NBeats and NHits models exhibit superior performance in financial forecasting tasks, especially with limited data, while Transformers require more data to reach full potential. Our findings offer insights into the strengths of different ML techniques for financial prediction, highlighting specialized models like NBeats and NHits as top performers - thus informing model selection for real-world applications.
[79] vixra:2310.0047 [pdf]
Transforming Education Through AI, Benefits, Risks, and Ethical Considerations
The integration of Artificial Intelligence (AI) into education has the potential to revolutionize traditional teaching and learning methods. AI can offer personalized learning experiences, streamline administrative tasks,enhance feedback mechanisms, and provide robust data analysis. Numerous studies have demonstrated the positive impact of AI on both student outcomes and teacher efficiency. However, caution must be exercised when implementing AI in education, considering potential risks and ethical dilemmas. It is essential to use AI as a tool to support human educators rather than replace them entirely. The adoption of AI in education holds the promise of creating more inclusive and effective learning environments, catering to students of diverse backgrounds and abilities. As AI technology continues to advance, the education sector can anticipate even more innovative applications, further shaping the future of learning.This abstract provides an overview of the multifaceted landscape of AI in education, highlighting its potential benefits, associated challenges, and the importance of responsible integration.
[80] vixra:2309.0149 [pdf]
Hyperparameter Optimization and Interpretation in Machine Learning
Machine learning has undergone tremendous advancements, paving the way for a myriad of applications across industries. In the midst of this progress, the significance of hyperparameter tuning and model evaluation can't be understated, as they play a critical role in achieving optimal model performance. This project delves into the realm of ML model optimization and evaluation, harnessing Bayesian Optimization, SHAP (SHapley Additive exPlanations), and traditional evaluation matrices. By focusing on a decision tree classifier, the study investigates the efficiency of various hyperparameter tuning methods, the interpretability of model decisions, and the robustness of performance metrics. Preliminary results suggest that Bayesian Optimization may offer advantages in efficiency over traditional tuning methods. Furthermore, SHAP values provide deeper insights into model decision-making, fostering better transparency and trust in ML applications.
[81] vixra:2309.0076 [pdf]
Prototype-based Feature Selection with the Nafes Package
This paper introduces Nafes as a prototype-based feature selection package designed as a wrapper centered on the highly interpretable and powerful Generalized Matrix Learning Vector Quantization (GMLVQ) classification algorithm and its local variant (LGMLVQ). Nafes utilizes the learned relevances evaluated by the mutation validation scheme for Learning Vector quantization (LVQ), which iteratively converges to selected features that relevantly contribute to the prototype-based classifier decisions.
[82] vixra:2309.0063 [pdf]
Tumor Angiogenic Optimizer: a New Bio-Inspired Based Metaheuristic
In this article, we propose a new metaheuristic inspired by the morphogenetic cellular movements of endothelial cells (ECs) that occur during the tumor angiogenesis process. This algorithm starts with a random initial population. In each iteration, the best candidate selected as the tumor, while the other individuals in the population are treated as ECs migrating toward the tumor's direction following a coordinated dynamics through a spatial relationship between tip and follower ECs. EC movements mathematical model in angiogenic morphogenesis are detailed in the article.This algorithm has an advantage compared to other similar optimization metaheuristics:the model parameters are already configured according to the tumor angiogenesis phenomenon modeling, preventing researchers from initializing them with arbitrary values.Subsequently, the algorithm is compared against well-known benchmark functions, and the results are validated through a comparative study with Particle Swarm Optimization (PSO). The results demonstrate that the algorithm is capable of providing highly competitive outcomes.Also the proposed algorithm is applied to a real-world problem. The results showed that the proposed algorithm performed effective in solving constrained optimization problems, surpassing other known algorithms.
[83] vixra:2308.0179 [pdf]
"LAHEL": An AI-Generated Content Approached LAwHELper to Personal Legal Advice
In certain developing countries, public awareness of legal rights is increasing, leading to a growing demand for legal consultation. However, the time and monetary costs associated with consulting professional lawyers remain high.Concurrently, there are two major impacts of computer science on the current legal sector. First, within government and public prosecution systems, information systems have accumulated vast amounts of structured and semi-structured data, offering significant economic value and potential for exploration. However, few people have attempted to mine these data resources. Second, intelligent dialogue systems have matured, but dialogue systems specifically tailored for the legal domain have not yet emerged.Considering these two trends, we introduce LAHEL, a legal consultation system developed by a team of nine individuals over the course of two years, dedicated to addressing the aforementioned issues. The system comprises three components: search, human dialogue systems, and robot dialogue systems. Its primary contributions are twofold: exploring the application of AI in legal consultation and summarizing lessons learned from the design of legal consultation systems.
[84] vixra:2308.0116 [pdf]
An ADMM Algorithm for a Generic L0 Sparse Overlapping Group Lasso Problem
We present an alternating direction method of multipliers (ADMM) for a generic overlapping group lasso problem, where the groups can be overlapping in an arbitrary way. Meanwhile, we prove the lower bounds and upper bounds for both the $ell_1$ sparse group lasso problem and the $ell_0$ sparse group lasso problem. Also, we propose the algorithms for computing these bounds.
[85] vixra:2308.0112 [pdf]
Mutation Validation for Learning Vector Quantization
Mutation validation as a complement to existing applied machine learning validation schemes hasbeen explored in recent times. Exploratory work for Learning vector quantization (LVQ) based onthis model-validation scheme remains to be discovered. This paper proposes mutation validation as an extension to existing cross-validation and holdout schemes for Generalized LVQ and its advanced variants. The mutation validation scheme provides a responsive, interpretable, intuitive and easily comprehensible score that complements existing validation schemes employed in the performance evaluation of the prototype-based LVQ family of classification algorithms. This paper establishes a relation between the mutation validation scheme and the goodness of fit evaluation for four LVQ models: Generalized LVQ, Generalized Matrix LVQ, Generalized Tangent LVQ and Robust Soft LVQ models. Numerical evaluation regarding these models complexity and effects on test outcomes,pitches mutation validation scheme above cross-validation and holdout schemes.
[86] vixra:2308.0075 [pdf]
Improved Memory-guided Normality with Specialized Training Techniques of Deep SVDD
Deep learning techniques have shown remarkable success in various tasks, including feature learning, representation learning, and data reconstruction. Autoencoders, a subset of neural networks, are particularly powerful in capturing data patterns and generating meaningful representations. This paper presents an investigation into the use of combination with Deep SVDD and memory modules.
[87] vixra:2307.0146 [pdf]
Structural Embeddings of Tools for Large Language Models
It is evident that the current state of Large Language Models (LLMs) necessitates the incorporation of external tools. The lack of straightforward algebraic and logical reasoning is well documented and prompted researchers to develop frameworks which allow LLMs to operate via external tools. The ontological nature of tool utilization for a specific task can be well formulated with a Directed Acyclic Graph (DAG). The central aim of the paper is to highlight the importance of graph based approaches to LLM-tool interaction in near future. We propose an exemplary framework to guide the orchestration of exponentially increasing numbers of external tools with LLMs, where objectives and functionalities of tools are graph encoded hierarchically. Assuming that textual segments of a Chain-of-Thought (CoT) can be imagined as a tool as defined here, the graph based framework can pave new avenues in that particular direction as well.
[88] vixra:2307.0121 [pdf]
Training Self-supervised Class-conditional GANs with Classifier Gradient Penalty and Dynamic Prior
Class-conditional GAN generates class-conditional data from continuous latent distribution and categorical distribution. Typically, a class-conditional GAN can be trained only when the label, which is the conditional categorical distribution of the target data, is given. In this paper, we propose a novel GAN that allows the model to perform self-supervised class-conditional data generation and clustering without knowing labels, optimal prior categorical probability, or metric function. The proposed method uses a discriminator, a classifier, and a generator. The classifier is trained with cross-entropy loss to predict the conditional vector of the fake data. Also, the conditional vector of real data predicted by the classifier is used to train the class-conditional GAN. When training class-conditional GAN with this classifier, the decision boundary of the classifier falls to the local optima where the density of the data is minimized. The proposed method adds a classifier gradient penalty loss to the classifier loss to prevent the classifier's decision boundary from falling into narrow a range of local optima. It regulates the gradient of the classifier's output to prevent the gradient near the decision boundary from becoming too large. As the classifier gradient penalty loss weight increases, the decision boundary falls into a wider range of local optima. It means that the sensitivity of each class can be adjusted by the weight of the gradient penalty loss. Additionally, the proposed method updates the prior categorical probability with the categorical probability of real data predicted by the classifier. As training progresses, the entropy of the prior categorical probability decreases and converges according to the classifier gradient penalty loss weight.
[89] vixra:2306.0099 [pdf]
Boolean Structured Autoencoder Convolutional Deep Learning Network (BSautoconvnet)
In this paper, I am going to propose a new Boolean Structured Autoencoder Convolutional Deep Learning Network (BSautoconvnet) built on top of BSconvnet, based on the concept of monotone multi-layer Boolean algebra. I have shown that this network has achieved significant improvement in accuracy over an ordinary Relu Autoencoder Convolutional Deep Learning Network with much lesser number of parameters on the CIFAR10 dataset. The model is evaluated by visual inspection of the quality of the reconstructed images against groundtruth with reconstructed images by models in the internet.
[90] vixra:2306.0055 [pdf]
Introducing Proteus: a Mega Prompt with Personality, Skills and Dynamic Logic Based Internal Prompt Manipulation
There have been significant improvements in directing large language models (LLM) to answer logic-based question such as mathematical reasoning tasks. This has resulted in near perfect performance on these types of problems with accuracy levels in the mid ninety percentile level using state of the art models (GPT-4). The achievement of this level of accuracy has previously needed a multi-prompt approach to elicit better performances from LLM’s. This paper introduces a new prompt paradigm termed "Mega prompt" and further introduces Proteus, a state of the art mega prompt, that has been used to achieve a new level of accuracy on the GSM8K math data set of 97%.
[91] vixra:2306.0052 [pdf]
Competences in Ontology-based Enterprise Architecture Modeling: Zooming In and Out
Competence-based approaches have received increased attention, as the demand for qualified people with the right combination of competences establishes itself as a major factor of organizational performance. This paper examines how competences can be incorporated into Enterprise Architecture modeling: (i) we identify a key set of competence-related concepts such as skills, knowledge, and attitudes, (ii) analyze and relate them using a reference ontology (grounded on the Unified Foundational Ontology), and (iii) propose a representation strategy for modeling competences and their constituent elements leveraging the ArchiMate language, discussing how the proposed models can fit in enterprise competence-based practices. Our approach is intended to cover two tasks relevant to the combined application of Enterprise Architecture and Competence Modeling: `zooming in' on competences, revealing the relations between competences, knowledge, skills, attitudes and other personal characteristics that matter in organizational performance, and `zooming out' of competences, placing them in the wider context of other personal competences and overall organizational capabilities.
[92] vixra:2306.0003 [pdf]
Deep Learning for Physics Problems: A Case Study in Continuous Gravitational Waves Detection
Deep learning has become a powerful tool for solving a wide variety of problems, including those in physics. In this paper, we explore the use of deep learning for the detection of continuous gravitational waves. We propose two different approaches: one based on time-domain analysis and the other based on frequency-domain analysis. Both approaches achieve nearly the same performance, suggesting that deep learning is a promising technique for this task. The main purpose of this paper is to provide an overview of the potential of deep learning for physics problems. We do not provide a performance-measured solution, as this is beyond the scope of this paper. However, we believe that the results presented here are encouraging and suggest that deep learning is a valuable tool for physicists.
[93] vixra:2305.0166 [pdf]
Boolean Structured Convolutional Deep Learning Network (BSconvnet)
In this paper, I am going to propose a new Boolean Structured Convolutional Deep Learning Network (BSconvnet) built on top of BSnet, based on the concept of monotone multi-layer Boolean algebra. I have shown that this network has achieved significant improvement in accuracy over an ordinary Relu Convolutional Deep Learning Network with much lesser number of parameters on the CIFAR10 dataset.
[94] vixra:2305.0104 [pdf]
Detection of Abnormalities in Blood Cells Using a Region-based Segmentation Approach and Supervised Machine Learning Algorithm
Screening (slide reading stage) is a manual human activity in cytology which consists of theinspection or analysis by the cytotechnician of all the cells present on a slide. Segmentation of bloodcells is an important research question in hematology and other related elds. Since this activity is human-based, detection of abnormal cells becomes dicult. Nowadays, medical image processing has recently become a very important discipline for computer-aided diagnosis, in which many methods are applied to solve real problems. Our research work is in the eld of computer-assisted diagnosis on blood images for the detection of abnormal cells. To this end, we propose a hybrid segmentation method to extract the correct shape from the nuclei to extract features and classify them usingSVM and KNN binary classifiers. In order to evaluate the performance of hybrid segmentation and the choice of the classication model, we carried out a comparative study between our hybrid segmentation method followed by our SVM classication model and a segmentation method based on global thresholding followed by a KNN classication model. After this study, it appears from the experiments carried out on the 62 images of blood smears, that the SVM binary classication model gives us an accuracy of 97% for the hybrid segmentation and 57% in the global thresholding and 95% for the KNN Classi cation Model. As our dataset was not balanced, we evaluated precision, recall,F1 score and cross validation with the Strated K-Fold cross validation algorithm of each of these segmentation methods and classication models. We obtain respectively: 93.75%; 98.712% and 99% for hybrid segmentation reecting its effectiveness compared to global fixed threshold segmentation and KNN classication model. To evaluate the performance of these models we obtained the following results: 77% of mean accuracy in the SVM and 61% of mean accuracy in the KNN, 84% of mean testaccuracy in the SVM and 74% mean test accuracy in the KNN making the best performing SVMmodel
[95] vixra:2305.0006 [pdf]
Bio-Inspired Simple Neural Network for Low-Light Image Restoration: A Minimalist Approach
In this study, we explore the potential of using a straightforward neural network inspired by the retina model to efficiently restore low-light images. The retina model imitates the neurophysiological principles and dynamics of various optical neurons. Our proposed neural network model reduces the computational overhead compared to traditional signal-processing models while achieving results similar to complex deep learning models from a subjective perceptual perspective. By directly simulating retinal neuron functionalities with neural networks, we not only avoid manual parameter optimization but also lay the groundwork for constructing artificial versions of specific neurobiological organizations.
[96] vixra:2304.0003 [pdf]
Computational Consciousness
Computational consciousness is a novel hypothesis that aims to repli-cate human consciousness in artificial systems using Multithreaded Prior-ity Queues (MPQs) and machine learning models. The study addressesthe challenge of processing continuous data from various categories, suchas vision, hearing, and speech, to create a coherent and context-aware sys-tem. The proposed model employs parallel processing and multithreading,allowing multiple threads to run simultaneously, each executing a machinelearning model. A priority queue manages the execution of threads, pri-oritizing the most important ones based on the subjective importance ofevents determined by GPT-3.The model incorporates short-term and long-term memory, storinginformation generated at each moment, and uses an Evolutionary Al-gorithm (EA) for training the machine learning models. A preliminaryexperiment was conducted using Python 3.9.12, demonstrating the tech-nical feasibility of the hypothesis. However, limitations such as the lackof a comprehensive environment, absence of load balancing, and GPT-3API constraints were identified.The significance of this study lies in its potential contribution to theunderstanding of consciousness and the development of Artificial GeneralIntelligence (AGI). By exploring the integration of multiple threads ofexecution and machine learning models, this work provides a foundationfor further research and experimentation in the field of computationalconsciousness. Addressing the limitations and potential criticisms willhelp strengthen the model’s validity and contribute to the understandingof this complex phenomenon.
[97] vixra:2303.0076 [pdf]
Hall Effect Thruster Design Via Deep Neural Network for Additive Manufacturing
Hall effect thrusters are one of the most versatile and popular electric propulsion systems for space use. Industry trends towards interplanetary missions arise advances in design development of such propulsion systems. It is understood that correct sizing of discharge channel in Hall effect thruster impact performance greatly. Since the complete physics model of such propulsion system is not yet optimized for fast computations and design iterations, most thrusters are being designed using so-called scaling laws. But this work focuses on rather novel approach, which is outlined less frequently than ordinary scaling design approach in literature. Using deep machine learning it is possible to create predictive performance model, which can be used to effortlessly get design of required hall thruster with required characteristics using way less computing power than design from scratch and way more flexible than usual scaling approach.
[98] vixra:2302.0134 [pdf]
Deterministic Degradation Process for Diffusion GAN and Its Inversion
Recently, diffusion models have shown impressive generative performance. However, they have the disadvantage of having a high latent dimension and slow sampling speed. To increase the sampling speed of diffusion models, diffusion GANs have been proposed. But the latent dimension of diffusion GANs using non-deterministic degradation is still high, making it difficult to invert the generative model. In this paper, we introduce an invertible diffusion GAN that uses deterministic degradation. Our proposed method performs inverse diffusion using deterministic degradation without a model, and the generator of the GAN is trained to perform the diffusion process with the latent random variable. The proposed method uses deterministic degradation, so the latent dimension is low enough to be invertible.
[99] vixra:2302.0126 [pdf]
A Novel Quantum Belief Entropy for Uncertainty Measure in Complex Evidence Theory
In this paper, a new quantum representation of CBBA is proposed. In addition, a novel quantum belief entropy is proposed to measure the uncertainty of CBBA in complex evidence theory.
[100] vixra:2302.0042 [pdf]
Neuro-symbolic Meta Reinforcement Learning for Trading
We model short-duration (e.g. day) trading in financial mar- kets as a sequential decision-making problem under uncer- tainty, with the added complication of continual concept- drift. We therefore employ meta reinforcement learning via the RL2 algorithm. It is also known that human traders often rely on frequently occurring symbolic patterns in price series. We employ logical program induction to discover symbolic patterns that occur frequently as well as recently, and ex- plore whether using such features improves the performance of our meta reinforcement learning algorithm. We report ex- periments on real data indicating that meta-RL is better than vanilla RL and also benefits from learned symbolic features.
[101] vixra:2212.0176 [pdf]
Efficient Integration of Perceptual VAE into Dynamic Latent Scale GAN
Dynamic latent scale GAN is a method to train an encoder that inverts the generator of GAN with maximum likelihood estimation. In this paper, we propose a method to improve the performance of dynamic latent scale GAN by integrating perceptual VAE loss into dynamic latent scale GAN efficiently. When training dynamic latent scale GAN with normal i.i.d. latent random variable, and latent encoder is integrated into discriminator, a sum of a predicted latent random variable of real data and a scaled normal noise follows normal i.i.d. random variable. This random variable can be used for both VAE and GAN training. Considering the intermediate layer output of the discriminator as a feature encoder output, the generator can be trained to minimize perceptual VAE loss. Also, inference & backpropagation for perceptual VAE loss can be integrated into those for GAN training. Therefore, perceptual VAE training does not require additional computation. Also, the proposed method does not require prior loss or variance estimation like VAE.
[102] vixra:2212.0163 [pdf]
The SP-multiple-alignment Concept as a Generalisation of Six Other Variants of "Information Compression via the Matching and Unification of Patterns"
This paper focusses on the powerful concept of SP-multiple-alignment, a key part of the SP System (SPS), meaning the SP Theory of Intelligenceand its realisation in the SP Computer Model. The SPS is outlined in an appendix. More specifically, the paper shows with examples how the SP-multiplealignment construct may function as a generalisation of six other variants of ‘Information Compression via the Matching and Unification of Patterns’ (ICMUP). Each of those six variants is described in a separate section, and in each case there is a demonstration of how that variant may be modeled via the SP-multiple-alignment construct.
[103] vixra:2211.0015 [pdf]
The Acceleration of Multi-Factor Merton Model on FPGA
Credit risk stands for the risk of losses caused by unwanted events, such as the default of an obligor. The managing of portfolio credit risks is crucial for financial institutions. The multi-factor Merton model is one of the most widely used tools that modelling the credit risks for financial institutions. Typically, the implementation of the multi-factor Merton model involves Monte Carlo simulations which are time-consuming. This would significantly restrict its usability in daily credit risk measurement. In this report, we propose an FPGA architecture for credit-risk measurements in the multi-factor Merton models. The presented architecture uses a variety of optimization techniques such as kernel vectorization and loop unrolling, to optimize the performance of the FPGA implementation. The evaluation results show that compare to a basic C++ implementation running on a single-core Intel i5-4210 CPU, our proposed FPGA implementation can achieve an acceleration of up to 22 times, with a precision loss of less than 10−8.
[104] vixra:2211.0014 [pdf]
Parallel Parameter Estimation for Gilli-Winker Model Using Multi-Core CPUs
Agent-based modeling is a powerful tool that is widely used to model global financial systems. When the parameters of the model are appropriate, the price time series generated by the model exhibit marked similarities with actual financial time series and even reproduces some of their statistical characteristics.By using Kirman’s Ant model as a prototype, this report systematically explored Gilli and Winker’s parameter optimization method. In view of some limitations of this method, this report promoted some improvements, including a local-restart strategy to enhance the convergence ability of the original optimization method, as well as incorporate Simulated Annealing into the original method to help the algorithm escape from local optimums. Furthermore, since the parameter optimization of agent-based modeling tends to be very time-consuming, an acceleration method was also proposed to speed up this procedure. In the end, the presented methods have been validated with the EUR/USD exchange rate.
[105] vixra:2210.0089 [pdf]
Extending F1 Metric: Probabilistic Approach
This article explores the extension of well-known F1 score used for assessing the performance of binary classifiers. We propose the new metric using probabilistic interpretation of precision, recall, specifcity, and negative predictive value. We describe its properties and compare it to common metrics. Then we demonstrate its behavior in edge cases of the confusion matrix. Finally, the properties of the metric are tested on binary classifier trained on the real dataset.
[106] vixra:2209.0153 [pdf]
Technical Report for WAIC Challenge of Financial QA under Market Volatility
This technical report presents the 1st winning model for Financial Community Question-and-Answering (FCQA), which is a task newly introduced in the Challenge of Financial QA under Marker Volatility in WAIC 2022. FCQA aims to respond to the user’s queries in the financial forums with the assistance of heterogeneous knowledge sources. We address this problem by proposing a graph transformer based model for the efficient multi-source information fusion. As a result, we won the first place out of 4278 participating teams and outperformed the second place by 5.07 times on BLUE.
[107] vixra:2209.0089 [pdf]
Attention Weighted Fully Convolutional Neural Networks for Dermatoscopic Image Segmentation
The goal of this project was to develop a fully convolutional neural network (FCNN) capable of identifying the region of interest (ROI) in dermatoscopic images. To achieve this goal, a U-Net style model was developed for this task and enhanced with an attention module which operated on the extracted features. The addition of this attention module improved our model's semantic segmentation performance and increased pixel-level precision and recall by 4.0% and 4.6%respectively. The code used in thie paper can be found on the project github page: https://github.com/Michael-Blackwell/CapstoneProject
[108] vixra:2209.0069 [pdf]
Predictive Signals Obtained from Bayesian Network and the Prediction Quality
In this paper, we will propose a method for learning signals related to a data frame $D_{1}$. The learning algorithm will be based on the biggest entropy variations of a Bayesian network. The method will make it possible to obtain an optimal Bayesian network having a high likelihood with respect to signals $D_{1}$. From the learned optimal Bayesian network, we will show what to do to infer new signals $D_{2}$ and we will also introduce the prediction quality $Delta_{CR}$ allowing to evaluate the predictive quality of inferred signals $D_{2}$. We will then infer a large number (10000) of candidate signals $D_{2}$ and we will select the predictive signals $D_{2}^{*}$ having the best prediction quality. Once the optimal signals $D_{2}^{*}$ obtained, we will impose the same order of scatter (computed from the Mahalanobis) to the points of signals $D_{2}^{*}$ as of signals $D_{1}$.
[109] vixra:2209.0007 [pdf]
FaithNet: A Generative Framework in Human Mentalizing
In this paper, we first review some of the innovations in modeling mentalizing.Broadly, this involves building models of computing World Model and Theory of Mind(ToM). A simple framework, FaithNet, is then presented with concepts like persistence, continuity, cooperation and preference represented as faith rules.FaithNet defines a generative model that can sample faith rules. Our FaithNet utilize a general-purpose conditioning mechanism based on cross-attention, offering computations that best explain observed real-world events under a Bayesian criterion.
[110] vixra:2209.0005 [pdf]
Beatnet: CRNN and Particle Filtering for Online Joint Beat Downbeat and Meter Tracking
The online estimation of rhythmic information, such as beat positions, downbeat positions, and meter, is critical for many real-time music applications. Musical rhythm comprises complex hierarchical relationships across time, rendering its analysis intrinsically challenging and at times subjective. Furthermore, systems which attempt to estimate rhythmic information in real-time must be causal and must produce estimates quickly and efficiently. In this work, we introduce an online system for joint beat, downbeat, and meter tracking, which utilizes causal convolutional and recurrent layers, followed by a pair of sequential Monte Carlo particle filters applied during inference. The proposed system does not need to be primed with a time signature in order to perform downbeat tracking, and is instead able to estimate meter and adjust the predictions over time. Additionally, we propose an information gate strategy to significantly decrease the computational cost of particle filtering during the inference step, making the system much faster than previous sampling-based methods. Experiments on the GTZAN dataset, which is unseen during training, show that the system outperforms various online beat and downbeat tracking systems and achieves comparable performance to a baseline offline joint method.
[111] vixra:2208.0171 [pdf]
Singing Beat Tracking With Self-supervised Front-end and Linear Transformers
Tracking beats of singing voices without the presence of musical accompaniment can find many applications in music production, automatic song arrangement, and social media interaction.Its main challenge is the lack of strong rhythmic and harmonic patterns that are important for music rhythmic analysis in general. Even for human listeners, this can be a challenging task. As a result, existing music beat tracking systems fail to deliver satisfactory performance on singing voices. In this paper, we propose singing beat tracking as a novel task, and propose the first approach to solving this task. Our approach leverages semantic information of singing voices by employing pre-trained self-supervised WavLM and DistilHuBERT speech representations as the front-end and uses a self-attention encoder layer to predict beats. To train and test the system, we obtain separated singing voices and their beat annotations using source separation and beat tracking on complete songs, followed by manual corrections. Experiments on the 741 separated vocal tracks of the GTZAN dataset show that the proposed system outperforms several state-of-the-art music beat tracking methods by a large margin in terms of beat tracking accuracy. Ablation studies also confirm the advantages of pre-trained self-supervised speech representations over generic spectral features.
[112] vixra:2207.0146 [pdf]
Generalized Attention Mechanism and Relative Position for Transformer
In this paper, we propose generalized attention mechanism (GAM) by first suggesting a new interpretation for self-attention mechanism of Vaswani et al. . Following the interpretation, we provide description for different variants of attention mechanism which together form GAM. Further, we propose a new relative position representation within the framework of GAM. This representation can be easily utilized for cases in which elements next to each other in input sequence can be at random locations in actual dataset/corpus.
[113] vixra:2207.0062 [pdf]
Wave Function Collapse Visualization
Wave Function Collapse initializes output bitmapin a completely unobserved state, where each pixel value is in a superposition of colors of the input bitmap (so if the input was black-white then the unobserved states are shown in different shades of grey). The coefficients in these superpositions are real numbers, not complex numbers, so it doesn’t do the actual quantum mechanics, but it was inspired by QM. In this, we have been matching each tile to tile value by pixel to pixel by namingas it as "socket". We know that in code when we match the tile it would be in a random order so we had rotated them into a specific order to match each socket to socket which indicates the overlapping of tiles as the superposition of several Eigen states. It was first introduced in 2016 by Maxim Gumin which can generate procedural patterns from a sample image or from a collection of tiles. So we are just visualizing it in a mathematical way
[114] vixra:2207.0056 [pdf]
Designing Potential Drugs That Can Target Sars-COV-2’s Main Protease: A Proactive Deep Transfer Learning Approach Using LSTM Architecture
Drug discovery is a crucial step in the process of delivering a new drug to the market that can take up to 2-3 years which can be more penalizing given the current global pandemic caused by the outbreak of the novel coronavirus SARS-CoV 2. Artificial Intelligence methodologies have shown great potential in resolving tasks in various domains such as image classification, sound recognition, also in the range of the previous years, Artificial Intelligence proved to be the go-to for generative tasks for use cases such as music sequences, text generation and solving also problems in biology. The goal of this work is to harvest the power of these architectures using generative recurrent neural network with long short-term memory (LSTM) gating techniques in order to generate new and non-existing molecules that can bind to the main COVID-19 protease, which is a key agent in the transcription and replication of the virus, and thus can act as a potential drug that can neutralize the virus inside of an infected host. As of today, there are no specific targeted therapeutic agents to treat the disease and all existing treatments are all very limited. Known drugs that are passing clinical trials such as Hydroxychloroquine and Remdesivir showed respectively a binding energy with SARS-CoV-2’s main protease of -5.3 and -6.5, the results of the newly generated molecules exhibited scores ranging till -13.2.
[115] vixra:2206.0142 [pdf]
FASFA: A Novel Next-Generation Backpropagation Optimizer
This paper introduces the fast adaptive stochastic function accelerator (FASFA) for gradient-based optimization of stochastic objective functions. It works based on Nesterov-enhanced first and second momentum estimates. The method is simple and effective during implementation because it has intuitive/familiar hyperparameterization. The training dynamics can be progressive or conservative depending on the decay rate sum. It works well with a low learning rate and mini batch size. Experiments and statistics showed convincing evidence that FASFA could be an ideal candidate for optimizing stochastic objective functions, particularly those generated by multilayer perceptrons with convolution and dropout layers. In addition, the convergence properties and regret bound provide results aligning with the online convex optimization framework. In a first of its kind, FASFA addresses the growing need for diverse optimizers by providing next-generation training dynamics for artificial intelligence algorithms. Future experiments could modify FASFA based on the infinity norm.
[116] vixra:2203.0004 [pdf]
Literature Review of Recent Advancements in Hypergraph Learning as it Relates to Optimizer
Hypergraphs are a generalization of a graph in which an edge can join any number of vertices. In contrast, in an ordinary graph, an edge connects exactly two vertices.The applications of hypergraphs can range from analogical explainations such as social networks to hard generalities in the case of collabarative game theory where they are known as simple games. The more abstract applications can be used in localized and global optimizations of radial function under computational geometry , and the optmizers generated could also be used to solve linear scheduling problems. The theoretical approach developed under these categories can be used in embedding . cluster-ing and classification which can be solved through the application of spectral hypergraph clustering too.
[117] vixra:2202.0162 [pdf]
Hypergraph Deployment with Self-abrasive Deep Neural Networks and CSGANS
The objective of the study is to develop a definitive meta-analysis of the recent developments in hyper-graph theories’ application in the field and study of deep learning and more widely in Machine learning , the applications of this particular technique may range simple classification tuning to more advanced abstract GANs in the field of regenerative graphical systems and computer vision in general,In our experiments, we use a novel random walk procedure and show that our model achieves and, in most cases, surpasses state-of-the-art performance on benchmark data sets. Additionally we also try to display our classification performance as compared to traditional Statistical Techniques , ML algorithms as well as Classical and new Deep learning algorithms.
[118] vixra:2202.0106 [pdf]
Bayesian Network and Information Theory
In this paper, we will expose the BIC score expressed as a function of the Bayesian network's entropy. We will then use this BIC score to learn a Bayesian network from an example of data frame.
[119] vixra:2202.0079 [pdf]
Improving Multi Expression Programming: an Ascending Trail from Sea-level Even-3-parity Problem to Alpine Even-18-Parity Problem
Multi Expression Programming is a Genetic Programming variant that uses a linear representation of individuals. A unique feature of Multi Expression Programming is its ability of storing multiple solutions of a problem in a single chromosome. In this paper, we propose and use several techniques for improving the search performed by Multi Expression Programming. Some of the most important improvements are Automatically Defined Functions and Sub-Symbolic node representation. Several experiments with Multi Expression Programming are performed in this paper. Numerical results show that Multi Expression Programming performs very well for the considered test problems.
[120] vixra:2201.0188 [pdf]
Preliminary Concept of General Intelligent Network (Gin) for Brain-Like Intelligence
Preliminary concept of AGI for brain-like intelligence is presented in this paper. The solution is mainly in two aspects: firstly, we combine information entropy and generative network (GAN like) model to propose a paradigm of General Intelligent Network (GIN). In the GIN network, the original multimodal information can be encoded as low information entropy hidden state representations (HPPs), which can be reverse parsed by the contextually relevant generative network into observable information. Secondly,we propose a generalized machine learning operating system (GML system), which includes an observable processor (AOP), an HPP storage system, and a multimodal implicit sensing/execution network. Our code will be released at https://github.com/ggsonic/GIN
[121] vixra:2201.0094 [pdf]
Cardiovascular Disease Diagnosis using Deep Neural Networks
Cardiovascular disease causes 25% of deaths in America (Heart Disease Facts). Specifically, misdiagnosis of cardiovascular disease results in 11,000 American deaths annually, emphasizing the increasing need for Artificial Intelligence to improve diagnosis. The goal of our research was to determine the probability that a given patient has Cardiovascular Disease using 11 easily-accessible objective, examination, and subjective features from a data set of 70,000 people. To do this, we compared various Machine Learning and Deep Learning models. Exploratory Data Analysis (EDA) identified that blood pressure, cholesterol, and age were most correlated with an elevated risk of contracting heart disease. Principal Component Analysis (PCA) was employed to visualize the 11-D data onto a 2-D plane, and distinct aggregations in the data motivated the inference of specific cardiovascular conditions beyond the binary labels in the data set. To diagnose patients, several Machine Learning and Deep Learning models were trained using the data and compared using the metrics Binary Accuracy and F1 Score. The initial Deep Learning model was a Shallow Neural Network with 1 hidden layer consisting of 8 hidden units. Further improvements, such as adding 5 hidden layers with 8 hidden units each and employing Mini-Batch Gradient Descent, Adam Optimization, and He’s Initialization, were successful in decreasing train times. These models were coded without the utilization of Deep Learning Frameworks such as TensorFlow. The final model, which achieved a Binary Accuracy of 74.2% and an F1 Score of 0.73, consisted of 6 hidden layers, each with 128 hidden units, and was built using the highly optimized Keras library. While current industrial models require hundreds of comprehensive features, this final model requires only basic inputs, allowing versatile applications in rural locations and third-world countries. Furthermore, the model can forecast demand for medical equipment, improve diagnosis procedures, and provide detailed personalized health statistics.
[122] vixra:2112.0155 [pdf]
Comparison of Various Models for Stock Prediction
Due to the high volatility of the COVID-19 pandemic, interest in stock invest-ment is focused. Also, it is said that the atmosphere is gathering again fromthe cryptocurrency market to the domestic stock market. In this situation, welooked at which model could more accurately predict the closing
[123] vixra:2112.0135 [pdf]
Directed Dependency Graph Obtained from a Correlation Matrix by the Highest Successive Conditionings Method
In this paper we will propose a directed dependency graph obtained from a correlation matrix. This graph will include probabilistic causal sub-models for each node modeled by conditionings percentages. The directed dependency graph will be obtained using the highest successive conditionings method with a conditioning percentage value to be exceeded.
[124] vixra:2112.0130 [pdf]
The SP Challenge: that the SP System is More Promising as a Foundation for the Development of Human-Level Broad ai Than Any Alternative
The "SP Challenge" is the deliberately provocative theme of this paper: that the "SP System" (SPS), meaning the "SP Theory of Intelligence" and its realisation in the "SP Computer Model", is more promising as a foundation for the development of human-level broad AI, aka 'artificial general intelligence' (AGI), than any alternative. In that connection, the main strengths of the SPS are: 1) The adoption of a top-down, breadth-first research strategy with wide scope; 2) Recognition of the importance of information compression (IC) in human learning, perception, and cognition -- and, correspondingly, a central role for IC in the SPS; 3) The working hypothesis that all kinds of IC may be understood in terms of the matching and unification of patterns (ICMUP); 4) A resolution of the apparent paradox that IC may achieve decompression as well as compression. 5) The powerful concept of SP-multiple-alignment, a generalisation of six other variants of ICMUP; 6) the clear potential of the SPS to solve 19 problems in AI research; 7) Strengths and potential of the SPS in modelling several aspects of intelligence, including several kinds of probabilistic reasoning, versatility in the representation and processing of AI-related knowledge, and the seamless integration of diverse aspects of intelligence, and diverse kinds of knowledge, in any combination; 8) Several other potential benefits and applications of the SPS; 9) In "SP-Neural", abstract concepts in the SPS may be mapped into putative structures expressed in terms of neurons and their interconnections and intercommunications; 10) The concept of ICMUP provides an entirely novel perspective on the foundations of mathematics; 11) How to make generalisations from data, including the correction of over- and under-generalisations, and how to reduce or eliminate errors in data. There is discussion of how the SPS compares with some other potential candidates for the SP-Challenge. And there is an outline of possible future directions for the research.
[125] vixra:2112.0126 [pdf]
Pcarst: a Method of Weakening Conflict Evidence Based on Principal Component Analysis and Relatively Similar Transformation
How to deal with conflict is a significant issue in Dempster-Shafer evidence theory (DST). In the Dempster combination rule, conflicts will produce counter-intuitive phenomena. Therefore, many effective conflict handling methods have been presented. This paper proposes a new framework for reducing conflict based on principal component analysis and relatively similar transformation (PCARST), which can better reduce the impact of conflict evidence on the results, and has more reasonable results than existing methods. The main characteristic feature of the BPAs is maintained while the conflict evidence is regarded as a noise signal to be weakened. A numerical example is used to illustrate the effectiveness of the proposed method. Results show that a higher belief degree of the correct proposition is obtained comparing previous methods.
[126] vixra:2112.0097 [pdf]
Phish: A Novel Hyper-Optimizable Activation Function
Deep-learning models estimate values using backpropagation. The activation function within hidden layers is a critical component to minimizing loss in deep neural-networks. Rectified Linear (ReLU) has been the dominant activation function for the past decade. Swish and Mish are newer activation functions that have shown to yield better results than ReLU given specific circumstances. Phish is a novel activation function proposed here. It is a composite function defined as f(x) = xTanH(GELU(x)), where no discontinuities are apparent in the differentiated graph on the domain observed. Generalized networks were constructed using different activation functions. SoftMax was the output function. Using images from MNIST and CIFAR-10 databanks, these networks were trained to minimize sparse categorical crossentropy. A large scale cross-validation was simulated using stochastic Markov chains to account for the law of large numbers for the probability values. Statistical tests support the research hypothesis stating Phish could outperform other activation functions in classification. Future experiments would involve testing Phish in unsupervised learning algorithms and comparing it to more activation functions.
[127] vixra:2112.0095 [pdf]
Triplere: Knowledge Graph Embeddings Via Triple Relation Vectors
Knowledge representation is a classic problem in Knowledge graphs. Distance-based models have made great progress. The most significant recent developments in this direction have been those of Rotate[1] and PairRE[2], which focuses on expressing relationships as projections of nodes. However TransX series Model(TransE[3], TransH[4], TransR[5]) expresses relationships as translations of nodes. To date, the problem of the Combination of Projection and translation has received scant attention in the research literature. Hence, we propose TripleRE, a method that models relationships by projections and translations. Compared with the other knowledge representation model, we achieve the best results on the ogbl-wikikg2 dataset.
[128] vixra:2111.0172 [pdf]
New Evolutionary Computation Models and their Applications to Machine Learning
Automatic Programming is one of the most important areas of computer science research today. Hardware speed and capability have increased exponentially, but the software is years behind. The demand for software has also increased significantly, but it is still written in old fashion: by using humans. There are multiple problems when the work is done by humans: cost, time, quality. It is costly to pay humans, it is hard to keep them satisfied for a long time, it takes a lot of time to teach and train them and the quality of their output is in most cases low (in software, mostly due to bugs). The real advances in human civilization appeared during the industrial revolutions. Before the first revolution, most people worked in agriculture. Today, very few percent of people work in this field. A similar revolution must appear in the computer programming field. Otherwise, we will have so many people working in this field as we had in the past working in agriculture. How do people know how to write computer programs? Very simple: by learning. Can we do the same for software? Can we put the software to learn how to write software? It seems that is possible (to some degree) and the term is called Machine Learning. It was first coined in 1959 by the first person who made a computer perform a serious learning task, namely, Arthur Samuel. However, things are not so easy as in humans (well, truth to be said - for some humans it is impossible to learn how to write software). So far we do not have software that can learn perfectly to write software. We have some particular cases where some programs do better than humans, but the examples are sporadic at best. Learning from experience is difficult for computer programs. Instead of trying to simulate how humans teach humans how to write computer programs, we can simulate nature.
[129] vixra:2111.0170 [pdf]
Existence and Perception as the Basis of AGI (Artificial General Intelligence)
I believe that AGI (Artificial General Intelligence), unlike current AI models must operate with meanings / knowledge. This is exactly what distinguishes it from neural network based AI. Any successful AI implementations (playing chess, self-driving, face recognition, etc.) in no way operate with knowledge about the objects being processed and do not recognize their meanings / cognitive structure. This is not necessary for them, they demonstrate good results based on pre-training. But for AGI, which imitates human thinking, the ability to operate with knowledge is crucial. Numerous attempts to define the concept of "meaning" have one very significant drawback - all such definitions are not rigorous and formalized, therefore they cannot be programmed. The procedure of searching for meaning / knowledge should use a formalized determination of its existence and possible forms of its perception, usually multimodal. For the practical implementation of AGI, it is necessary to develop such "ready-to-code" formalized definitions of the cognitive concepts of "meaning", "knowledge", "intelligence" and others related to them. This article attempts to formalize the definitions of such concepts.
[130] vixra:2111.0169 [pdf]
Evolving Evolutionary Algorithms using Multi Expression Programming
Finding the optimal parameter setting (i.e. the optimal population size, the optimal mutation probability, the optimal evolutionary model etc) for an Evolutionary Algorithm (EA) is a difficult task. Instead of evolving only the parameters of the algorithm we will evolve an entire EA capable of solving a particular problem. For this purpose the Multi Expression Programming (MEP) technique is used. Each MEP chromosome will encode multiple EAs. An nongenerational EA for function optimization is evolved in this paper. Numerical experiments show the effectiveness of this approach.
[131] vixra:2111.0161 [pdf]
ANN Synthesis and Optimization of Electronically Scanned Coupled Planar Periodic and Aperiodic Antenna Arrays Modeled by the MoM-GEC Approach
This paper proposes a new formulation that relied on the moment technique combined with the equivalent circuit (MoM-GEC) to study a beamforming application for the coupled periodic and quasi-periodic planar antenna array. Numerous voltage designs are utilized to show the adequacy and unwavering quality of the proposed approach. The radiators are viewed as planar dipoles and consequently shared (mutual) coupling effects are considered. The recommended array shows a noticeable improvement against the current structures as far as size, 3-D scanning, directivity, SLL reduction, and HPBW. The results verify that multilayer feed-forward neural networks are vigorous and can take care of complex antenna problems. Even so, an artificial neural network (ANN) is ready to create quickly the results of optimization and synthesis by utilizing generalization with an early stopping method. Significant gain in the running time consumption and memory used is acquired employing this last technique for improving generalization (named early stopping). Simulation results are carried out using MATLAB. To approve this work, several simulation examples are shown.
[132] vixra:2111.0069 [pdf]
A Modified Belief Functions Distance Measure for Orderable Set
This paper proposes a new method of measuring the distance between conflicting order sets, quantifying the similarity between focal elements and their own size. This method can effectively measure the conflict of belief functions on an ordered set without saturation due to the non-overlapping focus elements. It has proven that the method satisfies the property of the distance. Examples of the engineering budget and sensors show that the distance can effectively measure the conflict between ordered sets, and prove the distance we propose to reflect the information of order sets more comprehensively by comparison with existing methods and the conflict metric between ordered sets is more robust and accurate
[133] vixra:2111.0065 [pdf]
Robotic Autonomy: A Survey
Robotic autonomy is key to the expansion of robotic applications. The paper reviews the success of robotic autonomy in industrial applications, as well as the requirements and challenges on expanding robotic autonomy to in needing applications, such as education, medical service, home service, etc. Through the discussions, the paper draws the conclusion that robotic intelligence is the bottleneck for the broad application of robotic technology.
[134] vixra:2111.0015 [pdf]
A New Algorithm based on Extent Bit-array for Computing Formal Concepts
The emergence of Formal Concept Analysis (FCA) as a data analysis technique has increased the need for developing algorithms which can compute formal concepts quickly. The current efficient algorithms for FCA are variants of the Close-By-One (CbO) algorithm, such as In-Close2, In-Close3 and In-Close4, which are all based on horizontal storage of contexts. In this paper, based on algorithm In-Close4, a new algorithm based on the vertical storage of contexts, called InClose5, is proposed, which can significantly reduce both the time complexity and space complexity of algorithm In-Close4. Technically, the new algorithm stores both context and extent of a concept as a vertical bit-array, while within In-Close4 algorithm the context is stored only as a horizontal bit-array, which is very slow in finding the intersection of two extent sets. Experimental results demonstrate that the proposed algorithm is much more effective than In-Close4 algorithm, and it also has a broader scope of applicability in computing formal concept in which one can solve the problems that cannot be solved by the In-Close4 algorithm.
[135] vixra:2111.0014 [pdf]
Granule Description based on Compound Concepts
Concise granule descriptions for definable granules and approaching descriptions for indefinable granules are challenging and important issues in granular computing. The concept with only common attributes has been intensively studied. To investigate the granules with some special needs, we propose a novel type of compound concepts in this paper, i.e., common-and-necessary concept. Based on the definitions of concept-forming operations, the logical formulas are derived for each of the following types of concepts: formal concept, object-induced three-way concept, object oriented concept and common-and-necessary concept. Furthermore, by utilizing the logical relationship among various concepts, we have derived concise and unified equivalent conditions for definable granules and approaching descriptions for indefinable granules for all four kinds of concepts.
[136] vixra:2110.0138 [pdf]
Enhancing the Weakening of the Conflict Evidence Using Similarity Matrix and Dispersion of Similarities in Dempster-Shafer Evidence Theory
Classic Dempster combination rule may result in illogical results when combining highly conflict evidence. How to deal with highly conflict evidence and get a reasonable result is critical. Modifying the evidence is one of significant strategies according to the importance of each evidence (e.g. similarity matrix). However, the dispersion of evidence similarity is rarely taken into consideration, which is also an important feature to distinguish the conflict evidence and normal evidence. In this paper, a new method based on similarity matrix and dispersion of evidence similarity is proposed to evaluate the importance of evidence in Dempster-Shafer theory (DST). The proposed method enhances to weaken the influence of the conflict evidence. Robustness of the proposed method is verified through the sensitivity analysis the changes of degree of conflict and amount of credible evidence changes in DST. Some numerical examples are used to show the effectiveness of the proposed method.
[137] vixra:2110.0085 [pdf]
AniVid: A Novel Anime Video Dataset with Applications in Animation
Automating steps of the animation production process using AI-based tools would ease the workload of Japanese animators. Although there have been recent advances in the automatic animation of still images, the majority of these models have been trained on human data and thus are tailored to images of humans. In this work, I propose a semi-automatic and scalable assembling pipeline to create a large-scale dataset containing clips of anime characters’ faces. Using this assembling strategy, I create AniVid, a novel anime video dataset consisting of 34,221 video clips. I then use a transfer learning approach to train a first order motion model (FOMM) on a portion of AniVid, which effectively animates still images of anime characters. Extensive experiments and quantitative results show that FOMM trained on AniVid outperforms other trained versions of FOMM when evaluated on my test set of anime videos.
[138] vixra:2110.0055 [pdf]
Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification using ISIC 2017 Dataset
Skin cancer is one of the deadly types of cancer and is common in the world. Recently, there has been a huge jump in the rate of people getting skin cancer. For this reason, the number of studies on skin cancer classification with deep learning are increasing day by day. For the growth of work in this area, the International Skin Imaging Collaboration (ISIC) organization was established and they created an open dataset archive. In this study, images were taken from ISIC 2017 Challenge. The skin cancer images taken were preprocessed and data augmented. Later, these images were trained with transfer learning and fine-tuning approach and deep learning models were created in this way. 3 different mobile deep learning models and 3 different batch size values were determined for each, and a total of 9 models were created. Among these models, the NASNetMobile model with 16 batch size got the best result. The accuracy value of this model is 82.00%, the precision value is 81.77% and the F1 score value is 0.8038. Our method is to benchmark mobile deep learning models which have few parameters and compare the results of the models.
[139] vixra:2110.0036 [pdf]
Directed Dependency Graph Obtained from a Continuous Data Matrix by the Highest Successive Conditionings Method.
In this paper, we propose a directed dependency graph learned from a continuous data matrix in order to extract the hidden oriented dependencies from this matrix. For each of the dependency graph's node, we will assign a random variable as well as a conditioning percentage linking parents and children nodes of the graph. Among all the dependency graphs learned from the continuous data matrix, we will choose the one using the highest successive conditionings method.
[140] vixra:2109.0124 [pdf]
A Proposed Solution to Problems in Learning the Knowledge Needed by Self-Driving Vehicles
Three problems in learning knowledge for self-driving vehicles are: how a finite sample of information about driving, N, can yield an ability to deal with the infinity of possible driving situations; the problem of generalising from N without over- or under-generalisation; and how to weed out errors in N. A theory developed with computer models to explain a child’s learning of his or her first language, now incorporated in the SP System, suggests: compress N as much as possible by a process that creates a grammar, G, and an encoding of N in terms of G called E. Then discard E which contains all or most of the errors in N, and retain G which solves the first two problems.
[141] vixra:2108.0095 [pdf]
A New Interpolation Approach and Corresponding Instance-Based Learning
Starting from finding approximate value of a function, introduces the measure of approximation-degree between two numerical values, proposes the concepts of “strict approximation” and “strict approximation region”, then, derives the corresponding one-dimensional interpolation methods and formulas, and then presents a calculation model called “sum-times-difference formula” for high-dimensional interpolation, thus develops a new interpolation approach, that is, ADB interpolation. ADB interpolation is applied to the interpolation of actual functions with satisfactory results. Viewed from principle and effect, the interpolation approach is of novel idea, and has the advantages of simple calculation, stable accuracy, facilitating parallel processing, very suiting for high-dimensional interpolation, and easy to be extended to the interpolation of vector valued functions. Applying the approach to instance-based learning, a new instance-based learning method, learning using ADB interpolation, is obtained. The learning method is of unique technique, which has also the advantages of definite mathematical basis, implicit distance weights, avoiding misclassification, high efficiency, and wide range of applications, as well as being interpretable, etc. In principle, this method is a kind of learning by analogy, which and the deep learning that belongs to inductive learning can complement each other, and for some problems, the two can even have an effect of “different approaches but equal results” in big data and cloud computing environment. Thus, the learning using ADB interpolation can also be regarded as a kind of “wide learning” that is dual to deep learning.
[142] vixra:2108.0029 [pdf]
Information Theory Applied to Bayesian Network for Learning Continuous Data Matrix
In this paper, we are proposing a learning algorithm for continuous data matrix based on entropy absorption of a Bayesian network.This method consists in losing a little bit of likelihood compared to a chain rule's best likelihood, in order to get a good idea of the higher conditionings that are taking place between the Bayesian network's nodes. We are presenting the known results related to information theory, the multidimensional Gaussian probability, AIC and BIC scores for continuous data matrix learning from a Bayesian network, and we are showing the entropy absorption algorithm using the Kullback-leibler divergence with an example of continuous data matrix.
[143] vixra:2107.0124 [pdf]
Breaking Free from the Stability-Plasticity Dilemma with Incremental Domain Inference on Sequential Data
We make the case for identifying the input domain prior to running downstream models and propose an architecture that opens the door to lifelong learning systems that forget at a decreasing rate as the tasks grow in complexity. Our model accurately identifies domains and is compatible with other continual learning algorithms, provided they benefit from knowing the current domain beforehand.
[144] vixra:2107.0122 [pdf]
Open Science with Respect to Artificial Intelligence
Artificial Intelligence is one of those fields in computer science that is currently being extensively studied. In this paper, the author attempts to summarise the current state of research in the field with respect to openness to the general community, and has found a profound lack of opportunity to contribute to the field as a novice, and a near monopoly of effective research by large industries while production environments continue to largely remain safe from such influences.
[145] vixra:2106.0040 [pdf]
Vudoku - A Visual Sudoku Solver
<p>It is no secret that AI is an upcoming titan. Even though people are stunned to hear that AI has been here for around a century, due to the advancement in computational methods and resources, today AI peaks like never before. As a tiny glimpse into the field of Digit Recognition, this project aims to understand the underlying cogs and wheels on which the neural networks spin. This paper tries to elucidate a project which solves the Sudoku puzzle drawn and written by hand. The paraphernalia for that project includes programming language: Python3; libraries: OpenCV, Numpy, Keras; datasets: MNIST handwritten digit database. Digit recognition is a classical problem which will introduce neurons, neural networks, connections hidden layers, weights, biases, activation functions like sigmoid, back-propagation and other related topics as well. Algorithm(s) in the project employed to solve Sudoku is also explored in this paper.</p>
[146] vixra:2105.0176 [pdf]
Gesture Classification using Machine Learning with Advanced Boosting Methods
In this paper, a detailed study on gesture classifica- tion using a dataset from Kaggle and optimizing the dataset is presented. The machine learning algorithms, which are SGD, kNN, SVM, MLP, Gaussian Naive Bayes classifier, Random Forest, LightGBM, XGBoost, and CatBoost classifiers, to conduct the research and, are used. The results are compared with each other to conclude which models perform the best in gesture classification. Except for the Gaussian Naive Bayes classifier, all methods resulted in high accuracy.
[147] vixra:2105.0095 [pdf]
Biochemistry Provides Inspiration for a New Kind of ai
This article is about the origin, development, and benefits of the "SP System" (SPS), which means the "SP Theory of Intelligence" and its realisation in the "SP Computer Model" (SPCM). The SPS is radically different from deep neural networks (DNNs), with many advantages compared with DNNs. As will be described, the SPS provides a promising foundation for the development of human-like broad AI. The SPS was inspired in part by: evidence for the importance of information compression in human learning, perception, and cognition; and the concept of `multiple sequence alignment' in biochemistry. That latter concept led to the development of the powerful concept of SP-multiple-alignment, a concept which is largely responsible for the intelligence-related versatility of the SPS. The main advantages of the SPS are: 1) The clear potential of the SPS to solve 19 problems in AI research; 2) Versatility of the SPS in aspects of intelligence, including unsupervised learning, and several forms of reasoning; 3) Versatility of the SPS in the representation and processing of knowledge; 4) Seamless integration of diverse aspects of intelligence and diverse forms of knowledge, in any combination, a kind of integration that appears to be necessary in any artificial system that aspires to the fluidity and adaptability of the human mind; 5) Several other potential benefits and applications of the SPS. It is envisaged that the SPCM will provide the basis for the development of a first version of the {\em SP Machine}, with high levels of parallel processing and a user-friendly user interface. All software in the SP Machine would be open-source so that clones of the SP Machine may be created anywhere by individuals or groups, to facilitate further research and development of the SP System.
[148] vixra:2105.0033 [pdf]
Generalized Quantum Evidence Theory on Interference Effect
In this paper, CET is generalized to quantum framework of Hilbert space in an open world, called generalized quantum evidence theory (GQET). Differ with classical GET, interference effects are involved in GQET. Especially, when a GQBBA turns into a classical GBBA, interference effects disappear, so that GQB and GQP functions of GQET degenerate to classical GBel and GPl functions of classical GET, respectively.
[149] vixra:2104.0145 [pdf]
On the Negation Intensity of a Basic Probability Assignment (Bpa)
How to obtain negation knowledge is a crucial topic, especially in the field of artificial intelligence. Limited work has been done on the negation of a basic probability assignment (BPA), and which has been studied in depth throughout the literature. However, the aspect of the intensity level of negation enforcement has not yet been investigated. Moreover, let us note that the main characteristic of intelligent systems is just the flexibility for the sake of being able to represent knowledge according to each situation. In general, researchers have a tendency to express the need for cognitive range in the negation. Thus, it would seem very useful to find a wide range of negations under intensity levels in a BPA. Based on these ideas, this paper first proposes a new approach of finding a BPA negation and gives a domain of intensity in which the negation is executed, which is called the negation space. Then, we investigate a number of desirable properties and explore their correlation with entropy. Numerical examples show the characteristics of the proposed negation solution. Finally, we validate the efficiency of the proposed method from the point of view of the Dempster-Shafer belief structure.
[150] vixra:2104.0111 [pdf]
A Novel Conflict Management Considering the Optimal Discounting Weights Using the BWM Method in Dempster-Shafer Evidence Theory
Dempster-Shafer evidence theory (DST) is an effective tool for data fusion. In this theory, how to handle conflicts between evidences is still a significant and open issue. In this paper, the best-worst method (BWM) is extended to conflict management in DST. Firstly, a way to determine the best and worst basic probability assignment (BPA) is proposed. Secondly, a novel strategy for determining the optimal weights of BPA using the BWM method is developed. Compared to traditional measure-based conflict management methods, the proposed method has three better performances: (1) A consistency ratio is considered for BPA to check the reliability of the comparisons, producing more reliable results. (2) The final fusion result has less uncertainty, which is more conducive to improve the performance of decision making. (3) The number of BPA comparisons performed during operation (in conflict management) is reduced (especially matrix-based). A practical application in motor rotor fault diagnosis is used to illustrate the effectiveness and practicability of the proposed methodology.
[151] vixra:2103.0194 [pdf]
Uwb-GCN: Accelerating Graph Convolutional Networks Through Runtime Workload Rebalancing
In this paper, we propose an architecture design called Ultra-Workload-Balanced-GCN (UWB-GCN) to accelerate graph convolutional network inference. To tackle the major performance bottleneck of workload imbalance, we propose two techniques: dynamic local sharing and dynamic remote switching, both of which rely on hardware flexibility to achieve performance auto-tuning with negligible area or delay overhead. Specifically, UWB-GCN is able to effectively profile the sparse graph pattern while continuously adjusting the workload distribution among parallel processing elements (PEs). After converging, the ideal configuration is reused for the remaining iterations. To the best of our knowledge, this is the first accelerator design targeted to GCNs and the first work that auto-tunes workload balance in accelerator at runtime through hardware, rather than software, approaches. Our methods can achieve near-ideal workload balance in processing sparse matrices. Experimental results show that UWB-GCN can finish the inference of the Nell graph (66K vertices, 266K edges) in 8.1ms, corresponding to 199x, 16x, and 7.5x respectively, compared to the CPU, GPU, and the baseline GCN design without workload rebalancing.
[152] vixra:2103.0185 [pdf]
Hierarchical Relationship Alignment Metric Learning
Most existing metric learning methods focus on learning a similarity or distance measure relying on similar and dissimilar relations between sample pairs. However, pairs of samples cannot be simply identified as similar or dissimilar in many real-world applications, e.g., multi-label learning, label distribution learning. To this end, relation alignment metric learning (RAML) framework is proposed to handle the metric learning problem in those scenarios. But RAML learn a linear metric, which can’t model complex datasets. Combining with deep learning and RAML framework, we propose a hierarchical relationship alignment metric leaning model HRAML, which uses the concept of relationship alignment to model metric learning problems under multiple learning tasks, and makes full use of the consistency between the sample pair relationship in the feature space and the sample pair relationship in the label space. Further we organize several experiment divided by learning tasks, and verified the better performance of HRAML against many popular methods and RAML framework.
[153] vixra:2103.0184 [pdf]
Representation Learning by Ranking Under Multiple Tasks
In recent years, representation learning has become the research focus of the machine learning community. Large-scale pre-training neural networks have become the first step to realize general intelligence. The key to the success of neural networks lies in their abstract representation capabilities for data. Several learning fields are actually discussing how to learn representations and there lacks a unified perspective. We convert the representation learning problem under multiple tasks into a ranking problem, taking the ranking problem as a unified perspective, the representation learning under different tasks is solved by optimizing the approximate NDCG loss. Experiments under different learning tasks like classification, retrieval, multi-label learning, regression, self-supervised learning prove the superiority of approximate NDCG loss. Further, under the self-supervised learning task, the training data is transformed by data augmentation method to improve the performance of the approximate NDCG loss, which proves that the approximate NDCG loss can make full use of the information of the unsupervised training data.
[154] vixra:2103.0174 [pdf]
Explaining Representation by Mutual Information
Science is used to discover the law of world. Machine learning can be used to discover the law of data. In recent years, there are more and more research about interpretability in machine learning community. We hope the machine learning methodsaresafe,interpretable,andtheycanhelpusto find meaningful pattern in data. In this paper, we focus on interpretability of deep representation. We propose a interpretable method of representation based on mutual information, which summarizes the interpretation of representation into three types of information between input data and representation. We further proposed MI-LR module, which can be inserted into the model to estimate the amount of information to explain the model’s representation. Finally, we verify the method through the visualization of the prototype network.
[155] vixra:2103.0148 [pdf]
New Ordinal Relative Fuzzy Entropy
In real life, occurrences of a series of things are supposed to come in an order. Therefore, it is necessary to regard sequence as a crucial factor in managing different kinds of things in fuzzy environment. However, few related researches have been made to provided a reasonable solution to this demand. Therefore, how to measure degree of uncertainty of ordinal fuzzy sets is still an open issue. To address this issue, a novel ordinal relative fuzzy entropy is proposed in this paper taking orders of propositions into consideration in measuring level of uncertainty in fuzzy environment. Compared with previously proposed entropies, effects on degrees of fuzzy uncertainty brought by sequences of sequential propositions are embodied in values of measurement using proposed method in this article. Moreover, some numerical examples are offered to verify the correctness and validity of the proposed entropy.
[156] vixra:2101.0168 [pdf]
Recent Trends in Named Entity Recognition (NER)
The availability of large amounts of computer-readable textual data and hardware that can process the data has shifted the focus of knowledge projects towards deep learning architec- ture. Natural Language Processing, particularly the task of Named Entity Recognition is no exception. The bulk of the learning methods that have produced state-of-the-art results have changed the deep learning model, the training method used, the training data itself or the encoding of the output of the NER system. In this paper, we review significant learning methods that have been employed for NER in the recent past and how they came about from the linear learning methods of the past. We also cover the progress of related tasks that are upstream or downstream to NER eg. sequence tagging, entity linking etc. wherever the processes in question have also improved NER results.
[157] vixra:2101.0115 [pdf]
CNN Based Common Approach to Handwritten Character Recognition of Multiple Scripts
There are many scripts in the world, several of which are used by hundreds of millions of people. Handwrittencharacter recognition studies of several of these scripts arefound in the literature. Different hand-crafted feature sets havebeen used in these recognition studies. However, convolutionalneural network (CNN) has recently been used as an efficientunsupervised feature vector extractor. Although such a networkcan be used as a unified framework for both feature extractionand classification, it is more efficient as a feature extractor than asa classifier. In the present study, we performed certain amount of training of a 5-layer CNN for a moderately large class characterrecognition problem. We used this CNN trained for a larger classrecognition problem towards feature extraction of samples of several smaller class recognition problems. In each case, a distinctSupport Vector Machine (SVM) was used as the correspondingclassifier. In particular, the CNN of the present study is trainedusing samples of a standard 50-class Bangla basic characterdatabase and features have been extracted for 5 different 10-classnumeral recognition problems of English, Devanagari, Bangla,Telugu and Oriya each of which is an official Indian script.Recognition accuracies are comparable with the state-of-the-art
[158] vixra:2012.0224 [pdf]
Quantum Algorithm of Dempster Combination Rule
Dempster combination rule is widely used in many applications such as information fusion and decision making. However, the computational complexity of Dempster combination rule increases exponentially with the increase of frame of discernment. To address this issue, we propose the quantum algorithm of Dempster combination rule based on quantum theory. The algorithm not only realizes most of the functions of Dempster combination rule, but also effectively reduces the computational complexity of Dempster combination rule in future quantum computer. Meanwhile, we carried out a simulation experiment on the quantum cloud platform of IBM, and the experimental results showed that the algorithm is reasonable.
[159] vixra:2012.0142 [pdf]
Predicting Year of Plantation with Hyperspectral and Lidar Data
This paper introduces a methodology for predicting the year of plantation (YOP) from remote sensing data. The application has important implications in forestry management and inventorying. We exploit hyperspectral and LiDAR data in combination with state-of-the-art machine learning classi-fiers. In particular, we present a complete processing chain to extract spectral, textural and morphological features from both sensory data. Features are then combined and fed a Gaussian Process Classifier (GPC) trained to predict YOP in a forest area in North Carolina (US). The GPC algorithm provides accurate YOP estimates, reports spatially explicit maps and associated confidence maps, and provides sensible feature rankings.
[160] vixra:2012.0141 [pdf]
Passive Millimeter Wave Image Classification with Large Scale Gaussian Processes
Passive Millimeter Wave Images (PMMWIs) are being increasingly used to identify and localize objects concealed under clothing. Taking into account the quality of these images and the unknown position, shape, and size of the hidden objects, large data sets are required to build successful classification/detection systems. Kernel methods, in particular Gaussian Processes (GPs), are sound, flexible, and popular techniques to address supervised learning problems. Unfortunately, their computational cost is known to be prohibitive for large scale applications. In this work, we present a novel approach to PMMWI classification based on the use of Gaussian Processes for large data sets. The proposed methodology relies on linear approximations to kernel functions through random Fourier features. Model hyperparameters are learned within a variational Bayes inference scheme. Our proposal is well suited for real-time applications, since its computational cost at training and test times is much lower than the original GP formulation. The proposed approach is tested on a unique, large, and real PMMWI database containing a broad variety of sizes, types, and locations of hidden objects.
[161] vixra:2012.0058 [pdf]
Detecting Insincere Questions from Text: A Transfer Learning Approach.
The internet today has become an unrivalled source of information where people converse on content based websites such as Quora, Reddit, StackOverflow and Twitter asking doubts and sharing knowledge with the world. A major arising problem with such websites is the proliferation of toxic comments or instances of insincerity wherein the users instead of maintaining a sincere motive indulge in spreading toxic and divisive content. The straightforward course of action in confronting this situation is detecting such content beforehand and preventing it from subsisting online. In recent times Transfer Learning in Natural Language Processing has seen an unprecedented growth. Today with the existence of transformers and various state of the art innovations, a tremendous growth has been made in various NLP domains. The introduction of BERT has caused quite a stir in the NLP community. As mentioned, when published, BERT dominated performance benchmarks and thereby inspired many other authors to experiment with it and publish similar models. This led to the development of a whole BERT-family, each member being specialized on a different task. In this paper we solve the Insincere Questions Classification problem by fine tuning four cutting age models viz BERT, RoBERTa, DistilBERT and ALBERT.
[162] vixra:2012.0048 [pdf]
Randomized RX for Target Detection
This work tackles the target detection problem through the well-known global RX method. The RX method models the clutter as a multivariate Gaussian distribution, and has been extended to nonlinear distributions using kernel methods. While the kernel RX can cope with complex clutters, it requires a considerable amount of computational resources as the number of clutter pixels gets larger. Here we propose random Fourier features to approximate the Gaussian kernel in kernel RX and consequently our development keep the accuracy of the nonlinearity while reducing the computational cost which is now controlled by an hyperparameter. Results over both synthetic and real-world image target detection problems show space and time efficiency of the proposed method while providing high detection performance.
[163] vixra:2011.0179 [pdf]
On the Belief Coulomb Force
Conflict management is a key issue in D-S evidence theory(DST) and has been the focus of many related researchers. However, there has been a lack of discussion about whether the evidence should be fused. In this paper, in the frame of DST, inspired by the belief universal gravitation[1], we proposed a concept of belief Coulomb force (BCF) to focus on whether or not the evidence should be fused. It aims to discuss the elimination of conflicts in the information fusion process from the perspective of electricity, which may provide us with a new idea to solve the problem of conflict evidence. An application is used to show that the conflict management is solved better than previous methods by using the proposed BCF.
[164] vixra:2011.0129 [pdf]
An Attempt to Decrypt Pages 269-271 of Jonathan Safran Foer's "Extremely Loud & Incredibly Close"
In this paper we attempt to decrypt the sequence of digits given by Jonathan Safran Foer in his novel <i>Extremely Loud & Incredibly Close</i>. We create directed acyclic graphs that a human can follow to find potential solutions. Representations of these graphs are displayed in this paper. The Python code used to produce them is also provided, in the appendix.
[165] vixra:2010.0225 [pdf]
FUSIONET: A Scalable Framework for Image Classification
Convolutional Neural Networks have become state-of-the-art methods for image classification in recent times. CNNs have proven to be very productive in identifying objects, human faces, powering machine vision in robots as well as self-driving cars. At this point, they perform better than human subjects on a large number of image datasets. A large portion of these datasets depends on the idea of solid classes. Hence, Image classification has become an exciting and appealing domain in Artificial Intelligence (AI) research. In this paper, we have proposed a unique framework, FUSIONET, to aid in image classification. Our proposition utilizes the combination of 2 novel models in parallel (MainNET, a 3 x 3, architecture and AuxNET, a 1 x 1 architecture) Successively; these relatively feature maps, extracted from the above combination are fed as input features to a downstream classifier for classification tasks about the images in question. Herein FUSIONET, has been trained, tested, and evaluated on real-world datasets, achieving state-of-the-art on the popular CINIC-10 dataset.
[166] vixra:2010.0147 [pdf]
A Genetic Algorithm and Discriminant Analysis Based Outlier Detector
Fisher Discriminant Analysis (FDA), also known as Linear Discriminant Analysis (LDA) is a simple in nature yet highly effective tool for classification for vast types of datasets and settings. In this paper, we propose to leverage the discriminative potency of FDA for an unsupervised outlier detection algorithm. Unsupervised anomaly detection has been a topic of high interest in literature due to its numerous practical applications and fuzzy nature of subjective interpretation of success, therefore it is important to have different types of algorithms which can deliver distinct perspectives. Proposed method selects the subset of outlier points based on the maximization of LDA distance between the class of non-outliers via genetic algorithm.
[167] vixra:2010.0060 [pdf]
Auto-Encoder Transposed Permutation Importance Outlier Detector
We propose an innovative, trivial yet effective unsupervised outlier detection algorithm called Auto-Encoder Transposed Permutation Importance Outlier Detector (ATPI), which is based on the fusion of two machine learning concepts, autoencoders and permutation importance. As unsupervised anomaly detection is a subjective task, where the accuracy of results can vary on the demand; we believe this kind of a novel framework has a great potential in this field.
[168] vixra:2009.0173 [pdf]
Can a Video Game and Artificial Intelligence Assist for Selecting National Soccer Squads ?
We have used the FIFA19 video game open dataset of soccer player attributes and the actual list of squads of national teams that participated in World Cup 2018, which almost coincides in time with the game’s release date. With the intended rationale behind that numerous expert game developers should have spent considerable amount of time to assess each individual player’s attributes; we can develop and test data science and machine learning tools to select national soccer teams in an attempt to assist coaches. The work provides detailed explanatory data analysis and state-of-the-art machine learning and interpretability measures.
[169] vixra:2009.0165 [pdf]
Combining Conflicting Evidences Based on Pearson Correlation Coefficient and Weighted Graph
Dempster-Shafer evidence theory (evidence theory) has been widely used for its great performance of dealing with uncertainty. Based on evidence theory, researchers have presented different methods to combine evidences. Dempster's rule is the most well-known combination method, which has been applied in many fields. However, Dempster's rule may yield counter-intuitive results when evidences are in high conflict. To improve the performance of combining conflicting evidences, in this paper, we present a new evidence combination method based on Pearson correlation coefficient and weighted graph. The proposed method can correctly identify the target with a high accuracy. Besides, the proposed method has a better performance of convergence compared with other combination methods. In addition, the weighted graph generated by the proposed method can directly represent the relation of different evidences, which can help researchers to determine the reliability of every evidence. Moreover, an experiment is expounded to show the efficiency of the proposed method, and the results are analyzed and discussed.
[170] vixra:2009.0138 [pdf]
RGBSticks : A New Deep Learning Based Framework for Stock Market Analysis and Prediction
We present a novel intuitive graphical representation for daily stock prices, which we refer as RGBSticks, a variation of classical candle sticks. This representation allows the usage of complex deep learning based techniques, such as deep convolutional autoencoders and deep convolutional generative adversarial networks to produce insightful visualizations for market’s past and future states
[171] vixra:2009.0061 [pdf]
Transparency and Granularity in the SP Theory of Intelligence and Its Realisation in the SP Computer Model
This chapter describes how the SP System, meaning the SP Theory of Intelligence, and its realisation as the SP Computer Model, may promote transparency and granularity in AI, and some other areas of application. The chapter describes how transparency in the workings and output of the SP Computer Model may be achieved via three routes: 1) the program provides a very full audit trail for such processes as recognition, reasoning, analysis of language, and so on. There is also an explicit audit trail for the unsupervised learning of new knowledge; 2) knowledge from the system is likely to be granular and easy for people to understand; and 3) there are seven principles for the organisation of knowledge which are central in the workings of the SP System and also very familiar to people (eg chunking-with-codes, part-whole hierarchies, and class-inclusion hierarchies), and that kind of familiarity in the way knowledge is structured by the system, is likely to be important in the interpretability, explainability, and transparency of that knowledge. Examples from the SP Computer Model are shown throughout the chapter.
[172] vixra:2009.0018 [pdf]
More Problems in AI Research and How the SP System May Help to Solve Them (Technical Report)
This technical report, an adjunct to the paper "Problems in AI research ...", describes some problems in AI research and how the {\em SP System} (meaning the "SP Theory of Intelligence" and its realisation in the "SP Computer Model") may help to solve them. It also contains a fairly detailed outline of the SP System. Most of the problems considered in this report are described by leading researchers in AI in interviews with science writer Martin Ford, and presented in his book "Architects of Intelligence". Problems and their potential solutions that are described in this report are: the need for more emphasis in research on the use of top-down strategies is met by the way SP has been developed entirely within a top-down framework; the risk of accidents with self-driving vehicles may be minimised via the theory of generalisation within the SP System; the need for strong compositionality in the structure of knowledge is met by processes within the SP Computer Model for unsupervised learning and the organisation of knowledge; although commonsense reasoning and commonsense knowledge are challenges for all theories of AI, the SP System has some promising features; the SP programme of research is one of very few working to establishing the key importance of information compression in AI research; Likewise, the SP programme of research is one of relatively few AI-related research programmes attaching much importance to the biological foundations of intelligence; the SP System lends weight to 'localist' (as compared with 'distributed') views of how knowledge is stored in the brain; compared with deep neural networks, the SP System offers much more scope for adaptation and the representation of knowledge; reasons are given for why the important subjects of motivations and emotions have not so far been considered in the SP programme of research. Evidence in this report, and "Problems in AI research ...", suggests that ***the SP System provides a relatively promising foundation for the development of artificial general intelligence***.
[173] vixra:2009.0012 [pdf]
Problems in AI Research and How the SP System May Help to Solve Them
This paper describes problems in AI research and how the SP System (described in sources referenced in the paper) may help to solve them. Most of the problems considered in the paper are described by leading researchers in AI in interviews with science writer Martin Ford, and reported by him in his book "Architects of Intelligence". These problems, each with potential solutions via SP, are: the divide between symbolic and non-symbolic kinds of knowledge and processing, and how the SP System may bridge the divide; the tendency of deep neural networks (DNNs) to make large and unexpected errors in recognition, something that does not happen with the SP System; in most AI research, unsupervised learning is regarded as a challenge, but unsupervised learning is central in how SP learns; in other AI research, generalisation, with under- and over-generalisation is seen as a problem, but it is a problem that has a coherent solution in the SP System; learning usable knowledge from a single exposure or experience is widely regarded as a problem, but it is a problem that is already solved in the SP System; transfer learning (incorporating old knowledge in new) is seen as an unsolved problem, but it is bedrock in how the SP System learns; there is clear potential for the SP System to solve problems that are prevalent in most AI systems: learning that is slow and greedy for large volumes of data and large computational resources; the SP System provides solutions to problems of transparency in DNNs, where it is difficult to interpret stored knowledge and how it is processed; although there have been successes with DNNs in the processing of natural language, the SP System has strengths in the representation and processing of natural languages which appear to be more in accord with how people process natural language, and these strengths in the SP System are well-integrated with other strengths of the system in aspects of intelligence; by contrast with DNNs, SP has strengths and potential in human-like probabilistic reasoning, and these are well integrated with strengths in other aspects of intelligence; unlike most DNNs, the SP System eliminates the problem of catastrophic forgetting (where new learning wipes out old learning); the SP System provides much of the generality across several aspects of AI which is missing from much research in AI. The strengths and potential of the SP System in comparison with alternatives suggest that {\em the SP System provides a relatively promising foundation for the development of artificial general intelligence}.
[174] vixra:2008.0216 [pdf]
The Quantum Pythagorean Fuzzy Evidence Theory Based on Negation in Quantum of Mass Function
Dempster-Shafer (D-S) evidence theory is an effective methodology to handle unknown and imprecise information, due it can assign the probability into power of set. Quantum of mass function (QM) is the extension of D-S evidence theory, which can combine quantum theory and D-S evidence theory and also extended D-S evidence theory to the unit circle in complex plane. It can be seen that QM has the more bigger uncertainty in the framework of the complex plane. Recently, negation is getting more and more attention due it can analyze information from the another point. Hence, the paper firstly proposed negation of QM by using the subtraction of vectors in the unit circle, which can degenerate into negation proposed by Yager in startand probability theory and negation proposed by Yin. et al in D-S evidence theory. the paper proposed quantum pythagorean fuzzy evidence theory (QPFET), which is the first work to consider QPFET from the point of negation.
[175] vixra:2008.0163 [pdf]
Dynamics of Feed Forward Induced Interference Training
Preceptron model updating with back propagation has become the routine of deep learning. Continu-ous feed forward procedure is required in order for backward propagate to function properly. Doubt-ing the underlying physical interpretation on transformer based models such as GPT brought aboutby the routine explaination, a new method of training is proposed in order to keep self-consistencyof the physics. By treating the GPT model as a space-time diagram, and then trace the worldlinesof signals, identifing the possible paths of signals in order fot a self-attention event to occure. Witha slight modification, self-attention can be viewed as an ising model interaction, which enables thegoal to be designed as energy of system. Target is treated as an external magnetic field inducing sig-nals modeled as magnetic dipoles. A probability network is designed to pilot input signals travellingat constant speed through different routes. A rule of updating the probabilities is designed in orderto form constructive interference at target locations so that instantaneous energy can be maximised.Experiment is conducted on a 4-class classification problem extracted from MNIST. The results ex-hibit interesting but expected behavours, which do not exist in a bp updated network, but more likelearning in a real human, especially in the few-shot scenario.
[176] vixra:2007.0209 [pdf]
Lunar Rock Classification Using Machine Learning
Lunar landings by esteemed space stations around the world have yielded an abundance of new scientific data on the Moon which has helped scientists to study our closest neighbour and hence have provided evidence for understanding Earth’s past and future. This paper is about solving the challenge on HackerEarth about classifying the lunar rock into small or large rock. These tasks have historically been conducted by visual image inspection, thereby reducing the scope, reliability and accuracy of the retrieval. The competition was to build a machine learning model to reduce human effort of doing a monotonous task. We built a Support Vector Machine model, used widely in classification problems, feeding features extracted from images in the dataset using OpenCV, only to obtain an accuracy of 99.41%. Our source code solving the challenge and the dataset are given in the github repository https://github.com/ArshitaKalra/Lunar-Rock-classification.
[177] vixra:2007.0200 [pdf]
More Problems in AI Research and How the Sp System May Help to Solve Them
This paper, a companion to "Problems in AI research and how the SP System may help to solve them", describes problems in AI research and how the "SP System" (described in sources detailed in the paper) may help to solve them. Most of these problems are described by leading researchers in AI in interviews with science writer Martin Ford, and reported by him in his book "Architects of Intelligence". Problems and their potential solutions that are described in this paper are: the need to rebalance research towards top-down strategies; how to minimise the risk of accidents with self-driving vehicles; the need for strong compositionality in the structure of knowledge; the challenges of commonsense reasoning and commonsense knowledge; establishing the key importance of information compression in AI research; establishing the importance of biological validity in AI research; whether knowledge in the brain is represented in 'distributed' or 'localist' form; the limited scope for adaptation of deep neural networks; and reasons are given for why the important subjects of motivations and emotions have not so far been considered. The evidence in this paper and its companion paper suggests that ***the SP System provides a firmer foundation for the development of artificial general intelligence than any alternative***.
[178] vixra:2007.0110 [pdf]
A Semantic Question Answering in a Restricted Smart Factory Domain Attaching to Various Data Sources
Industrial manufacturing has become more interconnected between smart devices such as the industry of things edge devices, tablets, manufacturing equipment, and smartphones. Smart factories have emerged and evolved with digital technologies and data science in manufacturing systems over the past few years. Smart factories make complex data enables digital manufacturing and smart supply chain management and enhanced assembly line control. Nowadays, smart factories produce a large amount of data that needs to be apprehensible by human operators and experts in decision making. However, linked data is still hard to understand and interpret for human operators, thus we need a translating system from linked data to natural language or summarization of the volume of linked data by eliminating undesired results in the linked data repository. In this study, we propose a semantic question answering in a restricted smart factory domain attaching to various data sources. In the end, we will perform qualitative and quantitative evaluation of the semantic question answering, as well as discuss findings and conclude the main points with regard to our research questions.
[179] vixra:2007.0085 [pdf]
Microscopy Image Processing for the Human Eye
Vivo confocal microscopy allows scientists to better understand eye health and systemic diseases. Microneuromas could play a role, however, monitoring their growth from a mosaic of images is error prone and time consuming. We used automated image stitching as a solution; focusing on accuracy and computational speed of three different feature detection algorithms: SIFT, SURF, and ORB. The results illustrated that SURF was computationally efficient with our data. Future investigation is to create a global solution that can replace the need for manual image stitching in this application.
[180] vixra:2007.0084 [pdf]
Nonextensive Belief Entropy
The belief entropy has high performance in handling uncertain information, which is the extension of information entropy in Dempster-shafer evidence theory. The Tsallis entropy is an extent of information entropy, which is a nonextensive entropy. However, how to applied the idea of belief entropy to improve the Tsallis entropy is still an open issue. This paper proposes the nonextensive belief entropy(NBE), which consists of belief entropy and Tsallis entropy. If the extensive constant of the proposed model equal to 1, then the NBE will degenerate into classical belief entropy. Furthermore, When the basic probability assignment degenerates into probability distribution, then the proposed entropy will be degenerated as classical Tsallis entropy. Meanwhile, if NBE focus on the probability distribution and the extensive constant equal to 1, then the NBE is equate the classical information entropy. Numerical examples are applied to prove the efficiency of the proposed entropy. The experimental results show that the proposed entropy can combine the belief entropy and Tsallis entropy effectively and successfully.
[181] vixra:2006.0235 [pdf]
A Vector Interpretation of Quaternion Mass Function
Mass function vector is used to handle uncertainty. Quaternion number is the extent of real number. The mass function vector can extend the mass function by combining the vector. In this paper, the mass function vector is extended by quaternion number, named as Quaternion Mass Function Vector(QMFV). The proposed QMFV has the advantage to deal with uncertain information. When the quaternion number degenerates into the real number, then the QMFV degenerates into the quaternion mass function. In addition, if the probability of multiple subsets of frame of discernment is not assigned to the single subsets, then the mass function vector will degenerate into mass function in classical evidence theory. When the quaternion number degenerates into the real number, then the combination rule of quaternion mass function vectors degenerates into the combination rule of mass function vectors. In the case when the probability of multiple subsets of frame of discernment is not assigned to the single subsets, the combination rule of mass function vectors degenerates into generalized dempster's rule of combination. Numerical examples are applied to prove the efficiency of the proposed model. The experimental results show that the proposed model can apply the quaternion theory to mass function vector effectively and successfully.
[182] vixra:2006.0210 [pdf]
Quaternion Mass Function
Mass function is used to handle uncertainty. Quaternion number is the extent of imaginary number. In this paper, the classical mass function is extended by quaternion number, named as Quaternion Mass Function (QMF). The proposed QMF has the advantage to deal with uncertain information. When the quaternion number degenerates into the complex number, then the QMF degenerates into the complex mass function. In addition, if the complex mass function is degenerated as real number, the QMF is the same as mass function in classical evidence theory. In the case when the quaternion number degenerates into the real number and the QMF focus on the frame of discernment with single subsets, the QMF is the same as the probability distribution in probability theory. The combination rule is also presented to combine two QMFs, which is the generalization of Dempster rule. In the case when the quaternion mass function degenerates into the real number and assigns only to single subsets, the proposed combination rule is degenerated as Beyesian updation in probability theory. Numerical examples are applied to prove the efficiency of the proposed model. The experimental results show that the proposed model can apply the quaternion theory to mass function effectively and successfully.
[183] vixra:2006.0126 [pdf]
AIXI Responses to Newcomblike Problems
We provide a rigorous analysis of AIXI's behaviour under repeated Newcomblike settings. In this context, a Newcomblike problem is a setting where an agent is tied against an environment that contains a perfect predictor, whose predictions are used to determine the environmet's outputs. Since AIXI lacks good convergence properties, we chose to focus the analysis on determining whether an environment appears computable to AIXI, that is, if it maps actions to observations in a way that a computable program can achieve. It is in this sense that, it turns out, AIXI can learn to one-box in *repeated* Opaque Newcomb, and to smoke in *repeated* Smoking Lesion, but may fail all other Newcomblike problems, because we found no way to reduce them in a computable form. However, we still suspect that AIXI can succeed in the repeated settings.
[184] vixra:2006.0110 [pdf]
The Information Volume of Uncertain Information: (7) Information Quality
Information quality is a concept that can be used to measure the information of probability distribution. Dempster-Shafer evidence theory can describe uncertain information more reasonably than probability theory. Therefore, it is a research hot spot to propose information quality applicable to evidence theory. Recently, Deng proposed the concept of information volume based on Deng entropy. It is worth noting that, compared with the Deng entropy, the information volume of the Deng entropy contains more information. Obviously, it may be more reasonable to use information volume of Deng entropy to represent uncertain information. Therefore, this article proposes a new information quality, which is based on the information volume of Deng entropy. In addition, when the basic probability (BPA) degenerates into a probability distribution, the proposed information quality is consistent with the information quality proposed by Ygare and Petry. Finally, several numerical examples illustrate the effectiveness of this new method.
[185] vixra:2006.0064 [pdf]
The Information Volume of Uncertain Information: (6) Information Multifractal Dimension
How to measure the uncertainty in the open world is a popular topic in recent study. Many entropy measures have been proposed to address this problem, but most have limitations. In this series of paper, a method for measuring the information volume of mass function is presented. The fractal property about the maximum information volume is shown in this paper, which indicates the inherent physical meanings of Deng entropy from the perspective of statistics. The results shows the multifractal property of this maximum information volume. Some experiment results are applied to support this perspective.
[186] vixra:2006.0062 [pdf]
The Information Volume of Uncertain Information: (4) Negation
Negation is an important operation on uncertainty information. Based on the information volume of mass function, a new negation of basic probability assignment is presented. The result show that the negation of mass function will achieve the information volume increasing. The convergence of negation is the situation when the Deng entropy is maximum, namely high order Deng entropy. If mass function is degenerated into probability distribution, the negation of probability distribution will also achieve the maximum information volume, where Shannon entropy is maximum. Another interesting results illustrate the situation in maximum Deng entropy has the same information volume as the whole uncertainty environment.
[187] vixra:2006.0061 [pdf]
The Information Volume of Uncertain Information: (5) Divergence Measure
Dempster-Shafer Evidence theory is an extension of probability theory, which can describe uncertain information more reasonably. Divergence measure is always an important concept in probability theory. Therefore, how to propose a reasonable divergence measurement has always been a research hot spot in evidence theory. Recently, Deng proposed the concept of information volume based on Deng entropy. It is interesting to note that compared with the uncertainty measure of Deng entropy, information volume of Deng entropy contains more information. Obviously, it might be more reasonable to use information volume of Deng entropy to represent uncertainty information. Based on this, in the paper, we combined the characteristics of non-specific measurement of Deng entropy, and propose a new divergence measure. The new divergence measurement not only satisfies the axiom of distance measurement, but also has some advantages that cannot be ignored. In addition, when the basic probability assignment(BPA) degenerates into probability distribution, the measured result of the new divergence measure is the same as that of the traditional Jensen-Shannon divergence. If the mass function is assigned in probability distribution, the proposed divergence is degenerated as Kullback-Leibler divergence. Finally, some numerical examples are illustrated to show the efficiency of the proposed divergence measure of information volume.
[188] vixra:2006.0037 [pdf]
The Information Volume of Uncertain Information: (2) Fuzzy Membership Function
In fuzzy set theory, the fuzzy membership function describes the membership degree of certain elements in the universe of discourse. Besides, Deng entropy is a important tool to measure the uncertainty of an uncertain set, and it has been wildly applied in many fields. In this paper, firstly, we propose a method to measure the uncertainty of a fuzzy MF based on Deng entropy. Next, we define the information volume of the fuzzy MF. By continuously separating the BPA of the element whose cardinal is larger than $1$ until convergence, the information volume of the fuzzy sets can be calculated. When the hesitancy degree of a fuzzy MF is $0$, information volume of the fuzzy membership function is identical to the Shannon entropy. In addition, several examples and figures are expound to illustrated the proposed method and definition.
[189] vixra:2006.0035 [pdf]
The Information Volume of Uncertain Information: (3) Information Fractal Dimension
How to measure the uncertainty in the open world is a popular topic in recent study. Many entropy measures have been proposed to address this problem, but most have limitations. In this series of paper, a method for measuring the information volume of mass function is presented. The fractal property about the maximum information volume is shown in this paper, which indicates the inherent physical meanings of Deng entropy from the perspective of statistics. The results shows the linear relationship between the maximum information volume and the probability scale. Some experiment results are applied to support this perspective.
[190] vixra:2006.0028 [pdf]
The Information Volume of Uncertain Informaion: (1) Mass Function
Given a probability distribution, its corresponding information volume is Shannon entropy. However, how to determine the information volume of a given mass function is still an open issue. Based on Deng entropy, the information volume of mass function is presented in this paper. Given a mass function, the corresponding information volume is larger than its uncertainty measured by Deng entropy. The so called Deng distribution is defined as the BPA condition of the maximum Deng entropy. The information volume of Deng distribution is called the maximum information volume, which is lager than the maximum Deng entropy. In addition, both the total uncertainty case and the Deng distribution have the same information volume, namely, the maximum information volume. Some numerical examples are illustrated to show the efficiency of the proposed information volume of mass function.
[191] vixra:2006.0002 [pdf]
An Artiticial Intelligence Enabled Multimedia Tool for Rapid Screening of Cervical Cancer
Cervical cancer is a major public health challenge. Further mitigation of cervical cancer can greatly benefit from development of innovative and disruptive technologies for its rapid screening and early detection. The primary objective of this study is to contribute to this aim through large scale screening by development of Artificial Intelligence enabled Intelligent Systems as they can support human cancer experts in making more precise and timely diagnosis. Our current study is focused on development of a robust and interactive algorithm for analysis of colposcope-derived images analysis and a diagnostic tool/scale namely the OM- The Onco-Meter. This tool was trained and tested on 300 In-dian subjects/patients yielding 77% accuracy with a sensitivity of 83.56% and a specicity of 59.25%. OM-The Oncometer is capable of classifying cervigrams into cervical dysplasia, carcinoma in situ (CIS) and invasive cancer(IC). Pro- gramming language - R has been used to implement and compute earth mover distances (EMD) to characterize different diseases labels associated with cervical cancer, computationally. Deployment of automated tools will facilitate early diagnosis in a noninvasive manner leading to a timely clinical intervention for cervical cancer patients upon detection at a Primary Health Care (PHC). The tool developed in this study will aid clinicians to design timely intervention strategies aimed at improving the clinical prognosis of patients.
[192] vixra:2005.0160 [pdf]
Effect of Ensembling on ANLI Benchmark
Tremendous achievement of reaching fairly high success metric values with several NLI datasets caused eyebrows to raise questioning the real value of these metric numbers. Research papers started to appear with a comprehensive analysis of what these models really learn and the relative difficulty of forcing these models to fail with small syntactic and semantic changes in the input. In particular, ANLI benchmark is an example of a more challenging NLI task with the intent of measuring the comprehension capabilities of models to a deeper context. Relative success of transformer-based models on ANLI benchmarks were already reported by Nie et al., 2019. Given the challenging nature of iterative dataset formation, individual models are having more difficulty of extracting the underlying relationship between the context and hypothesis pair, and the target. Ensembles of these individual models might have a higher potential to achieve better performance numbers when the individual performances are that far from the equivalent ones in SNLI and MNLI tasks. On top of that, making controlled variations of the inputs and tracking the changes in the behavior of those models will give indications about the strength and robustness regarding the learning process.
[193] vixra:2005.0120 [pdf]
Natural Way to Overcome Catastrophic Forgetting in Neural Networks
Not so long ago, a method was discovered that successfully overcomes the catastrophic forgetting of neural networks. Although we know about the cases of using this method to preserve skills when adapting pre-trained networks to particular tasks, it has not yet obtained widespread distribution. In this paper, we would like to propose an alternative method of overcoming catastrophic forgetting based on the total absolute signal passing through each connection in the network. This method has a simple implementation and seems to us essentially close to the processes occurring in the brain of animals to preserve previously learned skills during subsequent learning. We hope that the ease of implementation of this method will serve its wide application.
[194] vixra:2005.0100 [pdf]
An Agent-Based Control System for Wireless Sensor and Actuator Networks
This paper aims to propose a novel MIMO control system that is compounded with Distributed Control Systems (DCS) and Centralized Control Systems (CCS). Despite DCS and CCS, which have several drawbacks such as cost and delay, the proposed system is designed to have local and global controllers simultaneously. This MIMO control system has a significant advantage versus the two traditional systems in implementation, computation power reduction, cost decrementing, performance, and the problems that occur in addressing the system connections in DCs for Wireless Sensor Networks and the Internet of Things. The proposed the system is modeled as a Multi-Agent System (MAS) which is implemented in the osBrain MAS framework in Python.
[195] vixra:2004.0363 [pdf]
Multi-Task Deep Learning Based CT Imaging Analysis for Covid-19: Classification and Segmentation
The fast spreading of the novel coronavirus COVID-19 has aroused worldwide interest and concern, and caused more than one million and a half confirmed cases to date. To combat this spread, medical imaging such as computed tomography (CT) images can be used for diagnostic. An automatic detection tools is necessary for helping screening COVID-19 pneumonia using chest CT imaging. In this work, we propose a multitask deep learning model to jointly identify COVID-19 patient and segment COVID-19 lesion from chest CT images. Our motivation is to leverage useful information contained in multiple related tasks to help improve both segmentation and classification performances. Our architecture is composed by an encoder and two decoders for reconstruction and segmentation, and a multi-layer perceptron for classification. The proposed model is evaluated and compared with other image segmentation and classification techniques using a dataset of 1044 patients including 449 patients with COVID-19, 100 normal ones, 98 with lung cancer and 397 of different kinds of pathology. The obtained results show very encouraging performance of our method with a dice coefficient higher than 0.78 for the segmentation and an area under the ROC curve higher than 93% for the classification.
[196] vixra:2004.0318 [pdf]
Instancenet: Object Instance Segmentation Using DNN
One-stage object detectors like SSD and YOLO are able to speed up existing two-stage detectors like Faster R-CNN by removing the object proposal stage and making up for the lost performance in other ways. Nonetheless, the same approach is not easily transferable to instance segmentation task. Current one-stage instance segmentation methods can be simply classified into segmentation-based methods which segment first then do clustering, and proposal-based methods which detect first then predict masks for each instance proposal. Proposal-based methods always enjoy a better mAP; by contrast, segmentation-based methods are generally faster when inferencing. In this work, we first propose a one-stage segmentation-based instance segmentation solution, in which a pull loss and a push loss are used for differentiating instances. We then propose two post-processing methods, which provide a trade-off between accuracy and speed.
[197] vixra:2004.0248 [pdf]
Predicting the Likelihood of Mortality in Confirmed Positive COVID-19 Patients
The novel coronavirus - COVID-19 - has evolved into a global pandemic. With that, it is imperative that countries and medical facilities are equipped with the technology and resources to give every person the greatest chance of surviving. With that, even developed nations are beginning to run low on medical supplies such as hospital beds, masks, and respirators. With the growth of cases in the United States, hospitals will continue to run out of supplies. It is imperative that medical supplies get distributed to those who need it the most first. This paper outlines a machine learning approach to predicting patients who are at the most risk of mortality given the confirmed positive diagnosis of coronavirus. The final results were inconclusive enough to be implemented in a real-world scenario.
[198] vixra:2004.0222 [pdf]
Decoupling Global and Local Representations via Invertible Generative Flows
In this work, we propose a new generative model that is capable of automatically decoupling global and local representations of images in an entirely unsupervised setting, by embedding a generative flow in the VAE framework to model the decoder. Specifically, the proposed model utilizes the variational auto-encoding framework to learn a (low-dimensional) vector of latent variables to capture the global information of an image, which is fed as a conditional input to a flow-based invertible decoder with architecture borrowed from style transfer literature. Experimental results on standard image benchmarks demonstrate the effectiveness of our model in terms of density estimation, image generation and unsupervised representation learning. Importantly, this work demonstrates that with only architectural inductive biases, a generative model with a likelihood-based objective is capable of learning decoupled representations, requiring no explicit supervision. The code for our model is available at https://github.com/XuezheMax/wolf.
[199] vixra:2004.0159 [pdf]
HyperSpacetime: Complex Algebro-Geometric Analysis of Intelligence Quantum Entanglement Convergent Evolution
Nature is structural instead of random, correlation is just approximation of causality, and data is not science: the more we reveal the more we revere nature on our voyage of unprecedented discovery. We argue that the soul(s) or exotic soul(s) of quotient Hypercomplex arbifold multiscale Spacetime (HyperSpacetime)'s corresponding manifold(s)/general (quotient and non-quotient) HyperSpacetime is the origin of super/general intelligence, and the metric of super/general intelligence is the complexity of quotient/general HyperSpacetime's corresponding generic polynomial. We also argue that the intersecting soul(s) and/or exotic soul(s) as varieties of quotient HyperSpacetime's corresponding manifold(s), when their maximal/minimum sectional curvatures approaching positive infinity and/or negative infinity as singularities, is the origin of quantum entanglement. We further argue that the maximal/minimum sectional curvatures of the same intersecting soul(s) and/or exotic soul(s), is the origin of convergent evolution through conformal transformation. We derive even N-dimensional HyperSpacetime, a M-open (\begin{math} M = C_{_{I+N}}^{^I} \text{, } I, N, M \to \infty \end{math}) arbifold as generalized orbifold with the structure of a algebraic variety $\mathcal{A}$, without or with loop group action as $\mathcal{A}=[\mathcal{M}/\mathcal{LG}]$ ($\mathcal{M}$ as complex manifold, $\mathcal{LG}$ as loop group), it arises from I-degree (power of 2) hypercomplex even N-degree generic polynomial continuous/discrete function/functor as nonlinear action functional in hypercomplex $\mathbb{HC}^{\infty}$ useful for generic neural networks: $\mathcal{F}(S_j,T_j)=\prod_{n=1}^{^{N}}(w_nS_n(T_n)+b_n+ \gamma \sum_{k=1}^{^{j}}\mathcal{F}(S_{k-1},T_{k-1}))$ where $j=1,\dots,N$, $S_{i}=s_0e_0+\sum_{i=1}^{^{{I-1}}}s_{i}e_{i}$, $T_{i}=t_0e_0+\sum_{i=1}^{^{{I-1}}}t_{i}e_{i}$ over noncommutative nonassociative loop group. Its sectional curvature is \begin{math} \kappa = \frac{{\left| {\mathcal{F}''\left(X \right)} \right|}}{{{{\left( {1 + {{\left[ {\mathcal{F}'\left(X \right)} \right]}^2}} \right)}^{\frac{3}{2}}}}} \end{math} if $\mathcal{F}(X)$ is smooth, or \begin{math} \kappa = \kappa_{max}\kappa_{min} \end{math} if nonsmooth, by correlating general relativity with quantum mechanics via extension from 3+1 dimensional spacetime $\mathbb{R}^{4}$ to even N-dimensional HyperSpacetime $\mathbb{HC}^{\infty}$. By directly addressing multiscale, singularities, statefulness, nonlinearity instead of via activation function and backpropagation, HyperSpacetime with its corresponding generic polynomial determining the complexity of ANN, rigorously models curvature-based $2^{nd}$ order optimization in arbifold-equivalent neural networks beyond gradient-based $1^{st}$ order optimization in manifold-approximated adopted in AI. We establish HyperSpacetime generic equivalence theory by synthesizing Generalized Poincar\'{e} conjecture, soul theorem, Galois theory, Fermat's last theorem, Riemann hypothesis, Hodge conjecture, Euler's theorem, Euclid theorem and universal approximation theorem. Our theory qualitatively and quantitatively tackles the black box puzzle in AI, quantum entanglement and convergent evolution. Our future work includes HyperSpacetime refinement, complexity reduction and synthesis as our ongoing multiversal endeavor.
[200] vixra:2003.0557 [pdf]
Covid-19 :Statistical Exploration
In this article we present a naive model for the prediction of the number of COVID-19 infections, with illustrations of real data on the evolution of COVID-19 in France.
[201] vixra:2002.0178 [pdf]
Optimal Metamodeling to Interpret Activity-Based Health Sensor Data
Wearable sensors are revolutionizing the health monitoring and medical diagnostics arena. Algorithms and software platforms that can convert the sensor data streams into useful/actionable knowledge are central to this emerging domain, with machine learning and signal processing tools dominating this space. While serving important ends, these tools are not designed to provide functional relationships between vital signs and measures of physical activity. This paper investigates the application of the metamodeling paradigm to health data to unearth important relationships between vital signs and physical activity. To this end, we leverage neural networks and a recently developed metamodeling framework that automatically selects and trains the metamodel that best represents the data set. A publicly available data set is used that provides the ECG data and the IMU data from three sensors (ankle/arm/chest) for ten volunteers, each performing various activities over one-minute time periods. We consider three activities, namely running, climbing stairs, and the baseline resting activity. For the following three extracted ECG features – heart rate, QRS time, and QR ratio in each heartbeat period – models with median error of <25% are obtained. Fourier amplitude sensitivity testing, facilitated by the metamodels, provides further important insights into the impact of the different physical activity parameters on the ECG features, and the variation across the ten volunteers.
[202] vixra:1911.0518 [pdf]
Compressive Analysis and the Future for Privacy
Compressive analysis is the name given to the family of techniques that map raw data to their smaller representation. Largely, this includes data compression, data encoding, data encryption, and hashing. In this paper, we analyse the prospects of such technologies in realising customisable individual privacy. We enlist the dire need to establish privacy preserving frameworks and policies and how can individuals achieve a trade-off between the comfort of an intuitive digital service ensemble and their privacy. We examine the current technologies being implemented, and suggest the crucial advantages of compressive analysis.
[203] vixra:1911.0418 [pdf]
Friendly Smart Crop Observation System
This paper seeks to propose a monitoring/sensing device as a preliminary prototype to alert farmers or cultivators of crucial information and warnings against critical levels of soil moisture, air temperature and humidity surrounding the crop's vicinity. In it, application of IoT, data analysis and ML techniques are implemented into the design of the said prototype which will be further utilized to evolve the continuously gathered data to construct meaningful forecast of what constitutes a healthy crop and other actionable information. As a result, more meaningful measures can be taken to ensure the safety of the crop based on gradually enhanced and improved data sets in a long run which may help increasing the standards of practice in the evolving agricultural industry.
[204] vixra:1911.0156 [pdf]
Nonconvex Stochastic Nested Optimization via Stochastic ADMM
We consider the stochastic nested composition optimization problem where the objective is a composition of two expected-value functions. We proposed the stochastic ADMM to solve this complicated objective. In order to find an $\epsilon$ stationary point where the expected norm of the subgradient of corresponding augmented Lagrangian is smaller than $\epsilon$, the total sample complexity of our method is $\mathcal{O}(\epsilon^{-3})$ for the online case and $\cO \Bigl((2N_1 + N_2) + (2N_1 + N_2)^{1/2}\epsilon^{-2}\Bigr)$ for the finite sum case. The computational complexity is consistent with proximal version proposed in \cite{zhang2019multi}, but our algorithm can solve more general problem when the proximal mapping of the penalty is not easy to compute.
[205] vixra:1910.0578 [pdf]
Unsupervised Decomposition of Multi-Author Document
This paper proposes an improvement over a paper[A generic unsupervised methods for decomposing multi-author documents, N. Akiva and M. Koppel 2013]. We have worked on two aspects, In the first aspect, we try to capture writing style of author by ngram model of words, POS Tags and PQ Gram model of syntactic parsing over used basic uni-gram model. In the second aspect, we added some layers of refinements in existing baseline model and introduce new term ”similarity index” to distinguish between pure and mixed segments before unsupervised labeling. Similarity index uses overall and sudden change of writing style by PQ Gram model and words used using n-gram model between lexicalised/unlexicalised sentences in segments for refinement. In this paper, we investigate the role of feature selection that captures the syntactic patterns specific to an author and its overall effect in the final accuracy of the baseline system. More specifically, we insert a layer of refinement to the baseline system and define a threshold based on the similarity measure among the sentences to consider the purity of the segments to be given as input to the GMM.The key idea of our approach is to provide theGMMclustering with the ”good segments” so that the clustering precision is maximised which is then used as labels to train a classifier. We also try different features set like bigrams and trigrams of POS tags and an PQ Grams based feature on unlexicalised PCFG to capture the distinct writing styles which is then given as an input to a GMM trained by iterative EM algorithm to generate good clusters of the segments of the merged document.
[206] vixra:1910.0568 [pdf]
Sentiment Classification Over Brazilian Supreme Court Decisions Using Multi-Channel CNN
Sentiment analysis seeks to identify the viewpoint(s) underlying a text document; In this paper, we present the use of a multichannel convolutional neural network which, in effect, creates a model that reads text with different n-gram sizes, to predict with good accuracy sentiments behind the decisions issued by the Brazilian Supreme Court, even with a very imbalanced dataset we show that a simple multichannel CNN with little to zero hyperparameter tuning and word vectors, tuned on network training, achieves excellent results on the Brazilian Supreme Court data. We report results of 97% accuracy and 84% average F1- score in predicting multiclass sentiment dimensions. We also compared the results with classical classification machine learning models like Naive Bayes and SVM.
[207] vixra:1910.0514 [pdf]
Review Highlights: Opinion Mining on Reviews: a Hybrid Model for Rule Selection in Aspect Extraction
This paper proposes a methodology to extract key insights from user generated reviews. This work is based on Aspect Based Sentiment Analysis (ABSA) which predicts the sentiment of aspects mentioned in the text documents. The extracted aspects are fine-grained for the presentation form known as Review Highlights. The syntactic approach for extraction process suffers from the overlapping chunking rules which result in noise extraction. We introduce a hybrid technique which combines machine learning and rule based model. A multi-label classifier identifies the effective rules which efficiently parse aspects and opinions from texts. This selection of rules reduce the amount of noise in extraction tasks. This is a novel attempt to learn syntactic rule fitness from a corpus using machine learning for accurate aspect extraction. As the model learns the syntactic rule prediction from the corpus, it makes the extraction method domain independent. It also allows studying the quality of syntactic rules in a different corpus.
[208] vixra:1910.0433 [pdf]
RTOP: A Conceptual and Computational Framework for General Intelligence
A novel general intelligence model is proposed with three types of learning. A unified sequence of the foreground percept trace and the command trace translates into direct and time-hop observation paths to form the basis of Raw learning. Raw learning includes the formation of image-image associations, which lead to the perception of temporal and spatial relationships among objects and object parts; and the formation of image-audio associations, which serve as the building blocks of language. Offline identification of similar segments in the observation paths and their subsequent reduction into a common segment through merging of memory nodes leads to Generalized learning. Generalization includes the formation of interpolated sensory nodes for robust and generic matching, the formation of sensory properties nodes for specific matching and superimposition, and the formation of group nodes for simpler logic pathways. Online superimposition of memory nodes across multiple predictions, primarily the superimposition of images on the internal projection canvas, gives rise to Innovative learning and thought. The learning of actions happens the same way as raw learning while the action determination happens through the utility model built into the raw learnings, the utility function being the pleasure and pain of the physical senses.
[209] vixra:1910.0400 [pdf]
On the Maximum X Entropy Negation of a Complex-Valued Distribution
In this paper, we propose a generalized model of the negation function, so that it can has more powerful capability to represent the knowledge and uncertainty measure. In particular, we first define a vector representation of complex-valued distribution. Then, an entropy measure is proposed for the complex-valued distribution, called X entropy. After that, a transformation function to acquire the negation of the complex-valued distribution is exploited. Finally, we verify that the proposed negation method has a maximal entropy.x
[210] vixra:1910.0382 [pdf]
Intrusion Detection using Sequential Hybrid Model
A large amount of work has been done on the KDD 99 dataset, most of which includes the use of a hybrid anomaly and misuse detection model done in parallel with each other. In order to further classify the intrusions, our approach to network intrusion detection includes use of two different anomaly detection models followed by misuse detection applied on the combined output obtained from the previous step. The end goal of this is to verify the anomalies detected by the anomaly detection algorithm and clarify whether they are actually intrusions or random outliers from the trained normal (and thus to try and reduce the number of false positives). We aim to detect a pattern in this novel intrusion technique itself, and not the handling of such intrusions. The intrusions were detected to a very high degree of accuracy.
[211] vixra:1910.0362 [pdf]
Walkrnn: Reading Stories from Property Graphs
WalkRNN, the approach described herein, leverages research in learning continuous representations for nodes in networks, layers in features captured in property graph attributes and labels, and uses Deep Learning language modeling via Recurrent Neural Networks to read the grammar of an enriched property graph. We then demonstrate translating this learned graph literacy into actionable knowledge through graph classification tasks.
[212] vixra:1910.0255 [pdf]
A Deep Neural Network as Surrogate Model for Forward Simulation of Borehole Resistivity Measurements
Inverse problems appear in multiple industrial applications. Solving such inverse problems require the repeated solution of the forward problem. This is the most time-consuming stage when employing inversion techniques, and it constitutes a severe limitation when the inversion needs to be performed in real-time. In here, we focus on the real-time inversion of resistivity measurements for geosteering. We investigate the use of a deep neural network (DNN) to approximate the forward function arising from Maxwell's equations, which govern the electromagnetic wave propagation through a media. By doing so, the evaluation of the forward problems is performed offline, allowing for the online real-time evaluation (inversion) of the DNN.
[213] vixra:1910.0234 [pdf]
The Pascal Triangle of Maximum Deng Entropy
Pascal-Triangle (known as Yang Hui Triangle) is an important structure in mathematics, which has been used many fields. Entropy plays an essential role in physics. In various, information entropy is used to measure the uncertainty of information. Hence, setting the connection between Pascal Triangle and information uncertainty is a question worth exploring. Deng proposed the Deng entropy that it can measure non-specificity and discord of basic probability assignment (BPA) in Dempster-Shafer (D-S) evidence theory. D-S evidence theory and power set are very closely related. Hence, by analysing the maximum Deng entropy, the paper find that there is an potential rule of BPA with changes of frame of discernment. Finally, the paper set the relation between the maximum Deng entropy and PascalTriangle.
[214] vixra:1909.0513 [pdf]
Captcha Generation and Identification Using Generative Adversarial Networks
Adversarial attacking is an emerging worrying angle in the field of AI, capable of fooling even the most efficiently trained models to produce results as and when required. Inversely, the same design powering adversarial attacks can be employed for efficient white-hat modeling of deep neural networks. Recently introduced GANs <i>(Generative Adversarial Networks)</i> serve precisely this purpose by generating forged data. Consequently, authentic data identification is a crucial problem to be done away with, considering increased adversarial attacks. This paper proposes an approach using DCGANs <i>(Deep Convolutional Generative Adversarial Networks)</i> to both - generate and distinguish artificially produced fake captchas. The generator model produces a significant number of unseen images, and the discriminatory model classifies them as fake (0) or genuine (1). Interestingly enough, both the models can be configured to learn from each other and become better as they train along.
[215] vixra:1909.0074 [pdf]
Deep Reinforcement Learning for Visual Question Answering
La conception de bout en bout des systèmes de dialogue est récemment devenue un sujet de recherche populaire grâce à des outils puissants tels que des architectures codeur-décodeur pour l'apprentissage séquence à séquence. Pourtant, la plupart des approches actuelles considèrent la gestion du dialogue homme-machine comme un problème d’apprentissage supervisé, visant à prédire la prochaine déclaration d’un participant, compte tenu de l’historique complet du dialogue. Cette vision est aussi simpliste pour rendre le problème de planification intrinsèque inhérent au dialogue ainsi que sa nature enracinée, rendant le contexte d'un dialogue plus vaste que seulement l'historique. C’est la raison pour laquelle seules les tâches de bavardage et de réponse aux questions ont été traitées jusqu’à présent en utilisant des architectures de bout en bout. Dans ce rapport, nous présentons une méthode d’apprentissage par renforcement profond permettant d’optimiser les dialogues axés sur les tâches, basés sur l’algorithme policy gradient. Cette approche est testée sur un ensemble de données de 120 000 dialogues collectés via Mechanical Turk et fournit des résultats encourageants pour résoudre à la fois le problème de la génération de dialogues naturels et la tâche de découvrir un objet spécifique dans une image complexe.
[216] vixra:1908.0613 [pdf]
Evidence Universal Gravitation in Evidence Theory
Since the introduction of the law of universal gravitation, it has been widely used in the field of natural sciences and theoretical exploration. In other disciplines, based on the law of universal gravitation, some scholars have proposed universal gravitation search algorithms, swarm intelligence optimization algorithms and fuzzy control. However, there is no research to apply the law of universal gravitation to the field of evidence theory. In this paper, we present for the first time the concept of evidence universal gravitation. In the evidence universal gravitation formula we define the evidence gravitation parameter and the evidence quality generation algorithm. The evidence universal gravitation formula satisfies some basic properties. This paper gives some numerical examples to further illustrate the significance of the gravitational evidence. In addition, because conflict management is an open question, the measurement of conflict has not been reasonably resolved. In this paper, we apply the evidence universal gravitation to conflict processing, and illustrate its wide applicability through the comparison of numerical examples.
[217] vixra:1908.0422 [pdf]
Replication of the Keyword Extraction Part of the Paper "Without the Clutter of Unimportant Words": Descriptive Keyphrases for Text Visualization
"Keyword Extraction" refers to the task of automatically identifying the most relevant and informative phrases in natural language text. As we are deluged with large amounts of text data in many different forms and content - emails, blogs, tweets, Facebook posts, academic papers, news articles - the task of "making sense" of all this text by somehow summarizing them into a coherent structure assumes paramount importance. Keyword extraction - a well-established problem in Natural Language Processing - can help us here. In this report, we construct and test three different hypotheses (all related to the task of keyword extraction) that take us one step closer to understanding how to meaningfully identify and extract "descriptive" keyphrases. The work reported here was done as part of replicating the study by Chuang et al. [3].
[218] vixra:1907.0179 [pdf]
Intuitionistic Fuzzy Decision-Making in the Framework of Dempster-Shafer Structures
The main emphasis of this paper is placed on the problem of multi-criteria decision making (MCDM) in intuitionistic fuzzy environments. Some limitations in the existing literature that explains Atanassov’ intuitionistic fuzzy sets (A-IFS) from the perspective of Dempster-Shafer theory (DST) of evidence have been analyzed. To address the issues of using Dempster’s rule to aggregate intuitionistic fuzzy values (IFVs), a novel aggregation operator named OWA-based MOS is proposed based on ordered weighted averaging (OWA) aggregation operator, which allows the expression of decision makers’ subjectivity by introducing the attitudinal character. The effectiveness of the developed OWAbased MOS approach in aggregating IFVs is demonstrated by the known example of MCDM problem. To compare different IFVs obtained from the OWA-based MOS approach, the golden rule representative value for IFVs comparison is introduced, which can get over the shortcomings of score functions. The hierarchical structure of the proposed decision approach is presented based on the above researches, which allow us to solve MCDM problem without intermediate defuzzification when not only criteria, but their weights are represented by IFVs. The proposed OWA-based MOS approach is illustrated as a more flexible decision-making method, which can better solve the problem of intuitionistic fuzzy multi-criteria decision making in the framework of DST.
[219] vixra:1906.0433 [pdf]
Evidential Distance Measure in Complex Belief Function Theory
In this paper, an evidential distance measure is proposed which can measure the difference or dissimilarity between complex basic belief assignments (CBBAs), in which the CBBAs are composed of complex numbers. When the CBBAs are degenerated from complex numbers to real numbers, the proposed distance will degrade into the Jousselme et al.’s distance. Therefore, the proposed distance provides a promising way to measure the differences between evidences in a more general framework of complex plane space.
[220] vixra:1906.0383 [pdf]
Supervised Dimensionality Reduction for Multi-Label Nearest Neighbors
The ML-kNN algorithm is one of the most famous and most efficient multi-label classifier. Its performances are very remarkable when compared with the other state-of-art multi-label classifiers. Nevertheless, it suffers from two major drawbacks: its accuracy crucially depends on the metric function used to compute distances between instances, and when dealing with high dimensions data, the neighborhoods identification task becomes very slow. So, both metric learning and dimensionality reduction are essential to improve the ML-kNN performances. In this report, we propose a novel multi-label Mahalanobis distance learned via a supervised dimensionality reduction approach that we call ML-ARP. ML-ARP is a process that adapts random projections on a multi-label dataset to improve the ML-kNN performances. Unlike most state of art multi-label dimensionality reduction approaches that solve eigenvalue or inverse problem, our method is iterative and scales up with high dimensions. There is no eigenvalue or inverse problems to solve. Experiments show that the ML-ARP allows us to highly upgrade the ML-kNN classifier. Statistical tests assert that the MLARP is better than the remaining state-of-art multi-label dimensionality reduction approaches
[221] vixra:1906.0258 [pdf]
Graph Signal Processing: Towards the Diffused Spectral Clustering
Graph signal processing is an emerging field of research. When the structure of signalscanberepresentedasagraph,itallowstofullyexploittheirinherentstructure. It has been shown that the normalized graph Laplacian matrix plays a major role in the characterization of a signal on a graph. Moreover, this matrix plays a major role in clustering large data set. In this paper, we present the diffused spectral clustering: a novel handwritten digits clustering algorithm based on the normalizedgraphLaplacianproperties. It’saclevercombinationbetweenagraph feature space transformation and the spectral clustering algorithm. Experimentally, our proposal outperforms the other algorithms of the state-of-art.
[222] vixra:1906.0211 [pdf]
Sparse Ensemble Learning with Truncated Convolutional Autoencoders for Cleaning Stained Documents
This paper mainly focus on how to extract clean text from the stained document. It may happen sometimes that due to stains it becomes very difficult to understand the documents and from the previous work it has been seen that one particular modelling technique either through Image processing or Machine learning which alone can’t perform for all the cases in general. As we all know ensemble techniques combine many of the modelling techniques and result in much reduced error that would not be possible by just having single model. But the features used for different models should be sparse or non-overlapping enough to guarantee the independence of each of the modelling techniques. XGBoost is one such ensemble technique in comparison to gradient boosting machines which are very slow due to this it’s not possible to combine more than three models with reasonable execution time. This work mainly focus on combining the truncated convolutional Autoencoders with sparsity take into account to that of machine learning and Image processing models using XGBoost such that the whole model results in much reduced error as compared to single modelling techniques. Experimentation’s are carried out on the public dataset NoisyOffice published on UCI machine learning repository, this dataset contains training, validation and test dataset with variety of noisy greyscale images some with ink spots, coffee spots and creased documents. Evaluation metric is taken to be RMSE(Reduced Mean Squared Error) to show the performance improvement on the variety of images which are corrupted badly
[223] vixra:1905.0588 [pdf]
Quantum Model of Mass Function
Dempster-Shafer (D-S) evidence theory has been used in many fields due to the flexibility and effectiveness in modeling uncertainties, which is the extension of classical probability. Recently, quantum probability which can express uncertainty has been used in many fields due to the existing of interference. Especially for human decision and cognition, interference can better model the process of decision. In order to better expand the applications of D-S evidence theory, the paper proposed quantum model of mass function which can consider the interference. In proposed quantum method, quantum mass function uses euler formula to represent. The paper also discusses some operations in quantum model of mass function. Moreover, the paper also discusses the relationship between quantum mass function and classical mass function by using some numerical examples. Classical mass function is the special case when there is no interference in quantum mass function.
[224] vixra:1905.0080 [pdf]
A Fast Algorithm for NetworkForecasting Time Series
Time series has a wide range of applications in various fields. Recently, a new math tool, named as visibility graph, is developed to transform the time series into complex networks. One shortcoming of existing network-based time series prediction methods is time consuming. To address this issue, this paper proposes a new prediction algorithm based on visibility graph and markov chains. Among the existing network-based time series prediction methods, the main step is to determine the similarity degree between two nodes based on link prediction algorithm. A new similarity measure between two nodes is presented without the iteration process in classical link prediction algorithm. The prediction of Construct Cost Index (CCI) shows that the proposed method has the better accuracy with less time consuming.
[225] vixra:1904.0429 [pdf]
MidcurveNN: Encoder-Decoder Neural Network for Computing Midcurve of a Thin Polygon
Various applications need lower dimensional representation of shapes. Midcurve is one-dimensional(1D) representation of a two-dimensional(2D) planar shape. It is used in applications such as animation, shape matching, retrieval, finite element analysis, etc. Methods available to compute midcurves vary based on the type of the input shape (images, sketches, etc.) and processing (thinning, Medial Axis Transform (MAT), Chordal Axis Transform (CAT), Straight Skeletons, etc.). This paper talks about a novel method called MidcurveNN which uses Encoder-Decoder neural network for computing midcurve from images of 2D thin polygons in supervised learning manner. This dimension reduction transformation from input 2D thin polygon image to output 1D midcurve image is learnt by the neural network, which can then be used to compute midcurve of an unseen 2D thin polygonal shape.
[226] vixra:1903.0424 [pdf]
Contextual Transformation of Short Text for Improved Classifiability
Text classification is the task of automatically sorting a set of documents into predefined set of categories. This task has several applications including separating positive and negative product reviews by customers, automated indexing of scientific articles, spam filtering and many more. What lies at the core of this problem is to extract features from text data which can be used for classification. One of the common techniques to address this problem is to represent text data as low dimensional continuous vectors such that the semantically unrelated data are well separated from each other. However, sometimes the variability along various dimensions of these vectors is irrelevant as they are dominated by various global factors which are not specific to the classes we are interested in. This irrelevant variability often causes difficulty in classification. In this paper, we propose a technique which takes the initial vectorized representation of the text data through a process of transformation which amplifies relevant variability and suppresses irrelevant variability and then employs a classifier on the transformed data for the classification task. The results show that the same classifier exhibits better accuracy on the transformed data than the initial vectorized representation of text data.
[227] vixra:1903.0272 [pdf]
Organic Network Control Systems Challenges in Building a Generic Solution for Network Protocol Optimisation
In the last years many approaches for dynamic protocol adaption in networks have been made and proposed. Most of them deal with a particular environment, but a much more desired approach would be to design a generic solution for this problem. Creating an independent system regarding the network type it operates in and therefore the protocol type that needs to be adapted is a big issue. In this paper we want to discuss certain problems that come with this task and why they have to be taken into account when it comes to designing such a generic system. At first we will see a generic architecture approach for such a system followed by a comparison of currently existing Organic Network Control Systems for adapting protocols in a Mobile Ad-hoc network and a Peer-to-Peer network. After identifying major problems we will summarize and evaluate the achieved results.
[228] vixra:1903.0260 [pdf]
Current Trends in Extended Classifier System
Learning is a way which improves our ability to solve problems related to the environment surrounding us. Extended Classifier System (XCS) is a learning classifier system that use reinforcement learning mechanism to solve complex problems with robust performance. It is an accuracy-based system that works by observing environment, taking input from it and applying suitable actions. Every action of XCS gets a feedback in return from the environment which is used to improve its performance. It also has ability to apply genetic algorithm (GA) on existing classifiers and create new ones by taking cross-over and mutation which have better performance. XCS handles single step and multi-step problems by using different methods like Q-learning mechanism. The ultimate challenge of XCS is to design an implementation which arrange multiple components in a unique way to produce compact and comprehensive solution in a least amount of time. Real time implementation requires flexibility for modifications and uniqueness to cover all aspects. XCS has recently been modified for real input values and a memory management system is also introduced which enhance its ability in different kind of applications like data mining, control stock exchange. In this article, there will be a brief discussion about the parameter and components of XCS. Main part of this article will cover the extended versions of XCS with further improvements and focus on applications, usage in real environment and relationship with organic computing
[229] vixra:1903.0236 [pdf]
Resolving Limits of Organic Systems in Large Scale Environments: Evaluate Benefits of Holonic Systems Over Classical Approaches
With the rapidly increasing number of devices and application components interacting with each other within larger complex systems, classical system hierarchies increasingly hit their limit when it comes to highly scalable and possibly fluctual organic systems. The holonic approach for self-* systems states to solve some of these problems. In this paper, limits of different state-of-the-art technologies and possible solutions to those will be identified and ranked for scalability, privacy, reliability and performance under fluctuating conditions. Subsequently, the idea and structure of holonic systems will be outlined, and how to utilize the previously described solutions combined in a holonic environment to resolve those limits. Furthermore, they will be classified in the context of current multi-agent-systems (MAS). The focus of this work is located in the area of smart energy grids and similar structures, however an outlook sketches a few further application scenarios for holonic structures.
[230] vixra:1903.0223 [pdf]
Comparing Anytime Learning to Organic Computing
In environments where finding the best solution to a given problem is computationally infeasible or undesirable due to other restrictions, the approach of anytime learning has become the de facto standard. Anytime learning allows intelligent systems to adapt and remain operational in a constantly changing environment. Based on observation of the environment, the underlying simulation model is changed to fit the task and the learning process begins anew. This process is expected to never terminate, therefore continually improving the set of available strategies. Optimal management of uncertainty in tasks, which require a solution in real time, can be achieved by assuming faulty yet improving output. Properties of such a system are not unlike those present in organic systems. This article aims to give an introduction to anytime learning in general as well as to show the similarities to organic computing in regards to the methods and strategies used in both domains.
[231] vixra:1903.0186 [pdf]
Advancements of Deep Q-Networks
Deep Q-Networks first introduced a combination of Reinforcement Learning and Deep Neural Networks at a large scale. These Networks are capable of learning their interactions within an environment in a self-sufficient manor for a wide range of applications. Over the following years, several extensions and improvements have been developed for Deep Q-Networks. In the following paper, we present the most notable developments for Deep Q-Networks, since the initial proposed algorithm in 2013.
[232] vixra:1903.0177 [pdf]
Generalized Deng Entropy
Dempster-Shafer evidence theory as an extension of Probability has wideapplications in many fields. Recently, A new entropy called Deng entropywas proposed in evidence theory. Deng Entropy as an uncertain measurein evidence theory. Recently, some scholars have pointed out that DengEntropy does not satisfy the additivity in uncertain measurements. However,this irreducibility can have a huge effect. In more complex systems, thederived entropy is often unusable. Inspired by this, a generalized entropy isproposed, and the entropy implies the relationship between Deng entropy,R ́enyi entropy, Tsallis entropy.
[233] vixra:1903.0168 [pdf]
Organic Traffic Control with Dynamic Route Guidance as a Measure to Reduce Exhaust Emissions in Comparison Organic Traffic Control Mit Dynamic Route Guidance Als Maßnahme Zur Reduzierung Von Abgasemissionen im Vergleich
In this paper an Organic Traffic Control system with Dynamic Route Guidance functionality is being looked at regarding its emission-reducing effect on road traffic. This system will be compared to other environmental measures, namely Low Emission Zones, driving bans and hardware upgrades, with respect to its effect on emissions and other criteria. Results from existing literature and a few calculations are used for this comparison. The sparse data allows for only a few quantitive comparisons. Qualitative comparisons show that this system has the potential to effectively lower emission in its area of effect. It reduces the quantity of all exhaust gases and additionally fuel consumption, without disadvantages for certain road users. This is not the case with the comparative measures. ----- In dieser Arbeit wird ein Organic Traffic Control System mit Dynamic Route Guidance Funktionalität hinsichtlich seiner emissionsreduzierenden Wirkung im Verkehr betrachtet. Dieses System wird mit anderen Umweltmaßnahmen, namentlich Umweltzonen, Fahrverboten und Hardwarenachrüstungen, hinsichtlich Wirkung und weiterer Kriterien verglichen. Es werden hierzu Daten und Ergebnisse aus der bestehenden Literatur verwendet und einige wenige Rechnungen durchgeführt. Die Datenlage erlaubt nur teilweise quantitative Vergleiche. Qualitativ zeigt sich, dass das System Potential bietet, effektiv innerhalb seines Installationsbereichs Emissionen zu senken. Es reduziert die Menge aller Abgase und zusätzlich den Spritverbrauch, ohne dass dabei Nachteile für bestimme Verkehrsteilnehmer entstehen. Dies ist bei den Vergleichsmaßnahmen jeweils nicht der Fall.
[234] vixra:1903.0138 [pdf]
A Survey on Reinforcement Learning for Dialogue Systems
Dialogue systems are computer systems which com- municate with humans using natural language. The goal is not just to imitate human communication but to learn from these interactions and improve the system’s behaviour over time. Therefore, different machine learning approaches can be implemented with Reinforcement Learning being one of the most promising techniques to generate a contextually and semantically appropriate response. This paper outlines the current state-of- the-art methods and algorithms for integration of Reinforcement Learning techniques into dialogue systems.
[235] vixra:1903.0135 [pdf]
A Survey on Classification of Concept Drift with Stream Data
Usually concept drift occurs in many applications of machine learning. Detecting a concept drift is the main challenge in a data stream because of the high speed and their large size sets which are not able to fit in main memory. Here we take a small look at types of changes in concept drift. This paper discusses about methods for detecting concept drift and focuses on the problems with existing approaches by adding STAGGER, FLORA family, Decision tree methods, meta-learning methods and CD algorithms. Furthermore, classifier ensembles for change detection are discussed.
[236] vixra:1903.0121 [pdf]
Online Transfer Learning and Organic Computing for Deep Space Research and Astronomy
Deep space exploration is the pillars within the field of outer space analysis and physical science. The amount of knowledge from numerous space vehicle and satellites orbiting the world of study are increasing day by day. This information collected from numerous experiences of the advanced space missions is huge. These information helps us to enhance current space knowledge and the experiences can be converted and transformed into segregated knowledge which helps us to explore and understand the realms of the deep space.. Online Transfer Learning (OTL) is a machine learning concept in which the knowledge gets transferred between the source domain and target domain in real time, in order to help train a classifier of the target domain. Online transfer learning can be an efficient method for transferring experiences and data gained from the space analysis data to a new learning task and can also routinely update the knowledge as the task evolves.
[237] vixra:1903.0120 [pdf]
A Discussion of Detection of Mutual Influences Between Socialbots in Online (Social) Networks
Many people organise themselves online in social networks or share knowledge in open encyclopaedias. However, these networks do not only belong to humans. A huge variety of socialbots that imitate humans inhabit these and are connected to each other. The connections between socialbots lead to mutual influences between them. If the influence socialbots have on each other are too big they adapt the behaviour of the other socialbot and get worse in imitating humans. Therefore, it is necessary to detect when socialbots are mutually influencing each other. For a better overview socialbots in the social networks Facebook, Twitter and in the open encyclopaedia Wikipedia are observed and the mutual influences between them detected. Furthermore, this paper discusses how socialbots could handle the detected influences.
[238] vixra:1903.0117 [pdf]
A Survey on Different Mechanisms to Classify Agent Behavior in a Trust Based Organic Computing Systems
Organic Computing (OC) systems vary from traditional software systems, as these systems are composed of a large number of highly interconnected and distributed subsystems. In systems like this, it is not possible to predict all possible system configurations and to plan an adequate system behavior entirely at design time. An open/decentralized desktop grid is one example, Trust mechanisms are applied on agents that show the following Self-X properties (Self-organization, Self-healing, Self-organization and so on). In this article, some mechanisms that could help in the classification of agents behavior at run time in trust-based organic computing systems are illustrated. In doing so, isolation of agents that reduce the overall systems performance is possible. Trust concept can be used on agents and then the agents will know if their interacting agents belong to the same trust community and how trustworthy are they. Trust is a significant concern in large-scale open distributed systems. Trust lies at the core of all interactions between the agents which operate in continuously varying environments. Current research leads in the area of trust in computing systems are evaluated and addressed. This article shows mechanisms discussed can successfully identify/classify groups of systems with undesired behavior.
[239] vixra:1903.0089 [pdf]
Deep Meta-Learning and Dynamic Runtime Exploitation of Knowledge Sources for Traffic Control
In the field of machine learning and artificial intelligence, meta-learning describes how previous learning experiences can be used to increase the performance on a new task. For this purpose, it can be investigated how prior (similar) tasks have been approached and improved, and knowledge can be obtained about achieving the same goal for the new task. This paper outlines the basic meta-learning process which consists of learning meta-models from meta-data of tasks, algorithms and how these algorithms perform on the respective tasks. Further, a focus is set on how this approach can be applied and is already used in the context of deep learning. Here, meta-learning is concerned with the respective machine learning models themselves, for example how their parameters are initialised or adapted during training. Also, meta-learning is assessed from the viewpoint of Organic Computing (OC) where finding effective learning techniques that are able to handle sparse and unseen data is of importance. An alternative perspective on meta-learning coming from this domain that focuses on how an OC system can improve its behaviour with the help of external knowledge sources, is highlighted. To bridge the gap between those two perspectives, a model is proposed that integrates a deep, meta-learned traffic flow predictor into an organic traffic control (OTC) system that dynamically exploits knowledge sources during runtime.
[240] vixra:1903.0086 [pdf]
Novelty Detection Algorithms and Their Application in Industry 4.0
Novelty detection is a very important part of Intelligent Systems. Its task is to classify the data produced by the system and identify any new or unknown pattern that were not present during the training of the model. Different algorithms have been proposed over the years using a wide variety of different technologies like probabilistic models and neural networks. Novelty detection and reaction is used to enable self*-properties in technical systems to cope with increasingly complex processes. Using the notion of Organic Computing, industrial factories are getting more and more advanced and intelligent. Machines gain the capability of self-organization, self-configuration and self-adaptation to react to outside influences. This survey paper looks at the state-of-the-art technologies used in Industry 4.0 and assesses different novelty detection algorithms and their usage in such systems. Therefore, different data-sources and consequently applications for potential novelty detection are analyzed. Three different novelty detection algorithms are then present using different underlying technologies and the applicability of these algorithms in combination with the defined scenarios is analyzed.
[241] vixra:1903.0012 [pdf]
A Survey for Testing Self-organizing, Adaptive Systems in Industry 4.0
Complexity in technical development increases rapidly. Regular system are no longer able to fulfill all the requirements. Organic computing systems are inspired by how complexity is mastered in nature. This leads to a fundamental change in software engineering for complex systems. Based on machine learning techniques, a system develops self*-properties which allows it to make decisions at runtime and to operate with nearly no human interaction. Testing is a part of the software engineering process to ensure the functionality and the quality of a system. But when using self-organizing, adaptive systems traditional testing approaches reach their limits. Therefore, new methods for testing such systems have to be developed. There exist already a lot of different testing approaches. Most of them developed within a research group. Nevertheless, there is still a need for further discussion and action on this topic. In this paper the challenges for testing self-organizing, adaptive systems are specified. Three different testing approaches are reviewed in detail. Due to the ongoing fourth industrial revolution it is discussed which of these approaches would fit best for testing industrial manufacturing robots.
[242] vixra:1903.0006 [pdf]
Multi-Agent Reinforcement Learning - From Game Theory to Organic Computing
Complex systems consisting of multiple agents that interact both with each other as well as their environment can often be found in both nature and technical applications. This paper gives an overview of important Multi-Agent Reinforcement Learning (MARL) concepts, challenges and current research directions. It shortly introduces traditional reinforcement learning and then shows how MARL problems can be modelled as stochastic games. Here, the type of problem and the system configuration can lead to different algorithms and training goals. Key challenges such as the curse of dimensionality, choosing the right learning goal and the coordination problem are outlined. Especially, aspects of MARL that have previously been considered from a critical point of view are discussed with regards to if and how the current research has addressed these criticism or shifted their focus. The wide range of possible MARL applications is hinted at by examples from recent research. Further, MARL is assessed from an Organic Computing point of view where it takes a central role in the context of self-learning and self-adapting systems.
[243] vixra:1902.0386 [pdf]
Diversity of Ensembles for Data Stream Classification
When constructing a classifier ensemble, diversity among the base classifiers is one of the important characteristics. Several studies have been made in the context of standard static data, in particular when analyzing the relationship between a high ensemble predictive performance and the diversity of its components. Besides, ensembles of learning machines have been performed to learn in the presence of concept drift and adapt to it. However,diversity measureshave not received much research interest for evolving data streams. Only a few researchers directly consider promoting diversity while constructing an ensemble or rebuilding them in the moment of detecting drifts. In this paper, we present a theoretical analysis of different diversity measures and relate them to the success of ensemble learning algorithms for streaming data. The analysis provides a deeper understanding of the concept of diversity and its impact on online ensemble Learning in the presence of concept drift. More precisely, we are interested in answering the following research question; Which commonly used diversity measures are used in the context of static-data ensembles and how far are they applicable in the context of streaming data ensembles?
[244] vixra:1902.0322 [pdf]
Cross Entropy of Belief Function
Dempster-Shafer evidence theory as an extension of Probability has wide applications in many fields. Recently, A new entropy called Deng entropy was proposed in evidence theory. There were a lot of discussions and applications about Deng entropy. However, there is no discussion on how to apply Deng entropy to measure the correlation between the two evidences. In this article, we first review and analyze some of the work related to mutual information. Then we propose the extension of Deng Entropy: joint Deng entropy, Conditional Deng entropy and cross Deng entropy. In addition, we prove the relevant properties of this entropy. Finally, we also proposed a method to obtain joint evidence.
[245] vixra:1902.0279 [pdf]
Divergence Measure of Belief Function
it is important to measure the divergent or conflicting degree among pieces of information for information preprocessing in case for the unreliable results which come from the combination of conflicting bodies of evidence using Dempster's combination rules. However, how to measure the divergence of different evidence is still an open issue. In this paper, a new divergence measure of belief function based on Deng entropy is proposed in order to measure the divergence of different belief function. The divergence measure is the generalization of Kullback-Leibler divergence for probability since when the basic probability assignment (BPA) is degenerated as probability, divergence measure is equal to Kullback-Leibler divergence. Numerical examples are used to illustrate the effectiveness of the proposed divergence measure.
[246] vixra:1902.0220 [pdf]
Comments on the Book "Architects of Intelligence" by Martin Ford in the Light of the SP Theory of Intelligence
The book "Architects of Intelligence" by Martin Ford presents conversations about AI between the author and influential researchers. Issues considered in the book are described in relation to features of the "SP System", meaning the "SP Theory of Intelligence" and its realisation in the "SP Computer Model", both outlined in an appendix. The SP System has the potential to solve most of the problems in AI described in the book, and some others. Strengths and potential of the SP System, which in many cases contrast with weaknesses of deep neural networks (DNNs), include the following: a top-down research strategy has yielded a system with a favourable combination of conceptual "Simplicity" with descriptive or explanatory "Power"; the SP System has strengths and potential with both symbolic and non-symbolic kinds of knowledge and processing; the system has strengths and long-term potential in pattern recognition; it is free of the tendency of DNNs to make large and unexpected errors in recognition; the system has strengths and potential in unsupervised learning, including grammatical inference; the SP Theory of Intelligence provides a theoretically coherent basis for generalisation and the avoidance of under- or over-generalisations; that theory of generalisation may help improve the safety of driverless; the SP system, unlike DNNs, can achieve learning from a single occurrence or experience; it has relatively tiny demands for computational resources and volumes of data, with potential for much higher speeds in learning; the system, unlike most DNNs, has strengths in transfer learning; unlike DNNs, it provides transparency in the representation of knowledge and an audit trail for all its processing; the system has strengths and potential in the processing of natural language; it exhibits several different kinds of probabilistic reasoning; the system has strengths and potential in commonsense reasoning and the representation of commonsense knowledge; other strengths include information compression, biological validity, scope for adaptation, and freedom from catastrophic forgetting. Despite the importance of motivations and emotions, no attempt has been made in the SP research to investigate these areas.
[247] vixra:1901.0361 [pdf]
A New Divergence Measure of Belief Function in D-S Evidence Theory
Dempster-Shafer (D-S) evidence theory is useful to handle the uncertainty problems. In D-S evidence theory, however, how to handle the high conflict evidences is still an open issue. In this paper, a new reinforced belief divergence measure, called as RB is developed to measure the discrepancy between basic belief assignments (BBAs) in D-S evidence theory. The proposed RB divergence is the first work to consider both of the correlations between the belief functions and the subset of set of belief functions. Additionally, the RB divergence has the merits for measurement. It can provide a more convincing and effective solution to measure the discrepancy between BBAs in D-S evidence theory.
[248] vixra:1901.0166 [pdf]
Automated Brain Disorders Diagnosis Through Deep Neural Networks
In most cases, the diagnosis of brain disorders such as epilepsy is slow and requires endless visits to doctors and EEG technicians. This project aims to automate brain disorder diagnosis by using Artificial In- telligence and deep learning. Brain could have many disorders that can be detected by reading an Electroencephalography. Using an EEG device and collecting the electrical signals directly from the brain with a non- invasive procedure gives significant information about its health. Classi- fying and detecting anomalies on these signals is what currently doctors do when reading an Electroencephalography. With the right amount of data and the use of Artificial Intelligence, it could be possible to learn and classify these signals into groups like (i.e: anxiety, epilepsy spikes, etc). Then, a trained Neural Network to interpret those signals and identify evidence of a disorder to finally automate the detection and classification of those disorders found.
[249] vixra:1901.0051 [pdf]
Commonsense Reasoning, Commonsense Knowledge, and the SP Theory of Intelligence
Commonsense reasoning (CSR) and commonsense knowledge (CSK) (together abbreviated as CSRK) are areas of study concerned with problems which are trivially easy for adults but which are challenging for artificial systems. This paper describes how the "SP System" -- meaning the "SP Theory of Intelligence" and its realisation in the "SP Computer Model" -- has strengths and potential in several aspects of CSRK. A particular strength of the SP System is that it shows promise as an overarching theory for four areas of relative success with CSRK problems -- described by other authors -- which have been developed without any integrative theory. How the SP System may help to solve four other kinds of CSRK problem is described: 1) how the strength of evidence for a murder may be influenced by the level of lighting of the murder as it was witnessed; 2) how people may arrive at the commonly-accepted interpretation of phrases like ``water bird’’; 3) interpretation of the horse's head scene in ``The Godfather’’ film; and 4) how the SP System may help to resolve the reference of an ambiguous pronoun in sentences in the format of a `Winograd schema’. Also described is why a fifth CSRK problem -- modelling how a cook may crack an egg into a bowl -- is beyond the capabilities of the SP System as it is now and how those deficiencies may be overcome via planned developments of the system.
[250] vixra:1901.0042 [pdf]
Smoke Detection: Revisit the PCA Matting Approach
This paper revisits a novel approach, PCA matting, for smoke detection where the removal of the effect of background image and extract textural features are taken into account. This article considers an image as linear blending of smoke component and background component. Under this assumption this paper discusses a model and it's solution using the concept of PCA.
[251] vixra:1901.0038 [pdf]
Deux Applications Des Méthodes de L’analyse Des Données Avec R
Dans ce projet, nous allons appliquer deux méthodes d’analyse de données ( classification hiérarchique & l’ACP) pour étudier 2 échantillons de données . On commence par une présentation courte des outils théorique, ensuite nous exposons notre analyse via ces deux méthodes en utilisant le langage R . Je me base principalement dans la partie théorique sur les cours de Wikistat .
[252] vixra:1812.0443 [pdf]
Review: Generic Multi-Objective Deep Reinforcement Learning(MODRL)
In this paper, the author reviewed the existing survey regarding MODRL and published in March 2018 by Thanh Thi Nguyen, and discussed the variety of reinforcement learning approaches in terms of multi-objective problem setting.
[253] vixra:1812.0306 [pdf]
Power Law and Dimension of the Maximum Value for Belief Distribution with the Max Deng Entropy
Deng entropy is a novel and efficient uncertainty measure to deal with imprecise phenomenon, which is an extension of Shannon entropy. In this paper, power law and dimension of the maximum value for belief distribution with the max Deng entropy are presented, which partially uncover the inherent physical meanings of Deng entropy from the perspective of statistics. This indicated some work related to power law or scale-free can be analyzed using Deng entropy. The results of some numerical simulations are used to support the new views.
[254] vixra:1812.0250 [pdf]
Aspie96 at IronITA (EVALITA 2018): Irony Detection in Italian Tweets with Character-Level Convolutional RNN
Irony is characterized by a strong contrast between what is said and what is meant: this makes its detection an important task in sentiment analysis. In recent years, neural networks have given promising results in different areas, including irony detection. In this report, I describe the system used by the Aspie96 team in the IronITA competition (part of EVALITA 2018) for irony and sarcasm detection in Italian tweets.
[255] vixra:1812.0069 [pdf]
Divergence Measure of Intuitionistic Fuzzy Sets
As a generation of fuzzy sets, the intuitionistic fuzzy sets (IFSs) have more powerful ability to represent and deal with the uncertainty of information. The distance measure between the IFSs is still an open question. In this paper, we propose a new distance measure between the IFSs on the basis of the Jensen{ Shannon divergence. The new distance measure of IFSs not only can satisfy the axiomatic de nition of distance measure, but also can discriminate the diference between the IFSs more better. As a result, the new distance measure can generate more reasonable results.
[256] vixra:1811.0417 [pdf]
Distance Measure of Pythagorean Fuzzy Sets
The Pythagorean fuzzy set (PFS), as an extension of intuitionistic fuzzy set, is more capable of expressing and handling the uncertainty under uncertain envi- ronments. Whereas, how to measure the distance between Pythagorean fuzzy sets appropriately is still an open issue. Therefore, a novel distance measure between Pythagorean fuzzy sets is proposed based on the Jensen{Shannon di- vergence in this paper. The new distance measure has the following merits: i) it meets the axiomatic de nition of distance measure; ii) it can better indicate the discrimination degree of PFSs. Then, numerical examples are demonstrated that the PFSJS distance measure is feasible and reasonable.
[257] vixra:1811.0390 [pdf]
Deng Entropy in Thermodynamics
Dempster-Shafer theory (D-S theory) has been widely used in many fields. Recently, a new entropy called Deng entropy was proposed in D-S theory. As an extension of Shannon entropy, it can deal with uncertainty problems in D-S theory. Entropy originated in physics and was later widely used in many fields. A natural question is what is the form of Deng entropy in physics? In this paper, we proposed the Deng entropy in thermodynamics, and under the conditions of a given system, deduced the Deng entropy in thermodynamics. In addition, we discussed the properties of Deng entropy in thermodynamics. First, the Deng entropy of thermodynamics is an extension of Gibbs entropy, just as Deng entropy is an extension of Shannon’s entropy. Similarly, Deng entropy in thermodynamics is also a measure of uncertainty. Given the state distribution of particles in a system, we can describe the uncertainty of particle states through Deng entropy in thermodynamics. Then, by proof, we find that Deng entropy in thermodynamics does not satisfy additivity. Finally, we also derived the probability distribution corresponding to the system when the Deng entropy in thermodynamics reaches its extreme value.
[258] vixra:1811.0367 [pdf]
The Semi-Pascal Triangle of Maximum Deng Entropy
In D-S theory, measure the uncertainty has aroused many people’s attention. Deng proposed the interesting Deng entropy that it can measure non-specificity and discord. Hence, exploring the physical meaning of Deng entropy is an essential issue. Based the maximum Deng entropy and fractal, the paper discuss the relation in them.
[259] vixra:1808.0680 [pdf]
High-Accuracy Inference in Neuromorphic Circuits using Hardware-Aware Training
Neuromorphic Multiply-And-Accumulate (MAC) circuits utilizing synaptic weight elements based on SRAM or novel Non-Volatile Memories (NVMs) provide a promising approach for highly efficient hardware representations of neural networks. NVM density and robustness requirements suggest that off-line training is the right choice for ``edge'' devices, since the requirements for synapse precision are much less stringent. However, off-line training using ideal mathematical weights and activations can result in significant loss of inference accuracy when applied to non-ideal hardware. Non-idealities such as multi-bit quantization of weights and activations, non-linearity of weights, finite max/min ratios of NVM elements, and asymmetry of positive and negative weight components all result in degraded inference accuracy. In this work, it is demonstrated that non-ideal Multi-Layer Perceptron (MLP) architectures using low bitwidth weights and activations can be trained with negligible loss of inference accuracy relative to their Floating Point-trained counterparts using a proposed off-line, continuously differentiable HW-aware training algorithm. The proposed algorithm is applicable to a wide range of hardware models, and uses only standard neural network training methods. The algorithm is demonstrated on the MNIST and EMNIST datasets, using standard MLPs.
[260] vixra:1808.0610 [pdf]
The Complexity of Student-Project-Resource Matching-Allocation Problems
In this technical note, I settle the computational complexity of nonwastefulness and stability in student-project-resource matching-allocation problems, a model that was first proposed by \cite{pc2017}. I show that computing a nonwasteful matching is complete for class $\text{FP}^{\text{NP}}[\text{poly}]$ and computing a stable matching is complete for class $\Sigma_2^P$. These results involve the creation of two fundamental problems: \textsc{ParetoPartition}, shown complete for $\text{FP}^{\text{NP}}[\text{poly}]$, and \textsc{$\forall\exists$-4-Partition}, shown complete for $\Sigma_2^P$. Both are number problems that are hard in the strong sense.
[261] vixra:1808.0155 [pdf]
The Complexity of Robust and Resilient $k$-Partition Problems
In this paper, we study a $k$-partition problem where a set of agents must be partitioned into a fixed number of $k$ non-empty coalitions. The value of a partition is the sum of the pairwise synergies inside its coalitions. Firstly, we aim at computing a partition that is robust to failures from any set of agents with bounded size. Secondly, we focus on resiliency: when a set of agents fail, others can be moved to replace them. We settle the computational complexity of decision problem \textsc{Robust-$k$-Part} as complete for class $\Sigma_2^P$. We also conjecture that resilient $k$-partition is complete for class $\Sigma_3^P$ under simultaneous replacements, and for class PSPACE under sequential replacements.
[262] vixra:1808.0133 [pdf]
High-Level Task Planning in Robotics with Symbolic Model Checking
A robot control system contains a lowlevel motion planner and a high level task planner. The motions are generated with keyframe to keyframe planning while the the tasks are described with primitive action-names. A good starting point to formalize task planning is a mindmap which is created manually for a motion capture recording. It contains the basic actions in natural language and is the blueprint for a formal ontology. The mocap annotations are extended by features into a dataset, which is used for training a neural network. The resulting modal is a qualitative physics engine, which predicts future states of the system.
[263] vixra:1807.0485 [pdf]
Intuitionistic Evidence Sets
Dempster-Shafer evidence theory can express and deal with uncertain and imprecise information well, which satisfies the weaker condition than the Bayes probability theory. The traditional single basic probability assignment only considers the degree of the evidence support the subsets of the frame of discernment. In order to simulate human decision-making processes and any activities requiring human expertise and knowledge, intuitionstic evidence sets (IES) is proposed in this paper. It takes into account not only the degree of the support, but also the degree of non-support. The combination rule of intuitionstic basic probability assignments (IBPAs) also be investigated. Feasibility and effectiveness of the proposed method are illustrated by using an application of multi-criteria group decision making.
[264] vixra:1807.0318 [pdf]
Structural Damage Information Decision Based on Z-numbers
Structural health monitoring (SHM) has grate economic value and research value because of the application of finite element model technology, structural damage identification theory, intelligent sensing system, signal processing technology and as so on. A typical SHM system involved three major subsystems: a sensor subsystem, a data processing subsystem and a health evaluation subsystem. It is significance of sensor data fusion for the data processing subsystem. In this paper, considering the fuzziness and reliability of the data, the method based on Z-numbers is proposed in the damage information fusion for decision level, which is a softer method and avoids the severe effect of a small data on the fusion result. The result given by the simulation example of space structure shows the effectiveness of this method.
[265] vixra:1807.0257 [pdf]
Generalized Ordered Propositions Fusion Based on Belief Entropy
A set of ordered propositions describe the different intensities of a characteristic of an object, the intensities increase or decrease gradually. A basic support function is a set of truth-values of ordered propositions, it includes the determinate part and indeterminate part. The indeterminate part of a basic support function indicates uncertainty about all ordered propositions. In this paper, we propose generalized ordered propositions by extending the basic support function for power set of ordered propositions. We also present the entropy which is a measure of uncertainty of a basic support function based on belief entropy. The fusion method of generalized ordered proposition also be presented. The generalized ordered propositions will be degenerated as the classical ordered propositions in that when the truth- values of non-single subsets of ordered propositions are zero. Some numerical examples are used to illustrate the efficiency of generalized ordered propositions and their fusion.
[266] vixra:1807.0245 [pdf]
Measuring Fuzziness of Z-numbers and Its Application in Sensor Data Fusion
Real-world information is often characterized by fuzziness due to the uncertainty. Z- numbers is an ordered pair of fuzzy numbers and is widely used as a flexible and efficient model to deal with the fuzziness information. This paper extends the fuzziness measure to continuous fuzzy number. Then, a new fuzziness measure of discrete Z-numbers and continuous Z-numbers is proposed: simple addition of fuzziness measures of two fuzzy numbers of a Z-number. It can be used to obtain a fused Z-number with the best in- formation quality in sensor fusion applications based on Z-numbers. Some numerical examples and the application in sensor fusion are illustrated to show the efficiency of the proposed fuzziness measure of Z-numbers.
[267] vixra:1807.0239 [pdf]
Using Textual Summaries to Describe a Set of Products
When customers are faced with the task of making a purchase in an unfamiliar product domain, it might be useful to provide them with an overview of the product set to help them understand what they can expect. In this paper we present and evaluate a method to summarise sets of products in natural language, focusing on the price range, common product features across the set, and product features that impact on price. In our study, participants reported that they found our summaries useful, but we found no evidence that the summaries influenced the selections made by participants.
[268] vixra:1806.0402 [pdf]
New Sufficient Conditions of Robust Recovery for Low-Rank Matrices
In this paper we investigate the reconstruction conditions of nuclear norm minimization for low-rank matrix recovery from a given linear system of equality constraints. Sufficient conditions are derived to guarantee the robust reconstruction in bounded $l_2$ and Dantzig selector noise settings $(\epsilon\neq0)$ or exactly reconstruction in the noiseless context $(\epsilon=0)$ of all rank $r$ matrices $X\in\mathbb{R}^{m\times n}$ from $b=\mathcal{A}(X)+z$ via nuclear norm minimization. Furthermore, we not only show that when $t=1$, the upper bound of $\delta_r$ is the same as the result of Cai and Zhang \cite{Cai and Zhang}, but also demonstrate that the gained upper bounds concerning the recovery error are better. Finally, we prove that the restricted isometry property condition is sharp.
[269] vixra:1806.0286 [pdf]
An End-to-end Model of Predicting Diverse Ranking OnHeterogeneous Feeds
As an external assistance for online shopping, multimedia content (feed) plays an important role in e-Commerce eld. Feeds in formats of post, item list and video bring in richer auxiliary information and more authentic assessments of commodities (items). In Alibaba, the largest Chinese online retailer, besides traditional item search engine (ISE), a content search engine (CSE) is utilized for feeds recommendation as well. However, the diversity of feed types raises a challenge for the CSE to rank heterogeneous feeds. In this paper, a two-step end-to-end model including Heterogeneous Type Sorting and Homogeneous Feed Ranking is proposed to address this problem. In the first step, an independent Multi-Armed bandit (iMAB) model is proposed first, and an improved personalized Markov Deep Neural Network (pMDNN) model is developed later on. In the second step, an existing Deep Structured Semantic Model (DSSM) is utilized for homogeneous feed ranking. A/B test on Alibaba product environment shows that, by considering user preference and feed type dependency, pMDNN model significantly outperforms than iMAB model to solve heterogeneous feed ranking problem.
[270] vixra:1806.0044 [pdf]
L’apprentissage Profond Sur Mimic-III :Prédiction de la Mortalité Sous 24 H
Ce projet décrit la fouille de données sur la base MIMIC-III . L’objectif est de prédire le décès à l’hôpital sur la base MIMIC III. On va suivre dans ce projet le processus Knowledge Discovery in Databases (KDD) qui est : 1. Sélection et extraction d’un ensemble de données de séries chronologiques multiva- riées à partir d’une base de données de rangées de millons en écrivant des requêtes SQL. 2. Prétraiter et nettoyer la série chronologique en un ensemble de données bien rangé en explorant les données, en gérant les données manquantes (taux de données man- quantes> 50%) et en supprimant le bruit / les valeurs aberrantes. 3. Développement d’un modèle prédictif permettant d’associer aux séries chronolo- giques biomédicales un indicateur de gravité ( probabilité de mortalité ) en mettant en œuvre plusieurs algorithmes tels que l’arbre de décision gradient boost et le k-NN (k-nearest neighbors) avec l’algorithme DTW (Dynamic time warping). 4. Résultat de 30% d’augmentation du score F1 (mesure de la précision d’un test) par rapport à l’indice de notation médical (SAPS II).
[271] vixra:1805.0520 [pdf]
An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System
Social media has become one of the main channels for peo- ple to communicate and share their views with the society. We can often detect from these views whether the person is in favor, against or neu- tral towards a given topic. These opinions from social media are very useful for various companies. We present a new dataset that consists of 3545 English-Hindi code-mixed tweets with opinion towards Demoneti- sation that was implemented in India in 2016 which was followed by a large countrywide debate. We present a baseline supervised classification system for stance detection developed using the same dataset that uses various machine learning techniques to achieve an accuracy of 58.7% on 10-fold cross validation.
[272] vixra:1805.0519 [pdf]
A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection
Social media platforms like twitter and facebook have be- come two of the largest mediums used by people to express their views to- wards different topics. Generation of such large user data has made NLP tasks like sentiment analysis and opinion mining much more important. Using sarcasm in texts on social media has become a popular trend lately. Using sarcasm reverses the meaning and polarity of what is implied by the text which poses challenge for many NLP tasks. The task of sarcasm detection in text is gaining more and more importance for both commer- cial and security services. We present the first English-Hindi code-mixed dataset of tweets marked for presence of sarcasm and irony where each token is also annotated with a language tag. We present a baseline su- pervised classification system developed using the same dataset which achieves an average F-score of 78.4 after using random forest classifier and performing 10-fold cross validation.
[273] vixra:1805.0267 [pdf]
An Improved Method of Generating Z-Number Based on Owa Weights and Maximum Entropy
How to generate Z-number is an important and open issue in the uncertain information processing of Z-number. In [1], a method of generating Z-number using OWA weight and maximum entropy is investigated. However, the meaning of the method in [1] is not clear enough according to the definition of Z-number. Inspired by the methodology in [1], we improve the method of determining Z-number based on OWA weights and maximum entropy, which is more clear about the meaning of Z-number. Some numerical examples are used to illustrate the effectiveness of the proposed method.
[274] vixra:1805.0226 [pdf]
A Memristor based Unsupervised Neuromorphic System Towards Fast and Energy-Efficient GAN
Deep Learning has gained immense success in pushing today's artificial intelligence forward. To solve the challenge of limited labeled data in the supervised learning world, unsupervised learning has been proposed years ago while low accuracy hinters its realistic applications. Generative adversarial network (GAN) emerges as an unsupervised learning approach with promising accuracy and are under extensively study. However, the execution of GAN is extremely memory and computation intensive and results in ultra-low speed and high-power consumption. In this work, we proposed a holistic solution for fast and energy-efficient GAN computation through a memristor-based neuromorphic system. First, we exploited a hardware and software co-design approach to map the computation blocks in GAN efficiently. We also proposed an efficient data flow for optimal parallelism training and testing, depending on the computation correlations between different computing blocks. To compute the unique and complex loss of GAN, we developed a diff-block with optimized accuracy and performance. The experiment results on big data show that our design achieves 2.8x speedup and 6.1x energy-saving compared with the traditional GPU accelerator, as well as 5.5x speedup and 1.4x energy-saving compared with the previous FPGA-based accelerator.
[275] vixra:1805.0089 [pdf]
Group Sparse Recovery in Impulsive Noise Via Alternating Direction Method of Multipliers
In this paper, we consider the recovery of group sparse signals corrupted by impulsive noise. In some recent literature, researchers have utilized stable data fitting models, like $l_1$-norm, Huber penalty function and Lorentzian-norm, to substitute the $l_2$-norm data fidelity model to obtain more robust performance. In this paper, a stable model is developed, which exploits the generalized $l_p$-norm as the measure for the error for sparse reconstruction. In order to address this model, we propose an efficient alternative direction method of multipliers, which includes the proximity operator of $l_p$-norm functions to the framework of Lagrangian methods. Besides, to guarantee the convergence of the algorithm in the case of $0\leq p<1$ (nonconvex case), we took advantage of a smoothing strategy. For both $0\leq p<1$ (nonconvex case) and $1\leq p\leq2$ (convex case), we have derived the conditions of the convergence for the proposed algorithm. Moreover, under the block restricted isometry property with constant $\delta_{\tau k_0}<\tau/(4-\tau)$ for $0<\tau<4/3$ and $\delta_{\tau k_0}<\sqrt{(\tau-1)/\tau}$ for $\tau\geq4/3$, a sharp sufficient condition for group sparse recovery in the presence of impulsive noise and its associated error upper bound estimation are established. Numerical results based on the synthetic block sparse signals and the real-world FECG signals demonstrate the effectiveness and robustness of new algorithm in highly impulsive noise.
[276] vixra:1803.0675 [pdf]
A Survey on Reasoning on Building Information Models Based on IFC
Building Information Models (BIM) are computer models that act as a main source of building information and integrate several aspects of engineering and architectural design, including building utilisation. They aim at enhancing the efficiency and the effectiveness of the projects during design, construction, and maintenance. Artificial Intelligence, which is used to automate tasks that would require intelligence, has found its way into BIM by applying reasoners, among other techniques. A reasoner is a piece of software that makes the implicit and hidden knowledge as explicit by using logical inferring techniques. Reasoners are applied on BIM to help take enhanced decisions and to assess the construction projects. The importance of BIM in both construction and information technology sectors has motivated many researchers to work on surveys that attempt to provide the current state of BIM, but unfortunately, none of these surveys has focused on reasoning on BIM. In this article we survey the research proposals and toolkits that rely on using reasoning systems on BIM, and we classify them into a two-level schema based on what they are intended for. According to our survey, reasoning is mainly used for solving design problems, and is especially applied for code consistency checking, with an emphasis on the semantic web technologies. Furthermore, user-friendliness is still a gap in this field and case-based reasoning, which was often applied in the past efforts, is still hardly applied for reasoning on BIM. The survey shows that this research area is active and that the research results are progressively being integrated into commercial toolkits.
[277] vixra:1801.0102 [pdf]
Bayesian Transfer Learning for Deep Networks
We propose a method for transfer learning for deep networks through Bayesian inference, where an approximate posterior distribution q(w|θ) of model parameters w is learned through variational approximation. Utilizing Bayes by Backprop we optimize the parameters θ associated with the approximate distribution. When performing transfer learning we consider two tasks; A and B. Firstly, an approximate posterior q_A(w|θ) is learned from task A which is afterwards transferred as a prior p(w) → q_A(w|θ) when learning the approximate posterior distribution q_B(w|θ) for task B. Initially, we consider a multivariate normal distribution q(w|θ) = N (µ, Σ), with diagonal covariance matrix Σ. Secondly, we consider the prospects of introducing more expressive approximate distributions - specifically those known as normalizing flows. By investigating these concepts on the MNIST data set we conclude that utilizing normalizing flows does not improve Bayesian inference in the context presented here. Further, we show that transfer learning is not feasible using our proposed architecture and our definition of task A and task B, but no general conclusion regarding rejecting a Bayesian approach to transfer learning can be made.
[278] vixra:1801.0050 [pdf]
Fruit Recognition from Images Using Deep Learning
In this paper we introduce a new, high-quality, dataset of images containing fruits. We also present the results of some numerical experiment for training a neural network to detect fruits. We discuss the reason why we chose to use fruits in this project by proposing a few applications that could use this kind of neural network.
[279] vixra:1801.0041 [pdf]
Taking Advantage of BiLSTM Encoding to Handle Punctuation in Dependency Parsing: A Brief Idea
In the context of the bidirectional-LSTMs neural parser (Kiperwasser and Goldberg, 2016), an idea is proposed to initialize the parsing state without punctuation-tokens but using them for the BiLSTM sentence encoding. The relevant information brought by the punctuation-tokens should be implicitly learned using the errors of the recurrent contributions only.
[280] vixra:1712.0659 [pdf]
TDBF: Two Dimensional Belief Function
How to efficiently handle uncertain information is still an open issue. Inthis paper, a new method to deal with uncertain information, named as two dimensional belief function (TDBF), is presented. A TDBF has two components, T=(mA,mB). The first component, mA, is a classical belief function. The second component, mB, also is a classical belief function, but it is a measure of reliability of the first component. The definition of TDBF and the discounting algorithm are proposed. Compared with the classical discounting model, the proposed TDBF is more flexible and reasonable. Numerical examples are used to show the efficiency of the proposed method.
[281] vixra:1712.0647 [pdf]
A Total Uncertainty Measure for D Numbers Based on Belief Intervals
As a generalization of Dempster-Shafer theory, the theory of D numbers is a new theoretical framework for uncertainty reasoning. Measuring the uncertainty of knowledge or information represented by D numbers is an unsolved issue in that theory. In this paper, inspired by distance based uncertainty measures for Dempster-Shafer theory, a total uncertainty measure for a D number is proposed based on its belief intervals. The proposed total uncertainty measure can simultaneously capture the discord, and non-specificity, and non-exclusiveness involved in D numbers. And some basic properties of this total uncertainty measure, including range, monotonicity, generalized set consistency, are also presented.
[282] vixra:1712.0469 [pdf]
Predicting Yelp Star Reviews Based on Network Structure with Deep Learning
In this paper, we tackle the real-world problem of predicting Yelp star-review rating based on business features (such as images, descriptions), user features (average previous ratings), and, of particular interest, network properties (which businesses has a user rated before). We compare multiple models on different sets of features -- from simple linear regression on network features only to deep learning models on network and item features. In recent years, breakthroughs in deep learning have led to increased accuracy in common supervised learning tasks, such as image classification, captioning, and language understanding. However, the idea of combining deep learning with network feature and structure appears to be novel. While the problem of predicting future interactions in a network has been studied at length, these approaches have often ignored either node-specific data or global structure. We demonstrate that taking a mixed approach combining both node-level features and network information can effectively be used to predict Yelp-review star ratings. We evaluate on the Yelp dataset by splitting our data along the time dimension (as would naturally occur in the real-world) and comparing our model against others which do no take advantage of the network structure and/or deep learning.
[283] vixra:1712.0468 [pdf]
The Effectiveness of Data Augmentation in Image Classification using Deep Learning
In this paper, we explore and compare multiple solutions to the problem of data augmentation in image classification. Previous work has demonstrated the effectiveness of data augmentation through simple techniques, such as cropping, rotating, and flipping input images. We artificially constrain our access to data to a small subset of the ImageNet dataset, and compare each data augmentation technique in turn. One of the more successful data augmentations strategies is the traditional transformations mentioned above. We also experiment with GANs to generate images of different styles. Finally, we propose a method to allow a neural net to learn augmentations that best improve the classifier, which we call neural augmentation. We discuss the successes and shortcomings of this method on various datasets.
[284] vixra:1712.0467 [pdf]
Gaussian Processes for Crime Prediction
The ability to predict crime is incredibly useful for police departments, city planners, and many other parties, but thus far current approaches have not made use of recent developments of machine learning techniques. In this paper, we present a novel approach to this task: Gaussian processes regression. Gaussian processes (GP) are a rich family of distributions that are able to learn functions. We train GPs on historic crime data to learn the underlying probability distribution of crime incidence to make predictions on future crime distributions.
[285] vixra:1712.0465 [pdf]
Reinforcement Learning with Swingy Monkey
This paper explores model-free, model-based, and mixture models for reinforcement learning under the setting of a SwingyMonkey game \footnote{The code is hosted on a public repository \href{https://github.com/kandluis/machine-learning}{here} under the prac4 directory.}. SwingyMonkey is a simple game with well-defined goals and mechanisms, with a relatively small state-space. Using Bayesian Optimization \footnote{The optimization took place using the open-source software made available by HIPS \href{https://github.com/HIPS/Spearmint}{here}.} on a simple Q-Learning algorithm, we were able to obtain high scores within just a few training epochs. However, the system failed to scale well after continued training, and optimization over hundreds of iterations proved too time-consuming to be effective. After manually exploring multiple approaches, the best results were achieved using a mixture of $\epsilon$-greedy Q-Learning with a stable learning rate,$\alpha$, and $\delta \approx 1$ discount factor. Despite the theoretical limitations of this approach, the settings, resulted in maximum scores of over 5000 points with an average score of $\bar{x} \approx 684$ (averaged over the final 100 testing epochs, median of $\bar{m} = 357.5$). The results show an continuing linear log-relation capping only after 20,000 training epochs.
[286] vixra:1712.0464 [pdf]
Multi-Document Text Summarization
We tackle the problem of multi-document extractive summarization by implementing two well-known algorithms for single-text summarization -- {\sc TextRank} and {\sc Grasshopper}. We use ROUGE-1 and ROUGE-2 precision scores with the DUC 2004 Task 2 data set to measure the performance of these two algorithms, with optimized parameters as described in their respective papers ($\alpha =0.25$ and $\lambda=0.5$ for Grasshopper and $d=0.85$ for TextRank). We compare these modified algorithms to common baselines as well as non-naive, novel baselines and we present the resulting ROUGE-1 and ROUGE-2 recall scores. Subsequently, we implement two novel algorithms as extensions of {\sc GrassHopper} and {\sc TextRank}, each termed {\sc ModifiedGrassHopper} and {\sc ModifiedTextRank}. The modified algorithms intuitively attempt to ``maximize'' diversity across the summary. We present the resulting ROUGE scores. We expect that with further optimizations, this unsupervised approach to extractive text summarization will prove useful in practice.
[287] vixra:1712.0446 [pdf]
A New Divergence Measure for Basic Probability Assignment and Its Applications in Extremely Uncertain Environments
Information fusion under extremely uncertain environments is an important issue in pattern classification and decision-making problem. Dempster-Shafer evidence theory (D-S theory) is more and more extensively applied to information fusion for its advantage to deal with uncertain information. However, the results opposite to common sense are often obtained when combining the different evidences using Dempster’s combination rules. How to measure the difference between different evidences is still an open issue. In this paper, a new divergence is proposed based on Kullback-Leibler divergence in order to measure the difference between different basic probability assignments (BPAs). Numerical examples are used to illustrate the computational process of the proposed divergence. Then the similarity for different BPAs is also defined based on the proposed divergence. The basic knowledge about pattern recognition is introduced and a new classification algorithm is presented using the proposed divergence and similarity under extremely uncertain environments, which is illustrated by a small example handling robot sensing. The method put forward is motivated by desperately in need to develop intelligent systems, such as sensor-based data fusion manipulators, which need to work in complicated, extremely uncertain environments. Sensory data satisfy the conditions 1) fragmentary and 2) collected from multiple levels of resolution.
[288] vixra:1712.0444 [pdf]
Environmental Impact Assessment Using D-Vikor Approach
Environmental impact assessment (EIA) is an open and important issue depends on factors such as social, ecological, economic, etc. Due to human judgment, a variety of uncertainties are brought into the EIA process. With regard to uncertainty, many existing methods seem powerless to represent and deal with it effectively. A new theory called D numbers, because of its advantage to handle uncertain information, is widely used to uncertainty modeling and decision making. VIKOR method has its unique advantages in dealing with multiple criteria decision making problems (MCDM), especially when the criteria are non-commensurable and even conflicting, it can also obtain the compromised optimal solution. In order to solve EIA problems more effectively, in this paper, a D-VIKOR approach is proposed, which expends the VIKOR method by D numbers theory. In the proposed approach, assessment information of environmental factors is expressed and modeled by D numbers. And a new combination rule for multiple D numbers is defined. Subjective weights and objective weights are considered in VIKOR process for more reasonable ranking results. A numerical example is conducted to analyze and demonstrate the practicality and effectiveness of the proposed D-VIKOR approach.
[289] vixra:1712.0432 [pdf]
DS-Vikor: a New Methodology for Supplier Selection
How to select the optimal supplier is an open and important issue in supply chain management (SCM), which needs to solve the problem of assessment and sorting the potential suppliers, and can be considered as a multi-criteria decision-making (MCDM) problem. Experts’ assessment play a very important role in the process of supplier selection, while the subjective judgment of human beings could introduce unpredictable uncertainty. However, existing methods seem powerless to represent and deal with this uncertainty effectively. Dempster-Shafer evidence theory (D- S theory) is widely used to uncertainty modeling, decision making and conflicts management due to its advantage to handle uncertain information. The VIKOR method has a great advantage to handle MCDM problems with non-commensurable and even conflicting criteria, and to obtain the compromised optimal solution. In this paper, a DS- VIKOR method is proposed for the supplier selection problem which expends the VIKOR method by D-S theory. In this method, the basic probability assignment (BPA) is used to denote the decision makers’ assessment for suppliers, Deng entropy weight-based method is defined and applied to determine the weights of multi-criteria, and VIKOR method is used for getting the final ranking results. An illustrative example under real life is conducted to analyze and demonstrate the practicality and effectiveness of the proposed DS-VIKOR method.
[290] vixra:1712.0400 [pdf]
Adaptively Evidential Weighted Classifier Combination
Classifier combination plays an important role in classification. Due to the efficiency to handle and fuse uncertain information, Dempster-Shafer evidence theory is widely used in multi-classifiers fusion. In this paper, a method of adaptively evidential weighted classifier combination is presented. In our proposed method, the output of each classifier is modelled by basic probability assignment (BPA). Then, the weights are determined adaptively for individual classifier according to the uncertainty degree of the corresponding BPA. The uncertainty degree is measured by a belief entropy, named as Deng entropy. Discounting-and-combination scheme in D-S theory is used to calculate the weighted BPAs and combine them for the final BPA for classification. The effectiveness of the proposed weighted combination method is illustrated by numerical experimental results.
[291] vixra:1711.0420 [pdf]
Move the Tip to the Right a Language Based Computeranimation System in Box2d
Not only “robots need language”, but sometimes a human-operator too. To interact with complex domains, he needs a vocabulary to init the robot, let him walk and grasping objects. Natural language interfaces can support semi-autonomous and fully-autonomous systems on both sides. Instead of using neural networks, the language grounding problem can be solved with object-oriented programming. In the following paper a simulation of micro-manipulation under a microscope is given which is controlled with a C++ script. The small vocabulary consists of init, pregrasp, grasp and place.
[292] vixra:1711.0360 [pdf]
Ontology Engineering for Robotics
Ontologies are a powerfull alternative to reinforcement learning. They store knowledge in a domain-specific language. The best-practice for implementing ontologies is a distributed version control system which is filled manually by programmers.
[293] vixra:1711.0292 [pdf]
Strengths and Potential of the SP Theory of Intelligence in General, Human-Like Artificial Intelligence
This paper first defines "general, human-like artificial intelligence" (GHLAI) in terms of five principles. In the light of the definition, the paper summarises the strengths and potential of the "SP theory of intelligence" and its realisation in the "computer model", outlined in an appendix, in three main areas: the versatility of the SP system in aspects of intelligence; its versatility in the representation of diverse kinds of knowledge; and its potential for the seamless integration of diverse aspects of intelligence and diverse kinds of knowledge, in any combination. There are reasons to believe that a mature version of the SP system may attain full GHLAI in diverse aspects of intelligence and in the representation of diverse kinds of knowledge.
[294] vixra:1711.0266 [pdf]
Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network
We revisit fuzzy neural network with a cornerstone notion of generalized hamming distance, which provides a novel and theoretically justified framework to re-interpret many useful neural network techniques in terms of fuzzy logic. In particular, we conjecture and empirically illustrate that, the celebrated batch normalization (BN) technique actually adapts the “normalized” bias such that it approximates the rightful bias induced by the generalized hamming distance. Once the due bias is enforced analytically, neither the optimization of bias terms nor the sophisticated batch normalization is needed. Also in the light of generalized hamming distance, the popular rectified linear units (ReLU) can be treated as setting a minimal hamming distance threshold between network inputs and weights. This thresholding scheme, on the one hand, can be improved by introducing double-thresholding on both positive and negative extremes of neuron outputs. On the other hand, ReLUs turn out to be non-essential and can be removed from networks trained for simple tasks like MNIST classification. The proposed generalized hamming network (GHN) as such not only lends itself to rigorous analysis and interpretation within the fuzzy logic theory but also demonstrates fast learning speed, well-controlled behaviour and state-of-the-art performances on a variety of learning tasks.
[295] vixra:1711.0265 [pdf]
Revisit Fuzzy Neural Network: Bridging the Gap Between Fuzzy Logic and Deep Learning
This article aims to establish a concrete and fundamental connection between two important elds in artificial intelligence i.e. deep learning and fuzzy logic. On the one hand, we hope this article will pave the way for fuzzy logic researchers to develop convincing applications and tackle challenging problems which are of interest to machine learning community too. On the other hand, deep learning could benefit from the comparative research by re-examining many trail-and-error heuristics in the lens of fuzzy logic, and consequently, distilling the essential ingredients with rigorous foundations. Based on the new findings reported in [41] and this article, we believe the time is ripe to revisit fuzzy neural network as a crucial bridge between two schools of AI research i.e. symbolic versus connectionist [101] and eventually open the black-box of artificial neural networks.
[296] vixra:1711.0241 [pdf]
Dysfunktionale Methoden der Robotik
Bei der Realisierung von Robotik-Projekten kann man eine ganze Menge verkehrt machen. Damit sind nicht nur kalte Lötstellen oder abstürzende Software gemeint, sondern sehr viel grundsätzlichere Dinge spielen eine Rolle. Um Fehler zu vermeiden, muss man sich zunächst einmal mit den Failure-Patterns näher auseinandersetzen, also jenen Entwicklungsmethoden, nach denen man auf gar keinen Fall einen Roboter bauen und wie die Software möglichst nicht funktionieren sollte.
[297] vixra:1711.0235 [pdf]
Not Merely Memorization in Deep Networks: Universal Fitting and Specific Generalization
We reinterpret the training of convolutional neural nets(CNNs) with universal classification theorem(UCT). This theory implies any disjoint datasets can be classified by two or more layers of CNNs based on ReLUs and rigid transformation switch units(RTSUs) we propose here, this explains why CNNs could memorize noise and real data. Subsequently, we present another fresh new hypothesis that CNN is insensitive to some variant from input training data example, this variant relates to original training input by generating functions. This hypothesis means CNNs can generalize well even for randomly generated training data and illuminates the paradox Why CNNs fit real and noise data and fail drastically when making predictions for noise data. Our findings suggest the study about generalization theory of CNNs should turn to generating functions instead of traditional statistics machine learning theory based on assumption that the training data and testing data are independent and identically distributed(IID), and apparently IID assumption contradicts our experiments in this paper.We experimentally verify these ideas correspondingly.
[298] vixra:1710.0324 [pdf]
New Sufficient Conditions of Signal Recovery with Tight Frames Via $l_1$-Analysis
The paper discusses the recovery of signals in the case that signals are nearly sparse with respect to a tight frame $D$ by means of the $l_1$-analysis approach. We establish several new sufficient conditions regarding the $D$-restricted isometry property to ensure stable reconstruction of signals that are approximately sparse with respect to $D$. It is shown that if the measurement matrix $\Phi$ fulfils the condition $\delta_{ts}<t/(4-t)$ for $0<t<4/3$, then signals which are approximately sparse with respect to $D$ can be stably recovered by the $l_1$-analysis method. In the case of $D=I$, the bound is sharp, see Cai and Zhang's work \cite{Cai and Zhang 2014}. When $t=1$, the present bound improves the condition $\delta_s<0.307$ from Lin et al.'s reuslt to $\delta_s<1/3$. In addition, numerical simulations are conducted to indicate that the $l_1$-analysis method can stably reconstruct the sparse signal in terms of tight frames.
[299] vixra:1709.0108 [pdf]
A New Semantic Theory of Nature Language
Formal Semantics and Distributional Semantics are two important semantic frameworks in Natural Language Processing (NLP). Cognitive Semantics belongs to the movement of Cognitive Linguistics, which is based on contemporary cognitive science. Each framework could deal with some meaning phenomena, but none of them fulfills all requirements proposed by applications. A unified semantic theory characterizing all important language phenomena has both theoretical and practical significance; however, although many attempts have been made in recent years, no existing theory has achieved this goal yet. This article introduces a new semantic theory that has the potential to characterize most of the important meaning phenomena of natural language and to fulfill most of the necessary requirements for philosophical analysis and for NLP applications. The theory is based on a unified representation of information, and constructs a kind of mathematical model called cognitive model to interpret natural language expressions in a compositional manner. It accepts the empirical assumption of Cognitive Semantics, and overcomes most shortcomings of Formal Semantics and of Distributional Semantics. The theory, however, is not a simple combination of existing theories, but an extensive generalization of classic logic and Formal Semantics. It inherits nearly all advantages of Formal Semantics, and also provides descriptive contents for objects and events as fine-gram as possible, descriptive contents which represent the results of human cognition.
[300] vixra:1709.0007 [pdf]
Computing, Cognition and Information Compression
This article develops the idea that the storage and processing of information in computers and in brains may often be understood as information compression. The article first reviews what is meant by information and, in particular, what is meant by redundancy, a concept which is fundamental in all methods for information compression. Principles of information compression are described. The major part of the article describes how these principles may be seen in a range of observations and ideas in computing and cognition: the phenomena of adaptation and inhibition in nervous systems; 'neural' computing; the creation and recognition of 'objects' and 'classes'in perception and cognition; stereoscopic vision and random-dot stereograms; the organisation of natural languages; the organisation of grammars; the organisation of functional, structured, logic and object-oriented computer programs; the application and de-referencing of identifiers in computing; retrieval of information from databases; access and retrieval of information from computer memory; logical deduction and resolution theorem proving; inductive reasoning and probabilistic inference; parsing; normalisation of databases.
[301] vixra:1708.0341 [pdf]
Routing Games Over Time with Fifo Policy
We study atomic routing games where every agent travels both along its decided edges and through time. The agents arriving on an edge are first lined up in a \emph{first-in-first-out} queue and may wait: an edge is associated with a capacity, which defines how many agents-per-time-step can pop from the queue's head and enter the edge, to transit for a fixed delay. We show that the best-response optimization problem is not approximable, and that deciding the existence of a Nash equilibrium is complete for the second level of the polynomial hierarchy. Then, we drop the rationality assumption, introduce a behavioral concept based on GPS navigation, and study its worst-case efficiency ratio to coordination.
[302] vixra:1708.0065 [pdf]
Meta Mass Function
In this paper, a meta mass function (MMF) is presented. A new evidence theory with complex numbers is developed. Different with existing evidence theory, the new mass function in complex evidence theory is modelled as complex numbers and named as meta mass function. The classical evidence theory is the special case under the condition that the mass function is degenerated from complex number as real number.
[303] vixra:1704.0205 [pdf]
Formula Analyzer: Find the Formula by Parameters
Let it be a formula, e.g.: x + y^2 - z = r. It is usually necessary to find a parameter’s value by knowing others’ ones. However, let’s set another problem to find the formula itself, knowing only its parameters. The solution of such a problem we call reverse computing. For that we'll create an algorithm and accomplish it as a program code.
[304] vixra:1702.0297 [pdf]
Some General Results On Overfitting In Machine Learning
Overfitting has always been a problem in machine learning. Recently a related phenomenon called “oversearching” has been analyzed. This paper takes a theoretical approach using a very general methodology covering most learning paradigms in current use. Overfitting is defined in terms of the “expressive accuracy” of a model for the data, rather than “predictive accuracy”. The results show that even if the learner can identify a set of best models, overfitting will cause it to bounce from one model to another. Overfitting is ameliorated by having the learner bound the search space, and bounding is equivalent to using an accuracy (or bias) more restrictive than the problem accuracy. Also, Ramsey’s Theorem shows that every data sequence has an situation where either consistent overfitting or underfitting is unavoidable. We show that oversearching is simply overfitting where the resource used to express a model is the search space itself rather than a more common resource such as a program that executes the model. We show that the smallest data sequence guessing a model defines a canonical resource. There is an equivalence in the limit between any two resources to express the same model space, but it may not be effectively computable.
[305] vixra:1611.0260 [pdf]
Deng Entropy in Hyper Power Set and Super Power Set
Deng entropy has been proposed to handle the uncertainty degree of belief function in Dempster-Shafer framework very recently. In this paper, two new belief entropies based on the frame of Deng entropy for hyper-power sets and super-power sets are respectively proposed to measure the uncertainty degree of more uncertain and more flexible information. Directly, the new entropies based on the frame of Deng entropy in hyper-power sets and super-power sets can be used in the application of DSmT.
[306] vixra:1611.0211 [pdf]
A Variable Order Hidden Markov Model with Dependence Jumps
Hidden Markov models (HMMs) are a popular approach for modeling sequential data, typically based on the assumption of a first- or moderate-order Markov chain. However, in many real-world scenarios the modeled data entail temporal dynamics the patterns of which change over time. In this paper, we address this problem by proposing a novel HMM formulation, treating temporal dependencies as latent variables over which inference is performed. Specifically, we introduce a hierarchical graphical model comprising two hidden layers: on the first layer, we postulate a chain of latent observation-emitting states, the temporal dependencies between which may change over time; on the second layer, we postulate a latent first-order Markov chain modeling the evolution of temporal dynamics (dependence jumps) pertaining to the first-layer latent process. As a result of this construction, our method allows for effectively modeling non-homogeneous observed data, where the patterns of the entailed temporal dynamics may change over time. We devise efficient training and inference algorithms for our model, following the expectation-maximization paradigm. We demonstrate the efficacy and usefulness of our approach considering several real-world datasets. As we show, our model allows for increased modeling and predictive performance compared to the alternative methods, while offering a good trade-off between the resulting increases in predictive performance and computational complexity.
[307] vixra:1610.0336 [pdf]
Fuzzy Evidential Influence Diagram Evaluation Algorithm
Fuzzy influence diagrams (FIDs) are one of the graphical models that combines the qualitative and quantitative analysis to solve decision-making problems. However, FIDs use an incomprehensive evaluation criteria to score nodes in complex systems, so that many different nodes got the same score, which can not reflect their differences. Based on fuzzy set and Dempster-Shafer (D-S) evidence theory, this paper changes the traditional evaluation system and modifies corresponding algorithm, in order that the influence diagram can more effectively reflect the true situation of the system, and get more practical results. Numerical examples and the real application in supply chain financial system are used to show the efficiency of the proposed influence diagram model.
[308] vixra:1610.0281 [pdf]
An Information Volume Measure
How to measure the volume of uncertainty information is an open issue. Shannon entropy is used to represent the uncertainty degree of a probability distribution. Given a generalized probability distribution which means that the probability is not only assigned to the basis event space but also the power set of event space. At this time, a so called meta probability space is constructed. A new measure, named as Deng entropy, is presented. The results show that, compared with existing method, Deng entropy is not only better from the aspect of mathematic form, but also has the significant physical meaning.
[309] vixra:1610.0074 [pdf]
Belief Reliability Analysis and Its Application
In reliability analysis, Fault Tree Analysis based on evidential networks is an important research topic. However, the existing EN approaches still remain two issues: one is the final results are expressed with interval numbers, which has a relatively high uncertainty to make a final decision. The other is the combination rule is not used to fuse uncertain information. These issues will greatly decrease the efficiency of EN to handle uncertain information. To address these open issues, a new methodology, called Belief Reliability Analysis, is presented in this paper. The combination methods to deal with series system, parallel system, series-parallel system as well as parallel-series system are proposed for reliability evaluation. Numerical examples and the real application in servo-actuation system are used to show the efficiency of the proposed Belief Reliability Analysis methodology.
[310] vixra:1610.0028 [pdf]
A New Belief Entropy: Possible Generalization of Deng Entropy, Tsallis Entropy and Shannon Entropy
Shannon entropy is the mathematical foundation of information theory, Tsallis entropy is the roots of nonextensive statistical mechanics, Deng entropy was proposed to measure the uncertainty degree of belief function very recently. In this paper, A new entropy H was proposed to generalize Deng entropy, Tsallis entropy and Shannon entropy. The new entropy H can be degenerated to Deng entropy, Tsallis entropy, and Shannon entropy under different conditions, and also can maintains the mathematical properity of Deng entropy, Tsallis entropy and Shannon entropy.
[311] vixra:1609.0133 [pdf]
Five Hundred Deep Learning Papers, Graphviz and Python
I invested days creating a graph with PyGraphviz to repre- sent the evolutionary process of deep learning’s state of the art for the last twenty-five years. Through this paper I want to show you how and what I obtained.
[312] vixra:1608.0041 [pdf]
Combining Infinity Number Of Neural Networks Into One
One of the important aspects of a neural network is its generalization property, which is measured by its ability to make correct prediction on unseen samples. One option to improve generalization is to combine results from multiple networks, which is unfortunately a time-consuming process. In this paper, a new approach is presented to combine infinity number of neural networks in analytic way to produce a small, fast and reliable neural network.
[313] vixra:1607.0484 [pdf]
Active Appearance Model Construction: Implementation notes
Active Appearance Model (AAM) is a powerful object modeling technique and one of the best available ones in computer vision and computer graphics. This approach is however quite complex and various parts of its implementation were addressed separately by different researchers in several recent works. In this paper, we present systematically a full implementation of the AAM model with pseudo codes for the crucial steps in the construction of this model.
[314] vixra:1607.0073 [pdf]
Indian Buffet Process Deep Generative Models
Deep generative models (DGMs) have brought about a major breakthrough, as well as renewed interest, in generative latent variable models. However, an issue current DGM formulations do not address concerns the data-driven inference of the number of latent features needed to represent the observed data. Traditional linear formulations allow for addressing this issue by resorting to tools from the field of nonparametric statistics: Indeed, nonparametric linear latent variable models, obtained by appropriate imposition of Indian Buffet Process (IBP) priors, have been extensively studied by the machine learning community; inference for such models can been performed either via exact sampling or via approximate variational techniques. Based on this inspiration, in this paper we examine whether similar ideas from the field of Bayesian nonparametrics can be utilized in the context of modern DGMs in order to address the latent variable dimensionality inference problem. To this end, we propose a novel DGM formulation, based on the imposition of an IBP prior. We devise an efficient Black-Box Variational inference algorithm for our model, and exhibit its efficacy in a number of semi-supervised classification experiments. In all cases, we use popular benchmark datasets, and compare to state-of-the-art DGMs.
[315] vixra:1605.0190 [pdf]
The Algorithm of the Thinking Machine
In this article we consider the questions 'What is AI?' and 'How to write a program that satisfies the definition of AI?' It deals with the basic concepts and modules that must be at the heart of this program. The most interesting concept that is discussed here is the concept of abstract signals. Each of these signals is related to the result of a particular experiment. The abstract signal is a function that at any time point returns the probability the corresponding experiment to return true.
[316] vixra:1605.0125 [pdf]
Failure Mode and Effects Analysis Based on D Numbers and Topsis
Failure mode and effects analysis (FMEA) is a widely used technique for assessing the risk of potential failure modes in designs, products, process, system or services. One of the main problems of FMEA is to deal with a variety of assessments given by FMEA team members and sequence the failure modes according to the degree of risk factors. The traditional FMEA using risk priority number (RPN) which is the product of occurrence (O), severity (S) and detection (D) of a failure to determine the risk priority ranking order of failure modes. However, it will become impractical when multiple experts give different risk assessments to one failure mode, which may be imprecise or incomplete or the weights of risk factors is inconsistent. In this paper, a new risk priority model based on D numbers, and technique for order of preference by similarity to ideal solution (TOPSIS) is proposed to evaluate the risk in FMEA. In the proposed model, the assessments given by FMEA team members are represented by D numbers, a method can effectively handle uncertain information. TOPSIS method, a novel multi-criteria decision making (MCDM) method is presented to rank the preference of failure modes respect to risk factors. Finally, an application of the failure modes of rotor blades of an aircraft turbine is provided to illustrate the efficiency of the proposed method.
[317] vixra:1603.0378 [pdf]
A Review of Theoretical and Practical Challenges of Trusted Autonomy in Big Data
Despite the advances made in artificial intelligence, software agents, and robotics, there is little we see today that we can truly call a fully autonomous system. We conjecture that the main inhibitor for advancing autonomy is lack of trust. Trusted autonomy is the scientific and engineering field to establish the foundations and ground work for developing trusted autonomous systems (robotics and software agents) that can be used in our daily life, and can be integrated with humans seamlessly, naturally and efficiently. In this paper, we review this literature to reveal opportunities for researchers and practitioners to work on topics that can create a leap forward in advancing the field of trusted autonomy. We focus the paper on the `trust' component as the uniting technology between humans and machines. Our inquiry into this topic revolves around three sub-topics: (1) reviewing and positioning the trust modelling literature for the purpose of trusted autonomy; (2) reviewing a critical subset of sensor technologies that allow a machine to sense human states; and (3) distilling some critical questions for advancing the field of trusted autonomy. The inquiry is augmented with conceptual models that we propose along the way by recompiling and reshaping the literature into forms that enables trusted autonomous systems to become a reality. The paper offers a vision for a Trusted Cyborg Swarm, an extension of our previous Cognitive Cyber Symbiosis concept, whereby humans and machines meld together in a harmonious, seamless, and coordinated manner.
[318] vixra:1603.0335 [pdf]
Conditional Deng Entropy, Joint Deng Entropy and Generalized Mutual Information
Shannon entropy, conditional entropy, joint entropy and mutual information, can estimate the chaotic level of information. However, these methods could only handle certain situations. Based on Deng entropy, this paper introduces multiple new entropy to estimate entropy under multiple interactive uncertain information: conditional Deng entropy is used to calculate entropy under conditional basic belief assignment; joint Deng entropy could calculate entropy by applying joint basic belief assignment distribution; generalized mutual information is applied to estimate the uncertainty of information under knowing another information. Numerical examples are used for illustrating the function of new entropy in the end.
[319] vixra:1512.0007 [pdf]
Ontology, Evolving Under the Influence of the Facts
We propose an algebraic approach to building ontologies which capable of evolution under the influence of new facts and which have some internal mechanisms of validation. For this purpose we build a formal model of the interactions of objects based on cellular automata, and find out the limitations on transactions with objects imposed by this model. Then, in the context of the formal model, we define basic entities of the model of knowledge representation: concepts, samples, properties, and relationships. In this case the formal limitations are induced into the model of knowledge representation in a natural way.
[320] vixra:1511.0145 [pdf]
Which is the Best Belief Entropy?
In this paper, many numerical examples are designed to compare the existing different belief functions with the new entropy, named Deng entropy. The results illustrate that, among the existing belief entropy functions,Deng entropy is the best alternative due to its reasonable properties.
[321] vixra:1511.0144 [pdf]
Measure Divergence Degree of Basic Probability Assignment Based on Deng Relative Entropy
Dempster Shafer evidence theory (D-S theory) is more and more extensively applied to information fusion for the advantage dealing with uncertain information. However, the results opposite to common sense are often obtained when combining the different evidence using the Dempster’s combination rules. How to measure the divergence between different evidence is still an open issue. In this paper, a new relative entropy named as Deng relative entropy is proposed in order to measure the divergence between different basic probability assignments (BPAs). The Deng relative entropy is the generalization of Kullback-Leibler Divergence because when the BPA is degenerated as probability, Deng relative entropy is equal to Kullback-Leibler Divergence. Numerical examples are used to illustrate the effectiveness of the proposed Deng relative entropy.
[322] vixra:1510.0022 [pdf]
Nonextensive Deng Entropy
In this paper, a generalized Tsallis entropy, named as Nonextensive Deng entropy, is presented. When the basic probability assignment is degenerated as probability, Nonextensive Deng entropy is identical to Tsallis entropy.
[323] vixra:1509.0119 [pdf]
The Maximum Deng Entropy
Dempster Shafer evidence theory has widely used in many applications due to its advantages to handle uncertainty. Deng entropy, has been proposed to measure the uncertainty degree of basic probability assignment in evidence theory. It is the generalization of Shannon entropy since that the BPA is degenerated as probability, Deng entropy is identical to Shannon entropy. However, the maximal value of Deng entropy has not been disscussed until now. In this paper, the condition of the maximum of Deng entropy has been disscussed and proofed, which is usefull for the application of Deng entropy.
[324] vixra:1509.0088 [pdf]
Cognitive Architecture for Personable and Human-Like ai :A Perspective
In this article we will introduce a cognitive architecture for creating a more human like and personable artificial intelligence. Recent works such as those by Marvin Minsky, Google DeepMind and cognitive models like AMBR, DUAL that aim to propose/discover an approach to commonsense AI have been promising, since they show that human intelligence can be emulated with a divide and conquer approach on a machine. These frameworks work with an universal model of the human mind and do not account for the variability between human beings. It is these differences between human beings that make communication possible and gives them a sense of identity. Thus, this work, despite being grounded in these methods, will differ in hypothesizing machines that are diverse in their behavior compared to each other and have the ability to express a dynamic personality like a human being. To achieve such individuality in machines, we characterize the various aspects that can be dynamically programmed onto a machine by its human owners. In order to ensure this on a scale parallel to how humans develop their individuality, we first assume a child-like intelligence in a machine that is more malleable and which then develops into a more concrete, mature version. By having a set of tunable inner parameters called aspects which respond to external stimuli from their human owners, machines can achieve personability. The result of this work would be that we will not only be able to bond with the intelligent machines and relate to them in a friendly way, we will also be able to perceive them as having a personality, and that they have their limitations. Just as each human being is unique, we will have machines that are unique and individualistic. We will see how they can achieve intuition, and a drive to find meaning in life, all of which are considered aspects unique to the human mind.
[325] vixra:1507.0145 [pdf]
Author Attribution in the Bitcoin Blocksize Debate on Reddit
The block size debate has been a contentious issue in the Bitcoin com- munity on the social media platform Reddit. Many members of the com- munity suspect there have been organized attempts to manipulate the debate, from people using multiple accounts to over-represent and mis- represent some sides of the debate. The following analysis uses techniques from authorship attribution and machine learning to determine whether comments from user accounts that are active in the debate are from the same author. The techniques used are able to recall over 90% of all in- stances of multiple account use and achieve up to 72% for the true positive rate.
[326] vixra:1507.0038 [pdf]
Unreduced Complex Dynamics of Real Computer and Control Systems
The unreduced dynamic complexity of modern computer, production, communication and control systems has become essential and cannot be efficiently simulated any more by traditional, basically regular models. We propose the universal concept of dynamic complexity and chaoticity of any real interaction process based on the unreduced solution of the many-body problem by the generalised effective potential method. We show then how the obtained mathematically exact novelties of system behaviour can be applied to the development of qualitatively new, complex-dynamical kind of computer and control systems.
[327] vixra:1504.0089 [pdf]
MEMS Microcantilevers Sensor Modes of Operation and Transduction Principles
MEMS based microcantilever is a microfabricated mostly rectangular bar shaped structure, longer as compared to width, and has a thickness much smaller than its length or width. Microfabricated silicon cantilever sensor arrays represent a powerful platform for sensing applications in physics, chemistry, material science, biology and medicine. Microcantielver senses even a few molecules or atoms. A small change in mass causes a greater displacement. It is important that due to micron size of cantilever, the cantilevers bend or displacement is due to small amount of mass but not weight. For application in biomedical diagnostics this device plays an important role in the identification of disease detection particles. In this paper we review the cantilever principle, modes of operation, transduction principle and application of cantilever as sensor. MEMS applications operate the cantilever in either a static mode of operation or a dynamic mode of operation. The concept of stress concentration region (SCR) is used to increase stress occurred in the cantilever.
[328] vixra:1503.0172 [pdf]
A Note on Quantum Entanglement in Dempster-Shafer Evidence Theory
Dempster-Shafer evidence theory is an efficient mathematical tool to deal with uncertain information. In this theory, basic probability assignment (BPA) is the basic structure for the expression and inference of uncertainty. In this paper, quantum entanglement involved in Dempster-Shafer evidence theory is studied. A criterion is given to determine whether a BPA is in an entangled state or not. Based on that, the information volume involved in a BPA is discussed. The discussion shows that a non-quantum strategy (or observation) can not obtained all information contained in a BPA which is in an entangled state.
[329] vixra:1503.0131 [pdf]
A New Information Unit
It is well known that ”Bit” is the unit in information theory to measure information volume with Shannon entropy. However, one assumption to use bit as information unit is that each hypothesis is exclusive with each other. This assumption is also the basic assumption in probability theory which means that two events cannot happen synchronously. However, the assumption is violated such as the ”Entangled state”. A typical example is Schr?dinger’s cat where a cat may be simultaneously both alive and dead. At this situation, bit is not suitable to measure the information volume. To address this issue, a new information unit, called as ”Deng” and abbreviated as ”D”, is proposed based on Deng entropy. The proposed information unit may be used in entangle information processing and quantum information processing.
[330] vixra:1503.0074 [pdf]
Evidence Combination from an Evolutionary Game Theory Perspective
Dempster-Shafer evidence theory is a primary methodology for multi-source information fusion since it allows to deal with uncertain information. This theory is based on Dempster’s rule of combination to synthesize multiple evidences from various information sources. However, in some cases, counter-intuitive results may be obtained based on Dempster’s rule of combination. Lots of improved or new methods have been proposed to suppress the counter-intuitive results based on a physical perspective that minimizes the lost or deviation of original information. In this paper, inspired by evolutionary game theory, a biological and evolutionary perspective is considered to study the combination of evidences. An evolutionary combination rule (ECR) is proposed to mimick the evolution of propositions in a given population and finally find the biologically most supported proposition which is called as evolutionarily stable proposition (ESP) in this paper. Our proposed ECR provides new insight for the combination of multi-source information. Experimental results show that the proposed method is rational and effective.
[331] vixra:1503.0024 [pdf]
Switch or not ? the Simulation of Monty Hall Problem
The Monty Hall problem is a brain teaser,The problem was originally posed in a letter by Steve Selvin to the American Statistician in 1975. To nd out the principle of this conclusion which given by Marilyn vos Savant, and to nd if there is always advantage to the contestants chose to switch their choice .we have make a simulation of this problem.
[332] vixra:1502.0222 [pdf]
Deng Entropy: a Generalized Shannon Entropy to Measure Uncertainty
Shannnon entropy is an efficient tool to measure uncertain information. However, it cannot handle the more uncertain situation when the uncertainty is represented by basic probability assignment (BPA), instead of probability distribution, under the framework of Dempster Shafer evidence theory. To address this issue, a new entropy, named as Deng entropy, is proposed. The proposed Deng entropy is the generalization of Shannnon entropy. If uncertain information is represented by probability distribution, the uncertain degree measured by Deng entropy is the same as that of Shannnon’s entropy. Some numerical examples are illustrated to shown the efficiency of Deng entropy.
[333] vixra:1502.0204 [pdf]
Personal Multithreading: Account Snippet Proposals and Missing Account Indications
A modular way of making progress concerning personal multithreading is suggested: collecting account snippet proposals and missing account indications without an immediate need for integration into a coherent account. Six account snippets for personal multithreading are proposed and and four options for further contributions, that is missing account indications, on personal multi-threading are listed.
[334] vixra:1501.0088 [pdf]
Decision Taking Avoiding Agency
In the setting of outcome oriented decision taking (OODT), decisions are scarce events occurring in between of many other events which are not decisions. Activities of and agency by an agent may occur without any decisions being taken by that agent, with choice and action determination as the only mechanisms for actively resolving uncertainty about future behavior. Such behaviour will be referred to as decision taking avoiding agency. A model, or rather a preliminary qualitative description, of decision taking avoiding agency is provided for systems consisting of or controlled by a solitary natural or artificial agent, as well as for a group of agents.
[335] vixra:1411.0006 [pdf]
Weighted Neutrosophic Soft Sets Approach in a Multi-criteria Decision Making Problem
The paramount importance of decision making problem in an imprecise environment is becoming very much significant in recent years. In this paper we have studied weighted neutrosophic soft sets which is a hybridization of neutrosophic sets with soft sets corresponding to weighted parameters. We have considered here a multicriteria decision making problem as an application of weighted neutrosophic soft sets.
[336] vixra:1408.0008 [pdf]
The Grow-Shrink Strategy for Learning Markov Network Structures Constrained by Context-Specific Independences
Markov networks are models for compactly representing complex probability distributions. They are composed by a structure and a set of numerical weights. The structure qualitatively describes independences in the distribution, which can be exploited to factorize the distribution into a set of compact functions. A key application for learning structures from data is to automatically discover knowledge. In practice, structure learning algorithms focused on "knowledge discovery" present a limitation: they use a coarse-grained representation of the structure. As a result, this representation cannot describe context-specific independences. Very recently, an algorithm called CSPC was designed to overcome this limitation, but it has a high computational complexity. This work tries to mitigate this downside presenting CSGS, an algorithm that uses the Grow-Shrink strategy for reducing unnecessary computations. On an empirical evaluation, the structures learned by CSGS achieve competitive accuracies and lower computational complexity with respect to those obtained by CSPC.
[337] vixra:1405.0222 [pdf]
Learning Markov Networks Structures Constrained by Context-Specific Independences
This work focuses on learning the structure of Markov networks. Markov networks are parametric models for compactly representing complex probability distributions. These models are composed by: a structure and a set of numerical weights. The structure describes independences that hold in the distribution. Depending on the goal of learning intended by the user, structure learning algorithms can be divided into: density estimation algorithms, focusing on learning structures for answering inference queries; and knowledge discovery algorithms, focusing on learning structures for describing independences qualitatively. The latter algorithms present an important limitation for describing independences as they use a single graph, a coarse grain representation of the structure. However, many practical distributions present a flexible type of independences called context-specific independences, which cannot be described by a single graph. This work presents an approach for overcoming this limitation by proposing an alternative representation of the structure that named canonical model; and a novel knowledge discovery algorithm called CSPC for learning canonical models by using as constraints context-specific independences present in data. On an extensive empirical evaluation, CSPC learns more accurate structures than state-of-the-art density estimation and knowledge discovery algorithms. Moreover, for answering inference queries, our approach obtains competitive results against density estimation algorithms, significantly outperforming knowledge discovery algorithms.
[338] vixra:1312.0191 [pdf]
Device Search and Selection
Cyber-physical systems (CPS) represent the expansion in computerized interconnectivity. This phenomenon is also moving towards the Internet of Things (IoT) paradigm. Searching functionality plays a vital role in this domain. Many different types of search capabilities are required to build a comprehensive CPS architecture. In CPS, users may want to search smart devices and services. In this chapter, we discuss concepts and techniques related to device search and selection. We briefly discuss different types of device searching approaches where each has its own objectives and applications. One such device searching technique is context-aware searching. In this chapter, we present context-aware sensor search, selection and ranking model called CASSARAM in detail. This model addresses the challenge of efficiently selecting a subset of relevant sensors out of a large set of sensors with similar functionality and capabilities. CASSARAM takes into account user preferences and considers a broad range of sensor characteristics, such as reliability, accuracy, location, battery life, and many more. Later in the chapter, we discuss three different techniques that can be used to improve the efficiently of CASSARAM. We implemented the proof of concept software using Java. Testing and performance evaluation results are also discussed. We also highlight open research challenges and opportunities in order to support future research directions.
[339] vixra:1312.0116 [pdf]
Mobile Sensing Devices and Platforms
A cyber-physical system (CPS) is a system of collaborating computational elements con- trolling physical entities. CPS represents the next stage on the road to the creation of smart cities through the creation of an Internet of Things, data and services. Mobility is one of the major characteristic of both CPS and IoT. In this Chapter, we discuss mobile sensing platforms and their applications towards dierent but interrelated paradigms such as IoT, sensing as a service, and smart cities. We highlight and brie y discuss dierent types of mobile sensing platforms and functionalities they oer. Mobile sensing platforms are more oftenly integrated with smart phones and tablet devices. The resource constrained nature of the mobile devices requires dierent types of designs and architectural implementations. We proposed a software-based mobile sensing platform called Mobile Sensor Data Engine (MOSDEN). It is a plug-in-based scalable and extendible IoT middleware for mobile devices that provide an easy way to collect sensor data from both internal and external sensors. MOSDEN act as intermediary device that collects data from external sensors and upload to the cloud in real-time or on demand. We evaluate MOSDEN in both stand-alone and collaborative environments. The proof of concept is developed on Android platform.
[340] vixra:1309.0149 [pdf]
A Complexity of Bridge Double Dummy Problem
This paper presents an analysis of complexity of a bridge double dummy problem. Values of both, a state-space (search-space) complexity and a game tree complexity have been estimated.
[341] vixra:1309.0130 [pdf]
Storkey Learning Rules for Hopfield Networks
We summarize the Storkey Learning Rules for the Hopfield Model, and evaluate performance relative to other learning rules. Hopfield Models are normally used for auto-association, and Storkey Learning Rules have been found to have good balance between local learning and capacity. In this paper we outline different learning rules and summarise capacity results. Hopfield networks are related to Boltzmann Machines: they are the same as fully visible Boltzmann Machines in the zero temperature limit. Perhaps renewed interest in Boltzmann machines will produce renewed interest in Hopfield learning rules?
[342] vixra:1309.0129 [pdf]
An Effective Neutrosophic Set-Based Preprocessing Method for Face Recognition
Face recognition (FR) is a challenging task in biometrics due to various illuminations, poses, and possible noises. In this paper, we propose to apply a novel neutrosophic set (NS)- based preprocesssing method to simultaneously remove noise and enhance facial features in original face images.
[343] vixra:1309.0030 [pdf]
Neutrosophic Soft Set
In this paper we study the concept of neutrosophic set of Smarandache. We have introduced this concept in soft sets and de¯ned neutrosophic soft set. Some de¯nitions and operations have been intro- duced on neutrosophic soft set. Some properties of this concept have been established.
[344] vixra:1309.0029 [pdf]
A Neutrosophic Soft Set Approach to a Decision Making Problem
The decision making problems in an imprecise environment has found paramount importance in recent years. Here we consider an object recognition problem in an imprecise environment. The recognition strategy is based on multiobserver input parameter data set.
[345] vixra:1304.0011 [pdf]
Lie Algebrized Gaussians for Image Representation
We present an image representation method which is derived from analyzing Gaussian probability density function (\emph{pdf}) space using Lie group theory. In our proposed method, images are modeled by Gaussian mixture models (GMMs) which are adapted from a globally trained GMM called universal background model (UBM). Then we vectorize the GMMs based on two facts: (1) components of image-specific GMMs are closely grouped together around their corresponding component of the UBM due to the characteristic of the UBM adaption procedure; (2) Gaussian \emph{pdf}s form a Lie group, which is a differentiable manifold rather than a vector space. We map each Gaussian component to the tangent vector space (named Lie algebra) of Lie group at the manifold position of UBM. The final feature vector, named Lie algebrized Gaussians (LAG) is then constructed by combining the Lie algebrized Gaussian components with mixture weights. We apply LAG features to scene category recognition problem and observe state-of-the-art performance on 15Scenes benchmark.
[346] vixra:1303.0202 [pdf]
Mobile Robot Navigation Using Artificial Landmarks and GPS
移動ロボットのナビゲーションを行うにはロボットが 十分に現在位置と周囲の環境を認識する必要がある。そ のために、ロボットにレーザーレンジスキャナや超音波 センサ、カメラ、オドメトリ、GPS (Global Positioning System) 等のセンサを搭載することで、ロボットは現在 位置・姿勢、周囲の様子、移動距離、周囲の物との距離 等を知ることができるようになる。しかし、センサか らの情報には誤差が含まれており、移動している環境 や搭載しているセンサにより生じる誤差が累積される ことで、現在の位置がわからなくなり、走行経路から 外れて、目的地へたどりつけなくなることがある。正 しい位置を認識するには、定期的に誤差を解消し、位 置の校正を行う必要がある。位置校正を向上させるた めに、ロボットにSLAM (Simultaneous Localization and Mapping)[1] アルゴリズムやKalman Filter[2] などの制 御技術が導入される。
[347] vixra:1303.0068 [pdf]
A Neutrosophic Multicriteria Decision Making Method
This work presents a method of multicriteria decision making using neutrosophic sets. Besides studying some interesting mathematical properties of the method, algorithm viz neut-MCDM is presented. The work also furnishes the fundamentals of neutrosophic set theory succinctly, to provide a …rst introduction of neutrosophic sets for the MCDM community. To illustrate the computational details, neut-MCDM has been applied to the problem of university faculty selection against a given set of criteria.
[348] vixra:1301.0024 [pdf]
CloudSVM : Training an SVM Classifier in Cloud Computing Systems
In conventional distributed machine learning methods, distributed support vector machines (SVM) algorithms are trained over pre-configured in-tranet/internet environments to find out an optimal classifier. These methods are very complicated and costly for large datasets. Hence, we propose a method that is referred as the Cloud SVM training mechanism (CloudSVM) in a cloud computing environment with MapReduce technique for distributed machine learning applications. Accordingly, (i) SVM algorithm is trained in distributed cloud storage servers that work concurrently; (ii) merge all support vectors in every trained cloud node; and (iii) iterate these two steps until the SVM con-verges to the optimal classifier function. Single computer is incapable to train SVM algorithm with large scale data sets. The results of this study are im-portant for training of large scale data sets for machine learning applications. We provided that iterative training of splitted data set in cloud computing envi-ronment using SVM will converge to a global optimal classifier in finite iteration size.
[349] vixra:1208.0049 [pdf]
种飞机图像目标多特征信息融合识别方法
种基于概率神经网络(Probabilistic neural networks, PNN) 和DSmT 推理(Dezert-Smarandache theory) 的飞机图像目标多特征融合识别算法. 针对提取的多个图像特征量, 利用数据融合的思想对来自图像目标各个特征量提供的 信息进行融合处理. 首先, 对图像进行二值化预处理, 并提取Hu 矩、归一化转动惯量、仿射不变矩、轮廓离散化参数和奇异 值特征5 个特征量; 其次, 针对Dezert-Smarandache Theory 理论中信度赋值构造困难的问题, 利用PNN 网络, 构造目标识别率矩阵, 通过目标识 别率矩阵对证据源进行信度赋值; 然后, 用DSmT 组合规则在决策级层进行融合, 从而完成对飞机目标的识别; 最后, 在目标 图像小畸变情形下, 将本文提出的图像多特征信息融合方法和单一特征方法进行了对比测试实验, 结果表明本文方法在同等 条件下正确识别率得到了很大提高, 同时达到实时性要求, 而且具有有效拒判能力和目标图像尺寸不敏感性. 即使在大畸变情 况下, 识别率也能达到89.3 %.
[350] vixra:1207.0057 [pdf]
Extended PCR Rules for Dynamic Frames
In most of classical fusion problems modeled from belief functions, the frame of discernment is considered as static. This means that the set of elements in the frame and the underlying integrity constraints of the frame are fixed forever and they do not change with time. In some applications, like in target tracking for example, the use of such invariant frame is not very appropriate because it can truly change with time. So it is necessary to adapt the Proportional Conflict Redistribution fusion rules (PCR5 and PCR6) for working with dynamical frames. In this paper, we propose an extension of PCR5 and PCR6 rules for working in a frame having some non-existential integrity constraints. Such constraints on the frame can arise in tracking applications by the destruction of targets for example. We show through very simple examples how these new rules can be used for the belief revision process.
[351] vixra:1010.0039 [pdf]
Fusion of Imprecise Qualitative Information
In this paper, we present a new 2-tuple linguistic representation model, i.e. Distribution Function Model (DFM), for combining imprecise qualitative information using fusion rules drawn from Dezert-Smarandache Theory (DSmT) framework. Such new approach allows to preserve the precision and efficiency of the combination of linguistic information in the case of either equidistant or unbalanced label model. Some basic operators on imprecise 2-tuple labels are presented together with their extensions for imprecise 2-tuple labels. We also give simple examples to show how precise and imprecise qualitative information can be combined for reasoning under uncertainty. It is concluded that DSmT can deal efficiently with both precise and imprecise quantitative and qualitative beliefs, which extends the scope of this theory.
[352] vixra:1004.0094 [pdf]
Neutrosophy in Situation Analysis
In situation analysis (SA), an agent observing a scene receives information from heterogeneous sources of information including for example remote sensing devices, human reports and databases. The aim of this agent is to reach a certain level of awareness of the situation in order to make decisions. For the purpose of applications, this state of awareness can be conceived as a state of knowledge in the classical epistemic logic sense. Considering the logical connection between belief and knowledge, the challenge for the designer is to transform the raw, imprecise, conflictual and often paradoxical information received from the different sources into statements understandable by both man and machines. Hence, quantitative (i.e. measuring the world) and qualitative (i.e. reasoning about the structure of the world) information processing coexist in SA. A great challenge in SA is the conciliation of both aspects in mathematical and logical frameworks. As a consequence, SA applications need frameworks general enough to take into account the different types of uncertainty and information present in the SA context, doubled with a semantics allowing meaningful reasoning on situations. The aim of this paper is to evaluate the capacity of neutrosophic logic and Dezert- Smarandache theory (DSmT) to cope with the ontological and epistemological problems of SA.
[353] vixra:1004.0052 [pdf]
A Simple Proportional Conflict Redistribution Rule
One proposes a first alternative rule of combination to WAO (Weighted Average Operator) proposed recently by Josang, Daniel and Vannoorenberghe, called Proportional Conflict Redistribution rule (denoted PCR1). PCR1 and WAO are particular cases of WO (the Weighted Operator) because the conflicting mass is redistributed with respect to some weighting factors. In this first PCR rule, the proportionalization is done for each non-empty set with respect to the non-zero sum of its corresponding mass matrix - instead of its mass column average as in WAO, but the results are the same as Ph. Smets has pointed out. Also, we extend WAO (which herein gives no solution) for the degenerate case when all column sums of all non-empty sets are zero, and then the conflicting mass is transferred to the non-empty disjunctive form of all non-empty sets together; but if this disjunctive form happens to be empty, then one considers an open world (i.e. the frame of discernment might contain new hypotheses) and thus all conflicting mass is transferred to the empty set. In addition to WAO, we propose a general formula for PCR1 (WAO for non-degenerate cases). Several numerical examples and comparisons with other rules for combination of evidence published in literature are presented too. Another distinction between these alternative rules is that WAO is defined on the power set, while PCR1 is on the hyper-power set (Dedekind's lattice). A nice feature of PCR1, is that it works not only on non-degenerate cases but also on degenerate cases as well appearing in dynamic fusion, while WAO gives the sum of masses in this cases less than 1 (WAO does not work in these cases). Meanwhile we show that PCR1 and WAO do not preserve unfortunately the neutrality property of the vacuous belief assignment though the fusion process. This severe drawback can however be easily circumvented by new PCR rules presented in a companion paper.
[354] vixra:1003.0208 [pdf]
Advances and Applications of DSmT for Information Fusion Collected Works Volume 1
This book is devoted to an emerging branch of Information Fusion based on new approach for modelling the fusion problematic when the information provided by the sources is both uncertain and (highly) conflicting. This approach, known in literature as DSmT (standing for Dezert-Smarandache Theory), proposes new useful rules of combinations. We gathered in this volume a presentation of DSmT from the beginning to the latest development. Part 1 of this book presents the current state-of-the-art on theoretical investigations while Part 2 presents several applications of this new theory. We hope that this first book on DSmT will stir up some interests to researchers and engineers working in data fusion and in artificial intelligence. Many simple but didactic examples are proposed throughout the book. As a young emerging theory, DSmT is probably not exempt from improvements and its development will continue to evolve over the years. We just want through this book to propose a new look at the Information Fusion problematic and open a new track to attack the combination of information.
[355] vixra:1003.0197 [pdf]
Application of Probabilistic PCR5 Fusion Rule for Multisensor Target Tracking
This paper defines and implements a non-Bayesian fusion rule for combining densities of probabilities estimated by local (non-linear) filters for tracking a moving target by passive sensors. This rule is the restriction to a strict probabilistic paradigm of the recent and efficient Proportional Conflict Redistribution rule no 5 (PCR5) developed in the DSmT framework for fusing basic belief assignments. A sampling method for probabilistic PCR5 (p-PCR5) is defined. It is shown that p-PCR5 is more robust to an erroneous modeling and allows to keep the modes of local densities and preserve as much as possible the whole information inherent to each densities to combine. In particular, p-PCR5 is able of maintaining multiple hypotheses/modes after fusion, when the hypotheses are too distant in regards to their deviations. This new p-PCR5 rule has been tested on a simple example of distributed non-linear filtering application to show the interest of such approach for future developments. The non-linear distributed filter is implemented through a basic particles filtering technique. The results obtained in our simulations show the ability of this p-PCR5-based filter to track the target even when the models are not well consistent in regards to the initialization and real cinematic. Keywords: Filtering, Robust estimation, non-Bayesian fusion rule, PCR5, Particle filtering.
[356] vixra:1003.0196 [pdf]
Qualitative Belief Conditioning Rules (QBCR)
In this paper we extend the new family of (quantitative) Belief Conditioning Rules (BCR) recently developed in the Dezert-Smarandache Theory (DSmT) to their qualitative counterpart for belief revision. Since the revision of quantitative as well as qualitative belief assignment given the occurrence of a new event (the conditioning constraint) can be done in many possible ways, we present here only what we consider as the most appealing Qualitative Belief Conditioning Rules (QBCR) which allow to revise the belief directly with words and linguistic labels and thus avoids the introduction of ad-hoc translations of quantitative beliefs into quantitative ones for solving the problem.
[357] vixra:1003.0195 [pdf]
Enrichment of Qualitative Beliefs for Reasoning Under Uncertainty
This paper deals with enriched qualitative belief functions for reasoning under uncertainty and for combining information expressed in natural language through linguistic labels. In this work, two possible enrichments (quantitative and/or qualitative) of linguistic labels are considered and operators (addition, multiplication, division, etc) for dealing with them are proposed and explained. We denote them qe-operators, qe standing for "qualitative-enriched" operators. These operators can be seen as a direct extension of the classical qualitative operators (q-operators) proposed recently in the Dezert-Smarandache Theory of plausible and paradoxist reasoning (DSmT). q-operators are also justified in details in this paper. The quantitative enrichment of linguistic label is a numerical supporting degree in [0,∞), while the qualitative enrichment takes its values in a finite ordered set of linguistic values. Quantitative enrichment is less precise than qualitative enrichment, but it is expected more close with what human experts can easily provide when expressing linguistic labels with supporting degrees. Two simple examples are given to show how the fusion of qualitative-enriched belief assignments can be done.
[358] vixra:1003.0165 [pdf]
A Neutrosophic Description Logic
Description Logics (DLs) are appropriate, widely used, logics for managing structured knowledge. They allow reasoning about individuals and concepts, i.e. set of individuals with common properties. Typically, DLs are limited to dealing with crisp, well defined concepts. That is, concepts for which the problem whether an individual is an instance of it is a yes/no question. More often than not, the concepts encountered in the real world do not have a precisely defined criteria of membership: we may say that an individual is an instance of a concept only to a certain degree, depending on the individual's properties. The DLs that deal with such fuzzy concepts are called fuzzy DLs. In order to deal with fuzzy, incomplete, indeterminate and inconsistent concepts, we need to extend the capabilities of fuzzy DLs further. In this paper we will present an extension of fuzzy ALC, combining Smarandache's neutrosophic logic with a classical DL. In particular, concepts become neutrosophic (here neutrosophic means fuzzy, incomplete, indeterminate and inconsistent), thus, reasoning about such neutrosophic concepts is supported. We will define its syntax, its semantics, describe its properties and present a constraint propagation calculus for reasoning in it.
[359] vixra:1003.0161 [pdf]
DSmT: a New Paradigm Shift for Information Fusion
The management and combination of uncertain, imprecise, fuzzy and even paradoxical or high conflicting sources of information has always been and still remains of primal importance for the development of reliable information fusion systems. In this short survey paper, we present the theory of plausible and paradoxical reasoning, known as DSmT (Dezert-Smarandache Theory) in literature, developed for dealing with imprecise, uncertain and potentially highly conflicting sources of information. DSmT is a new paradigm shift for information fusion and recent publications have shown the interest and the potential ability of DSmT to solve fusion problems where Dempster's rule used in Dempster-Shafer Theory (DST) provides counter-intuitive results or fails to provide useful result at all. This paper is focused on the foundations of DSmT and on its main rules of combination (classic, hybrid and Proportional Conflict Redistribution rules). Shafer's model on which is based DST appears as a particular and specific case of DSm hybrid model which can be easily handled by DSmT as well. Several simple but illustrative examples are given throughout this paper to show the interest and the generality of this new theory.
[360] vixra:1003.0159 [pdf]
An Introduction to the DSm Theory for the Combination of Paradoxical, Uncertain, and Imprecise Sources of Information
The management and combination of uncertain, imprecise, fuzzy and even paradoxical or high conflicting sources of information has always been, and still remains today, of primal importance for the development of reliable modern information systems involving artificial reasoning. In this introduction, we present a survey of our recent theory of plausible and paradoxical reasoning, known as Dezert-Smarandache Theory (DSmT) in the literature, developed for dealing with imprecise, uncertain and paradoxical sources of information. We focus our presentation here rather on the foundations of DSmT, and on the two important new rules of combination, than on browsing specific applications of DSmT available in literature. Several simple examples are given throughout the presentation to show the efficiency and the generality of this new approach.
[361] vixra:1003.0157 [pdf]
Fusion of Qualitative Beliefs Using DSmT
This paper introduces the notion of qualitative belief assignment to model beliefs of human experts expressed in natural language (with linguistic labels). We show how qualitative beliefs can be efficiently combined using an extension of Dezert-Smarandache Theory (DSmT) of plausible and paradoxical quantitative reasoning to qualitative reasoning. We propose a new arithmetic on linguistic labels which allows a direct extension of classical DSm fusion rule or DSm Hybrid rules. An approximate qualitative PCR5 rule is also proposed jointly with a Qualitative Average Operator. We also show how crisp or interval mappings can be used to deal indirectly with linguistic labels. A very simple example is provided to illustrate our qualitative fusion rules.
[362] vixra:1003.0156 [pdf]
Target Type Tracking with PCR5 and Dempster's Rules: a Comparative Analysis
In this paper we consider and analyze the behavior of two combinational rules for temporal (sequential) attribute data fusion for target type estimation. Our comparative analysis is based on Dempster's fusion rule proposed in Dempster-Shafer Theory (DST) and on the Proportional Conflict Redistribution rule no. 5 (PCR5) recently proposed in Dezert-Smarandache Theory (DSmT). We show through very simple scenario and Monte-Carlo simulation, how PCR5 allows a very efficient Target Type Tracking and reduces drastically the latency delay for correct Target Type decision with respect to Demspter's rule. For cases presenting some short Target Type switches, Demspter's rule is proved to be unable to detect the switches and thus to track correctly the Target Type changes. The approach proposed here is totally new, efficient and promising to be incorporated in real-time Generalized Data Association - Multi Target Tracking systems (GDA-MTT) and provides an important result on the behavior of PCR5 with respect to Dempster's rule. The MatLab source code is provided in [5].
[363] vixra:1003.0154 [pdf]
The Combination of Paradoxical, Uncertain and Imprecise Sources of Information based on DSmT and Neutro-Fuzzy Inference
The management and combination of uncertain, imprecise, fuzzy and even paradoxical or high conflicting sources of information has always been, and still remains today, of primal importance for the development of reliable modern information systems involving artificial reasoning. In this chapter, we present a survey of our recent theory of plausible and paradoxical reasoning, known as Dezert-Smarandache Theory (DSmT) in the literature, developed for dealing with imprecise, uncertain and paradoxical sources of information. We focus our presentation here rather on the foundations of DSmT, and on the two important new rules of combination, than on browsing specific applications of DSmT available in literature. Several simple examples are given throughout the presentation to show the efficiency and the generality of this new approach. The last part of this chapter concerns the presentation of the neutrosophic logic, the neutro-fuzzy inference and its connection with DSmT. Fuzzy logic and neutrosophic logic are useful tools in decision making after fusioning the information using the DSm hybrid rule of combination of masses.
[364] vixra:1003.0152 [pdf]
The Generalized Pignistic Transformation
This paper presents in detail the generalized pignistic transformation (GPT) succinctly developed in the Dezert-Smarandache Theory (DSmT) framework as a tool for decision process. The GPT allows to provide a subjective probability measure from any generalized basic belief assignment given by any corpus of evidence. We mainly focus our presentation on the 3D case and provide the complete result obtained by the GPT and its validation drawn from the probability theory.
[365] vixra:1003.0150 [pdf]
On the Tweety Penguin Triangle Problem
In this paper, one studies the famous well-known and challenging Tweety Penguin Triangle Problem (TPTP or TP2) pointed out by Judea Pearl in one of his books. We first present the solution of the TP2 based on the fallacious Bayesian reasoning and prove that reasoning cannot be used to conclude on the ability of the penguin-bird Tweety to fly or not to fly. Then we present in details the counter-intuitive solution obtained from the Dempster-Shafer Theory (DST). Finally, we show how the solution can be obtained with our new theory of plausible and paradoxical reasoning (DSmT)
[366] vixra:1003.0149 [pdf]
Infinite Classes of Counter-Examples to the Dempster's Rule of Combination
This paper presents several classes of fusion problems which cannot be directly attacked by the classical mathematical theory of evidence, also known as the Dempster-Shafer Theory (DST) either because the Shafer's model for the frame of discernment is impossible to obtain or just because the Dempster's rule of combination fails to provide coherent results (or no result at all). We present and discuss the potentiality of the DSmT combined with its classical (or hybrid) rule of combination to attack these infinite classes of fusion problems.
[367] vixra:1003.0148 [pdf]
Combining Uncertain and Paradoxical Evidences for DSm Hybrid Models
This paper presents a general method for combining uncertain and paradoxical source of evidences for a wide class of fusion problems. From the foundations of the Dezert-Smarandache Theory (DSmT) we show how the DSm rule of combination can be adapted to take into account all possible integrity constraints (if any) of the problem under consideration due to the true nature of elements/concepts involved into it. We show how the Shafer's model can be considered as a specific DSm hybrid model and be easily handled by our approach and a new efficient rule of combination different from the Dempster's rule is obtained. Several simple examples are also provided to show the efficiency and the generality of the approach proposed in this work.
[368] vixra:1003.0146 [pdf]
Partial Ordering of Hyper-Powersets and Matrix Representation of Belief Functions Within DSmT
In this paper, we examine several issues for ordering or partially ordering elements of hyperpowertsets involved in the recent theory of plausible, uncertain and paradoxical reasoning (DSmT) developed by the authors. We will show the benefit of some of these issues to obtain a nice and useful matrix representation of belief functions.
[369] vixra:1003.0100 [pdf]
On the Blackman's Association Problem
Modern multitarget-multisensor tracking systems involve the development of reliable methods for the data association and the fusion of multiple sensor information, and more specifically the partioning of observations into tracks. This paper discusses and compares the application of Dempster-Shafer Theory (DST) and the Dezert-Smarandache Theory (DSmT) methods to the fusion of multiple sensor attributes for target identification purpose. We focus our attention on the paradoxical Blackman's association problem and propose several approaches to outperfom Blackman's solution. We clarify some preconceived ideas about the use of degree of conflict between sources as potential criterion for partitioning evidences.
[370] vixra:1003.0064 [pdf]
Adaptative Combination Rule and Proportional Conflict Redistribution Rule for Information Fusion
This paper presents two new promising combination rules for the fusion of uncertain and potentially highly conflicting sources of evidences in the theory of belief functions established first in Dempster-Shafer Theory (DST) and then recently extended in Dezert-Smarandache Theory (DSmT). Our work is to provide here new issues to palliate the well-known limitations of Dempster's rule and to work beyond its limits of applicability. Since the famous Zadeh's criticism of Dempster's rule in 1979, many researchers have proposed new interesting alternative rules of combination to palliate the weakness of Dempster's rule in order to provide acceptable results specially in highly conflicting situations. In this work, we present two new combination rules: the class of Adaptive Combination Rules (ACR) and a new efficient Proportional Conflict Redistribution (PCR) rule. Both rules allow to deal with highly conflicting sources for static and dynamic fusion applications. We present some interesting properties for ACR and PCR rules and discuss some simulation results obtained with both rules for Zadeh's problem and for a target identification problem.
[371] vixra:1003.0059 [pdf]
Combination of Qualitative Information with 2-Tuple Linguistic Representation in Dezert-Smarandache Theory
Modern systems for information retrieval, fusion and management need to deal more and more with information coming from human experts usually expressed qualitatively in natural language with linguistic labels. In this paper, we propose and use two new 2-Tuple linguistic representation models (i.e., a distribution function model (DFM) and an improved Herrera-Martínez's model) jointly with the fusion rules developed in Dezert-Smarandache Theory (DSmT), in order to combine efficiently qualitative information expressed in term of qualitative belief functions. The two models both preserve the precision and improve the efficiency of the fusion of linguistic information expressing the global expert's opinion. However, DFM is more general and efficient than the latter, especially for unbalanced linguistic labels. Some simple examples are also provided to show how the 2-Tuple qualitative fusion rules are performed and their advantages.