scholarly journals A self-supervised, physics-aware, Bayesian neural network architecture for modelling galaxy emission-line kinematics

Author(s):  
James M Dawson ◽  
Timothy A Davis ◽  
Edward L Gomez ◽  
Justus Schock

Abstract In the upcoming decades large facilities, such as the SKA, will provide resolved observations of the kinematics of millions of galaxies. In order to assist in the timely exploitation of these vast datasets we blackexplore the use of a self-supervised, physics aware neural network capable of Bayesian kinematic modelling of galaxies. We demonstrate the network’s ability to model the kinematics of cold gas in galaxies with an emphasis on recovering physical parameters and accompanying modelling errors. The model is able to recover rotation curves, inclinations and disc scale lengths for both CO and H i data which match well with those found in the literature. The model is also able to provide modelling errors over learned parameters thanks to the application of quasi-Bayesian Monte-Carlo dropout. This work shows the promising use of machine learning, and in particular self-supervised neural networks, in the context of kinematically modelling galaxies. This work represents the first steps in applying such models for kinematic fitting and we propose that variants of our model would seem especially suitable for enabling emission-line science from upcoming surveys with e.g. the SKA, allowing fast exploitation of these large datasets.

2021 ◽  
Author(s):  
◽  
Martin Mundt

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.


The applications of a content-based image retrieval system in fields such as multimedia, security, medicine, and entertainment, have been implemented on a huge real-time database by using a convolutional neural network architecture. In general, thus far, content-based image retrieval systems have been implemented with machine learning algorithms. A machine learning algorithm is applicable to a limited database because of the few feature extraction hidden layers between the input and the output layers. The proposed convolutional neural network architecture was successfully implemented using 128 convolutional layers, pooling layers, rectifier linear unit (ReLu), and fully connected layers. A convolutional neural network architecture yields better results of its ability to extract features from an image. The Euclidean distance metric is used for calculating the similarity between the query image and the database images. It is implemented using the COREL database. The proposed system is successfully evaluated using precision, recall, and F-score. The performance of the proposed method is evaluated using the precision and recall.


IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


2019 ◽  
pp. 64-75
Author(s):  
A. A. Yarygin ◽  
B. H. Aytbaev ◽  
A. Yu. Kanyshev ◽  
E. A. Alekseeva

For sterling application of scientific and engineered achievements in field of bionic prosthesis it’s required to provide comfortable  and natural human‑prosthesis interface for an end‑user. In this article we are looking into ways and methods of analysis of the  signal collected through electromyography activity of muscles on the skin surface. Such signal is nonstationary and unstable  by  its  nature,  dependent  on  various  factors.  sEMG  based  interface  has  several  unsolved  problem  at  the  moment,  such  as  insufficient accuracy of recognition and noticeable delay caused by signal recognition and processing. Article is dedicated to  application of deep machine learning required to provide decent recognition of electromyography signals. In the course of the  research hardware was developed to register muscle activity. Data collecting system and algorithms of gesture recognition have  been designed as well. In conclusion decent results were achieved by using convolutional neural network, with two‑dimensional input, since data stream has obvious translational orientation. In the future, modification of neural network architecture, learning  algorithms and experiments with structure of data are planned.


Author(s):  
Vincent Grari ◽  
Sylvain Lamprier ◽  
Marcin Detyniecki

The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. While plenty of fair algorithms have been proposed recently to tackle this challenge for discrete variables, only a few ideas exist for continuous ones. The objective in this paper is to ensure some independence level between the outputs of regression models and any given continuous sensitive variables. For this purpose, we use the Hirschfeld-Gebelein-Rényi (HGR) maximal correlation coefficient as a fairness metric. We propose to minimize the HGR coefficient directly with an adversarial neural network architecture. The idea is to predict the output Y while minimizing the ability of an adversarial neural network to find the estimated transformations which are required to predict the HGR coefficient. We empirically assess and compare our approach and demonstrate significant improvements on previously presented work in the field.


10.2196/14502 ◽  
2019 ◽  
Vol 7 (4) ◽  
pp. e14502
Author(s):  
Po-Ting Lai ◽  
Wei-Liang Lu ◽  
Ting-Rung Kuo ◽  
Chia-Ru Chung ◽  
Jen-Chieh Han ◽  
...  

Background Research on disease-disease association (DDA), like comorbidity and complication, provides important insights into disease treatment and drug discovery, and a large body of the literature has been published in the field. However, using current search tools, it is not easy for researchers to retrieve information on the latest DDA findings. First, comorbidity and complication keywords pull up large numbers of PubMed studies. Second, disease is not highlighted in search results. Finally, DDA is not identified, as currently no disease-disease association extraction (DDAE) dataset or tools are available. Objective As there are no available DDAE datasets or tools, this study aimed to develop (1) a DDAE dataset and (2) a neural network model for extracting DDA from the literature. Methods In this study, we formulated DDAE as a supervised machine learning classification problem. To develop the system, we first built a DDAE dataset. We then employed two machine learning models, support vector machine and convolutional neural network, to extract DDA. Furthermore, we evaluated the effect of using the output layer as features of the support vector machine-based model. Finally, we implemented large margin context-aware convolutional neural network architecture to integrate context features and convolutional neural networks through the large margin function. Results Our DDAE dataset consisted of 521 PubMed abstracts. Experiment results showed that the support vector machine-based approach achieved an F1 measure of 80.32%, which is higher than the convolutional neural network-based approach (73.32%). Using the output layer of convolutional neural network as a feature for the support vector machine does not further improve the performance of support vector machine. However, our large margin context-aware-convolutional neural network achieved the highest F1 measure of 84.18% and demonstrated that combining the hinge loss function of support vector machine with a convolutional neural network into a single neural network architecture outperforms other approaches. Conclusions To facilitate the development of text-mining research for DDAE, we developed the first publicly available DDAE dataset consisting of disease mentions, Medical Subject Heading IDs, and relation annotations. We developed different conventional machine learning models and neural network architectures and evaluated their effects on our DDAE dataset. To further improve DDAE performance, we propose an large margin context-aware-convolutional neural network model for DDAE that outperforms other approaches.


2021 ◽  
Vol 11 (16) ◽  
pp. 7181
Author(s):  
Jakub Caputa ◽  
Daria Łukasik ◽  
Maciej Wielgosz ◽  
Michał Karwatowski ◽  
Rafał Frączek ◽  
...  

We present the experiment results to use the YOLOv3 neural network architecture to automatically detect tumor cells in cytological samples taken from the skin in canines. A rich dataset of 1219 smeared sample images with 28,149 objects was gathered and annotated by the vet doctor to perform the experiments. It covers three types of common round cell neoplasms: mastocytoma, histiocytoma, and lymphoma. The dataset has been thoroughly described in the paper and is publicly available. The YOLOv3 neural network architecture was trained using various schemes involving original dataset modification and the different model parameters. The experiments showed that the prototype model achieved 0.7416 mAP, which outperforms the state-of-the-art machine learning and human estimated results. We also provided a series of analyses that may facilitate ML-based solutions by casting more light on some aspects of its performance. We also presented the main discrepancies between ML-based and human-based diagnoses. This outline may help depict the scenarios and how the automated tools may support the diagnosis process.


2020 ◽  
Vol 10 (17) ◽  
pp. 5988
Author(s):  
Saleh Albahli ◽  
Fatimah Alhassan ◽  
Waleed Albattah ◽  
Rehan Ullah Khan

Neural networks have several useful applications in machine learning. However, benefiting from the neural-network architecture can be tricky in some instances due to the large number of parameters that can influence performance. In general, given a particular dataset, a data scientist cannot do much to improve the efficiency of the model. However, by tuning certain hyperparameters, the model’s accuracy and time of execution can be improved. Hence, it is of utmost importance to select the optimal values of hyperparameters. Choosing the optimal values of hyperparameters requires experience and mastery of the machine learning paradigm. In this paper, neural network-based architectures are tested based on altering the values of hyperparameters for handwritten-based digit recognition. Various neural network-based models are used to analyze different aspects of the same, primarily accuracy based on hyperparameter values. The extensive experimentation setup in this article should, therefore, provide the most accurate and time-efficient solution models. Such an evaluation will help in selecting the optimized values of hyperparameters for similar tasks.


Sign in / Sign up

Export Citation Format

Share Document