Multitask learning for Transformers with application to large-scale single-cell transcriptomes

Mapping Intimacies ◽

10.1101/2020.02.05.935239 ◽

2020 ◽

Author(s):

Minxing Pang ◽

Jesper Tegnér

Keyword(s):

Single Cell ◽

Large Scale ◽

Single Cell Analysis ◽

Brain Atlas ◽

Biological Knowledge ◽

Data Sets ◽

Cell Analysis ◽

Large Scale Data ◽

Components Analysis ◽

Scale Data

AbstractRecent progress in machine learning provides competitive methods for bioinformatics in many traditional topics, such as transcriptomes sequence and single-cell analysis. However, discovering biomedical correlation of cells that are present across large-scale data sets remains challenging. Our attention-based neural network module with 300 million parameters is able to capture biological knowledge in a data-driven way. The module contains high-quality embedding, taxonomy analysis and similarity measurement. We tested the model on Mouse Brain Atlas, which consists of 160,000 cells and 25,000 genes. Our module obtained some interesting findings that have been verified by biologists and got better performance when benchmarked against autoencoder and principal components analysis.

Download Full-text

Faculty Opinions recommendation of Comparative assessment of large-scale data sets of protein-protein interactions.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1006598.82257 ◽

2002 ◽

Author(s):

Rob Russell

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Comparative Assessment ◽

Data Sets ◽

Protein Protein Interactions ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Pattern Recognition in Large-Scale Data Sets: Application in Integrated Circuit Manufacturing

Big Data Analytics - Lecture Notes in Computer Science ◽

10.1007/978-3-319-03689-2_13 ◽

2013 ◽

pp. 185-196 ◽

Cited By ~ 1

Author(s):

Choudur K. Lakshminarayan ◽

Michael I. Baron

Keyword(s):

Pattern Recognition ◽

Integrated Circuit ◽

Large Scale ◽

Data Sets ◽

Integrated Circuit Manufacturing ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Discovering Latent Class Labels for Multi-Label Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/423 ◽

2020 ◽

Author(s):

Jun Huang ◽

Linchuan Xu ◽

Jing Wang ◽

Lei Feng ◽

Kenji Yamanishi

Keyword(s):

Large Scale ◽

Latent Class ◽

Training Data ◽

Data Sets ◽

Robust Learning ◽

Large Scale Data ◽

Novel Approach ◽

Fixed Set ◽

Class Labels ◽

Scale Data

Existing multi-label learning (MLL) approaches mainly assume all the labels are observed and construct classification models with a fixed set of target labels (known labels). However, in some real applications, multiple latent labels may exist outside this set and hide in the data, especially for large-scale data sets. Discovering and exploring the latent labels hidden in the data may not only find interesting knowledge but also help us to build a more robust learning model. In this paper, a novel approach named DLCL (i.e., Discovering Latent Class Labels for MLL) is proposed which can not only discover the latent labels in the training data but also predict new instances with the latent and known labels simultaneously. Extensive experiments show a competitive performance of DLCL against other state-of-the-art MLL approaches.

Download Full-text

WebViz: A Web-Based Collaborative Interactive Visualization System for Large-Scale Data Sets

Lecture Notes in Earth System Sciences - GPU Solutions to Multi-scale Problems in Science and Engineering ◽

10.1007/978-3-642-16405-7_37 ◽

2013 ◽

pp. 587-606 ◽

Cited By ~ 2

Author(s):

Yichen Zhou ◽

Robin M. Weiss ◽

Elizabeth McArthur ◽

David Sanchez ◽

Xiang Yao ◽

...

Keyword(s):

Large Scale ◽

Interactive Visualization ◽

Data Sets ◽

Web Based ◽

Visualization System ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Artificial Neural Network Models for Large-Scale Data

Advances in Data Mining and Database Management - Handbook of Research on Big Data and the IoT ◽

10.4018/978-1-5225-7432-3.ch022 ◽

2019 ◽

pp. 406-439

Author(s):

Vo Ngoc Phu ◽

Vo Thi Ngoc Tran

Keyword(s):

Large Scale ◽

Network Models ◽

Data Sets ◽

Neural Network Models ◽

Large Scale Data ◽

The World ◽

Commercial Applications ◽

Artificial Neural Network Models ◽

Scale Data ◽

Large Scale Data Sets

Artificial intelligence (ARTINT) and information have been famous fields for many years. A reason has been that many different areas have been promoted quickly based on the ARTINT and information, and they have created many significant values for many years. These crucial values have certainly been used more and more for many economies of the countries in the world, other sciences, companies, organizations, etc. Many massive corporations, big organizations, etc. have been established rapidly because these economies have been developed in the strongest way. Unsurprisingly, lots of information and large-scale data sets have been created clearly from these corporations, organizations, etc. This has been the major challenges for many commercial applications, studies, etc. to process and store them successfully. To handle this problem, many algorithms have been proposed for processing these big data sets.

Download Full-text

Q-Learning with Fisher Score for Feature Selection of Large-Scale Data Sets

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-030-82147-0_25 ◽

2021 ◽

pp. 306-318

Author(s):

Min Gan ◽

Li Zhang

Keyword(s):

Feature Selection ◽

Large Scale ◽

Data Sets ◽

Fisher Score ◽

Q Learning ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets ◽

Selection Of

Download Full-text

Collection, Storage, Protection, and Sharing Issues With Large-Scale Data Sets

10.4135/9781473999053 ◽

2017 ◽

Author(s):

Shirley M. Matteson ◽

Sonya E. Sherrod ◽

Sevket Ceyhun Cetin

Keyword(s):

Large Scale ◽

Data Sets ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

On the Effectiveness of Hybrid Canopy with Hoeffding Adaptive Naive Bayes Trees

International Journal of Applied Evolutionary Computation ◽

10.4018/ijaec.2017040102 ◽

2017 ◽

Vol 8 (2) ◽

pp. 30-43

Author(s):

Mrutyunjaya Panda

Keyword(s):

Big Data ◽

Clustering Analysis ◽

Large Scale ◽

Data Sets ◽

Recent Past ◽

Large Scale Data ◽

Huge Data ◽

With Memory ◽

Memory Constraints ◽

Scale Data

The Big Data, due to its complicated and diverse nature, poses a lot of challenges for extracting meaningful observations. This sought smart and efficient algorithms that can deal with computational complexity along with memory constraints out of their iterative behavior. This issue may be solved by using parallel computing techniques, where a single machine or a multiple machine can perform the work simultaneously, dividing the problem into sub problems and assigning some private memory to each sub problems. Clustering analysis are found to be useful in handling such a huge data in the recent past. Even though, there are many investigations in Big data analysis are on, still, to solve this issue, Canopy and K-Means++ clustering are used for processing the large-scale data in shorter amount of time with no memory constraints. In order to find the suitability of the approach, several data sets are considered ranging from small to very large ones having diverse filed of applications. The experimental results opine that the proposed approach is fast and accurate.

Download Full-text