document mining Latest Research Papers

AbstractMass spectrometry methods of peptide identification involve comparing observed tandem spectra with in-silico derived spectrum models. Presented here is a proteomics search engine that offers a new variation of the standard approach, with improved results. The proposed method employs information theory and probabilistic information retrieval on a pre-computed and indexed fragmentation database generating a peptide-to-spectrum match (PSM) score modeled on fragment ion frequency. As a result, the direct application of modern document mining, allows for treating the collection of peptides as a corpus and corresponding fragment ions as indexable words, leveraging ready-built search engines and common predefined ranking algorithms. Fast and accurate PSM matches are achieved yielding a 5-10% higher rate of peptide identities than current database mining methods. Immediate applications of this search engine are aimed at identifying peptides from large sequence databases consisting of homologous proteins with minor sequence variations, such as genetic variation expected in the human population.

Download Full-text

Discriminant Pearson Correlative Feature Selection based Gentle Adaboost Classification for Medical Document Mining

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c5391.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 3777-3783

Keyword(s):

Feature Selection ◽

False Positive Rate ◽

Significant Feature ◽

Correlative Analysis ◽

Pubmed Database ◽

Search Processes ◽

Positive Rate ◽

Document Mining ◽

Medical Document ◽

Selection Of

This paper examines Discriminant Pearson Correlative Analysis Based Multivariate Gentle Adaboost Classification (DPCA-MGAC) and it is used to improve the performance of medical document mining with minimum time complexity. A large number of documents are collected from PubMed databases through the semantic-based search. Processes such as removing stop words, stemming, features identification, selection of features i.e., relevant keywords for document classification are carried out. The significant feature selection is carried out using DPCA, and with the selected features the documents are categorized into different classes using MGAC. This classification process combines the results of all weak learners and makes a strong classification in order to improve the precision of medical data mining and minimizes the false positive rate. Experimental evaluation has been performed using PubMed database.

Download Full-text

The study of Economic Complexity and GDP Effect on Inflation Rate and Income Inequality in Persian Gulf States 2002 -2015

Mapta Journal of Mechanical and Industrial Engineering (MJMIE) ◽

10.33544/mjmie.v2i3.83 ◽

2018 ◽

Vol 2 (3) ◽

pp. 31-39

Author(s):

Leila Sharif Moghadasi

Keyword(s):

Income Inequality ◽

Persian Gulf ◽

Inflation Rate ◽

Statistical Population ◽

Descriptive Research ◽

Economic Complexity ◽

Correlational Research ◽

Negative Effect ◽

Gulf States ◽

Document Mining

The purpose of this study was to examine the effect of economic complexity and gross domestic product (GDP) on inflation rate and income inequality between 2002 and 2015. The statistical population of this research is Persian Gulf states, and independent variables are economic complexity and GDP and dependent variables are inflation rate and income inequality. The present research is an applied research and is essentially a descriptive research, and also in terms of methodology, it is considered as a correlational research. The theoretical literature and subjective history and research data collection had been done using library method and document mining method, respectively. Descriptive and inferential statistics have been used to describe and summarize the collected data. Firstly, variance heterogeneity pre-tests, F lemmer test, Hausman test and Jarque-Bera test were used to for analyzing data and then multivariate regression test and Eviews software were used to confirm and reject the research hypotheses. The results of the research show that economic complexity and GDP have a positive and significant effect on inflation rate, while economic complexity and GDP have a negative effect on income inequality.

Download Full-text

An Efficient Pharse Based Pattern Taxonomy Deploying Method for Text Document Mining

International Journal of Trend in Scientific Research and Development ◽

10.31142/ijtsrd11270 ◽

2018 ◽

Vol Volume-2 (Issue-3) ◽

pp. 1375-1383

Author(s):

S. Brindha ◽

Dr. S. Sukumaran ◽

Keyword(s):

Text Document ◽

Document Mining

Download Full-text

A Novel Approach on Document Mining Using Dirichlet Process Mixture Models and Its Kernels

SSRN Electronic Journal ◽

10.2139/ssrn.3216381 ◽

2018 ◽

Author(s):

Ratnam Dodda ◽

Dr. A. Suresh Babu

Keyword(s):

Mixture Models ◽

Dirichlet Process ◽

Dirichlet Process Mixture ◽

Novel Approach ◽

Dirichlet Process Mixture Models ◽

Document Mining

Download Full-text

Analysis and research of document mining based on Chinese sentence structure database

Proceedings of the 2nd International Conference on Big Data Research - ICBDR 2018 ◽

10.1145/3291801.3291806 ◽

2018 ◽

Author(s):

Jiahao Li ◽

Jingchang Pan ◽

Wu Minglei

Keyword(s):

Sentence Structure ◽

Structure Database ◽

Document Mining

Download Full-text

Time-Varying Dynamic Topic Model

Journal of Global Information Management ◽

10.4018/jgim.2018010106 ◽

2018 ◽

Vol 26 (1) ◽

pp. 104-119 ◽

Cited By ~ 1

Author(s):

Jun Han ◽

Yu Huang ◽

Kuldeep Kumar ◽

Sukanto Bhattacharya

Keyword(s):

Parameter Estimation ◽

Topic Model ◽

Global Information ◽

Time Varying ◽

Dynamic Features ◽

Promising Tool ◽

Prior Literature ◽

Mining Methods ◽

Document Mining

In this paper the authors build on prior literature to develop an adaptive and time-varying metadata-enabled dynamic topic model (mDTM) and apply it to a large Weibo dataset using an online Gibbs sampler for parameter estimation. Their approach simultaneously captures the maximum number of inherent dynamic features of microblogs thereby setting it apart from other online document mining methods in the extant literature. In summary, the authors' results show a better performance of mDTM in terms of the quality of the mined information compared to prior research and showcases mDTM as a promising tool for the effective mining of microblogs in a rapidly changing global information space.

Download Full-text

document mining
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine learning approach for categorical document mining

Literature Survey on Aircraft Maintenance Issues with Human Errors and Skill Set Mismatch Using Document Mining Technique

Multimedia Document Mining using Sequential Multimedia Feature Patterns

A Pre-computed Probabilistic Molecular Search Engine for Tandem Mass Spectrometry Proteomics

Discriminant Pearson Correlative Feature Selection based Gentle Adaboost Classification for Medical Document Mining

The study of Economic Complexity and GDP Effect on Inflation Rate and Income Inequality in Persian Gulf States 2002 -2015

An Efficient Pharse Based Pattern Taxonomy Deploying Method for Text Document Mining

A Novel Approach on Document Mining Using Dirichlet Process Mixture Models and Its Kernels

Analysis and research of document mining based on Chinese sentence structure database

Time-Varying Dynamic Topic Model

Export Citation Format

document miningRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Machine learning approach for categorical document mining

Literature Survey on Aircraft Maintenance Issues with Human Errors and Skill Set Mismatch Using Document Mining Technique

Multimedia Document Mining using Sequential Multimedia Feature Patterns

A Pre-computed Probabilistic Molecular Search Engine for Tandem Mass Spectrometry Proteomics

Discriminant Pearson Correlative Feature Selection based Gentle Adaboost Classification for Medical Document Mining

The study of Economic Complexity and GDP Effect on Inflation Rate and Income Inequality in Persian Gulf States 2002 -2015

An Efficient Pharse Based Pattern Taxonomy Deploying Method for Text Document Mining

A Novel Approach on Document Mining Using Dirichlet Process Mixture Models and Its Kernels

Analysis and research of document mining based on Chinese sentence structure database

Time-Varying Dynamic Topic Model

document mining
Recently Published Documents