document mining
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 2)

H-INDEX

6
(FIVE YEARS 0)

2021 ◽  
pp. 53-64
Author(s):  
T. Nanthakumaran Thulasy ◽  
Puteri N. E. Nohuddin ◽  
Norlizawati Abd Rahim ◽  
Astuty Amrin

2020 ◽  
Author(s):  
Jeff Jones

AbstractMass spectrometry methods of peptide identification involve comparing observed tandem spectra with in-silico derived spectrum models. Presented here is a proteomics search engine that offers a new variation of the standard approach, with improved results. The proposed method employs information theory and probabilistic information retrieval on a pre-computed and indexed fragmentation database generating a peptide-to-spectrum match (PSM) score modeled on fragment ion frequency. As a result, the direct application of modern document mining, allows for treating the collection of peptides as a corpus and corresponding fragment ions as indexable words, leveraging ready-built search engines and common predefined ranking algorithms. Fast and accurate PSM matches are achieved yielding a 5-10% higher rate of peptide identities than current database mining methods. Immediate applications of this search engine are aimed at identifying peptides from large sequence databases consisting of homologous proteins with minor sequence variations, such as genetic variation expected in the human population.


2019 ◽  
Vol 8 (3) ◽  
pp. 3777-3783

This paper examines Discriminant Pearson Correlative Analysis Based Multivariate Gentle Adaboost Classification (DPCA-MGAC) and it is used to improve the performance of medical document mining with minimum time complexity. A large number of documents are collected from PubMed databases through the semantic-based search. Processes such as removing stop words, stemming, features identification, selection of features i.e., relevant keywords for document classification are carried out. The significant feature selection is carried out using DPCA, and with the selected features the documents are categorized into different classes using MGAC. This classification process combines the results of all weak learners and makes a strong classification in order to improve the precision of medical data mining and minimizes the false positive rate. Experimental evaluation has been performed using PubMed database.


2018 ◽  
Vol 2 (3) ◽  
pp. 31-39
Author(s):  
Leila Sharif Moghadasi

The purpose of this study was to examine the effect of economic complexity and gross domestic product (GDP) on inflation rate and income inequality between 2002 and 2015. The statistical population of this research is Persian Gulf states, and independent variables are economic complexity and GDP and dependent variables are inflation rate and income inequality. The present research is an applied research and is essentially a descriptive research, and also in terms of methodology, it is considered as a correlational research. The theoretical literature and subjective history and research data collection had been done using library method and document mining method, respectively. Descriptive and inferential statistics have been used to describe and summarize the collected data. Firstly, variance heterogeneity pre-tests, F lemmer test, Hausman test and Jarque-Bera test were used to for analyzing data and then multivariate regression test and Eviews software were used to confirm and reject the research hypotheses. The results of the research show that economic complexity and GDP have a positive and significant effect on inflation rate, while economic complexity and GDP have a negative effect on income inequality.


2018 ◽  
Vol 26 (1) ◽  
pp. 104-119 ◽  
Author(s):  
Jun Han ◽  
Yu Huang ◽  
Kuldeep Kumar ◽  
Sukanto Bhattacharya

In this paper the authors build on prior literature to develop an adaptive and time-varying metadata-enabled dynamic topic model (mDTM) and apply it to a large Weibo dataset using an online Gibbs sampler for parameter estimation. Their approach simultaneously captures the maximum number of inherent dynamic features of microblogs thereby setting it apart from other online document mining methods in the extant literature. In summary, the authors' results show a better performance of mDTM in terms of the quality of the mined information compared to prior research and showcases mDTM as a promising tool for the effective mining of microblogs in a rapidly changing global information space.


Sign in / Sign up

Export Citation Format

Share Document