MixMir: microRNA motif discovery from gene expression data using mixed linear models

Liyang Diao; Antoine Marcais; Scott Norton; Kevin C. Chen

doi:10.1093/nar/gku672

MixMir: microRNA motif discovery from gene expression data using mixed linear models

Nucleic Acids Research ◽

10.1093/nar/gku672 ◽

2014 ◽

Vol 42 (17) ◽

pp. e135-e135 ◽

Cited By ~ 11

Author(s):

Liyang Diao ◽

Antoine Marcais ◽

Scott Norton ◽

Kevin C. Chen

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Motif Discovery ◽

Linear Models ◽

Expression Data ◽

Mixed Linear Models

Download Full-text

MixMir: microRNA motif discovery from gene expression data using mixed linear models

10.1101/004010 ◽

2014 ◽

Author(s):

LIYANG Diao ◽

Antoine Marcais ◽

Scott Norton ◽

Kevin C. Chen

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Motif Discovery ◽

Linear Models ◽

Developmental Stages ◽

Sequence Similarity ◽

Association Studies ◽

Genome Wide Association Studies ◽

Expression Data ◽

Mixed Linear Models

MicroRNAs (miRNAs) are a class of ~22nt non-coding RNAs that potentially regulate over 60% of human protein-coding genes. MiRNA activity is highly specific, differing between cell types, developmental stages and environmental conditions, so the identification of active miRNAs in a given sample is of great interest. Here we present a novel computational approach for analyzing both mRNA sequence and gene expression data, called MixMir. Our method corrects for 3' UTR background sequence similarity between transcripts, which is known to correlate with mRNA transcript abundance. We demonstrate that after accounting for kmer sequence similarities in 3' UTRs, a statistical linear model based on motif presence/absence can effectively discover active miRNAs in a sample. MixMir utilizes fast software implementations for solving mixed linear models which are widely-used in genome-wide association studies (GWAS). Essentially we use 3' UTR sequence similarity in place of population cryptic relatedness in the GWAS problem. Compared to similar methods such as miREDUCE, Sylamer and cWords, we found that MixMir performed better at discovering true miRNA motifs in Dicer knockout CD4+ T-cells, as well as protein and mRNA expression data obtained from miRNA transfection experiments in human cell lines. MixMir can be freely downloaded from https://github.com/ldiao/MixMir.

Download Full-text

SArKS: de novo discovery of gene expression regulatory motifs and domains by suffix array kernel smoothing

10.1101/133934 ◽

2017 ◽

Author(s):

Dennis Wylie ◽

Hans A. Hofmann ◽

Boris V. Zemelman

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Gene Expression Data ◽

Motif Discovery ◽

Kernel Smoothing ◽

De Novo ◽

Supplementary Information ◽

Spatial Proximity ◽

Regulatory Sequences ◽

Expression Data

AbstractMotivationWe set out to develop an algorithm that can mine differential gene expression data to identify candidate cell type-specific DNA regulatory sequences. Differential expression is usually quantified as a continuous score—fold-change, test-statistic, p-value—comparing biological classes. Unlike existing approaches, our de novo strategy, termed SArKS, applies nonparametric kernel smoothing to uncover promoter motifs that correlate with elevated differential expression scores. SArKS detects motifs by smoothing sequence scores over sequence similarity. A second round of smoothing over spatial proximity reveals multi-motif domains (MMDs). Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.ResultsWe applied SArKS to published gene expression data representing distinct neocortical neuron classes in M. musculus and interneuron developmental states in H. sapiens. When benchmarked against several existing algorithms for correlative motif discovery using a cross-validation procedure, SArKS identified larger motif sets that formed the basis for regression models with higher correlative power.Availabilityhttps://github.com/denniscwylie/[email protected] informationappended to document.

Download Full-text

IntLIM: integration using linear models of metabolomics and gene expression data

BMC Bioinformatics ◽

10.1186/s12859-018-2085-6 ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 13

Author(s):

Jalal K. Siddiqui ◽

Elizabeth Baskin ◽

Mingrui Liu ◽

Carmen Z. Cantemir-Stone ◽

Bofei Zhang ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Linear Models ◽

Expression Data

Download Full-text

Cancer Classification from Gene Expression data using Fuzzy-Rough techniques An Empirical Study

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i6.415420 ◽

2018 ◽

Vol 6 (6) ◽

pp. 415-420

Author(s):

Ansuman Kumar ◽

Anindya Halder

Keyword(s):

Gene Expression ◽

Empirical Study ◽

Gene Expression Data ◽

Cancer Classification ◽

Expression Data

Download Full-text

Statistical methods for analysis of time course gene expression data

Frontiers in Bioscience ◽

10.2741/a743 ◽

2002 ◽

Vol 7 (1) ◽

pp. a90-98 ◽

Cited By ~ 5

Author(s):

Hongzhe Li

Keyword(s):

Gene Expression ◽

Statistical Methods ◽

Gene Expression Data ◽

Time Course ◽

Expression Data

Download Full-text

Faculty Opinions recommendation of A new type of stochastic dependence revealed in gene expression data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1032265.370760 ◽

2006 ◽

Author(s):

Arcady Mushegian

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Data ◽

Stochastic Dependence ◽

New Type

Download Full-text

Faculty Opinions recommendation of A systematic comparison and evaluation of biclustering methods for gene expression data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1087930.540878 ◽

2007 ◽

Author(s):

Daniel Chamovitz

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Data ◽

Systematic Comparison

Download Full-text

Faculty Opinions recommendation of CAERUS: predicting CAncER oUtcomeS using relationship between protein structural information, protein networks, gene expression data, and mutation data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.11029956.11977055 ◽

2011 ◽

Author(s):

Yuanpeng Janet Huang

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Structural Information ◽

Protein Networks ◽

Expression Data ◽

Cancer Outcomes ◽

Mutation Data

Download Full-text

Faculty Opinions recommendation of Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1116793.572870 ◽

2008 ◽

Author(s):

Laura Haneline

Keyword(s):

Gene Expression ◽

Stem Cells ◽

Gene Expression Data ◽

In Silico ◽

Leukemia Stem Cells ◽

Expression Data

Download Full-text

Faculty Opinions recommendation of Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1116793.1165055 ◽

2010 ◽

Author(s):

Anthony D Ho ◽

Eike C Buss ◽

Patrick Wuchter

Keyword(s):

Gene Expression ◽

Stem Cells ◽

Gene Expression Data ◽

In Silico ◽

Leukemia Stem Cells ◽

Expression Data

Download Full-text