miRmedon: confident detection of microRNA editing

Mapping Intimacies ◽

10.1101/774661 ◽

2019 ◽

Author(s):

Amitai Mordechai ◽

Alal Eran

Keyword(s):

Cancer Progression ◽

Large Scale ◽

Complex Processes ◽

Mirna Genes ◽

Multiple Loci ◽

Rnaseq Data ◽

Error Distributions ◽

Improved Performance ◽

Mirna Editing ◽

Small Rnaseq

SummarymicroRNA (miRNA), key regulators of gene expression, are prime targets for adenosine deaminase acting on RNA (ADAR) enzymes. Although ADAR-mediated A-to-I miRNA editing has been shown to be essential for orchestrating complex processes, including neurodevelopment and cancer progression, only a few human miRNA editing sites have been reported. Several computational approaches have been developed for the detection of miRNA editing in small RNAseq data, all based on the identification of systematic mismatches of ‘G’ at primary adenosine sites in known miRNA sequences. However, these methods have several limitations, including their ability to detect only one editing site per sequence (although editing of multiple sites per miRNA has been reproducibly validated), their focus on uniquely mapping reads (although 20% of human miRNA are transcribed from multiple loci), and their inability to detect editing in miRNA genes harboring genomic variants (although 73% of human miRNA loci include a reported SNP or indel). To overcome these limitations, we developed miRmedon, that leverages large scale human variation data, a combination of local and global alignments, and a comparison of the inferred editing and error distributions, for a confident detection of miRNA editing in small RNAseq data. We demonstrate its improved performance as compared to currently available methods and describe its advantages.Availability and implementationPython source code is available at https://github.com/Amitai88/[email protected]

Download Full-text

School Choice During a Period of Radical School Reform. Evidence from Academy Conversion in England

Economic Policy ◽

10.1093/epolic/eiaa023 ◽

2020 ◽

Author(s):

Marco Bertoni ◽

Stephen Gibbons ◽

Olmo Silva

Keyword(s):

School Choice ◽

Large Scale ◽

English Education ◽

Limited Information ◽

State Control ◽

High Performing ◽

Opt Out ◽

Improved Performance ◽

Expected Benefits ◽

State Schools

Abstract We study how demand responds to the rebranding of existing state schools as autonomous ‘academies’ in the context of a radical and large-scale reform to the English education system. The academy programme encouraged schools to opt out of local state control and funding, but provided parents and students with limited information on the expected benefits. We use administrative data on school applications for three cohorts of students to estimate whether this rebranding changes schools’ relative popularity. We find that families – particularly higher-income, White British – are more likely to rank converted schools above non-converted schools on their applications. We also find that it is mainly schools that are high-performing, popular and proximate to families’ homes that attract extra demand after conversion. Overall, the patterns we document suggest that families read academy conversion as a signal of future quality gains – although this signal is in part misleading as we find limited evidence that conversion causes improved performance.

Download Full-text

Online sequential ensembling of predictive fuzzy systems

Evolving Systems ◽

10.1007/s12530-021-09398-x ◽

2021 ◽

Author(s):

Edwin Lughofer ◽

Mahardhika Pratama

Keyword(s):

Data Streams ◽

Fuzzy Systems ◽

Large Scale ◽

Fuzzy Model ◽

System Delay ◽

Actual System ◽

Processing Times ◽

Target Values ◽

Improved Performance ◽

Prediction Techniques

AbstractEvolving fuzzy systems (EFS) have enjoyed a wide attraction in the community to handle learning from data streams in an incremental, single-pass and transparent manner. The main concentration so far lied in the development of approaches for single EFS models, basically used for prediction purposes. Forgetting mechanisms have been used to increase their flexibility, especially for the purpose to adapt quickly to changing situations such as drifting data distributions. These require forgetting factors steering the degree of timely out-weighing older learned concepts, whose adequate setting in advance or in adaptive fashion is not an easy and not a fully resolved task. In this paper, we propose a new concept of learning fuzzy systems from data streams, which we call online sequential ensembling of fuzzy systems (OS-FS). It is able to model the recent dependencies in streams on a chunk-wise basis: for each new incoming chunk, a new fuzzy model is trained from scratch and added to the ensemble (of fuzzy systems trained before). This induces (i) maximal flexibility in terms of being able to apply variable chunk sizes according to the actual system delay in receiving target values and (ii) fast reaction possibilities in the case of arising drifts. The latter are realized with specific prediction techniques on new data chunks based on the sequential ensemble members trained so far over time. We propose four different prediction variants including various weighting concepts in order to put higher weights on the members with higher inference certainty during the amalgamation of predictions of single members to a final prediction. In this sense, older members, which keep in mind knowledge about past states, may get dynamically reactivated in the case of cyclic drifts, which induce dynamic changes in the process behavior which are re-occurring from time to time later. Furthermore, we integrate a concept for properly resolving possible contradictions among members with similar inference certainties. The reaction onto drifts is thus autonomously handled on demand and on the fly during the prediction stage (and not during model adaptation/evolution stage as conventionally done in single EFS models), which yields enormous flexibility. Finally, in order to cope with large-scale and (theoretically) infinite data streams within a reasonable amount of prediction time, we demonstrate two concepts for pruning past ensemble members, one based on atypical high error trends of single members and one based on the non-diversity of ensemble members. The results based on two data streams showed significantly improved performance compared to single EFS models in terms of a better convergence of the accumulated chunk-wise ahead prediction error trends, especially in the case of regular and cyclic drifts. Moreover, the more advanced prediction schemes could significantly outperform standard averaging over all members’ outputs. Furthermore, resolving contradictory outputs among members helped to improve the performance of the sequential ensemble further. Results on a wider range of data streams from different application scenarios showed (i) improved error trend lines over single EFS models, as well as over related AI methods OS-ELM and MLPs neural networks retrained on data chunks, and (ii) slightly worse trend lines than on-line bagged EFS (as specific EFS ensembles), but with around 100 times faster processing times (achieving low processing times way below requiring milli-seconds for single samples updates).

Download Full-text

Mutational landscape of EGFR-, MYC-, and Kras-driven genetically engineered mouse models of lung adenocarcinoma

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1613601113 ◽

2016 ◽

Vol 113 (42) ◽

pp. E6409-E6417 ◽

Cited By ~ 80

Author(s):

David G. McFadden ◽

Katerina Politi ◽

Arjun Bhutkar ◽

Frances K. Chen ◽

Xiaoling Song ◽

...

Keyword(s):

Lung Adenocarcinoma ◽

Mouse Models ◽

Cancer Progression ◽

Large Scale ◽

Human Cancer ◽

Genetically Engineered ◽

Model Systems ◽

Driver Mutations ◽

Genetic Profile ◽

Genetically Engineered Mouse Models

Genetically engineered mouse models (GEMMs) of cancer are increasingly being used to assess putative driver mutations identified by large-scale sequencing of human cancer genomes. To accurately interpret experiments that introduce additional mutations, an understanding of the somatic genetic profile and evolution of GEMM tumors is necessary. Here, we performed whole-exome sequencing of tumors from three GEMMs of lung adenocarcinoma driven by mutant epidermal growth factor receptor (EGFR), mutant Kirsten rat sarcoma viral oncogene homolog (Kras), or overexpression of MYC proto-oncogene. Tumors from EGFR- and Kras-driven models exhibited, respectively, 0.02 and 0.07 nonsynonymous mutations per megabase, a dramatically lower average mutational frequency than observed in human lung adenocarcinomas. Tumors from models driven by strong cancer drivers (mutant EGFR and Kras) harbored few mutations in known cancer genes, whereas tumors driven by MYC, a weaker initiating oncogene in the murine lung, acquired recurrent clonal oncogenic Kras mutations. In addition, although EGFR- and Kras-driven models both exhibited recurrent whole-chromosome DNA copy number alterations, the specific chromosomes altered by gain or loss were different in each model. These data demonstrate that GEMM tumors exhibit relatively simple somatic genotypes compared with human cancers of a similar type, making these autochthonous model systems useful for additive engineering approaches to assess the potential of novel mutations on tumorigenesis, cancer progression, and drug sensitivity.

Download Full-text

Improved techniques for liquid culture of human and mouse bone marrow

Blood ◽

10.1182/blood.v47.3.369.bloodjournal473369 ◽

1976 ◽

Vol 47 (3) ◽

pp. 369-379

Author(s):

MJ Cline ◽

DW Golde

Keyword(s):

Bone Marrow ◽

Stem Cell ◽

Large Scale ◽

Membrane Surface ◽

Diffusion Chamber ◽

Mononuclear Phagocytes ◽

Mouse Bone Marrow ◽

Improved Performance ◽

Proliferation In Vitro

Previous studies using the in vitro diffusion chamber (Marbrook) have shown that bone marrow grown in this system will undergo limited stem cell replication and differentiation to mature granulocytes and mononuclear phagocytes. A series of studies with modified culture systems was initiated to improve cell production and committed stem cell (CFU-C) proliferation in vitro. Introduction of a continuous-flow system and a migration technique providing means of egress for mature neutrophils resulted in substantially improved performance. CFU-C were found to be capable of migration through a 3-mu pore membrane. These studies indicated that membrane surface area, culture medium circulation, and mature cell egress were among the conditions that could be optimized for maximum hematopoietic cell proliferation in suspension culture. The present observations also suggested that large- scale in vitro growth of mammalian bone marrow may be feasible.

Download Full-text

Mutational analysis of driver genes with tumor suppressive and oncogenic roles in gastric cancer

PeerJ ◽

10.7717/peerj.3585 ◽

2017 ◽

Vol 5 ◽

pp. e3585 ◽

Cited By ~ 2

Author(s):

Tianfang Wang ◽

Yining Liu ◽

Min Zhao

Keyword(s):

Gastric Cancer ◽

Cancer Progression ◽

Complex Disease ◽

Mutational Analysis ◽

Driver Mutations ◽

Driver Genes ◽

Protein Coding ◽

Mirna Genes ◽

Genetic Mechanisms ◽

New Treatment

Gastric cancer (GC) is a complex disease with heterogeneous genetic mechanisms. Genomic mutational profiling of gastric cancer not only expands our knowledge about cancer progression at a fundamental genetic level, but also could provide guidance on new treatment decisions, currently based on tumor histology. The fact that precise medicine-based treatment is successful in a subset of tumors indicates the need for better identification of clinically related molecular tumor phenotypes, especially with regard to those driver mutations on tumor suppressor genes (TSGs) and oncogenes (ONGs). We surveyed 313 TSGs and 160 ONGs associated with 48 protein coding and 19 miRNA genes with both TSG and ONG roles. Using public cancer mutational profiles, we confirmed the dual roles of CDKN1A and CDKN1B. In addition to the widely recognized alterations, we identified another 82 frequently mutated genes in public gastric cancer cohort. In summary, these driver mutation profiles of individual GC will form the basis of personalized treatment of gastric cancer, leading to substantial therapeutic improvements.

Download Full-text

A Bayesian hidden Potts mixture model for analyzing lung cancer pathology images

Biostatistics ◽

10.1093/biostatistics/kxy019 ◽

2018 ◽

Vol 20 (4) ◽

pp. 565-581

Author(s):

Qiwei Li ◽

Xinlei Wang ◽

Faming Liang ◽

Faliu Yi ◽

Yang Xie ◽

...

Keyword(s):

Lung Cancer ◽

Spatial Patterns ◽

Cancer Progression ◽

Large Scale ◽

Digital Pathology ◽

Interaction Strength ◽

Monte Carlo Sampling ◽

Square Lattice ◽

Different Types ◽

Pathology Image

Summary Digital pathology imaging of tumor tissues, which captures histological details in high resolution, is fast becoming a routine clinical procedure. Recent developments in deep-learning methods have enabled the identification, characterization, and classification of individual cells from pathology images analysis at a large scale. This creates new opportunities to study the spatial patterns of and interactions among different types of cells. Reliable statistical approaches to modeling such spatial patterns and interactions can provide insight into tumor progression and shed light on the biological mechanisms of cancer. In this article, we consider the problem of modeling a pathology image with irregular locations of three different types of cells: lymphocyte, stromal, and tumor cells. We propose a novel Bayesian hierarchical model, which incorporates a hidden Potts model to project the irregularly distributed cells to a square lattice and a Markov random field prior model to identify regions in a heterogeneous pathology image. The model allows us to quantify the interactions between different types of cells, some of which are clinically meaningful. We use Markov chain Monte Carlo sampling techniques, combined with a double Metropolis–Hastings algorithm, in order to simulate samples approximately from a distribution with an intractable normalizing constant. The proposed model was applied to the pathology images of $205$ lung cancer patients from the National Lung Screening trial, and the results show that the interaction strength between tumor and stromal cells predicts patient prognosis (P = $0.005$). This statistical methodology provides a new perspective for understanding the role of cell–cell interactions in cancer progression.

Download Full-text

An Efficient Software Architecture for Automated Coupling of Convection and Thermal Radiation Tools

Heat Transfer: Volume 3 ◽

10.1115/ht2008-56303 ◽

2008 ◽

Cited By ~ 1

Author(s):

Christian Rauch ◽

Thomas Ho¨rmann ◽

Sebastian Jagsch ◽

Raimund Almbauer

Keyword(s):

Software Architecture ◽

Large Scale ◽

Exhaust System ◽

Industrial Applications ◽

Proof Of Concept ◽

Speed Up ◽

Simulation Results ◽

Improved Performance ◽

New System ◽

Efficient Software

Much attention has been paid recently by research and development engineers on performing multi-physics calculations. One way to do this is to couple commercial tools for examining complex systems. Since the proposal of an software architecture for coupling programs as published in a previous paper significant changes have led to an improved performance for large-scale industrial applications. This architecture is being described and as a proof of concept a simulation is being conducted by coupling two commercial solvers. The speed-up of the new system is being presented. The simulation results are then compared with measurements of surface temperatures of an exhaust system of an actual sports utilities vehicle (SUV) and conclusions are being drawn. The proposed architecture is easily adaptable to various programs as it is implemented in C++ and changes for a specific code can be restricted to a view classes.

Download Full-text

4284 Development of a Survey Instrument to Predict Uptake of and Adherence to Active Surveillance among Men with Low-Risk Prostate Cancer

Journal of Clinical and Translational Science ◽

10.1017/cts.2020.383 ◽

2020 ◽

Vol 4 (s1) ◽

pp. 128-129

Author(s):

Aaron T Seaman ◽

Kathryn L. Taylor ◽

Kimberly Davis ◽

Kenneth G. Nepple ◽

Michelle A. Mengeling ◽

...

Keyword(s):

Prostate Cancer ◽

Active Surveillance ◽

Cancer Progression ◽

Large Scale ◽

Survey Instrument ◽

Low Risk ◽

Cognitive Interviews ◽

Risk Prostate Cancer ◽

Low Risk Prostate Cancer ◽

Single Instrument

OBJECTIVES/GOALS: Active surveillance (AS) is a recognized strategy to manage low-risk prostate cancer (PCa) in the absence of cancer progression. Little prospective data exists on the decisional factors associated with selecting and adhering to AS in the absence of cancer progression. We developed a survey instrument to predict AS uptake and adherence. METHODS/STUDY POPULATION: We utilized a three-step process to develop and refine a survey instrument designed to predict AS uptake and adherence among men with low-risk PCa: 1) We identified relevant conceptual domains based on prior research and a literature review. 2) We conducted 21 semi-structured concept elicitation interviews to identify patient-perceived barriers and facilitators to AS uptake and adherence among men with a low-risk PCa who had been on AS for ≥1 year. The identified concepts became the basis of our draft survey instrument. 3) We conducted two rounds of cognitive interviews with men with low-risk PCa (n = 12; n = 6) to refine and initially validate the instrument. RESULTS/ANTICIPATED RESULTS: Relevant concepts identified from the initial interviews included the importance of patient: knowledge of their PCa risk, value in delaying treatment, trust in urologist and the AS surveillance protocol, and perceived social support. Initially, the survey was drafted as a single instrument to be administered after a patient had selected AS comprising sections on patient health, AS selection, and AS adherence. Based on the first round of cognitive interviews, we revised the single instrument into two surveys to track shifts in patient preference and experience. The first, administered at diagnosis, focuses on selection, and the second, a 6-month follow up, focuses on adherence. Following revisions, participants indicated the revised 2-part instrument was clear and not burdensome to complete. DISCUSSION/SIGNIFICANCE OF IMPACT: The instrument’s content validity was evaluated through cognitive interviews, which supported that the survey items’ intended and understood meanings were isomorphic. In the next phase, we plan to conduct a large-scale prospective cohort study to evaluate the predictive validity, after which it will be available for public research use.

Download Full-text

Methylation profile of group of miRNA genes in clear cell renal cell carcinoma and their involvement in cancer progression

Russian Journal of Genetics ◽

10.1134/s1022795413030034 ◽

2013 ◽

Vol 49 (3) ◽

pp. 320-328 ◽

Cited By ~ 10

Author(s):

E. V. Beresneva ◽

S. V. Rykov ◽

D. S. Khodyrev ◽

I. V. Pronina ◽

V. D. Ermilova ◽

...

Keyword(s):

Renal Cell Carcinoma ◽

Cell Carcinoma ◽

Cancer Progression ◽

Renal Cell ◽

Clear Cell ◽

Methylation Profile ◽

Cell Renal Cell Carcinoma ◽

Mirna Genes

Download Full-text

The Influence of PBL Parameterization on the Practical Predictability of Convection Initiation during the Mesoscale Predictability Experiment (MPEX)

Weather and Forecasting ◽

10.1175/waf-d-16-0174.1 ◽

2017 ◽

Vol 32 (3) ◽

pp. 1161-1183 ◽

Cited By ~ 10

Author(s):

Bryan M. Burlingame ◽

Clark Evans ◽

Paul J. Roebber

Keyword(s):

Large Scale ◽

Distribution Functions ◽

Forecast Skill ◽

Cumulative Distribution ◽

Spatial Error ◽

Error Distributions ◽

Pbl Parameterization ◽

Large Scale Flow ◽

Planetary Boundary Layer Parameterization ◽

Convection Initiation

Abstract This study evaluates the influence of planetary boundary layer parameterization on short-range (0–15 h) convection initiation (CI) forecasts within convection-allowing ensembles that utilize subsynoptic-scale observations collected during the Mesoscale Predictability Experiment. Three cases, 19–20 May, 31 May–1 June, and 8–9 June 2013, are considered, each characterized by a different large-scale flow pattern. An object-based method is used to verify and analyze CI forecasts. Local mixing parameterizations have, relative to nonlocal mixing parameterizations, higher probabilities of detection but also higher false alarm ratios, such that the ensemble mean forecast skill only subtly varied between parameterizations considered. Temporal error distributions associated with matched events are approximately normal around a zero mean, suggesting little systematic timing bias. Spatial error distributions are skewed, with average mean (median) distance errors of approximately 44 km (28 km). Matched event cumulative distribution functions suggest limited forecast skill increases beyond temporal and spatial thresholds of 1 h and 100 km, respectively. Forecast skill variation is greatest between cases with smaller variation between PBL parameterizations or between individual ensemble members for a given case, implying greatest control on CI forecast skill by larger-scale features than PBL parameterization. In agreement with previous studies, local mixing parameterizations tend to produce simulated boundary layers that are too shallow, cool, and moist, while nonlocal mixing parameterizations tend to be deeper, warmer, and drier. Forecasts poorly resolve strong capping inversions across all parameterizations, which is hypothesized to result primarily from implicit numerical diffusion associated with the default finite-differencing formulation for vertical advection used herein.

Download Full-text