Efficient duplicate rate estimation from subsamples of sequencing libraries

10.7287/peerj.preprints.1298v2 ◽

2015 ◽

Author(s):

Christopher Schröder ◽

Sven Rahmann

Keyword(s):

Mathematical Model ◽

Quality Control ◽

High Throughput Sequencing ◽

Linear Optimization ◽

Computational Approach ◽

Mathematical Framework ◽

Quality Metric ◽

Sampling Process ◽

Copy Numbers ◽

Occupancy Distribution

In high-throughput sequencing (HTS) projects, the sequenced fragments’ duplicate rate is a key quality metric. A high duplicate rate may arise from a low amount of input DNA and many PCR cycles. Many methods for downstream analyses require that duplicates be removed. If the duplicate rate is high, most of the sequencing effort and money spent would have been in vain. Therefore, it is of considerable interest to estimate the duplicate rate after sequencing only a small subsample at low depth (multiplexed with other libraries) for quality control before running the full experiment. In this article, we provide an elementary mathematical framework and an efficient computational approach based on quadratic and linear optimization to estimate the true duplicate rate from a small subsample. Our method is based on up-sampling the occupancy distribution of the reads’ copy numbers. Compared to an existing approach, we use an explicit and easily explained mathematical model that accurately inverts the sub-sampling process. We evaluate the performance of our approach in comparison to that of the existing method on several artificial and real datasets. The same ideas can be used for diversity estimation in general. Software implementing our approach is available under the MIT license.

Download Full-text

Efficient duplicate rate estimation from subsamples of sequencing libraries

10.7287/peerj.preprints.1298v1 ◽

2015 ◽

Author(s):

Christopher Schröder ◽

Sven Rahmann

Keyword(s):

Mathematical Model ◽

Quality Control ◽

High Throughput Sequencing ◽

Linear Optimization ◽

Computational Approach ◽

Mathematical Framework ◽

Quality Metric ◽

Sampling Process ◽

Copy Numbers ◽

Occupancy Distribution

In high-throughput sequencing (HTS) projects, the sequenced fragments’ duplicate rate is a key quality metric. A high duplicate rate may arise from a low amount of input DNA and many PCR cycles. Many methods for downstream analyses require that duplicates be removed. If the duplicate rate is high, most of the sequencing effort and money spent would have been in vain. Therefore, it is of considerable interest to estimate the duplicate rate after sequencing only a small subsample at low depth (multiplexed with other libraries) for quality control before running the full experiment. In this article, we provide an elementary mathematical framework and an efficient computational approach based on quadratic and linear optimization to estimate the true duplicate rate from a small subsample. Our method is based on up-sampling the occupancy distribution of the reads’ copy numbers. Compared to an existing approach, we use an explicit and easily explained mathematical model that accurately inverts the sub-sampling process. We evaluate the performance of our approach in comparison to that of the existing method on several artificial and real datasets. The same ideas can be used for diversity estimation in general. Software implementing our approach is available under the MIT license.

Download Full-text

Computational approach based on wavelets for financial mathematical model governed by distributed order fractional differential equation

Mathematics and Computers in Simulation ◽

10.1016/j.matcom.2021.05.026 ◽

2021 ◽

Author(s):

Yashveer Kumar ◽

Vineet Kumar Singh

Keyword(s):

Differential Equation ◽

Mathematical Model ◽

Fractional Differential Equation ◽

Computational Approach ◽

Fractional Differential ◽

Distributed Order

Download Full-text

A Branching Process to Characterize the Dynamics of Stem Cell Differentiation

10.1101/016295 ◽

2015 ◽

Cited By ~ 1

Author(s):

david miguez

Keyword(s):

Mathematical Model ◽

Cell Cycle ◽

Stem Cell ◽

Cycle Length ◽

Branching Process ◽

Stem Cell Differentiation ◽

Stem Cell Population ◽

Mathematical Framework ◽

Average Cell ◽

Cell Cycle Length

The understanding of the regulatory processes that orchestrate stem cell maintenance is a cornerstone in developmental biology. Here, we present a mathematical model based on a branching process formalism that predicts average rates of proliferative and differentiative divisions in a given stem cell population. In the context of vertebrate spinal neurogenesis, the model predicts complex non-monotonic variations in the rates of pp, pd and dd modes of division as well as in cell cycle length, in agreement with experimental results. Moreover, the model shows that the differentiation probability follows a binomial distribution, allowing us to develop equations to predict the rates of each mode of division. A phenomenological simulation of the developing spinal cord informed with the average cell cycle length and division rates predicted by the mathematical model reproduces the correct dynamics of proliferation and differentiation in terms of average numbers of progenitors and differentiated cells. Overall, the present mathematical framework represents a powerful tool to unveil the changes in the rate and mode of division of a given stem cell pool by simply quantifying numbers of cells at different times.

Download Full-text

DNA-Based Herbal Teas’ Authentication: An ITS2 and psbA-trnH Multi-Marker DNA Metabarcoding Approach

Plants ◽

10.3390/plants10102120 ◽

2021 ◽

Vol 10 (10) ◽

pp. 2120

Author(s):

Jessica Frigerio ◽

Giulia Agostinetto ◽

Valerio Mezzasalma ◽

Fabrizio De De Mattia ◽

Massimo Labra ◽

...

Keyword(s):

Quality Control ◽

Quantitative Analysis ◽

Medicinal Plants ◽

High Throughput ◽

High Throughput Sequencing ◽

The Other ◽

Plant Component ◽

Identification Rate ◽

Dna Metabarcoding ◽

Therapeutic Properties

Medicinal plants have been widely used in traditional medicine due to their therapeutic properties. Although they are mostly used as herbal infusion and tincture, employment as ingredients of food supplements is increasing. However, fraud and adulteration are widespread issues. In our study, we aimed at evaluating DNA metabarcoding as a tool to identify product composition. In order to accomplish this, we analyzed fifteen commercial products with DNA metabarcoding, using two barcode regions: psbA-trnH and ITS2. Results showed that on average, 70% (44–100) of the declared ingredients have been identified. The ITS2 marker appears to identify more species (n = 60) than psbA-trnH (n = 35), with an ingredients’ identification rate of 52% versus 45%, respectively. Some species are identified only by one marker rather than the other. Additionally, in order to evaluate the quantitative ability of high-throughput sequencing (HTS) to compare the plant component to the corresponding assigned sequences, in the laboratory, we created six mock mixtures of plants starting both from biomass and gDNA. Our analysis also supports the application of DNA metabarcoding for a relative quantitative analysis. These results move towards the application of HTS analysis for studying the composition of herbal teas for medicinal plants’ traceability and quality control.

Download Full-text

Accurate Inference of Tumor Purity and Absolute Copy Numbers From High-Throughput Sequencing Data

Frontiers in Genetics ◽

10.3389/fgene.2020.00458 ◽

2020 ◽

Vol 11 ◽

Author(s):

Xiguo Yuan ◽

Zhe Li ◽

Haiyong Zhao ◽

Jun Bai ◽

Junying Zhang

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Tumor Purity ◽

Copy Numbers ◽

High Throughput Sequencing Data

Download Full-text

Modeling and prototyping of a soft closed-chain modular gripper

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-09-2018-0180 ◽

2019 ◽

Vol 46 (1) ◽

pp. 135-145 ◽

Cited By ~ 5

Author(s):

Muddasar Anwar ◽

Toufik Al Khawli ◽

Irfan Hussain ◽

Dongming Gan ◽

Federico Renda

Keyword(s):

Mathematical Model ◽

Degrees Of Freedom ◽

Kinematic Model ◽

Chain Structure ◽

Mathematical Representation ◽

Mathematical Framework ◽

Complex Objects ◽

Content Type ◽

Design And Optimization ◽

Closed Chain

Purpose This paper aims to present a soft closed-chain modular gripper for robotic pick-and-place applications. The proposed biomimetic gripper design is inspired by the Fin Ray effect, derived from fish fins physiology. It is composed of three axisymmetric fingers, actuated with a single actuator. Each finger has a modular under-actuated closed-chain structure. The finger structure is compliant in contact normal direction, with stiff crossbeams reorienting to help the finger structure conform around objects. Design/methodology/approach Starting with the design and development of the proposed gripper, a consequent mathematical representation consisting of closed-chain forward and inverse kinematics is detailed. The proposed mathematical framework is validated through the finite element modeling simulations. Additionally, a set of experiments was conducted to compare the simulated and prototype finger trajectories, as well as to assess qualitative grasping ability. Findings Key Findings are the presented mathematical model for closed-loop chain mechanisms, as well as design and optimization guidelines to develop controlled closed-chain grippers. Research limitations/implications The proposed methodology and mathematical model could be taken as a fundamental modular base block to explore similar distributed degrees of freedom (DOF) closed-chain manipulators and grippers. The enhanced kinematic model contributes to optimized dynamics and control of soft closed-chain grasping mechanisms. Practical implications The approach is aimed to improve the development of soft grippers that are required to grasp complex objects found in human–robot cooperation and collaborative robot (cobot) applications. Originality/value The proposed closed-chain mathematical framework is based on distributed DOFs instead of the conventional lumped joint approach. This is to better optimize and understand the kinematics of soft robotic mechanisms.

Download Full-text

Automating Linear Tolerance Analysis Across Assemblies

Journal of Mechanical Design ◽

10.1115/1.2916912 ◽

1992 ◽

Vol 114 (1) ◽

pp. 174-179 ◽

Cited By ~ 6

Author(s):

N. P. Juster ◽

P. M. Dew ◽

A. de Pennington

Keyword(s):

Mathematical Model ◽

Manufacturing Process ◽

Tolerance Analysis ◽

Mathematical Framework ◽

Worst Case ◽

Experimental Software

One of the tests carried out by designers in an attempt to check whether an assembly of components will function correctly is tolerance analysis. Tolerance analysis, although relatively straightforward, is liable to be time consuming and error prone. It cannot be automated unless a suitable mathematical framework is developed to model the variations introduced by the manufacturing process. The designer allows for the variations by means of tolerances attached to the dimensions. This paper describes a suitable mathematical model and shows how it may be used to automate linear worst case tolerance analysis across assemblies. Experimental software has been written, based on the theory.

Download Full-text

Endophytic root bacteria associated with the natural vegetation growing at the hydrocarbon-contaminated Bitumount Provincial Historic site

Canadian Journal of Microbiology ◽

10.1139/cjm-2017-0039 ◽

2017 ◽

Vol 63 (6) ◽

pp. 502-515 ◽

Cited By ~ 12

Author(s):

Natalie P. Blain ◽

Bobbi L. Helgason ◽

James J. Germida

Keyword(s):

Plant Species ◽

Contaminated Soils ◽

Bacterial Communities ◽

High Throughput Sequencing ◽

Growth Promotion ◽

Plant Root ◽

Gene Copy ◽

Sampling Location ◽

Historic Site ◽

Copy Numbers

The Bitumount Provincial Historic site is the location of 2 of the world’s first oil-extracting and -refining operations. Despite hydrocarbon levels ranging from 330 to 24 700 mg·(kg soil)−1, plants have been able to recolonize the site through means of natural revegetation. This study was designed to achieve a better understanding of the plant-root-associated bacterial partnerships occurring within naturally revegetated hydrocarbon-contaminated soils. Root endophytic bacterial communities were characterized from representative plant species throughout the site by both high-throughput sequencing and culturing techniques. Population abundance of rhizosphere and root endosphere bacteria was significantly influenced (p < 0.05) by plant species and sampling location. In general, members of the Actinomycetales, Rhizobiales, Pseudomonadales, Burkholderiales, and Sphingomonadales orders were the most commonly identified orders. Community structure of root-associated bacteria was influenced by both plant species and sampling location. Quantitative real-time polymerase chain reaction was used to determine the potential functional diversity of the root endophytic bacteria. The gene copy numbers of 16S rRNA and 2 hydrocarbon-degrading genes (CYP153 and alkB) were significantly affected (p < 0.05) by the interaction of plant species and sampling location. Our findings suggest that some of the bacterial communities detected are known to exhibit plant growth promotion characteristics.

Download Full-text

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets

Cancer Informatics ◽

10.4137/cin.s13890 ◽

2014 ◽

Vol 13s1 ◽

pp. CIN.S13890 ◽

Cited By ~ 1

Author(s):

Changjin Hong ◽

Solaiappan Manimaran ◽

William Evan Johnson

Keyword(s):

Quality Control ◽

High Throughput ◽

High Performance ◽

High Throughput Sequencing ◽

Next Generation Sequencing Data ◽

Data Sets ◽

Sequencing Data ◽

Computationally Efficient ◽

High Throughput Sequencing Data ◽

Downstream Analysis

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .

Download Full-text