scholarly journals In Silico Estimation of the Abundance and Phylogenetic Significance of the Composite Oct4-Sox2 Binding Motifs within a Wide Range of Species

Data ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 111
Author(s):  
Arman Kulyyassov ◽  
Ruslan Kalendar

High-throughput sequencing technologies have greatly accelerated the progress of genomics, transcriptomics, and metagenomics. Currently, a large amount of genomic data from various organisms is being generated, the volume of which is increasing every year. Therefore, the development of methods that allow the rapid search and analysis of DNA sequences is urgent. Here, we present a novel motif-based high-throughput sequence scoring method that generates genome information. We found and identified Utf1-like, Fgf4-like, and Hoxb1-like motifs, which are cis-regulatory elements for the pluripotency transcription factors Sox2 and Oct4 within the genomes of different eukaryotic organisms. The genome-wide analysis of these motifs was performed to understand the impact of their diversification on mammalian genome evolution. Utf1-like, Fgf4-like, and Hoxb1-like motif diversity was evaluated across genomes from multiple species.

2021 ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear.Results: Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically.Conclusions: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


2019 ◽  
Author(s):  
Emilie Lejal ◽  
Agustín Estrada-Peña ◽  
Maud Marsot ◽  
Jean-François Cosson ◽  
Olivier Rué ◽  
...  

AbstractBackgroundThe development of high throughput sequencing technologies has substantially improved analysis of bacterial community diversity, composition, and functions. Over the last decade, high throughput sequencing has been used extensively to identify the diversity and composition of tick microbial communities. However, a growing number of studies are warning about the impact of contamination brought along the different steps of the analytical process, from DNA extraction to amplification. In low biomass samples, e.g. individual tick samples, these contaminants may represent a large part of the obtained sequences, and thus generate considerable errors in downstream analyses and in the interpretation of results. Most studies of tick microbiota either do not mention the inclusion of controls during the DNA extraction or amplification steps, or consider the lack of an electrophoresis signal as an absence of contamination. In this context, we aimed to assess the proportion of contaminant sequences resulting from these steps. We analyzed the microbiota of individual Ixodes ricinus ticks by including several categories of controls throughout the analytical process: crushing, DNA extraction, and DNA amplification.ResultsControls yielded a significant number of sequences (1,126 to 13,198 mean sequences, depending on the control category). Some operational taxonomic units (OTUs) detected in these controls belong to genera reported in previous tick microbiota studies. In this study, these OTUs accounted for 50.9% of the total number of sequences in our samples, and were considered contaminants. Contamination levels (i.e. the percentage of sequences belonging to OTUs identified as contaminants) varied with tick stage and gender: 76.3% of nymphs and 75% of males demonstrated contamination over 50%, while most females (65.7%) had rates lower than 20%. Contamination mainly corresponded to OTUs detected in crushing and DNA extraction controls, highlighting the importance of carefully controlling these steps.ConclusionHere, we showed that contaminant OTUs from extraction and amplification steps can represent more than half the total sequence yield in sequencing runs, and lead to unreliable results when characterizing tick microbial communities. We thus strongly advise the routine use of negative controls in tick microbiota studies, and more generally in studies involving low biomass samples.


2015 ◽  
Author(s):  
M.V. Cannon ◽  
J. Hester ◽  
A. Shalkhauser ◽  
E.R. Chan ◽  
K. Logue ◽  
...  

Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semiaquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity.


2021 ◽  
Vol 12 ◽  
Author(s):  
Manas Joshi ◽  
Adamandia Kapopoulou ◽  
Stefan Laurent

The unprecedented rise of high-throughput sequencing and assay technologies has provided a detailed insight into the non-coding sequences and their potential role as gene expression regulators. These regulatory non-coding sequences are also referred to as cis-regulatory elements (CREs). Genetic variants occurring within CREs have been shown to be associated with altered gene expression and phenotypic changes. Such variants are known to occur spontaneously and ultimately get fixed, due to selection and genetic drift, in natural populations and, in some cases, pave the way for speciation. Hence, the study of genetic variation at CREs has improved our overall understanding of the processes of local adaptation and evolution. Recent advances in high-throughput sequencing and better annotations of CREs have enabled the evaluation of the impact of such variation on gene expression, phenotypic alteration and fitness. Here, we review recent research on the evolution of CREs and concentrate on studies that have investigated genetic variation occurring in these regulatory sequences within the context of population genetics.


2021 ◽  
Author(s):  
Matthew Vinson Cannon ◽  
Haikel N Bogale ◽  
Devika Bhalerao ◽  
Kalil Keita ◽  
Denka Camara ◽  
...  

Vector-borne pathogens cause many human infectious diseases and are responsible for high mortality and morbidity throughout the world. They can also cause livestock epidemics with dramatic social and economic consequences. Due to the high costs, vector-borne disease surveillance is often limited to current threats, and the investigation of emerging pathogens typically occur after the reports of clinical cases. Here, we use high-throughput sequencing to detect and identify a wide range of parasites and viruses carried by mosquitoes from Cambodia, Guinea, Mali and Maryland. We apply this approach to individual Anopheles mosquitoes as well as pools of mosquitoes captured in traps; and compare the outcomes of this assay when applied to DNA or RNA. We identified known human and animal pathogens and mosquito parasites belonging to a wide range of taxa, insect Flaviviruses, and novel DNA sequences from previously uncharacterized organisms. Our results also revealed that analysis of the content of an entire trap is an efficient approach to monitor and identify potential vector-borne pathogens in large surveillance studies, and that analyses of RNA extracted from mosquitoes is preferable, when possible, over DNA-based analyses. Overall, we describe a flexible and easy-to-customize assay that can provide important information for vector-borne disease surveillance and research studies to efficiently complement current approaches.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear. Results Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically. Conclusions We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


Biology Open ◽  
2021 ◽  
Author(s):  
Matthew V. Cannon ◽  
Haikel N. Bogale ◽  
Devika Bhalerao ◽  
Kalil Keita ◽  
Denka Camara ◽  
...  

Vector-borne pathogens cause many human infectious diseases and are responsible for high mortality and morbidity throughout the world. They can also cause livestock epidemics with dramatic social and economic consequences. Due to its high costs, vector-borne disease surveillance is often limited to current threats, and the investigation of emerging pathogens typically occurs after the reports of clinical cases. Here, we use high-throughput sequencing to detect and identify a wide range of parasites and viruses carried by mosquitoes from Cambodia, Guinea, Mali and Maryland. We apply this approach to individual Anopheles mosquitoes as well as pools of mosquitoes captured in traps; and compare the outcomes of this assay when applied to DNA or RNA. We identified known human and animal pathogens and mosquito parasites belonging to a wide range of taxa, as well as novel DNA sequences from previously uncharacterized organisms. Our results also revealed that analysis of the content of an entire trap could be an efficient approach to monitor and identify rare vector-borne pathogens in large surveillance studies. Overall, we describe a high-throughput and easy-to-customize assay to screen for a wide-range of pathogens and efficiently complement current vector-borne disease surveillance approaches.


Author(s):  
Stella C. Yuan ◽  
Eric Malekos ◽  
Melissa T. R. Hawkins

AbstractThe use of museum specimens held in natural history repositories for population and conservation genetic research is increasing in tandem with the use of massively parallel sequencing technologies. Short Tandem Repeats (STRs), or microsatellite loci, are commonly used genetic markers in wildlife and population genetic studies. However, they traditionally suffered from a host of issues including length homoplasy, high costs, low throughput, and difficulties in reproducibility across laboratories. Massively parallel sequencing technologies can address these problems, but the incorporation of museum specimen derived DNA suffers from significant fragmentation and exogenous DNA contamination. Combatting these issues requires extra measures of stringency in the lab and during data analysis, yet there have not been any high-throughput sequencing studies evaluating microsatellite allelic dropout from museum specimen extracted DNA. In this study, we evaluate genotyping errors derived from mammalian museum skin DNA extracts for previously characterized microsatellites across PCR replicates utilizing high-throughput sequencing. We found it useful to classify samples based on DNA concentration, which determined the rate by which genotypes were accurately recovered. Longer microsatellites performed worse in all museum specimens. Allelic dropout rates across loci were dependent on sample quantity, with high concentration museum specimens performing as well and recovering quality metrics nearly as high as the frozen tissue sample. Based on our results, we provide a set of best practices for quality assurance and incorporation of reliable genotypes from museum specimens.


2019 ◽  
Author(s):  
Reneth Millas ◽  
Mary Espina ◽  
CM Sabbir Ahmed ◽  
Angelina Bernardini ◽  
Ekundayo Adeleke ◽  
...  

ABSTRACTOne of the most important tools in genetic improvement is mutagenesis, which is a useful tool to induce genetic and phenotypic variation for trait improvement and discovery of novel genes. JTN-5203 (MG V) mutant population was generated using an induced ethyl methane sulfonate (EMS) mutagenesis and was used for detection of induced mutations in FAD2-1A and FAD2-1B genes using reverse genetics approach. Optimum concentration of EMS was used to treat 15,000 bulk JTN-5203 seeds producing 1,820 M2 population. DNA was extracted, normalized, and pooled from these individuals. Specific primers were designed from FAD2-1A and FAD2-1B genes that are involved in the fatty acid biosynthesis pathway for further analysis using next-generation sequencing. High throughput mutation discovery through TILLING-by-Sequencing approach was used to detect novel allelic variations in this population. Several mutations and allelic variations with high impacts were detected for FAD2-1A and FAD2-1B. This includes GC to AT transition mutations in FAD2-1A (20%) and FAD2-1B (69%). Mutation density for this population is estimated to be about 1/136kb. Through mutagenesis and high-throughput sequencing technologies, novel alleles underlying the mutations observed in mutants with reduced polyunsaturated fatty acids will be identified, and these mutants can be further used in breeding soybean lines with improved fatty acid profile, thereby developing heart-healthy-soybeans.


Sign in / Sign up

Export Citation Format

Share Document