Positive selection signatures in Anqing six‐end‐white pig population based on reduced‐representation genome sequencing data

2021 ◽  
Author(s):  
L. Guo ◽  
H. Sun ◽  
Q. Zhao ◽  
Z. Xu ◽  
Z. Zhang ◽  
...  
2020 ◽  
Vol 66 (11) ◽  
pp. 1450-1458 ◽  
Author(s):  
Divinlal Harilal ◽  
Sathishkumar Ramaswamy ◽  
Tom Loney ◽  
Hanan Al Suwaidi ◽  
Hamda Khansaheb ◽  
...  

Abstract Background With the gradual reopening of economies and resumption of social life, robust surveillance mechanisms should be implemented to control the ongoing COVID-19 pandemic. Unlike RT-qPCR, SARS-CoV-2 whole genome sequencing (cWGS) has the added advantage of identifying cryptic origins of the virus, and the extent of community-based transmissions versus new viral introductions, which can in turn influence public health policy decisions. However, the practical and cost considerations of cWGS should be addressed before it is widely implemented. Methods We performed shotgun transcriptome sequencing using RNA extracted from nasopharyngeal swabs of patients with COVID-19, and compared it to targeted SARS-CoV-2 genome amplification and sequencing with respect to virus detection, scalability, and cost-effectiveness. To track virus origin, we used open-source multiple sequence alignment and phylogenetic tools to compare the assembled SARS-CoV-2 genomes to publicly available sequences. Results We found considerable improvement in whole genome sequencing data quality and viral detection using amplicon-based target enrichment of SARS-CoV-2. With enrichment, more than 99% of the sequencing reads mapped to the viral genome, compared to an average of 0.63% without enrichment. Consequently, an increase in genome coverage was obtained using substantially less sequencing data, enabling higher scalability and sizable cost reductions. We also demonstrated how SARS-CoV-2 genome sequences can be used to determine their possible origin through phylogenetic analysis including other viral strains. Conclusions SARS-CoV-2 whole genome sequencing is a practical, cost-effective, and powerful approach for population-based surveillance and control of viral transmission in the next phase of the COVID-19 pandemic.


Author(s):  
Judith Himmelbauer ◽  
Gábor Mészáros ◽  
Johann Sölkner

A Copy Number Variation (CNV) is a loss or a gain in the DNA sequence, ranging from 50 basepairs to a few megabasepairs. Most studies use whole genome sequencing data to detect deletions. Due to the fact that SNP-chip data is more commonly used in livestock, especially in cattle, the detection of deletions based on SNP-chip data is of interest. In the present study an approach based on SNP chip data and the analysis of Mendelian mismatches in parent-offspring-pairs was developed. Use was made of the fact that deletions appear as homozygous after SNP Chip genotyping. For some SNPs with high number of mismatches, the inheritance of the mismatches could be traced back to one or a few bulls and thereby regions of possible deletions were defined. The study has shown that an approach based on Mendelian mismatches and SNP-chip data is a promising way of detecting deletions.


2020 ◽  
Author(s):  
Divinlal Harilal ◽  
Sathishkumar Ramaswamy ◽  
Tom Loney ◽  
Hanan Al Suwaidi ◽  
Hamda Khansaheb ◽  
...  

AbstractBackgroundWith the gradual reopening of economies and resumption of social life, robust surveillance mechanisms should be implemented to control the ongoing COVID-19 pandemic. Unlike RT-qPCR, SARS-CoV-2 Whole Genome Sequencing (cWGS) has the added advantage of identifying cryptic origins of the virus, and the extent of community-based transmissions versus new viral introductions, which can in turn influence public health policy decisions. However, practical and cost considerations of cWGS should be addressed before it can be widely implemented.MethodsWe performed shotgun transcriptome sequencing using RNA extracted from nasopharyngeal swabs of patients with COVID-19, and compared it to targeted SARS-CoV-2 full genome amplification and sequencing with respect to virus detection, scalability, and cost-effectiveness. To track virus origin, we used open-source multiple sequence alignment and phylogenetic tools to compare the assembled SARS-CoV-2 genomes to publicly available sequences.ResultsWe show a significant improvement in whole genome sequencing data quality and viral detection using amplicon-based target enrichment of SARS-CoV-2. With enrichment, more than 99% of the sequencing reads mapped to the viral genome compared to an average of 0.63% without enrichment. Consequently, a dramatic increase in genome coverage was obtained using significantly less sequencing data, enabling higher scalability and significant cost reductions. We also demonstrate how SARS-CoV-2 genome sequences can be used to determine their possible origin through phylogenetic analysis including other viral strains.ConclusionsSARS-CoV-2 whole genome sequencing is a practical, cost-effective, and powerful approach for population-based surveillance and control of viral transmission in the next phase of the COVID-19 pandemic.


2017 ◽  
Author(s):  
Huan Fan ◽  
Anthony R. Ives ◽  
Yann Surget-Groba

AbstractAlthough genome sequencing is becoming cheaper and faster, reducing the quantity of data by only sequencing part of the genome lowers both sequencing costs and computational burdens. One popular genome-reduction approach is restriction site associated DNA sequencing, or RADseq. RADseq was initially designed for studying genetic variation across genomes usually at the population level, and it has also proved to be suitable for interspecific phylogeny reconstruction. RADseq data pose challenges for standard phylogenomic methods, however, due to incomplete coverage of the genome and large amounts of missing data. Alignment-free methods are both efficient and accurate for phylogenetic reconstructions with whole genomes and are especially practical for non-model organisms; nonetheless, alignment-free methods have only been applied with whole genome sequences. Here, we test a full-genome assembly and alignment-free method, AAF, in application to RADseq data and propose two procedures for reads selection to remove missing data. We validate these methods using both simulations and a real dataset. Reads selection improved the accuracy of phylogenetic construction in every simulated scenario and the real dataset, making AAF comparable to or better than alignment-based method with much lower computation burdens. We also investigated the sources of missing data in RADseq and their effects on phylogeny reconstruction using AAF. The AAF pipeline modified for RADseq data, phyloRAD, is available on github (https://github.com/fanhuan/phyloRAD).


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Grace Png ◽  
Andrei Barysenka ◽  
Linda Repetto ◽  
Pau Navarro ◽  
Xia Shen ◽  
...  

AbstractDespite the increasing global burden of neurological disorders, there is a lack of effective diagnostic and therapeutic biomarkers. Proteins are often dysregulated in disease and have a strong genetic component. Here, we carry out a protein quantitative trait locus analysis of 184 neurologically-relevant proteins, using whole genome sequencing data from two isolated population-based cohorts (N = 2893). In doing so, we elucidate the genetic landscape of the circulating proteome and its connection to neurological disorders. We detect 214 independently-associated variants for 107 proteins, the majority of which (76%) are cis-acting, including 114 variants that have not been previously identified. Using two-sample Mendelian randomisation, we identify causal associations between serum CD33 and Alzheimer’s disease, GPNMB and Parkinson’s disease, and MSR1 and schizophrenia, describing their clinical potential and highlighting drug repurposing opportunities.


BMJ ◽  
2021 ◽  
pp. n214
Author(s):  
Weedon MN ◽  
Jackson L ◽  
Harrison JW ◽  
Ruth KS ◽  
Tyrrell J ◽  
...  

Abstract Objective To determine whether the sensitivity and specificity of SNP chips are adequate for detecting rare pathogenic variants in a clinically unselected population. Design Retrospective, population based diagnostic evaluation. Participants 49 908 people recruited to the UK Biobank with SNP chip and next generation sequencing data, and an additional 21 people who purchased consumer genetic tests and shared their data online via the Personal Genome Project. Main outcome measures Genotyping (that is, identification of the correct DNA base at a specific genomic location) using SNP chips versus sequencing, with results split by frequency of that genotype in the population. Rare pathogenic variants in the BRCA1 and BRCA2 genes were selected as an exemplar for detailed analysis of clinically actionable variants in the UK Biobank, and BRCA related cancers (breast, ovarian, prostate, and pancreatic) were assessed in participants through use of cancer registry data. Results Overall, genotyping using SNP chips performed well compared with sequencing; sensitivity, specificity, positive predictive value, and negative predictive value were all above 99% for 108 574 common variants directly genotyped on the SNP chips and sequenced in the UK Biobank. However, the likelihood of a true positive result decreased dramatically with decreasing variant frequency; for variants that are very rare in the population, with a frequency below 0.001% in UK Biobank, the positive predictive value was very low and only 16% of 4757 heterozygous genotypes from the SNP chips were confirmed with sequencing data. Results were similar for SNP chip data from the Personal Genome Project, and 20/21 individuals analysed had at least one false positive rare pathogenic variant that had been incorrectly genotyped. For pathogenic variants in the BRCA1 and BRCA2 genes, which are individually very rare, the overall performance metrics for the SNP chips versus sequencing in the UK Biobank were: sensitivity 34.6%, specificity 98.3%, positive predictive value 4.2%, and negative predictive value 99.9%. Rates of BRCA related cancers in UK Biobank participants with a positive SNP chip result were similar to those for age matched controls (odds ratio 1.31, 95% confidence interval 0.99 to 1.71) because the vast majority of variants were false positives, whereas sequence positive participants had a significantly increased risk (odds ratio 4.05, 2.72 to 6.03). Conclusions SNP chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung Yong Park ◽  
Gina Faraci ◽  
Pamela M. Ward ◽  
Jane F. Emerson ◽  
Ha Youn Lee

AbstractCOVID-19 global cases have climbed to more than 33 million, with over a million total deaths, as of September, 2020. Real-time massive SARS-CoV-2 whole genome sequencing is key to tracking chains of transmission and estimating the origin of disease outbreaks. Yet no methods have simultaneously achieved high precision, simple workflow, and low cost. We developed a high-precision, cost-efficient SARS-CoV-2 whole genome sequencing platform for COVID-19 genomic surveillance, CorvGenSurv (Coronavirus Genomic Surveillance). CorvGenSurv directly amplified viral RNA from COVID-19 patients’ Nasopharyngeal/Oropharyngeal (NP/OP) swab specimens and sequenced the SARS-CoV-2 whole genome in three segments by long-read, high-throughput sequencing. Sequencing of the whole genome in three segments significantly reduced sequencing data waste, thereby preventing dropouts in genome coverage. We validated the precision of our pipeline by both control genomic RNA sequencing and Sanger sequencing. We produced near full-length whole genome sequences from individuals who were COVID-19 test positive during April to June 2020 in Los Angeles County, California, USA. These sequences were highly diverse in the G clade with nine novel amino acid mutations including NSP12-M755I and ORF8-V117F. With its readily adaptable design, CorvGenSurv grants wide access to genomic surveillance, permitting immediate public health response to sudden threats.


Sign in / Sign up

Export Citation Format

Share Document