scholarly journals DISSECT: A new tool for analyzing extremely large genomic datasets

2015 ◽  
Author(s):  
Oriol Canela-Xandri ◽  
Andy Law ◽  
Alan Gray ◽  
John A. Woolliams ◽  
Albert Tenesa

Computational tools are quickly becoming the main bottleneck to analyze large-scale genomic and genetic data. This big-data problem, affecting a wide range of fields, is becoming more acute with the fast increase of data available. To address it, we developed DISSECT, a new, easy to use, and freely available software able to exploit the parallel computer architectures of supercomputers to perform a wide range of genomic and epidemiologic analyses which currently can only be carried out on reduced sample sizes or in restricted conditions. We showcased our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using Mixed Linear Model analysis. We analyzed simulated traits from half a million individuals genotyped for 590,004 SNPs using the combined computational power of 8,400 processor cores. We found that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large numbers of training individuals.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5179 ◽  
Author(s):  
Samuel R. Borstein ◽  
Brian C. O’Meara

BackgroundDNA sequences are pivotal for a wide array of research in biology. Large sequence databases, like GenBank, provide an amazing resource to utilize DNA sequences for large scale analyses. However, many sequence records on GenBank contain more than one gene or are portions of genomes. Inconsistencies in the way genes are annotated and the numerous synonyms a single gene may be listed under provide major challenges for extracting large numbers of subsequences for comparative analysis across taxa. At present, there is no easy way to extract portions from many GenBank accessions based on annotations where gene names may vary extensively.ResultsThe R packageAnnotationBustRallows users to extract sequences based on GenBank annotations through the ACNUC retrieval system given search terms of gene synonyms and accession numbers.AnnotationBustRextracts subsequences of interest and then writes them to a FASTA file for users to employ in their research endeavors.ConclusionFASTA files of extracted subsequences and accession tables generated byAnnotationBustRallow users to quickly find and extract subsequences from GenBank accessions. These sequences can then be incorporated in various analyses, like the construction of phylogenies to test a wide range of ecological and evolutionary hypotheses.


Author(s):  
Alexandra Sanmark

Chapter 5 shifts the focus to the rituals and activities of the wider community in Scandinavia. At thing sites a wide range of community activities and rituals, which most likely and created collective memories and strengthened social cohesion, were enacted. Many of these activities may have been designed by the elite, but equally the idea of assemblies as communal spaces may have been collectively driven. The archaeological signature of meeting-places and assembly-sites suggests associations with feasting and eating on a large-scale, and architectural layouts that emphasised the collective over the individual and facilitated group interaction and cohesion. The construction, enlargement and maintenance of monuments and other features required the participation of large numbers of people. By joining in this work the population gained shared ownership of the sites. This was further enhanced by communal activities during the meetings, which also involved games and sports, as well as trade. Assemblies therefore formed arenas of interplay between the top-elite and the wider population; kings were elected and ruled through the assembly, while at the same time continuously dependent on the endorsement of the people.


Author(s):  
Samuel R. Borstein ◽  
Brian C. O'Meara

Background. DNA sequences are pivotal for a wide array of research in biology. Large sequence databases, like GenBank, provide an amazing resource to utilize DNA sequences for large scale analyses. However, many sequences on GenBank contain more than one gene or are portions of genomes, and inconsistencies in the way genes are annotated and the numerous synonyms a single gene may be listed under provide major challenges for extracting large numbers of subsequences for comparative analysis across taxa. At present, there is no easy way to extract portions from multiple GenBank accessions based on annotations where gene names may vary extensively. Results. The R package AnnotationBustR allows users to extract sequences based on GenBank annotations through the ACNUC retrieval system given search terms of gene synonyms and accession numbers. AnnotationBustR extracts portions of interest and then writes them to a FASTA file for users to employ in their research endeavors. Conclusion. FASTA files of extracted subsequences and accession tables generated by AnnotationBustR allow users to quickly find and extract subsequences from GenBank accessions. These sequences can then be incorporated in various analyses, like the construction of phylogenies to test a wide range of ecological and evolutionary hypotheses.


Author(s):  
Samuel R. Borstein ◽  
Brian C. O'Meara

Background. DNA sequences are pivotal for a wide array of research in biology. Large sequence databases, like GenBank, provide an amazing resource to utilize DNA sequences for large scale analyses. However, many sequences on GenBank contain more than one gene or are portions of genomes, and inconsistencies in the way genes are annotated and the numerous synonyms a single gene may be listed under provide major challenges for extracting large numbers of subsequences for comparative analysis across taxa. At present, there is no easy way to extract portions from multiple GenBank accessions based on annotations where gene names may vary extensively. Results. The R package AnnotationBustR allows users to extract sequences based on GenBank annotations through the ACNUC retrieval system given search terms of gene synonyms and accession numbers. AnnotationBustR extracts portions of interest and then writes them to a FASTA file for users to employ in their research endeavors. Conclusion. FASTA files of extracted subsequences and accession tables generated by AnnotationBustR allow users to quickly find and extract subsequences from GenBank accessions. These sequences can then be incorporated in various analyses, like the construction of phylogenies to test a wide range of ecological and evolutionary hypotheses.


2019 ◽  
Vol 374 (1782) ◽  
pp. 20190224 ◽  
Author(s):  
Daniel J. Becker ◽  
Alex D. Washburne ◽  
Christina L. Faust ◽  
Erin A. Mordecai ◽  
Raina K. Plowright

Disease emergence events, epidemics and pandemics all underscore the need to predict zoonotic pathogen spillover. Because cross-species transmission is inherently hierarchical, involving processes that occur at varying levels of biological organization, such predictive efforts can be complicated by the many scales and vastness of data potentially required for forecasting. A wide range of approaches are currently used to forecast spillover risk (e.g. macroecology, pathogen discovery, surveillance of human populations, among others), each of which is bound within particular phylogenetic, spatial and temporal scales of prediction. Here, we contextualize these diverse approaches within their forecasting goals and resulting scales of prediction to illustrate critical areas of conceptual and pragmatic overlap. Specifically, we focus on an ecological perspective to envision a research pipeline that connects these different scales of data and predictions from the aims of discovery to intervention. Pathogen discovery and predictions focused at the phylogenetic scale can first provide coarse and pattern-based guidance for which reservoirs, vectors and pathogens are likely to be involved in spillover, thereby narrowing surveillance targets and where such efforts should be conducted. Next, these predictions can be followed with ecologically driven spatio-temporal studies of reservoirs and vectors to quantify spatio-temporal fluctuations in infection and to mechanistically understand how pathogens circulate and are transmitted to humans. This approach can also help identify general regions and periods for which spillover is most likely. We illustrate this point by highlighting several case studies where long-term, ecologically focused studies (e.g. Lyme disease in the northeast USA, Hendra virus in eastern Australia, Plasmodium knowlesi in Southeast Asia) have facilitated predicting spillover in space and time and facilitated the design of possible intervention strategies. Such studies can in turn help narrow human surveillance efforts and help refine and improve future large-scale, phylogenetic predictions. We conclude by discussing how greater integration and exchange between data and predictions generated across these varying scales could ultimately help generate more actionable forecasts and interventions. This article is part of the theme issue ‘Dynamic and integrative approaches to understanding pathogen spillover’.


2020 ◽  
Author(s):  
Marshall W. Ritchie ◽  
Jeff W. Dawson ◽  
Heath A. MacMillan

AbstractThe body temperature of ectothermic animals is heavily dependent on environmental temperature, impacting fitness. Laboratory exposure to favorable and unfavorable temperatures is used to understand these effects, as well as the physiological, biochemical, and molecular underpinnings of variation in thermal performance. Although small ectotherms, like insects, can often be easily reared in large numbers, it can be challenging and expensive to simultaneously create and manipulate several thermal environments in a laboratory setting. Here, we describe the creation and use of a thermal gradient device that can produce a wide range of constant or varying temperatures concurrently. This device is composed of a solid aluminum plate and copper piping, combined with a pair of programmable refrigerated circulators. As a simple proof-of-concept, we completed single experimental runs to produce a low-temperature survival curve for flies (Drosophila melanogaster) and explore the effects of daily thermal cycles of varying amplitude on growth rates of crickets (Gryllodes sigillatus). This approach avoids the use of multiple heating/cooling water or glycol baths or incubators for large-scale assessments of organismal thermal performance. It makes static or dynamic thermal experiments (e.g., creating a thermal performance or survival curves, quantifying responses to fluctuating thermal environments, or monitoring animal behaviour across a range of temperatures) easier, faster, and less costly.


1993 ◽  
Vol 323 ◽  
Author(s):  
Y. S. Li ◽  
M. A. van Daelen ◽  
D. King-Smith ◽  
M. Wrinn ◽  
E. Wimmer ◽  
...  

AbstractDensity functional theory provides a first-principles approach for computing the geometric and electronic structures, and a wealth of corresponding properties, of a wide range of materials types and compositions, including bulk solids, surfaces, defects and clusters of molecules. Parallel advances in hardware performance, implementation strategies and algorithms have all contributed to a rapid growth in the number of important applications. Recent developments under each of these themes are outlined and the breadth of current applications is illustrated by typical examples. Issues associated with the implementation and performance of density functional methods on parallel computer architectures are discussed.


2021 ◽  
pp. 1-18
Author(s):  
Andrew Cardow ◽  
Jean-Sebastien Imbeau ◽  
Bill Willie Apiata ◽  
Jenny Martin

Abstract Transition from the military environment into a civilian environment is a topic that has seen increasing attention within the last two decades. There is, in the literature, a clearly articulated issue that transition from the military to the civilian world is somewhat different to transitioning from school to work, or from career to career, or from work to retirement. Many, but not all, of the extant examples regarding military transition are case studies, focus groups or small-scale qualitative surveys. The following article details a large-scale survey that took place in New Zealand in 2019. From just over 1400 responses, a wide range of information was gathered. The aim of the survey was to uncover the experiences of military who had undergone transition within New Zealand. In this respect, the survey was exploratory. We report here the qualitative results that expand the existing body of knowledge of military transition. Our results are in line with international results and demonstrate that a large majority of respondents had a less than desirable transition experience. The contribution made therefore is a reinforcement that current practice in this area is needing a great deal of attention. The following outlines the experiences our New Zealand-based respondents had and how this mirrors the extant international literature. As this was the first survey of its kind to attract large numbers of respondents within New Zealand, the results and discussion that follow present aspects of transition that the Ministry of Defence and the New Zealand Defence Force may wish to consider when planning future transition programmes.


Author(s):  
V. C. Kannan ◽  
A. K. Singh ◽  
R. B. Irwin ◽  
S. Chittipeddi ◽  
F. D. Nkansah ◽  
...  

Titanium nitride (TiN) films have historically been used as diffusion barrier between silicon and aluminum, as an adhesion layer for tungsten deposition and as an interconnect material etc. Recently, the role of TiN films as contact barriers in very large scale silicon integrated circuits (VLSI) has been extensively studied. TiN films have resistivities on the order of 20μ Ω-cm which is much lower than that of titanium (nearly 66μ Ω-cm). Deposited TiN films show resistivities which vary from 20 to 100μ Ω-cm depending upon the type of deposition and process conditions. TiNx is known to have a NaCl type crystal structure for a wide range of compositions. Change in color from metallic luster to gold reflects the stabilization of the TiNx (FCC) phase over the close packed Ti(N) hexagonal phase. It was found that TiN (1:1) ideal composition with the FCC (NaCl-type) structure gives the best electrical property.


Author(s):  
Jose-Maria Carazo ◽  
I. Benavides ◽  
S. Marco ◽  
J.L. Carrascosa ◽  
E.L. Zapata

Obtaining the three-dimensional (3D) structure of negatively stained biological specimens at a resolution of, typically, 2 - 4 nm is becoming a relatively common practice in an increasing number of laboratories. A combination of new conceptual approaches, new software tools, and faster computers have made this situation possible. However, all these 3D reconstruction processes are quite computer intensive, and the middle term future is full of suggestions entailing an even greater need of computing power. Up to now all published 3D reconstructions in this field have been performed on conventional (sequential) computers, but it is a fact that new parallel computer architectures represent the potential of order-of-magnitude increases in computing power and should, therefore, be considered for their possible application in the most computing intensive tasks.We have studied both shared-memory-based computer architectures, like the BBN Butterfly, and local-memory-based architectures, mainly hypercubes implemented on transputers, where we have used the algorithmic mapping method proposed by Zapata el at. In this work we have developed the basic software tools needed to obtain a 3D reconstruction from non-crystalline specimens (“single particles”) using the so-called Random Conical Tilt Series Method. We start from a pair of images presenting the same field, first tilted (by ≃55°) and then untilted. It is then assumed that we can supply the system with the image of the particle we are looking for (ideally, a 2D average from a previous study) and with a matrix describing the geometrical relationships between the tilted and untilted fields (this step is now accomplished by interactively marking a few pairs of corresponding features in the two fields). From here on the 3D reconstruction process may be run automatically.


Sign in / Sign up

Export Citation Format

Share Document