scholarly journals Troubleshooting unstable molecules in chemical space

2021 ◽  
Vol 12 (15) ◽  
pp. 5566-5573
Author(s):  
Salini Senthil ◽  
Sabyasachi Chakraborty ◽  
Raghunathan Ramakrishnan

A high-throughput workflow for connectivity preserving geometry optimization minimizes unintended structural rearrangements during quantum chemistry big data generation.

2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Andreas Friedrich ◽  
Erhan Kenar ◽  
Oliver Kohlbacher ◽  
Sven Nahnsen

Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.


2020 ◽  
Vol 39 (5) ◽  
pp. 397-421
Author(s):  
Charlene Andraos ◽  
Il Je Yu ◽  
Mary Gulumian

Despite several studies addressing nanoparticle (NP) interference with conventional toxicity assay systems, it appears that researchers still rely heavily on these assays, particularly for high-throughput screening (HTS) applications in order to generate “big” data for predictive toxicity approaches. Moreover, researchers often overlook investigating the different types of interference mechanisms as the type is evidently dependent on the type of assay system implemented. The approaches implemented in the literature appear to be not adequate as it often addresses only one type of interference mechanism with the exclusion of others. For example, interference of NPs that have entered cells would require intracellular assessment of their interference with fluorescent dyes, which has so far been neglected. The present study investigated the mechanisms of interference of gold NPs and silver NPs in assay systems implemented in HTS including optical interference as well as adsorption or catalysis. The conventional assays selected cover all optical read-out systems, that is, absorbance (XTT toxicity assay), fluorescence (CytoTox-ONE Homogeneous membrane integrity assay), and luminescence (CellTiter Glo luminescent assay). Furthermore, this study demonstrated NP quenching of fluorescent dyes also used in HTS (2′,7′-dichlorofluorescein, propidium iodide, and 5,5′,6,6′-tetrachloro-1,1′,3,3′-tetraethyl-benzamidazolocarbocyanin iodide). To conclude, NP interference is, as such, not a novel concept, however, ignoring this aspect in HTS may jeopardize attempts in predictive toxicology. It should be mandatory to report the assessment of all mechanisms of interference within HTS, as well as to confirm results with label-free methodologies to ensure reliable big data generation for predictive toxicology.


Author(s):  
Xabier Rodríguez-Martínez ◽  
Enrique Pascual-San-José ◽  
Mariano Campoy-Quiles

This review article presents the state-of-the-art in high-throughput computational and experimental screening routines with application in organic solar cells, including materials discovery, device optimization and machine-learning algorithms.


2021 ◽  
Vol 9 (9) ◽  
pp. 3324-3333 ◽  
Author(s):  
Ke Zhao ◽  
Ömer H. Omar ◽  
Tahereh Nematiaram ◽  
Daniele Padula ◽  
Alessandro Troisi

125 potential TADF candidates are identified through quantum chemistry calculations of 700 molecules derived from a database of 40 000 molecular semiconductors. Most of them are new and some do not belong to the class of donor–acceptor molecules.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ikbal Taleb ◽  
Mohamed Adel Serhani ◽  
Chafik Bouhaddioui ◽  
Rachida Dssouli

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.


Author(s):  
Pijush Kanti Dutta Pramanik ◽  
Saurabh Pal ◽  
Moutan Mukhopadhyay

Like other fields, the healthcare sector has also been greatly impacted by big data. A huge volume of healthcare data and other related data are being continually generated from diverse sources. Tapping and analysing these data, suitably, would open up new avenues and opportunities for healthcare services. In view of that, this paper aims to present a systematic overview of big data and big data analytics, applicable to modern-day healthcare. Acknowledging the massive upsurge in healthcare data generation, various ‘V's, specific to healthcare big data, are identified. Different types of data analytics, applicable to healthcare, are discussed. Along with presenting the technological backbone of healthcare big data and analytics, the advantages and challenges of healthcare big data are meticulously explained. A brief report on the present and future market of healthcare big data and analytics is also presented. Besides, several applications and use cases are discussed with sufficient details.


Author(s):  
M. Mazhar Rathore ◽  
Anand Paul ◽  
Awais Ahmad ◽  
Gwanggil Jeon

Recently, a rapid growth in the population in urban regions demands the provision of services and infrastructure. These needs can be come up wit the use of Internet of Things (IoT) devices, such as sensors, actuators, smartphones and smart systems. This leans to building Smart City towards the next generation Super City planning. However, as thousands of IoT devices are interconnecting and communicating with each other over the Internet to establish smart systems, a huge amount of data, termed as Big Data, is being generated. It is a challenging task to integrate IoT services and to process Big Data in an efficient way when aimed at decision making for future Super City. Therefore, to meet such requirements, this paper presents an IoT-based system for next generation Super City planning using Big Data Analytics. Authors have proposed a complete system that includes various types of IoT-based smart systems like smart home, vehicular networking, weather and water system, smart parking, and surveillance objects, etc., for dada generation. An architecture is proposed that includes four tiers/layers i.e., 1) Bottom Tier-1, 2) Intermediate Tier-1, 3) Intermediate Tier 2, and 4) Top Tier that handle data generation and collections, communication, data administration and processing, and data interpretation, respectively. The system implementation model is presented from the generation and collection of data to the decision making. The proposed system is implemented using Hadoop ecosystem with MapReduce programming. The throughput and processing time results show that the proposed Super City planning system is more efficient and scalable.


2021 ◽  
Author(s):  
Adarsh Kalikadien ◽  
Evgeny A. Pidko ◽  
Vivek Sinha

<div>Local chemical space exploration of an experimentally synthesized material can be done by making slight structural</div><div>variations of the synthesized material. This generation of many molecular structures with reasonable quality,</div><div>that resemble an existing (chemical) purposeful material, is needed for high-throughput screening purposes in</div><div>material design. Large databases of geometry and chemical properties of transition metal complexes are not</div><div>readily available, although these complexes are widely used in homogeneous catalysis. A Python-based workflow,</div><div>ChemSpaX, that is aimed at automating local chemical space exploration for any type of molecule, is introduced.</div><div>The overall computational workflow of ChemSpaX is explained in more detail. ChemSpaX uses 3D information,</div><div>to place functional groups on an input structure. For example, the input structure can be a catalyst for which one</div><div>wants to use high-throughput screening to investigate if the catalytic activity can be improved. The newly placed</div><div>substituents are optimized using a computationally cheap force-field optimization method. After placement of</div><div>new substituents, higher level optimizations using xTB or DFT instead of force-field optimization are also possible</div><div>in the current workflow. In representative applications of ChemSpaX, it is shown that the structures generated by</div><div>ChemSpaX have a reasonable quality for usage in high-throughput screening applications. Representative applications</div><div>of ChemSpaX are shown by investigating various adducts on functionalized Mn-based pincer complexes,</div><div>hydrogenation of Ru-based pincer complexes, functionalization of cobalt porphyrin complexes and functionalization</div><div>of a bipyridyl functionalized cobalt-porphyrin trapped in a M2L4 type cage complex. Descriptors such as</div><div>the Gibbs free energy of reaction and HOMO-LUMO gap, that can be used in data-driven design and discovery</div><div>of catalysts, were selected and studied in more detail for the selected use cases. The relatively fast GFN2-xTB</div><div>method was used to calculate these descriptors and a comparison was done against DFT calculated descriptors.</div><div>ChemSpaX is open-source and aims to bolster the efforts of the scientific community towards data-driven material</div><div>discovery.</div>


2020 ◽  
Author(s):  
Anna M. Sozanska ◽  
Charles Fletcher ◽  
Dóra Bihary ◽  
Shamith A. Samarajiwa

AbstractMore than three decades ago, the microarray revolution brought about high-throughput data generation capability to biology and medicine. Subsequently, the emergence of massively parallel sequencing technologies led to many big-data initiatives such as the human genome project and the encyclopedia of DNA elements (ENCODE) project. These, in combination with cheaper, faster massively parallel DNA sequencing capabilities, have democratised multi-omic (genomic, transcriptomic, translatomic and epigenomic) data generation leading to a data deluge in bio-medicine. While some of these data-sets are trapped in inaccessible silos, the vast majority of these data-sets are stored in public data resources and controlled access data repositories, enabling their wider use (or misuse). Currently, most peer reviewed publications require the deposition of the data-set associated with a study under consideration in one of these public data repositories. However, clunky and difficult to use interfaces, subpar or incomplete annotation prevent discovering, searching and filtering of these multi-omic data and hinder their re-purposing in other use cases. In addition, the proliferation of multitude of different data repositories, with partially redundant storage of similar data are yet another obstacle to their continued usefulness. Similarly, interfaces where annotation is spread across multiple web pages, use of accession identifiers with ambiguous and multiple interpretations and lack of good curation make these data-sets difficult to use. We have produced SpiderSeqR, an R package, whose main features include the integration between NCBI GEO and SRA databases, enabling an integrated unified search of SRA and GEO data-sets and associated annotations, conversion between database accessions, as well as convenient filtering of results and saving past queries for future use. All of the above features aim to promote data reuse to facilitate making new discoveries and maximising the potential of existing data-sets.Availabilityhttps://github.com/ss-lab-cancerunit/SpiderSeqR


Sign in / Sign up

Export Citation Format

Share Document