Resumption of Data Extraction Process in Parallel Data Warehouses

Author(s):  
Marcin Gorawski ◽  
Pawel Marks
Pharmaceutics ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 358 ◽  
Author(s):  
Chiara R. M. Brambilla ◽  
Ogochukwu Lilian Okafor-Muo ◽  
Hany Hassanin ◽  
Amr ElShaer

Three-dimensional (3D) printing is a recent technology, which gives the possibility to manufacture personalised dosage forms and it has a broad range of applications. One of the most developed, it is the manufacture of oral solid dosage and the four 3DP techniques which have been more used for their manufacture are FDM, inkjet 3DP, SLA and SLS. This systematic review is carried out to statistically analyze the current 3DP techniques employed in manufacturing oral solid formulations and assess the recent trends of this new technology. The work has been organised into four steps, (1) screening of the articles, definition of the inclusion and exclusion criteria and classification of the articles in the two main groups (included/excluded); (2) quantification and characterisation of the included articles; (3) evaluation of the validity of data and data extraction process; (4) data analysis, discussion, and conclusion to define which technique offers the best properties to be applied in the manufacture of oral solid formulations. It has been observed that with SLS 3DP technique, all the characterisation tests required by the BP (drug content, drug dissolution profile, hardness, friability, disintegration time and uniformity of weight) have been performed in the majority of articles, except for the friability test. However, it is not possible to define which of the four 3DP techniques is the most suitable for the manufacture of oral solid formulations, because the selection is affected by different parameters, such as the type of formulation, the physical-mechanical properties to achieve. Moreover, each technique has its specific advantages and disadvantages, such as for FDM the biggest challenge is the degradation of the drug, due to high printing temperature process or for SLA is the toxicity of the carcinogenic risk of the photopolymerising material.


Author(s):  
Francisco Andres Rivera-Quiroz ◽  
Jeremy Miller

Traditional taxonomic publications have served as a biological data repository accumulating vast amounts of data on species diversity, geographical and temporal distributions, ecological interactions, taxonomic relations, among many other types of information. However, the fragmented nature of taxonomic literature has made this data difficult to access and use to its full potential. Current anthropogenic impact on biodiversity demands faster knowledge generation, but also making better use of what we already have. This could help us make better-informed decisions about conservation and resources management. In past years, several efforts have been made to make taxonomic literature more mobilized and accessible. These include online publications, open access journals, the digitization of old paper literature and improved availability through online specialized repositories such as the Biodiversity Heritage Library (BHL) and the World Spider Catalog (WSC), among others. Although easy to share, PDF publications still have most of their biodiversity data embedded in strings of text making them less dynamic and more difficult or impossible to read and analyze without a human interpreter. Recently developed tools as GoldenGATE-Imagine (GGI) allow transforming PDFs in XML files that extract and categorize taxonomically relevant data. These data can then be aggregated in databases such as Plazi TreatmentBank, where it can be re-explored, queried and analyzed. Here we combined several of these cybertaxonomic tools to test the data extraction process for one potential application: the design and planning of an expedition to collect fresh material in the field. We targeted the ground spider Teutamus politus and other related species from the Teutamus group (TG) (Araneae; Liocranidae). These spiders are known from South East Asia and have been cataloged in the family Liocranidae; however, their relations, biology and evolution are still poorly understood. We marked-up 56 publications that contained taxonomic treatments with specimen records for the Liocranidae. Of these publications, 20 contained information on members of the TG. Geographical distributions and occurrences of 90 TG species were analyzed based on 1,309 specimen records. These data were used to design our field collection in a way that allowed us to optimize the collection of adult specimens of our target taxa. The TG genera were most common in Indonesia, Thailand and Malaysia. From these, Thailand was the second richest but had the most records of T. politus. Seasonal distribution of TG specimens in Thailand suggested June and July as the best time for collecting adults. Based on these analyses, we decided to sample from mid-July to mid-August 2018 in the three Thai provinces that combined most records of TG species and T. politus. Relying on the results of our literature analyses and using standard collection methods for ground spiders, we captured at least one specimen of every TG genus reported for Thailand. Our one-month expedition captured 231 TG spiders; from these, T. politus was the most abundant species with 188 specimens (95 adults). By comparison, a total of 196 specimens of the TG and 66 of T. politus had been reported for the same provinces in the last 40 years. Our sampling greatly increased the number of available specimens, especially for the genera Teutamus and Oedignatha. Also, we extended the known distribution of Oedignatha and Sesieutes within Thailand. These results illustrate the relevance of making biodiversity data contained within taxonomic treatments accessible and reusable. It also exemplifies one potential use of taxonomic legacy data: to more efficiently use existing biodiversity data to fill knowledge gaps. A similar approach can be used to study neglected or interesting taxa and geographic areas, generating a better biodiversity documentation that could aid in decision making, management and conservation.


Author(s):  
Peter S. Curtis ◽  
Kerrie Mengersen ◽  
Marc J. Lajeunesse ◽  
Hannah R. Rothstein ◽  
Gavin B. Stewart

This chapter discusses the data extraction process, meta-analysis database, and critical appraisal of data. The efficient and accurate extraction of data from primary studies is an important component of successful research reviews. It is one of the most time-consuming parts of a research review and should be approached with the goal of repeatability and transparency of results. Careful definition of the research question and identification of the effect size metric(s) to be used are prerequisites to efficient data extraction. The extraction spreadsheet may simply be appended to a growing database stored in a single spreadsheet (also known as “flat file database”) (e.g., Microsoft Excel, Lotus, Quattro Pro), but it may be advantageous to develop relational databases (e.g., by using Microsoft Access, Paradox or dBase software), particularly for large or complex data. During the process of data extraction the investigator also has an opportunity for critical appraisal of data quality. One approach to quantitative assessment of study quality has been the use of numerical scales in which points are assigned to specific elements of the study and summed to produce an overall quality score.


CJEM ◽  
2018 ◽  
Vol 20 (S1) ◽  
pp. S48-S49
Author(s):  
H. C. Lindsay ◽  
J. Gallaher ◽  
C. Wright ◽  
L. Korchinski ◽  
C. Kim Sing

Introduction: For patients with chest pain, the target time from first medical contact to obtaining an electrocardiogram (ECG) is 10 minutes, as reperfusion within 120 minutes can reduce the risk of death and adverse outcomes in patients with ST elevation myocardial infarction (STEMI). In 2007, Vancouver Coastal Health (VCH) began tracking key indicators including time to first ECG. The Vancouver General Hospital (VGH) Emergency Department (ED) has been troubled with the longest door to ECG times in the region since 2014. In 2016, the VGH ED Quality Council developed a strategy to address this issue, with an aim of obtaining ECGs on 95% of patients presenting to the VGH ED with active chest pain within 10 minutes of presentation within a 6 month period. Methods: The VGH ED Quality Council brought together frontline clinicians, ECG technicians, and other stakeholders and completed a process map. We obtained baseline data regarding the median time to ECG in both patients with STEMI and all patients presenting with chest pain. Root cause analysis determined two main barriers: access to designated space to obtain ECGs, and the need for patients to be registered in the computer system before an ECG could be ordered. The team identified strategies to eliminate these barriers, identifying a dedicated space and undergoing multiple PDSA cycles to change the workflow to stream patients to this space before registration. Results: Our median times in patients with STEMI have gone from 33 minutes to 8 minutes as of June 2017. In all patients presenting with chest pain, we improved from a median of 36 to 17 minutes. As of April 2017 we are obtaining an ECG within 10 minutes in 27% of our patients, compared to 3% in 2016. Given the limitations in our data extraction process, we were not able to differentiate between patients with active chest pain versus those whose chest pain had resolved. Conclusion: By involving frontline staff, and having frontline champions providing real time support, we were able to make significant changes to the culture at triage. We cultivated sustainability by changing the workflow and physical space, and not relying on education only. While we have improved the times for our walk-in patients, we have not perfected the process when a patient moves immediately to a bed or presents via ambulance. Implementing small changes and incorporating feedback has allowed us to identify these new challenges early.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Jingfeng Yang ◽  
Nanfeng Zhang ◽  
Ming Li ◽  
Yanwei Zheng ◽  
Li Wang ◽  
...  

Due to the continuous progress in the field of vehicle hardware, the condition that a vehicle cannot load a complex algorithm no longer exists. At the same time, with the progress in the field of vehicle hardware, a number of studies have reported exponential growth in the actual operation. To solve the problem for a large number of data transmissions in an actual operation, wireless transmission is proposed for text information (including position information) on the basis of the principles of the maximum entropy probability and the neural network prediction model combined with the optimization of the Huffman encoding algorithm, from the exchange of data to the entire data extraction process. The test results showed that the text-type vehicle information based on a compressed algorithm to optimize the algorithm of data compression and transmission could effectively realize the data compression, achieve a higher compression rate and data transmission integrity, and after decompression guarantee no distortion. Therefore, it is important to improve the efficiency of vehicle information transmission, to ensure the integrity of information, to realize the vehicle monitoring and control, and to grasp the traffic situation in real time.


Author(s):  
MOHAMMAD SHAFKAT AMIN ◽  
HASAN JAMIL

In the last few years, several works in the literature have addressed the problem of data extraction from web pages. The importance of this problem derives from the fact that, once extracted, data can be handled in a way similar to instances of a traditional database, which in turn can facilitate application of web data integration and various other domain specific problems. In this paper, we propose a novel table extraction technique that works on web pages generated dynamically from a back-end database. The proposed system can automatically discover table structure by relevant pattern mining from web pages in an efficient way, and can generate regular expression for the extraction process. Moreover, the proposed system can assign intuitive column names to the columns of the extracted table by leveraging Wikipedia knowledge base for the purpose of table annotation. To improve accuracy of the assignment, we exploit the structural homogeneity of the column values and their co-location information to weed out less likely candidates. This approach requires no human intervention and experimental results have shown its accuracy to be promising. Moreover, the wrapper generation algorithm works in linear time.


2013 ◽  
Vol 756-759 ◽  
pp. 2583-2587 ◽  
Author(s):  
Zi Yang Han ◽  
Feng Ying Wang ◽  
Ping Sun ◽  
Zheng Yu Li

There are so many Deep Webs in Internet, which contains a large amount of valuable data, This paper proposes a Deep Web data extraction and service system based on the principle of cloud technology. We adopt a kind of multi-node parallel computing system structure and design a task scheduling algorithm in the data extraction process, in above foundation, balance the task load of among nodes to accomplish data extraction rapidly; The experimental results show that cloud parallel computing and dispersed network resources are used to extract data in Deep Web system is valid and improves the data extraction efficiency of Deep Web and service quality.


Sign in / Sign up

Export Citation Format

Share Document