scholarly journals Comment: A brief survey of the current state of play for Bayesian computation in data science at big-data scale

2017 ◽  
Vol 31 (4) ◽  
pp. 686-691 ◽  
Author(s):  
David Draper ◽  
Alexander Terenin
JAMIA Open ◽  
2018 ◽  
Vol 1 (2) ◽  
pp. 136-141 ◽  
Author(s):  
Philip R O Payne ◽  
Elmer V Bernstam ◽  
Justin B Starren

Abstract There are an ever-increasing number of reports and commentaries that describe the challenges and opportunities associated with the use of big data and data science (DS) in the context of biomedical education, research, and practice. These publications argue that there are substantial benefits resulting from the use of data-centric approaches to solve complex biomedical problems, including an acceleration in the rate of scientific discovery, improved clinical decision making, and the ability to promote healthy behaviors at a population level. In addition, there is an aligned and emerging body of literature that describes the ethical, legal, and social issues that must be addressed to responsibly use big data in such contexts. At the same time, there has been growing recognition that the challenges and opportunities being attributed to the expansion in DS often parallel those experienced by the biomedical informatics community. Indeed, many informaticians would consider some of these issues relevant to the core theories and methods incumbent to the field of biomedical informatics science and practice. In response to this topic area, during the 2016 American College of Medical Informatics Winter Symposium, a series of presentations and focus group discussions intended to define the current state and identify future directions for interaction and collaboration between people who identify themselves as working on big data, DS, and biomedical informatics were conducted. We provide a perspective concerning these discussions and the outcomes of that meeting, and also present a set of recommendations that we have generated in response to a thematic analysis of those same outcomes. Ultimately, this report is intended to: (1) summarize the key issues currently being discussed by the biomedical informatics community as it seeks to better understand how to constructively interact with the emerging biomedical big data and DS fields; and (2) propose a framework and agenda that can serve to advance this type of constructive interaction, with mutual benefit accruing to both fields.


2018 ◽  
Author(s):  
Jen Schradie

With a growing interest in data science and online analytics, researchers are increasingly using data derived from the Internet. Whether for qualitative or quantitative analysis, online data, including “Big Data,” can often exclude marginalized populations, especially those from the poor and working class, as the digital divide remains a persistent problem. This methodological commentary on the current state of digital data and methods disentangles the hype from the reality of digitally produced data for sociological research. In the process, it offers strategies to address the weaknesses of data that is derived from the Internet in order to represent marginalized populations.


2019 ◽  
Vol 9 (11) ◽  
pp. 2331 ◽  
Author(s):  
Luis Bote-Curiel ◽  
Sergio Muñoz-Romero ◽  
Alicia Gerrero-Curieses ◽  
José Luis Rojo-Álvarez

In the last few years, there has been a growing expectation created about the analysis of large amounts of data often available in organizations, which has been both scrutinized by the academic world and successfully exploited by industry. Nowadays, two of the most common terms heard in scientific circles are Big Data and Deep Learning. In this double review, we aim to shed some light on the current state of these different, yet somehow related branches of Data Science, in order to understand the current state and future evolution within the healthcare area. We start by giving a simple description of the technical elements of Big Data technologies, as well as an overview of the elements of Deep Learning techniques, according to their usual description in scientific literature. Then, we pay attention to the application fields that can be said to have delivered relevant real-world success stories, with emphasis on examples from large technology companies and financial institutions, among others. The academic effort that has been put into bringing these technologies to the healthcare sector are then summarized and analyzed from a twofold view as follows: first, the landscape of application examples is globally scrutinized according to the varying nature of medical data, including the data forms in electronic health recordings, medical time signals, and medical images; second, a specific application field is given special attention, in particular the electrocardiographic signal analysis, where a number of works have been published in the last two years. A set of toy application examples are provided with the publicly-available MIMIC dataset, aiming to help the beginners start with some principled, basic, and structured material and available code. Critical discussion is provided for current and forthcoming challenges on the use of both sets of techniques in our future healthcare.


2016 ◽  
Vol 39 (1) ◽  
pp. 63-77 ◽  
Author(s):  
Susan A. Matney ◽  
Theresa (Tess) Settergren ◽  
Jane M. Carrington ◽  
Rachel L. Richesson ◽  
Amy Sheide ◽  
...  

Disparate data must be represented in a common format to enable comparison across multiple institutions and facilitate Big Data science. Nursing assessments represent a rich source of information. However, a lack of agreement regarding essential concepts and standardized terminology prevent their use for Big Data science in the current state. The purpose of this study was to align a minimum set of physiological nursing assessment data elements with national standardized coding systems. Six institutions shared their 100 most common electronic health record nursing assessment data elements. From these, a set of distinct elements was mapped to nationally recognized Logical Observations Identifiers Names and Codes (LOINC®) and Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT®) standards. We identified 137 observation names (55% new to LOINC), and 348 observation values (20% new to SNOMED CT) organized into 16 panels (72% new LOINC). This reference set can support the exchange of nursing information, facilitate multi-site research, and provide a framework for nursing data analysis.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


MedienJournal ◽  
2017 ◽  
Vol 38 (4) ◽  
pp. 50-61 ◽  
Author(s):  
Jan Jagodzinski

This paper will first briefly map out the shift from disciplinary to control societies (what I call designer capitalism, the idea of control comes from Gilles Deleuze) in relation to surveillance and mediation of life through screen cultures. The paper then shifts to the issues of digitalization in relation to big data that have the danger of continuing to close off life as zoë, that is life that is creative rather than captured via attention technologies through marketing techniques and surveillance. The last part of this paper then develops the way artists are able to resist the big data archive by turning the data in on itself to offer viewers and participants a glimpse of the current state of manipulating desire and maintaining copy right in order to keep the future closed rather than being potentially open.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


2020 ◽  
Vol 22 (5) ◽  
pp. 51-55
Author(s):  
OLEG N. KORCHAGIN ◽  
◽  
ANASTASIA V. LYADSKAYA ◽  

The article is devoted to the current state of digitalization aimed at solving urgent problems of combating corruption in the field of public administration and private business sector. The work considers the experience of foreign countries and the influence of digital technologies on the fight against corruption. It is noted that the digitalization of public administration is becoming one of the decisive factors for increasing the efficiency of the anti-corruption system and improving management mechanisms. Big Data, if integrated and structured according to the given parameters, allows the implementation of legislative, law enforcement, control and supervisory and law enforcement activities reliably and transparently. Big Data tools allow us to analyze processes, identify dependencies and predict corruption risks. The author describes the most significant problems that complicate the transfer of offline technologies into the online environment. The paper analyzes promising directions for the development of digital technologies that would lead to solving the arising problems, as well as to implement tasks that previously seemed unreachable. The article also describes current developments in the field of collecting and managing large amounts of data, the “Internet of Things”, modern network architecture, and other advances in the field of IT; the work provides applied examples of their potential use in the field of combating corruption. The study gives reasons that, in the context of combating corruption, digitalization should be allocated in a separate area of activity that is controlled and regulated by the state.


Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114


Sign in / Sign up

Export Citation Format

Share Document