Comment: A brief survey of the current state of play for Bayesian computation in data science at big-data scale

David Draper; Alexander Terenin

doi:10.1214/17-bjps365b

Biomedical informatics meets data science: current state and future directions for interaction

JAMIA Open ◽

10.1093/jamiaopen/ooy032 ◽

2018 ◽

Vol 1 (2) ◽

pp. 136-141 ◽

Cited By ~ 2

Author(s):

Philip R O Payne ◽

Elmer V Bernstam ◽

Justin B Starren

Keyword(s):

Big Data ◽

Data Science ◽

Clinical Decision Making ◽

Social Issues ◽

Biomedical Informatics ◽

Scientific Discovery ◽

Topic Area ◽

Future Directions ◽

Current State ◽

Challenges And Opportunities

Abstract There are an ever-increasing number of reports and commentaries that describe the challenges and opportunities associated with the use of big data and data science (DS) in the context of biomedical education, research, and practice. These publications argue that there are substantial benefits resulting from the use of data-centric approaches to solve complex biomedical problems, including an acceleration in the rate of scientific discovery, improved clinical decision making, and the ability to promote healthy behaviors at a population level. In addition, there is an aligned and emerging body of literature that describes the ethical, legal, and social issues that must be addressed to responsibly use big data in such contexts. At the same time, there has been growing recognition that the challenges and opportunities being attributed to the expansion in DS often parallel those experienced by the biomedical informatics community. Indeed, many informaticians would consider some of these issues relevant to the core theories and methods incumbent to the field of biomedical informatics science and practice. In response to this topic area, during the 2016 American College of Medical Informatics Winter Symposium, a series of presentations and focus group discussions intended to define the current state and identify future directions for interaction and collaboration between people who identify themselves as working on big data, DS, and biomedical informatics were conducted. We provide a perspective concerning these discussions and the outcomes of that meeting, and also present a set of recommendations that we have generated in response to a thematic analysis of those same outcomes. Ultimately, this report is intended to: (1) summarize the key issues currently being discussed by the biomedical informatics community as it seeks to better understand how to constructively interact with the emerging biomedical big data and DS fields; and (2) propose a framework and agenda that can serve to advance this type of constructive interaction, with mutual benefit accruing to both fields.

Download Full-text

Big Data is Too Small: Research Implications of Class Inequality for Online Data Collection

10.31235/osf.io/zm6xy ◽

2018 ◽

Author(s):

Jen Schradie

Keyword(s):

Big Data ◽

Data Science ◽

Digital Data ◽

The Internet ◽

Sociological Research ◽

Marginalized Populations ◽

Online Data ◽

Persistent Problem ◽

Current State ◽

Using Data

With a growing interest in data science and online analytics, researchers are increasingly using data derived from the Internet. Whether for qualitative or quantitative analysis, online data, including “Big Data,” can often exclude marginalized populations, especially those from the poor and working class, as the digital divide remains a persistent problem. This methodological commentary on the current state of digital data and methods disentangles the hype from the reality of digitally produced data for sociological research. In the process, it offers strategies to address the weaknesses of data that is derived from the Internet in order to represent marginalized populations.

Download Full-text

Deep Learning and Big Data in Healthcare: A Double Review for Critical Beginners

Applied Sciences ◽

10.3390/app9112331 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2331 ◽

Cited By ~ 6

Author(s):

Luis Bote-Curiel ◽

Sergio Muñoz-Romero ◽

Alicia Gerrero-Curieses ◽

José Luis Rojo-Álvarez

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Science ◽

Critical Discussion ◽

Healthcare Sector ◽

Simple Description ◽

Future Evolution ◽

Success Stories ◽

Current State ◽

Application Fields

In the last few years, there has been a growing expectation created about the analysis of large amounts of data often available in organizations, which has been both scrutinized by the academic world and successfully exploited by industry. Nowadays, two of the most common terms heard in scientific circles are Big Data and Deep Learning. In this double review, we aim to shed some light on the current state of these different, yet somehow related branches of Data Science, in order to understand the current state and future evolution within the healthcare area. We start by giving a simple description of the technical elements of Big Data technologies, as well as an overview of the elements of Deep Learning techniques, according to their usual description in scientific literature. Then, we pay attention to the application fields that can be said to have delivered relevant real-world success stories, with emphasis on examples from large technology companies and financial institutions, among others. The academic effort that has been put into bringing these technologies to the healthcare sector are then summarized and analyzed from a twofold view as follows: first, the landscape of application examples is globally scrutinized according to the varying nature of medical data, including the data forms in electronic health recordings, medical time signals, and medical images; second, a specific application field is given special attention, in particular the electrocardiographic signal analysis, where a number of works have been published in the last two years. A set of toy application examples are provided with the publicly-available MIMIC dataset, aiming to help the beginners start with some principled, basic, and structured material and available code. Critical discussion is provided for current and forthcoming challenges on the use of both sets of techniques in our future healthcare.

Download Full-text

Standardizing Physiologic Assessment Data to Enable Big Data Analytics

Western Journal of Nursing Research ◽

10.1177/0193945916659471 ◽

2016 ◽

Vol 39 (1) ◽

pp. 63-77 ◽

Cited By ~ 10

Author(s):

Susan A. Matney ◽

Theresa (Tess) Settergren ◽

Jane M. Carrington ◽

Rachel L. Richesson ◽

Amy Sheide ◽

...

Keyword(s):

Big Data ◽

Data Science ◽

Big Data Analytics ◽

Snomed Ct ◽

Assessment Data ◽

Nursing Assessment ◽

Coding Systems ◽

Current State ◽

Data Elements ◽

Systematized Nomenclature Of Medicine

Disparate data must be represented in a common format to enable comparison across multiple institutions and facilitate Big Data science. Nursing assessments represent a rich source of information. However, a lack of agreement regarding essential concepts and standardized terminology prevent their use for Big Data science in the current state. The purpose of this study was to align a minimum set of physiological nursing assessment data elements with national standardized coding systems. Six institutions shared their 100 most common electronic health record nursing assessment data elements. From these, a set of distinct elements was mapped to nationally recognized Logical Observations Identifiers Names and Codes (LOINC®) and Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT®) standards. We identified 137 observation names (55% new to LOINC), and 348 observation values (20% new to SNOMED CT) organized into 16 panels (72% new LOINC). This reference set can support the exchange of nursing information, facilitate multi-site research, and provide a framework for nursing data analysis.

Download Full-text

Data Science, Predictive Analytics, and Big Data in Supply Chain Management: Current State and Future Potential

Journal of Business Logistics ◽

10.1111/jbl.12082 ◽

2015 ◽

Vol 36 (1) ◽

pp. 120-132 ◽

Cited By ~ 172

Author(s):

Tobias Schoenherr ◽

Cheri Speier-Pero

Keyword(s):

Big Data ◽

Supply Chain ◽

Supply Chain Management ◽

Data Science ◽

Predictive Analytics ◽

Chain Management ◽

Current State ◽

Future Potential

Download Full-text

Issues in security and privacy of big data

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i12.482 ◽

2018 ◽

Vol 7 (12) ◽

pp. 1

Author(s):

Shaveta Bhatia

Keyword(s):

Cloud Computing ◽

Big Data ◽

Approximate Method ◽

Biomedical Research ◽

Cyber Security ◽

Data Science ◽

Third Party ◽

Security And Privacy ◽

Security Threats ◽

The Third

The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity. It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.

Download Full-text

Artistic Challenges within Control Societies: Big Data and Democratic Resistance

MedienJournal ◽

10.24989/medienjournal.v38i4.88 ◽

2017 ◽

Vol 38 (4) ◽

pp. 50-61 ◽

Cited By ~ 1

Author(s):

Jan Jagodzinski

Keyword(s):

Big Data ◽

Gilles Deleuze ◽

Data Archive ◽

Current State ◽

The Future ◽

The Way

This paper will first briefly map out the shift from disciplinary to control societies (what I call designer capitalism, the idea of control comes from Gilles Deleuze) in relation to surveillance and mediation of life through screen cultures. The paper then shifts to the issues of digitalization in relation to big data that have the danger of continuing to close off life as zoë, that is life that is creative rather than captured via attention technologies through marketing techniques and surveillance. The last part of this paper then develops the way artists are able to resist the big data archive by turning the data in on itself to offer viewers and participants a glimpse of the current state of manipulating desire and maintaining copy right in order to keep the future closed rather than being potentially open.

Download Full-text

Big Data Driven Clinical Informatics & Surveillance (BDD_CIS) – A Multimodal Database Focused Clinical, Community, and Multi-Omics Surveillance Plan for COVID-19: A study Protocol (Preprint)

10.2196/preprints.24504 ◽

2020 ◽

Author(s):

Bankole Olatosi ◽

Jiajia Zhang ◽

Sharon Weissman ◽

Zhenlong Li ◽

Jianjun Hu ◽

...

Keyword(s):

Big Data ◽

South Carolina ◽

Data Science ◽

Age Groups ◽

The Elderly ◽

The United States ◽

Data Sources ◽

Patient Registries ◽

Multiple Partner ◽

Multimodal Data

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.

Download Full-text

Digitalization in the system of anti-corruption measures

Public Administration ◽

10.22394/2070-8378-2020-22-5-51-55 ◽

2020 ◽

Vol 22 (5) ◽

pp. 51-55

Author(s):

OLEG N. KORCHAGIN ◽

◽

ANASTASIA V. LYADSKAYA ◽

Keyword(s):

Big Data ◽

Law Enforcement ◽

Public Administration ◽

Network Architecture ◽

Digital Technologies ◽

Current State ◽

Foreign Countries ◽

Separate Area ◽

Private Business Sector ◽

The Given

The article is devoted to the current state of digitalization aimed at solving urgent problems of combating corruption in the field of public administration and private business sector. The work considers the experience of foreign countries and the influence of digital technologies on the fight against corruption. It is noted that the digitalization of public administration is becoming one of the decisive factors for increasing the efficiency of the anti-corruption system and improving management mechanisms. Big Data, if integrated and structured according to the given parameters, allows the implementation of legislative, law enforcement, control and supervisory and law enforcement activities reliably and transparently. Big Data tools allow us to analyze processes, identify dependencies and predict corruption risks. The author describes the most significant problems that complicate the transfer of offline technologies into the online environment. The paper analyzes promising directions for the development of digital technologies that would lead to solving the arising problems, as well as to implement tasks that previously seemed unreachable. The article also describes current developments in the field of collecting and managing large amounts of data, the “Internet of Things”, modern network architecture, and other advances in the field of IT; the work provides applied examples of their potential use in the field of combating corruption. The study gives reasons that, in the context of combating corruption, digitalization should be allocated in a separate area of activity that is controlled and regulated by the state.

Download Full-text

Produção internacional sobre ciência orientada a dados: análise dos termos data science e e-science na scopus e na web of science

Pesquisa Brasileira em Ciência da Informação e Biblioteconomia ◽

10.22478/ufpb.1981-0695.2017v12n1.34121 ◽

2017 ◽

Vol 12 (1) ◽

Author(s):

Leilah Santiago Bufrem ◽

Fábio Mascarenhas Silva ◽

Natanael Vitor Sobral ◽

Anna Elizabeth Galvão Coutinho Correia

Keyword(s):

Big Data ◽

Open Access ◽

Grid Computing ◽

Digital Library ◽

Data Science ◽

Web Of Science ◽

Computer Systems ◽

Distributed Computer Systems

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114

Download Full-text