scholarly journals Amplifying Domain Expertise in Clinical Data Pipelines

10.2196/19612 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e19612
Author(s):  
Protiva Rahman ◽  
Arnab Nandi ◽  
Courtney Hebert

Digitization of health records has allowed the health care domain to adopt data-driven algorithms for decision support. There are multiple people involved in this process: a data engineer who processes and restructures the data, a data scientist who develops statistical models, and a domain expert who informs the design of the data pipeline and consumes its results for decision support. Although there are multiple data interaction tools for data scientists, few exist to allow domain experts to interact with data meaningfully. Designing systems for domain experts requires careful thought because they have different needs and characteristics from other end users. There should be an increased emphasis on the system to optimize the experts’ interaction by directing them to high-impact data tasks and reducing the total task completion time. We refer to this optimization as amplifying domain expertise. Although there is active research in making machine learning models more explainable and usable, it focuses on the final outputs of the model. However, in the clinical domain, expert involvement is needed at every pipeline step: curation, cleaning, and analysis. To this end, we review literature from the database, human-computer information, and visualization communities to demonstrate the challenges and solutions at each of the data pipeline stages. Next, we present a taxonomy of expertise amplification, which can be applied when building systems for domain experts. This includes summarization, guidance, interaction, and acceleration. Finally, we demonstrate the use of our taxonomy with a case study.

2020 ◽  
Author(s):  
Protiva Rahman ◽  
Arnab Nandi ◽  
Courtney Hebert

UNSTRUCTURED Digitization of health records has allowed the health care domain to adopt data-driven algorithms for decision support. There are multiple people involved in this process: a data engineer who processes and restructures the data, a data scientist who develops statistical models, and a domain expert who informs the design of the data pipeline and consumes its results for decision support. Although there are multiple data interaction tools for data scientists, few exist to allow domain experts to interact with data meaningfully. Designing systems for domain experts requires careful thought because they have different needs and characteristics from other end users. There should be an increased emphasis on the system to optimize the experts’ interaction by directing them to high-impact data tasks and reducing the total task completion time. We refer to this optimization as amplifying domain expertise. Although there is active research in making machine learning models more explainable and usable, it focuses on the final outputs of the model. However, in the clinical domain, expert involvement is needed at every pipeline step: curation, cleaning, and analysis. To this end, we review literature from the database, human-computer information, and visualization communities to demonstrate the challenges and solutions at each of the data pipeline stages. Next, we present a taxonomy of expertise amplification, which can be applied when building systems for domain experts. This includes summarization, guidance, interaction, and acceleration. Finally, we demonstrate the use of our taxonomy with a case study.


2021 ◽  
Vol 13 (1) ◽  
pp. 1-25
Author(s):  
Michael Loster ◽  
Ioannis Koumarelas ◽  
Felix Naumann

The integration of multiple data sources is a common problem in a large variety of applications. Traditionally, handcrafted similarity measures are used to discover, merge, and integrate multiple representations of the same entity—duplicates—into a large homogeneous collection of data. Often, these similarity measures do not cope well with the heterogeneity of the underlying dataset. In addition, domain experts are needed to manually design and configure such measures, which is both time-consuming and requires extensive domain expertise. We propose a deep Siamese neural network, capable of learning a similarity measure that is tailored to the characteristics of a particular dataset. With the properties of deep learning methods, we are able to eliminate the manual feature engineering process and thus considerably reduce the effort required for model construction. In addition, we show that it is possible to transfer knowledge acquired during the deduplication of one dataset to another, and thus significantly reduce the amount of data required to train a similarity measure. We evaluated our method on multiple datasets and compare our approach to state-of-the-art deduplication methods. Our approach outperforms competitors by up to +26 percent F-measure, depending on task and dataset. In addition, we show that knowledge transfer is not only feasible, but in our experiments led to an improvement in F-measure of up to +4.7 percent.


Humans have been using their domain expertise intelligently and skillfully for making decisions in solving a problem. These decisions are made based on the knowledge that they have acquired through experience and practice over a course of time, which will be lost after the expert’s life ends. Hence, this expert knowledge is required to be stored to a database and a machine could be intelligently programmed which could use this knowledge to make decisions, known as an Expert System (ES). This system tries to emulate the decision-making skills of a domain expert by gathering knowledge of the domain experts, storing it to a knowledge base in rule format, and then using those rules to analyze the given data and provides solutions to the problems. These Expert Systems can be utilized to analyze the system log files, find issues logged into those log statements and provide solutions to the errors that are found in those logs.


2021 ◽  
pp. 026553222110107
Author(s):  
Simon Davidson

This paper investigates what matters to medical domain experts when setting standards on a language for specific purposes (LSP) English proficiency test: the Occupational English Test’s (OET) writing sub-test. The study explores what standard-setting participants value when making performance judgements about test candidates’ writing responses, and the extent to which their decisions are language-based and align with the OET writing sub-test criteria. Qualitative data is a relatively under-utilized component of standard setting and this type of commentary was garnered to gain a better understanding of the basis for performance decisions. Eighteen doctors were recruited for standard-setting workshops. To gain further insight, verbal reports in the form of a think-aloud protocol (TAP) were employed with five of the 18 participants. The doctors’ comments were thematically coded and the analysis showed that participants’ standard-setting judgements often aligned with the OET writing sub-test criteria. An overarching theme, ‘Audience Recognition’, was also identified as valuable to participants. A minority of decisions were swayed by features outside the OET’s communicative construct (e.g., clinical competency). Yet, overall, findings indicated that domain experts were undeniably focused on textual features associated with what the test is designed to assess and their views were vitally important in the standard-setting process.


Author(s):  
Nina H Di Cara ◽  
Jiao Song ◽  
Valerio Maggio ◽  
Christopher Moreno-Stokoe ◽  
Alastair R Tanner ◽  
...  

Background  Disasters such as the COVID-19 pandemic pose an overwhelming demand on resources that cannot always be met by official organisations. Limited resources and human response to crises can lead members of local communities to turn to one another to fulfil immediate needs. This spontaneous citizen-led response can be crucial to a community’s ability to cope in a crisis. It is thus essential to understand the scope of such initiatives so that support can be provided where it is most needed. Nevertheless, quickly developing situations and varying definitions can make the community response challenging to measure. Aim     To create an accessible interactive map of the citizen-led community response to need during the COVID-19 pandemic in Wales, UK that combines information gathered from multiple data providers to reflect different interpretations of need and support. Approach      We gathered data from a combination of official data providers and community-generated sources to create 14 variables representative of need and support. These variables are derived by a reproducible data pipeline that enables flexible integration of new data. The interactive tool is available online (www.covidresponsemap.wales) and can map available data at two geographic resolutions. Users choose their variables of interest, and interpretation of the map is aided by a linked bee-swarm plot. Discussion    The novel approach we developed enables people at all levels of community response to explore and analyse the distribution of need and support across Wales. While there can be limitations to the accuracy of community-generated data, we demonstrate that they can be effectively used alongside traditional data sources to maximise the understanding of community action. This adds to our overall aim to measure community response and resilience, as well as to make complex population health data accessible to a range of audiences. Future developments include the integration of other factors such as well-being.


MIS Quarterly ◽  
2021 ◽  
Vol 45 (3) ◽  
pp. 1557-1580
Author(s):  
Elmira van den Broek ◽  
◽  
Anastasia Sergeeva ◽  
Marleen Huysman Vrije ◽  
◽  
...  

The introduction of machine learning (ML)in organizations comes with the claim that algorithms will produce insights superior to those of experts by discovering the “truth” from data. Such a claim gives rise to a tension between the need to produce knowledge independent of domain experts and the need to remain relevant to the domain the system serves. This two-year ethnographic study focuses on how developers managed this tension when building an ML system to support the process of hiring job candidates at a large international organization. Despite the initial goal of getting domain experts “out the loop,” we found that developers and experts arrived at a new hybrid practice that relied on a combination of ML and domain expertise. We explain this outcome as resulting from a process of mutual learning in which deep engagement with the technology triggered actors to reflect on how they produced knowledge. These reflections prompted the developers to iterate between excluding domain expertise from the ML system and including it. Contrary to common views that imply an opposition between ML and domain expertise, our study foregrounds their interdependence and as such shows the dialectic nature of developing ML. We discuss the theoretical implications of these findings for the literature on information technologies and knowledge work, information system development and implementation, and human–ML hybrids.


2017 ◽  
Vol 26 (01) ◽  
pp. 125-132
Author(s):  
R. A. Jenders

Summary Introduction: Advances in clinical decision support (CDS) continue to evolve to support the goals of clinicians, policymakers, patients and professional organizations to improve clinical practice, patient safety, and the quality of care. Objectives: Identify key thematic areas or foci in research and practice involving clinical decision support during the 2015-2016 time period. Methods: Thematic analysis consistent with a grounded theory approach was applied in a targeted review of journal publications, the proceedings of key scientific conferences as well as activities in standards development organizations in order to identify the key themes underlying work related to CDS. Results: Ten key thematic areas were identified, including: 1) an emphasis on knowledge representation, with a focus on clinical practice guidelines; 2) various aspects of precision medicine, including the use of sensor and genomic data as well as big data; 3) efforts in quality improvement; 4) innovative uses of computer-based provider order entry (CPOE) systems, including relevant data displays; 5) expansion of CDS in various clinical settings; 6) patient-directed CDS; 7) understanding the potential negative impact of CDS; 8) obtaining structured data to drive CDS interventions; 9) the use of diagnostic decision support; and 10) the development and use of standards for CDS. Conclusions: Active research and practice in 2015-2016 continue to underscore the importance and broad utility of CDS for effecting change and improving the quality and outcome of clinical care.


Author(s):  
JAE HUN CHOI ◽  
JAE DONG YANG ◽  
DONG GILL LEE

In this paper, we propose a new approach for managing domain specific thesauri, where object-oriented paradigm is applied to thesaurus construction and query-based browsing. The approach provides an object-oriented mechanism to assist domain experts in constructing thesauri; it determines a considerable part of relationship degrees between terms by inheritance and supplies the domain expert with information available from other parts of the thesaurus being constructed or already constructed. In addition to that, it enables domain experts to incrementally construct the thesaurus, since the automatically determined relationship degrees can be refined whenever a more sophisticated thesaurus is needed. It may minimize domain experts' burden caused by the exhaustive specification of individual relationship. This approach also provides a query-based browsing facility, which enables users to find desired thesaurus terms without tedious browsing in the thesaurus. A browsing query can be formulated with terms rather ambiguous, yet capable of deriving the desired terms. This browsing query is useful especially when users want precise results. In other words, it is useful when they want to use only thesaurus terms carefully selected in reformulating Boolean queries. To demonstrate the feasibility of our approach, we fully implemented an object-based thesaurus system, which supports the semiautomatic thesaurus construction and the query-based browsing facility.


Sign in / Sign up

Export Citation Format

Share Document