scholarly journals Eight practices for data management to enable team data science

Author(s):  
Andrew McDavid ◽  
Anthony M. Corbett ◽  
Jennifer L. Dutra ◽  
Andrew G. Straw ◽  
David J. Topham ◽  
...  

Abstract Introduction: In clinical and translational research, data science is often and fortuitously integrated with data collection. This contrasts to the typical position of data scientists in other settings, where they are isolated from data collectors. Because of this, effective use of data science techniques to resolve translational questions requires innovation in the organization and management of these data. Methods: We propose an operational framework that respects this important difference in how research teams are organized. To maximize the accuracy and speed of the clinical and translational data science enterprise under this framework, we define a set of eight best practices for data management. Results: In our own work at the University of Rochester, we have strived to utilize these practices in a customized version of the open source LabKey platform for integrated data management and collaboration. We have applied this platform to cohorts that longitudinally track multidomain data from over 3000 subjects. Conclusions: We argue that this has made analytical datasets more readily available and lowered the bar to interdisciplinary collaboration, enabling a team-based data science that is unique to the clinical and translational setting.

2019 ◽  
Author(s):  
Dan Sholler ◽  
Diya Das ◽  
Fernando Hoces de la Guardia ◽  
Chris Hoffman ◽  
Francois Lanusse ◽  
...  

Turnover is a fact of life for any project, and academic research teams can face particularly high levels of people who come and go through the duration of a project. In this article, we discuss the challenges of turnover and some potential practices for helping manage it, particularly for computational- and data-intensive research teams and projects. The topics we discuss include establishing and implementing data management plans, file and format standardization, workflow and process documentation, clear team roles, and check-in and check-out procedures.


2020 ◽  
Vol 21 (4) ◽  
pp. 824-852
Author(s):  
EDWARD J. BALLEISEN

After reflecting on the thematic evolution of business history as a field over the past 50 years, this revised presidential address invites readers to consider the potential payoffs of expanding the contexts in which business historians work together on research projects, as well as with colleagues from cognate fields and with students. In addition to charting the steady growth in collaborative research among business historians since 2000, the essay also identifies areas that especially lend themselves to this mode of historical inquiry, including comparative or transnational analysis that requires detailed knowledge of multiple societies, the development of oral history projects, and the use of data science techniques. It concludes by exploring the advantages of incorporating interdisciplinary research teams into curricular structures, using the example of the Bass Connections program at Duke University.


2020 ◽  
Vol 21 (26) ◽  
Author(s):  
Mikko Tolonen ◽  
Eetu Mäkelä ◽  
Jani Marjanen ◽  
Tuuli Tahko

Digihumanitaaria-alane haridus peaks keskenduma selgelt määratletud allvaldkondadele, mis on mõttekad kohalikus kontekstis. Otsustasime Helsinki ülikoolis pöörata peatähelepanu interdistsiplinaarse digihumanitaaria valdkonnale. Käesolevas artiklis näitame, et digihumanitaaria-alaste uuringute edukaks läbiviimiseks on oluline interdistsiplinaarsus, ning väidame, et seda on digihumanitaarharidusse kõige parem liita humanitaarteaduslikel ühisuuringutel põhineva projektipõhise õppe kaudu.   Digital Humanities can be regarded as a complex landscape of partially overlapping and variously connected domains, including e.g. computational humanities, multimodal cultural heritage and digital cultural studies and cultural analytics (Svensson 2010). Yet, as a precondition for setting up an educational programme within an academic institution, one needs to be able to delineate the discipline being taught (Sinclair and Gouglas 2002, 168) in terms of a coherent academic identity, interrelations between courses, and skills that graduates will attain. Therefore any locally situated educational enterprise needs to focus on those areas of DH that can be reasonably tied to research conducted at the hosting institution. At the University of Helsinki, we have put particular effort into defining our educational profile in interdisciplinary computational humanities, taught both as a minor studies module (30 ECTS) and an MA track (120 ECTS). Because of the complexities of humanities data and the lack of standard protocols for dealing with it, it is very difficult for a humanities scholar to apply computational and statistical methods in a trustworthy manner without specialist help. At the same time, neither can computer scientists, statisticians or physicists answer humanities questions on their own, even if they understand the algorithms. Our solution to this problem is to argue that computational humanities research, and as a consequence also digital humanities education, should be fundamentally interdisciplinary endeavours, where statisticians, computer scientists and scholars in the humanities work together to develop, test and apply the methodology to solve humanities questions. Our version of computational humanities thus exists precisely and solely at the intersection of humanities and computer science rather than as separate from either of them. Consequently, people participating in this field should primarily anchor their academic profile to one of the parent disciplines instead of trying to find an identity purely in the middle. This is reflected in our educational approach. We provide students in the humanities with instruction on how to use ready-made tools, workflows or applied programming, granting them a general digital competency and agency, but our focus is on developing a broader literacy regarding data and computational methods. By learning to contextualize their skills within the field of computational humanities as a whole, the humanities students also learn to assess where their personal boundaries lie, and where an interdisciplinary collaboration is required instead. In this context, their computational literacy also helps them converse with the methodological experts coming to the field from computer science. In this interdisciplinary setting, we take a project-based approach to learning, tying teaching to actual research projects being conducted at the faculty. This approach both harnesses the varying competencies of our students and provides an excellent basis for learning interdisciplinary collaboration (Bell 2010). The culmination of our project courses is the Digital Humanities Hackathon, a multidisciplinary collaboration between the University of Helsinki digital humanities programme and the data science programmes at the Department of Computer Science and Aalto University. For researchers and students from computer and data sciences, the Hackathon is an opportunity to test their abstract knowledge against complex real-life problems; for people from the humanities and social sciences, it shows what is possible to achieve with such collaboration. For both, the Hackathon gives the experience of working with people from different backgrounds as part of an interdisciplinary team and simulates group work in such professional settings as the students may find themselves in after graduation, acculturating them to work outside academia (cf. Rockwell and Sinclair 2012). Our conception of computational humanities as intrinsically collaborative and interdisciplinary is based on the realisation that the traditional, single-author research culture of the humanities is a hindrance to successfully integrating computational approaches into humanities research. We feel that our formulation of the field has the power to contribute to the renewal of research culture and education within the humanities in general, adding value to traditional disciplinary curricula, as well as equipping students with skills relevant in the workplace.


2019 ◽  
Vol 4 (2) ◽  
pp. 81-89 ◽  
Author(s):  
Linda B. Cottler ◽  
Alan I. Green ◽  
Harold Alan Pincus ◽  
Scott McIntosh ◽  
Jennifer L. Humensky ◽  
...  

AbstractThe opioid crisis in the USA requires immediate action through clinical and translational research. Already built network infrastructure through funding by the National Institute on Drug Abuse (NIDA) and National Center for Advancing Translational Sciences (NCATS) provides a major advantage to implement opioid-focused research which together could address this crisis. NIDA supports training grants and clinical trial networks; NCATS funds the Clinical and Translational Science Award (CTSA) Program with over 50 NCATS academic research hubs for regional clinical and translational research. Together, there is unique capacity for clinical research, bioinformatics, data science, community engagement, regulatory science, institutional partnerships, training and career development, and other key translational elements. The CTSA hubs provide unprecedented and timely response to local, regional, and national health crises to address research gaps [Clinical and Translational Science Awards Program, Center for Leading Innovation and Collaboration, Synergy paper request for applications]. This paper describes opportunities for collaborative opioid research at CTSA hubs and NIDA–NCATS opportunities that build capacity for best practices as this crisis evolves. Results of a Landscape Survey (among 63 hubs) are provided with descriptions of best practices and ideas for collaborations, with research conducted by hubs also involved in premier NIDA initiatives. Such collaborations could provide a rapid response to the opioid epidemic while advancing science in multiple disciplinary areas.


2021 ◽  
Vol 45 (2) ◽  
Author(s):  
Hoa Luong ◽  
Daria Orlowska ◽  
Colleen Fallaw ◽  
Yali Feng ◽  
Livia Garza ◽  
...  

How do you help people improve their data management skills? For our team at the University of Illinois at Urbana-Champaign, we decided the answer was "one nudge at a time”. A study conducted by Wiley and Mischo (2016) found that Illinois researchers are aware of data services available but under-utilize them. Many researchers do not consider data management as a concern distinct from researching and producing scholarly work products. In 2017, the RDS piloted the Data Nudge – a monthly, opt-in email service to “nudge” Illinois researchers toward good data management practices, and towards utilizing data services on campus. The aim of the Data Nudge was to address the gap between knowing about a service and using it by highlighting best practices and campus resources. The topics covered in the Data Nudge center around data. Some topics are applicable to everyone, such as data back-up, documentation, and file naming conventions. Other topics are specific to Illinois, like storage options, events, and conferences. After four years, the Data Nudge has accumulated over 400 subscribers through word-of-mouth, marketing channels on campus and inclusion in subject liaisons' instructional workshops. It receives stable open rates averaging at 52% (compared to 19.44% average industry rate for Higher Education*) and many compliments from subscribers. We expect the Data Nudge to continue supplementing workshops and training as an effective means of communication to reach researchers on our campus. In the spirit of re-use, we are in the process of archiving the Data Nudge topics in a reusable format, readily adaptable by other institutions.  Data Nudge link: https://go.illinois.edu/past_nudges


Author(s):  
Luca Barbaglia ◽  
Sergio Consoli ◽  
Sebastiano Manzan ◽  
Diego Reforgiato Recupero ◽  
Michaela Saisana ◽  
...  

AbstractThis chapter is an introduction to the use of data science technologies in the fields of economics and finance. The recent explosion in computation and information technology in the past decade has made available vast amounts of data in various domains, which has been referred to as Big Data. In economics and finance, in particular, tapping into these data brings research and business closer together, as data generated in ordinary economic activity can be used towards effective and personalized models. In this context, the recent use of data science technologies for economics and finance provides mutual benefits to both scientists and professionals, improving forecasting and nowcasting for several kinds of applications. This chapter introduces the subject through underlying technical challenges such as data handling and protection, modeling, integration, and interpretation. It also outlines some of the common issues in economic modeling with data science technologies and surveys the relevant big data management and analytics solutions, motivating the use of data science methods in economics and finance.


Research ecosystems within university environments are continuously evolving and requiring more resources and domain specialists to assist with the data lifecycle. Typically, academic researchers and professionals are overcommitted, making it challenging to be up-to-date on recent developments in best practices of data management, curation, transformation, analysis, and visualization. Recently, research groups, university core centers, and Libraries are revitalizing these services to fill in the gaps to aid researchers in finding new tools and approaches to make their work more impactful, sustainable, and replicable. In this paper, we report on a student consultation program built within the University Libraries, that takes an innovative, student-centered approach to meeting the research data needs in a university environment while also providing students with experiential learning opportunities. This student program, DataBridge, trains students to work in multi-disciplinary teams and as student consultants to assist faculty, staff, and students with their real-world, data-intensive research challenges. Centering DataBridge in the Libraries allows students the unique opportunity to work across all disciplines, on problems and in domains that some students may not interact with during their college careers. To encourage students from multiple disciplines to participate, we developed a scaffolded curriculum that allows students from any discipline and skill level to quickly develop the essential data science skill sets and begin contributing their own unique perspectives and specializations to the research consultations. These students, mentored by Informatics faculty in the Libraries, provide research support that can ultimately impact the entire research process. Through our pilot phase, we have found that DataBridge enhances the utilization and openness of data created through research, extends the reach and impact of the work beyond the researcher’s specialized community, and creates a network of student “data champions” across the University who see the value in working with the Library. Here, we describe the evolution of the DataBridge program and outline its unique role in both training the data stewards of the future with regard to FAIR data practices, and in contributing significant value to research projects at Virginia Tech. Ultimately, this work highlights the need for innovative, strategic programs that encourage and enable real-world experience of data curation, data analysis, and data publication for current researchers, all while training the next generation of researchers in these best practices.


2021 ◽  
Vol 73 (2) ◽  
pp. 322-341
Author(s):  
Agusta Palsdottir

PurposeThe purpose of this paper is to investigate the knowledge and attitude about research data management, the use of data management methods and the perceived need for support, in relation to participants’ field of research.Design/methodology/approachThis is a quantitative study. Data were collected by an email survey and sent to 792 academic researchers and doctoral students. Total response rate was 18% (N = 139). The measurement instrument consisted of six sets of questions: about data management plans, the assignment of additional information to research data, about metadata, standard file naming systems, training at data management methods and the storing of research data.FindingsThe main finding is that knowledge about the procedures of data management is limited, and data management is not a normal practice in the researcher's work. They were, however, in general, of the opinion that the university should take the lead by recommending and offering access to the necessary tools of data management. Taken together, the results indicate that there is an urgent need to increase the researcher's understanding of the importance of data management that is based on professional knowledge and to provide them with resources and training that enables them to make effective and productive use of data management methods.Research limitations/implicationsThe survey was sent to all members of the population but not a sample of it. Because of the response rate, the results cannot be generalized to all researchers at the university. Nevertheless, the findings may provide an important understanding about their research data procedures, in particular what characterizes their knowledge about data management and attitude towards it.Practical implicationsAwareness of these issues is essential for information specialists at academic libraries, together with other units within the universities, to be able to design infrastructures and develop services that suit the needs of the research community. The findings can be used, to develop data policies and services, based on professional knowledge of best practices and recognized standards that assist the research community at data management.Originality/valueThe study contributes to the existing literature about research data management by examining the results by participants’ field of research. Recognition of the issues is critical in order for information specialists in collaboration with universities to design relevant infrastructures and services for academics and doctoral students that can promote their research data management.


Sign in / Sign up

Export Citation Format

Share Document