Building employability capabilities in data science students: An interdisciplinary, industry‐focused approach

Sonia Ferns; Aloke Phatak; Susan Benson; Nina Kumagai

doi:10.1111/test.12272

3165 Diseased and Healthy Gastrointestinal Tissue Data Mining requires an Engaged Transdisciplinary team

Journal of Clinical and Translational Science ◽

10.1017/cts.2019.299 ◽

2019 ◽

Vol 3 (s1) ◽

pp. 131-132

Author(s):

Sana Syed ◽

Marium Naveed Khan ◽

Alexis Catalano ◽

Christopher Moskaluk ◽

Jason Papin ◽

...

Keyword(s):

Global Health ◽

Data Science ◽

Team Member ◽

Medical Graduate ◽

Science Students ◽

Team Leaders ◽

Research Project ◽

Team Members ◽

The Past ◽

New Methods

OBJECTIVES/SPECIFIC AIMS: To establish an effective team of researchers working towards developing and validating prognostic models employing use of image analyses and other numerical metadata to better understand pediatric undernutrition, and to learn how different approaches can be brought together collaboratively and efficiently. METHODS/STUDY POPULATION: Over the past 18 months we have established a transdisciplinary team spanning three countries and the Schools of Medicine, Engineering, Data Science and Global Health. We first identified two team leaders specifically a pediatric physician scientist (SS) and a data scientist/engineer (DB). The leaders worked together to recruit team members, with the understanding that different ideas are encouraged and will be used collaboratively to tackle the problem of pediatric undernutrition. The final data analytic and interpretative core team consisted of four data science students, two PhD students, an undergraduate biology major, a recent medical graduate, and a PhD research scientist. Additional collaborative members included faculty from Biomedical Engineering, the School of Medicine (Pediatrics and Pathology) along with international Global Health faculty from Pakistan and Zambia. We learned early on that it was important to understand what each of the member’s motivation for contributing to the project was along with aligning that motivation with the overall goals of the team. This made us help prioritize team member tasks and streamline ideas. We also incorporated a mechanism of weekly (monthly/bimonthly for global partners) meetings with informal oral presentations which consisted of each member’s current progress, thoughts and concerns, and next experimental goals. This method enabled team leaders to have a 3600 mechanism of feedback. Overall, we assessed the effectiveness of our team by two mechanisms: 1) ongoing team member feedback, including team leaders, and 2) progress of the research project. RESULTS/ANTICIPATED RESULTS: Our feedback has shown that on initial development of the team there was hesitance in communication due to the background diversity of our various member along with different cultural/social expectations. We used ice-breaking methods such as dedicated time for brief introductions, career directions, and life goals for each team member. We subsequently found that with the exception of one, all other team members noted our working environment professional and conducive to productivity. We also learnt from our method of ongoing constant feedback that at times, due to the complexity of different disciplines, some information was lost due to the difference in educational backgrounds. We have now employed new methods to relay information more effectively, with the use of not just sharing literature but also by explaining the content. The progress of our research project has varied over the past 4-6 months. There was a steep learning curve for almost every member, for example all the data science students had never studied anything related to medicine during their education, including minimal if none exposure to the ethics of medical research. Conversely, team members with medical/biology backgrounds had minimal prior exposure to computational modeling, computer engineering and the verbage of communicating mathematical algorithms. While this may have slowed our progress we learned that by asking questions and engaging every member it was easier to delegate tasks effectively. Once our team reached an overall understanding of each member’s goals there was a steady progress in the project, with new results and new methods of analysis being tested every week. DISCUSSION/SIGNIFICANCE OF IMPACT: We expect that our on-going collaboration will result in the development of new and novel modalities to understand and diagnose pediatric undernutrition, and can be used as a model to tackle several other problems. As with many team science projects, credit and authorship are challenges that we are outlining creative strategies for as suggested by International Committee of Medical Journal Editors (ICMJE) and other literature.

Download Full-text

On traversing the data landscape: Introducing APIs to data‐science students

Teaching Statistics ◽

10.1111/test.12266 ◽

2021 ◽

Vol 43 (S1) ◽

Author(s):

Anna Fergusson ◽

Chris J. Wild

Keyword(s):

Data Science ◽

Science Students

Download Full-text

The importance and emergence of K-12 data science

Phi Delta Kappan ◽

10.1177/00317217211043627 ◽

2021 ◽

Vol 103 (1) ◽

pp. 49-53

Author(s):

Tanya LaMar ◽

Jo Boaler

Keyword(s):

Science Education ◽

Data Analysis ◽

Vaccine Efficacy ◽

Data Science ◽

Mathematics Curriculum ◽

Science Students ◽

Data Literacy ◽

Global Pandemic ◽

K 12

The COVID-19 global pandemic has required everyone to make sense of data about community spread, levels of risk, and vaccine efficacy. Yet research shows that students are underprepared in data literacy. Tanya LaMar and Jo Boaler argue that data science education provides an opportunity to address this problem while providing much needed updates to the current mathematics curriculum. The integration of data science can provide a more equitable mathematics pipeline than the calculus-focused pathway that has excluded most students from a future in mathematics. Through data science, students can learn to answer questions that are relevant to their lives and communities, to be critical consumers of the data that surround them every day, and to wield the power of data analysis.

Download Full-text

Helping Data Science Students Develop Task Modularity

Proceedings of the 52nd Hawaii International Conference on System Sciences ◽

10.24251/hicss.2019.134 ◽

2019 ◽

Cited By ~ 2

Author(s):

jeff saltz ◽

Robert Heckman ◽

Kevin Crowston ◽

Sangseok You ◽

Yatish Hegde

Keyword(s):

Data Science ◽

Science Students

Download Full-text

Finding Our Community: Iterating Data Science & Visualization Services

10.31229/osf.io/v725c ◽

2019 ◽

Author(s):

Mia Partlow

Keyword(s):

Data Science ◽

Peer To Peer ◽

Lessons Learned ◽

User Needs ◽

First Year ◽

Science Students ◽

Peer Training ◽

Consulting Services ◽

New Space ◽

Training And Support

Lightning Talk given at Designing Libraries VIII, Atlanta, GA October 6-8, 2019. The Dataspace at NC State University’s Hunt Library opened a year ago, offering a front door for the Libraries’ data science and visualization services with specialized computing and peer-to-peer training and support. During its first year in operation, we worked to assess and refine the services in the Dataspace to respond to user needs, but we have also been able to take advantage of those lessons learned as we prepare for the next generation of data science and visualization spaces at our other main branch, Hill Library, which is currently undergoing renovation. This presentation discusses what we’ve learned by approaching the Dataspace as both fully realized service point and testing ground, how we plan to implement assessment insights in the new space, and how we launched pop-up data science consulting services at Hill Library in order to better serve and understand the community of data science students and researchers on our central campus.

Download Full-text

A Way How to Impart Data Science Skills to Computer Science Students Exemplified by OBDA-Systems Development

Procedia Computer Science ◽

10.1016/j.procs.2017.05.191 ◽

2017 ◽

Vol 108 ◽

pp. 2161-2170 ◽

Cited By ~ 1

Author(s):

Svetlana Chuprina ◽

Igor Postanogov ◽

Taisya Kostareva

Keyword(s):

Computer Science ◽

Data Science ◽

Systems Development ◽

Science Students

Download Full-text

Project-Based Learning via Competition for Data Science Students

10.1162/99608f92.44f54f00 ◽

2021 ◽

Author(s):

Philip L. H. Yu ◽

Wai Keung Li

Keyword(s):

Data Science ◽

Project Based Learning ◽

Science Students

Download Full-text

Toward Training and Assessing Reproducible Data Analysis in Data Science Education

Data Intelligence ◽

10.1162/dint_a_00053 ◽

2019 ◽

Vol 1 (4) ◽

pp. 381-392 ◽

Cited By ~ 4

Author(s):

Bei Yu ◽

Xiao Hu

Keyword(s):

Public Policy ◽

Science Education ◽

Data Analysis ◽

Action Research ◽

Data Science ◽

Research Data ◽

Reproducible Research ◽

Science Students ◽

Student Training ◽

Peer Reports

Reproducibility is a cornerstone of scientific research. Data science is not an exception. In recent years scientists were concerned about a large number of irreproducible studies. Such reproducibility crisis in science could severely undermine public trust in science and science-based public policy. Recent efforts to promote reproducible research mainly focused on matured scientists and much less on student training. In this study, we conducted action research on students in data science to evaluate to what extent students are ready for communicating reproducible data analysis. The results show that although two-thirds of the students claimed they were able to reproduce results in peer reports, only one-third of reports provided all necessary information for replication. The actual replication results also include conflicting claims; some lacked comparisons of original and replication results, indicating that some students did not share a consistent understanding of what reproducibility means and how to report replication results. The findings suggest that more training is needed to help data science students communicating reproducible data analysis.

Download Full-text

CS vs non-CS: Analyzing Online Social Behaviors of Data Science Students with Diverse Academic Backgrounds

Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 2 ◽

10.1145/3456565.3460058 ◽

2021 ◽

Author(s):

Wensheng Wu

Keyword(s):

Data Science ◽

Social Behaviors ◽

Science Students

Download Full-text

Reading datasets: Strategies for interpreting the politics of data signification

Big Data & Society ◽

10.1177/20539517211029322 ◽

2021 ◽

Vol 8 (2) ◽

pp. 205395172110293

Author(s):

Lindsay Poirier

Keyword(s):

Reading Strategies ◽

Data Science ◽

Science Students ◽

Data Ethics ◽

Double Binds ◽

Data Semantics ◽

Critical Attention ◽

Vested Interests ◽

Political Commitments ◽

Ethics Curriculum

All datasets emerge from and are enmeshed in power-laden semiotic systems. While emerging data ethics curriculum is supporting data science students in identifying data biases and their consequences, critical attention to the cultural histories and vested interests animating data semantics is needed to elucidate the assumptions and political commitments on which data rest, along with the externalities they produce. In this article, I introduce three modes of reading that can be engaged when studying datasets—a denotative reading (extrapolating the literal meaning of values in a dataset), a connotative reading (tracing the socio-political provenance of data semantics), and a deconstructive reading (seeking what gets Othered through data semantics and structure). I then outline how I have taught students to engage these methods when analyzing three datasets in Data and Society—a course designed to cultivate student competency in politically aware data analysis and interpretation. I show how combined, the reading strategies prompt students to grapple with the double binds of perceiving contemporary problems through systems of representation that are always situated, incomplete, and inflected with diverse politics. While I introduce these methods in the context of teaching, I argue that the methods are integral to any data practice in the conclusion.

Download Full-text