US data science education lacks a much-needed focus on ethicssexiest job of the 21st century by Harvard Business Review – falls short in preparing students for the ethical use of data science, our new study found.
Data science lies at the nexus of statistics and computer science applied to a particular field such as astronomy, linguistics, medicine, psychology or sociology. The idea behind this data crunching is to use big data to address otherwise unsolvable problems, such as how health care providers can create personalised medicine based on a patient’s genes and how businesses can make purchase predictions based on customers’ behaviour.
The United States Bureau of Labor Statistics projects a 15% growth in data science careers over the period of 2019 to 2029, corresponding with an increased demand for data science training.
Universities and colleges have responded to the demand by creating new programmes or revamping existing ones. The number of undergraduate data science programmes in the US jumped from 13 in 2014 to at least 50 as of September 2020.
As educators and practitioners in data science, we were prompted by the growth in programmes to investigate what is covered, and what is not covered, in data science undergraduate education.
In our study, we compared undergraduate data science curricula with the expectations for undergraduate data science training put forth by the National Academies of Sciences, Engineering and Medicine.
Those expectations include training in ethics. We found most programmes dedicated considerable coursework to mathematics, statistics and computer science, but little training in ethical considerations such as privacy and systemic bias. Only 50% of the degree programmes we investigated required any coursework in ethics.
Why it matters
As with any powerful tool, the responsible application of data science requires training in how to use data science and to understand its impacts. Our results align with prior work that found little attention is paid to ethics in data science degree programmes. This suggests that undergraduate data science degree programmes may produce a workforce without the training and judgment to apply data science methods responsibly.
It isn’t hard to find examples of irresponsible use of data science. For instance, policing models that have a built-in data bias can lead to an elevated police presence in historically over-policed neighbourhoods. In another example, algorithms used by the US health care system are biased in a way that causes black patients to receive less care than white patients with similar needs.
We believe explicit training in ethical practices would better prepare a socially responsible data science workforce.
What still isn’t known, and what next
While data science is a relatively new field – still being defined as a discipline – guidelines exist for training undergraduate students in data science. These guidelines prompt the question: How much training can we expect in an undergraduate degree?
The National Academies recommend training in 10 areas, including ethical problem solving, communication and data management.
Our work focused on undergraduate data science degrees at universities classified as R1, meaning they engage in high levels of research activity. Further research could examine the amount of training and preparation in various aspects of data science at the masters and PhD levels and the nature of undergraduate data science training at universities of different research levels.
Given that many data science programmes are new, there is considerable opportunity to compare the training that students receive with the expectations of employers.
We plan to expand on our findings by investigating the pressures that might be driving curriculum development for degrees in other disciplines that are seeing similar job market growth.
Jeffrey C Oliver is a data science specialist at the University of Arizona, United States. Torbet McNeil is a PhD candidate in educational policy studies and practice at the University of Arizona, US. This article is republished from The Conversation under a creative commons licence. Read the original article.