Talk by Allison Theobold (PhD Defense in Statistics)

4/6/2020  12:10-1:00pm  Online


The importance of data science skills for modern environmental science research cannot be understated, but graduate students in these fields typically lack these integral skills. Yet, over the last 20 years statistics preparation in these fields has grown to be considered vital, and statistics coursework has been readily incorporated into graduate programs. Thus, many environmental science graduate degree programs expect students to acquire the data science skills necessary for their research in the statistics coursework required for their degree. A gap exists, however, between the data science skills required for students’ participation in the entire data analysis cycle, and those taught in statistics service courses. Over the last ten years, environmental science and statistics educators have outlined the shape of the data science skills specific to research in their respective disciplines. Disappointingly, however, both sides of these conversations have ignored the area at the intersection of these fields, specifically the data science skills necessary for environmental science practitioners of statistics.

My research focuses on describing the nature of environmental science graduate students’ need for data science skills when engaging in the data analysis cycle, through the voice of the students. In this presentation, I present three qualitative studies, each investigating a different aspect of this need. First, I present a study describing environmental science students’ experiences acquiring the computing skills necessary to implement statistics in their research. In-depth interviews revealed three themes in these students’ paths toward computational knowledge acquisition: use of peer support, seeking out a “singular consultant,” and learning through independent research. Motivated by the need for extracurricular opportunities for acquiring data science skills, next I describe research investigating the design and implementation of a suite of data science workshops tailored to meet the needs of environmental science graduate students. These workshops fill a critical hole in the environmental science and statistics curricula, providing students with the skills necessary to retrieve, view, wrangle, visualize, and analyze their data. Finally, I conclude with research that works toward identifying key data science skills necessary for environmental science graduate students as they engage in the data analysis cycle.