Data Science Workflow: Overview and Challenges
October 2013 (perspective of a postdoc)
During my Ph.D., I created tools for people who write programs to obtain insights from data. Millions of professionals in fields ranging from science, engineering, business, finance, public policy, and journalism, as well as numerous students and computer hobbyists, all perform this sort of programming on a daily basis.
Shortly after I wrote my dissertation in 2012, the term "Data Science" started appearing everywhere. Some industry pundits call data science the "sexiest job of the 21st century." And universities are putting tremendous funding into new Data Science Institutes.
I now realize that data scientists are one main target audience for the tools that I created throughout my Ph.D. However, that job title was not as prominent back when I was in grad school, so I didn't mention it explicitly in my dissertation.
What do data scientists do at work, and what challenges do they face?
This post provides an overview of the modern data science workflow, adapted from Chapter 2 of my Ph.D. dissertation, Software Tools to Facilitate Research Programming.
Keep this website up and running by making a small donation.
Last modified: 2013-10-30
Related pages tagged as CACM blog:
Related pages tagged as research:
Related pages tagged as data science:
Assistant Professor of
PG Vlog (344)
social observations (68)
Ph.D. student life (62)
working life (55)
computing education (53)
PG Podcast (51)
undergrad education (41)
research advising (32)
Python Tutor (32)
job hunting (32)
data science (28)
high school (22)
PG Podcast Hour (20)
CACM blog (16)
On the Move memoir (15)
screencast video (15)
undergrad research (11)
Asian parents (6)
guest article (3)
website overview (1)