Being a Data Scientist: My Experience and Toolset

If I had to use a few words to give myself a title for my position at UNC, I might not have said I was a data scientist. When I was starting my career there was no such thing, but looking at my CV / Resume, I have:

  • Worked at a billion dollar company, writing the integration process that pushed 40+ large datasets through complex models and analytics to produce one large modeled data product.
  • Done graduate work in text mining and data mining.
  • Wrote a innovative search engine from scratch and worked to commercialize it with two professors (it was their patent, but I was the programmer in the end).
  • Worked at UNC, Duke, and NC State University through Renci doing data mining, cartography, and interactive and static information visualization for various domain scientists.

I have done dozens of projects, and apparently I’ve amassed a fair bit of knowledge along the way that in some ways I have totally missed. Sometimes I answer a question and I think, “How did I know that, anyway?”

Well, yesterday I started mentoring at Thinkful in their Flexible Data Science Bootcamp, and I have to say that I love it already. I like their approach, because it blends 1-on-1 time with remote learning and goes out of its way to support its mentors in being good educators and not just experts.

But as I dig through my data science know-how I want to share it with more than just one student at a time, so this is the first in a series of posts about what it’s like to be a data scientist, or more accurately perhaps what I did as a data scientist and how that might relate to a new person doing data science in the field.

Some of it will include direct examples of doing data science projects in Python, and some of it will be more about the tools of the trade and how to work with open source tools to do data science. And some posts, like this one, will be more about “life as a data scientist,”

Source: Being a Data Scientist: My Experience and Toolset · Jefferson Heard