A Data Scientist… Who, Me?!

Hello, everyone! I am a new author on the ThinkToStart team and very excited to share my ideas as well as engage with this community.

Let me begin my first post by explaining that I Data Scientistnever in a million years would have
imagined that I could be considered a data scientist. Although I have gotten pretty good at solving specific types of problems using specific software tools, I really consider myself to be at the beginning stages of my development as a data scientist. In school, I was terrified of statistics, peaked in mathematics at the age of 12, and am not a programmer of any computer language. It wasn’t until I finished my doctoral degree and went out into the work world when I started to get interested in data mining.

I realize with every passing day that all of this data science is ultimately meant to help our amazingly complex and algorithm-defying brains make decisions more efficiently.

I needed the real-world context of being faced with pressing business problems which I needed to solve for my company and for my clients. I was initially very uncomfortable and lacked confidence when talking about data science, data mining, predictive analytics, whatever terms were being used. So, I found mentors to teach me how to use specific software applications and solve specific problems; I blew the dust off of my old statistics books from college; I watched online videos and read the blogs of people who knew what they were doing. I dove in. And, as I continue to dive deeper into the field and continuously learn, I realize with every passing day that all of this data science is ultimately meant to help our amazingly complex and algorithm-defying brains make decisions more efficiently. There are new and better tools every day to support our decisions. But, in the end, data science is a tool to support us and our ability to make decisions more confidently about problems which only we can define and tackle.

Data science is the process of taking all of this information, making sense of it, and then using it to inform practical strategies

Every day, we use data make decisions. Hearing about a health study leads us to add something new to our diets. We purchase books and music after seeing advertisements while browsing the internet. All of these behaviors, from buying gas and groceries to writing online reviews that document experiences at restaurants and doctors’ offices, provide data to businesses and service providers. Data science is the process of taking all of this information (collected from surveys, social media comments, recorded phone conversations, purchase histories, etc. ), making sense of it, and then using it to inform practical strategies pertaining to sales and marketing, customer service, product development, and other types of decisions and behaviors.

Google Analytics

all members of organizations are involved in data science

When most people think of data science, they often focus on computer science, mathematics, statistics, and engineering. While those disciplines are certainly important to data science, it is also necessary to understand the business context of the problem which you are trying to solve, how data is collected and analyzed to define the business context, how findings will be explained to those in the organization without a technical background who have to implement model output, create practices and policies based on the data, etc.

Therefore, all members of organizations are involved in data science. Some members of the organization decide what the business initiative is, some collect and analyze data, some create models and output new data, some act upon the new data by turning it into new processes or products. It is extremely rare to find one person who can participate in all aspects of a data science project. Instead, data science done right takes a multidisciplinary team that works together.

In addition to machine learning algorithms and database queries, data science requires the desire to make things better, a sense of curiosity, and basic human communication skills.

In a series of posts, we will explore various aspects of data science: what the process is, who is involved, what is required, and what can result.

James Vineburgh

I never imagined that I would ever want to be a data scientist (or be remotely qualified to be considered one). Scared of statistics as a student, a non-coder, and convinced that I peaked in math as a sixth grader, it wasn't until I had real-world opportunities to tackle concrete business challenges when I became interested in data mining. Now, I enjoy solving business problems and developing new products in the fields of advertising technology and market research in my work with Campus Explorer. Over the past few years, I have learned how to use Rapidminer, some R and Shiny, and several other tools that don't require programming expertise. I read software manuals and online tutorials (such as ThinkToStart!), watch videos, work with mentors, etc. to learn what I need to know to address the problem at hand. I am living proof that anyone who is curious, willing to fail over and over while learning, and interested in how to harness and make sense of data from all kinds of sources (including social media) can become a data scientist.