Data scientists take the recommendations that the business analysts make and do a variety of tasks including the following:
Build the technical case. They apply advanced math and statistics to build the technical cases around the hypotheses that the business analysts build. Data scientists are tasked with building the models required to test these theories. This model is important to big data. You start with a hypothesis. For example, if we change the branding colors on a product on a given day and publish that on Twitter and it is positively received, we can expect an increase in sales of 4 percent. That is the hypothesis.
Create the mathematical models. These models measure what positive sentiment means and then can model what tests need to be run to find correlations between that and price increases.
Discover patterns, trends, and correlations. Some tasks may not necessarily start with a hypothesis. This is where the real power of big data comes in. You find patterns and trends you didn’t even know existed.
The skill required here is to take a business idea and model it with numbers and data. Data scientists take that data and turn it into information. There can be a fine line between what data scientists do and what computer scientists do. There are some overlaps, but there are also jobs with a significant difference, namely in scientific and academic research.
Assessing your interest
As with the business analysts, there are a set of questions you can ask yourself to see if you’re a fit for this type of job. So, you should carefully consider the following questions.
Are you naturally inquisitive?
Just as a business analyst needs to think in terms of building hypotheses, the data scientist needs to have aptitude in this area. Computer scientists need to be able to construct models that can prove or disprove a given business hypothesis. Can you see beyond the surface issues and go deep? Do you know when a result has potential and needs further testing? Are you passionate about technology?
Can you focus for a long time?
The journey required to complete a PhD or advanced degree in the big data field can be a long one. You have to commit a significant amount of study to a specific area of research. Are there areas of math, statistics, or computer science that you have a passion for studying? Do you want to address big problems that may take years to solve? Do you like to write . . . a lot? Can you maintain intense focus on a few topics for many years — maybe for an entire career?
Are you self-motivated?
Data scientists need to be able to direct their own intellectual paths. Do you naturally follow a solution to its end? Do you have a knack for knowing where to find answers if you don’t know them?
Are you multidisciplined?
Data scientists need to be knowledgeable in multiple areas — math, statistics, and computer science. Can you pick up computer science languages and concepts easily? Does the idea of a new language excite you or intimidate you? Can you easily collaborate with others to learn new things?
Idea to reality
Data modeling requires the ability to take business concepts and ideas and model those within a world driven by numbers and data concepts. Do you have the aptitude or interest to build experiments that capture the business value?
Looking at a job posting
Let’s take a look at job posting for a data scientist who would operate at a junior level.
Data Consultant – Recent College Grad
Are you a recent college graduate who loves big data? Are you passionate about cutting-edge technologies and solving challenges for Fortune 500 clients? As a consultant, you’ll be part of a team that develops and implements advanced algorithms and data pipelines that extract, classify, merge, and deliver new insights and business value out of structured and unstructured data sets. You’ll work on a team whose data science efforts range from exploration and investigation to design and development of analytic systems. You’ll have a chance to gain diverse experience across multiple technologies and create path-breaking solutions. You’ll be surrounded and learn from the foremost Thought Leaders in the big data space.
This posting describes two paths: Data engineering and data science.
Key responsibilities include:
Data engineering
Designing and developing code, scripts, and data pipelines that leverage structured and unstructured data integrated from multiple sources
Software installation and configuration
Participating in requirements and design workshops with our clients
Developing project deliverable documentation
Data science
Providing big data solutions for our clients, including analytical consulting, statistical modeling, and quantitative solutions
Mentoring sophisticated organizations on large-scale data and analytics and working closely with client teams to deliver results
Helping to translate business cases to clear research projects, be they exploratory or confirmatory, to help our clients utilize data to drive their businesses
Collaborating and communicating across geographically distributed teams and with external clients
Required skills/experience include:
Data engineering
BS or MS in Computer Science or equivalent work experience
Experience programming in Java, Python, SQL, or C/C++
Background that includes mathematics, statistics, machine learning, and data mining.
Experience with SQL, NoSQL, relational database design, and methods for efficiently retrieving data
Prior work/research experience with unstructured data and data modeling
Strong analytical skills and creative problem solver
Excellent verbal and written communications skills
Strong team player capable of working in a demanding startup environment
Experience building complex and non-interactive systems (batch, distributed, and so on)
Data Science
BS or MS in Computer Science, Math, or equivalent work experience
Coursework in mathematics, statistics, machine learning, and data mining
Proficiency in R or other math packages (Matlab, SAS, and so on)
Excellent programming skills in object-oriented languages
Adept at learning and applying new technologies
Excellent verbal and written communication skills
Strong team player capable of working in a demanding startup environment
Experience with Java and Python
You don’t have to have a PhD to be a data scientist. The first role of a data engineer requires the candidate to have deep understanding of data modeling, programming, machine learning, and math. Although they aren’t building complicated algorithms oriented around research like the second posting, this role requires a deep understanding of data and how to construct data to extract value.
Manu Jeevan is a self-taught data scientist and loves to explain data science concepts in simple terms. You can connect with him on LinkedIn, or email him at manu@bigdataexaminer.com.
Learn R, Python, basics of statistics, machine learning and deep learning through this free course and set yourself up to emerge from these difficult times stronger, smarter and with more in-demand skills! In 15 days you will become better placed to move further towards a career in data science. Upgrade to the specialization programs at attractive discounts!
Don't Miss This Absolutely Free, No Conditions Attached Course