TABLE OF CONTENTS

Developing an AI project development life cycle involves five distinct tasks. No single individual has enough skills (or time) to carry out all tasks in AI project development. Thus, teams include individuals who focus on part of the cycle. Here is a visual representation of six technical roles and how they relate to various tasks.

Figure 1: Technical roles in an AI team vs. the tasks they carry out in AI projects.

I   What tasks does a data scientist carry out?

Data scientists carry out data engineering, modeling, and business analysis tasks as shown in Figure 1. This includes:

Their skills complement those of people who deploy models and build software infrastructure.

II   What skills does a data scientist need?

Data scientists demonstrate solid scientific foundations as well as business acumen (see Figure 2). Communication skills are usually required, because data scientists often interface with product managers, clients, or business leaders to provide insights for decision making. They understand business and product metrics such as conversions, click-through rates, and customer lifetime value.

Figure 2: A visual representation of the data scientist’s skill set and level of proficiency.

They mostly write prototyping code, as opposed to production code written by engineers, and throw out most of the code they write.

If you’re interested in comparing your skills to other data scientists, we recommend taking the standardized machine learning, data science, mathematics, and algorithmic coding tests on Workera. If you’re a company hiring data scientists, you can administer computerized tests to AI job applicants for free using Workera Test and connect with AI practitioners using Workera Connect.

III   What tools does a data scientist use?

Data scientists in different companies use different tools, but some tools stand out. The following tools grouped by task are the most frequently used tools identified in our research.

IV   In what team structure does a data scientist fit?

Building an AI team requires bringing together complementary individuals who can progressively carry out the tasks of the AI project development lifecycle. AI teams focus on data engineering and modeling from the beginning, because they need to validate the feasibility of an AI project or idea. As the project becomes more mature, the team starts focusing on deployment, business analysis, and AI infrastructure.

Data scientists combine well with software engineers and software engineers-machine learning. Data scientists prototype solutions to prove a concept, while engineers make the project available to users.


Conclusion

This article aims to clarify what a data scientist is, what tasks they carry out, and what skills they need. If you’re an AI practitioner, we hope it helps you choose a career track.

Companies may refer to this role as data scientist, data analyst, machine learning engineer, research scientist, statistician, quantitative analyst, full-stack data scientist, and other titles. If you’re a hiring manager, we hope that it helps you define your job requirements.

AI organizations are constantly evolving, so this article is a work in progress. We intend to revise it as our team learns more about new roles.

Developing an AI project development life cycle involves five distinct$:$ data engineering, modeling, deployment, business analysis, and AI infrastructure.

Author(s)

  1. Kian Katanforoosh - Founder at Workera, Lecturer at Stanford University - Department of Computer Science, Founding member at deeplearning.ai

Acknowledgment(s)

  1. The layout for this article was originally designed and implemented by Jingru Guo, Daniel Kunin, and Kian Katanforoosh for the deeplearning.ai AI Notes, and inspired by Distill.

Footnote(s)

  1. You can practice for the machine learning test, the deep learning test, the data science test, the mathematics test, the algorithmic coding test, and the software engineering test in The Skills Boost.
For members
Unlock your potential. Test. Assess. Progress.
For companies
Unlock the skills data needed to drive innovation and data-driven talent strategies.

↑ Back to top