Developing an AI project development life cycle involves five distinct tasks. No single individual has enough skills (or time) to carry out all tasks in AI project development. Thus, teams include individuals who focus on part of the cycle. Here is a visual representation of six technical roles and how they relate to various tasks.
I What tasks does a software engineer - machine learning carry out?
People who have the title software engineer - machine learning carry out data engineering, modeling, deployment and AI infrastructure tasks in Figure 1. This includes:
- data engineering subtasks such as defining data requirements, collecting, labeling, inspecting, cleaning, augmenting, and moving data.
- modeling subtasks such as training machine learning models, defining evaluation metrics, searching hyperparameters, and reading research papers.
- deployment subtasks such as converting prototyped code into production code, setting up a cloud environment to deploy the model, or improving response times and saving bandwidth.
- AI infrastructure subtasks such as building and maintaining reliable, fast, secure, and scalable software systems to help people working in data engineering, modeling, deployment and business analysis.
II What skills does a software engineer - machine learning need?
The software engineer - machine learning demonstrates solid engineering skills and is developing scientific skills (see Figure 2). Communication skills requirements vary among teams. They mostly write production code and sometimes prototyping code, as opposed to data scientists who mostly write prototyping code and software engineers who write lots of production code.
If you’re interested in comparing your skills to other software engineers - machine learning, we recommend taking the standardized machine learning, data science, mathematics, algorithmic coding, and software engineering tests on Workera. If you’re a company hiring software engineers - machine learning, you can administer computerized tests to AI job applicants for free using Workera Test and connect with AI practitioners using Workera Connect.
III What tools does a software engineer - machine learning use?
The software engineer - machine learning in different companies uses different tools, but some tools stand out. The following tools grouped by task are the most frequently used tools identified in our research.
- Modeling is primarily done in Python using packages such as numpy, scikit-learn, pandas, matplotlib, TensorFlow, and PyTorch.
- Data engineering happens in Python and/or SQL or other domain-specific query languages.
- Deployment and AI infrastructure using an object-oriented pro- gramming language such as Python, Java, or C++ and cloud technologies such as AWS, GCP, and Azure.
- Collaboration and workflow is managed with a version control system (for instance, Git, Subversion, and Mercurial), a command line interface (CLI) like Unix, an integrated development environment (IDE) such as Jupyter Notebook, and Sublime, and an issue tracking product like JIRA.
IV In what team structure does a software engineer fit?
Building an AI team requires bringing together complementary individuals who can progressively carry out the tasks of the AI project development lifecycle. AI teams focus on data engineering and modeling from the beginning, because they need to validate the feasibility of an AI project or idea. As the project becomes more mature, the team starts focusing on deployment, business analysis, and AI infrastructure.
Software engineers - machine learning work well with scientists, analysts and researchers who take charge of business analysis and modeling. The software engineer-machine learning is also the go-to role for early-stage teams or start-ups aiming to deploy machine learning models, because of its ability to carry out a variety of tasks.
This one-person team is an alternative to the team combining a software engineer with a data scientist and/or a machine learning engineer. The latter format is most common for teams working on more mature machine learning projects such as improving an existing machine learning model in production. Also, it is easier to find people to fill this role than to find machine learning engineers, and it is less costly and quicker to fill a single position than to hire a data scientist plus a software engineer.
Conclusion
This article aims to clarify what a software engineer - machine learning is, what tasks they carry out, and what skills they need. If you’re an AI practitioner, we hope it helps you choose a career track.
Companies may refer to this position as machine learning engineer, software engineer, full-stack data scientist, and many more titles. If you’re a hiring manager, we hope that it helps you define your job requirements.
AI organizations are constantly evolving, so this article is a work in progress. We intend to revise it as our team learns more about new roles.