AI organizations divide their work into data engineering, modeling, deployment, business analysis, and AI infrastructure. The necessary skills to carry out these tasks are a combination of technical, behavioral, and decision making skills. The machine learning case study interview focuses on technical and decision making skills, and you’ll encounter it during an onsite round for a Machine Learning Engineer (MLE), Data Scientist (DS), Machine Learning Researcher (MLR) or Software Engineer-Machine Learning (SE-ML) role. You can learn more about these roles in our AI Career Pathways report and about other types of interviews in The Skills Boost.
I What to expect in the machine learning case study interview
The interviewer is evaluating how you approach a real-world machine learning problem. The interview is usually a technical discussion of an open-ended question. There is no exact solution to the problem; it’s your thought process that the interviewer is evaluating. Here’s a list of interview questions you might be asked:
- How would you build a trigger word detection algorithm to spot the word “activate” in a 10 second long audio clip?
- An e-commerce company is trying to minimize the time it takes customers to purchase their selected items. As a machine learning engineer, what can you do to help them?
- You are given a data set of credit card purchases information. Each record is labeled as fraudulent or safe. You are asked to build a fraud detection algorithm. How would you proceed?
- You are provided with data from a music streaming platform. Each of the 100,000 records indicates the songs a user has listened to in the past month. How would you build a music recommendation system?
II Recommended framework
All interviews are different, but the ASPER framework is applicable to a variety of case studies:
- Ask. Ask questions to uncover details that were kept hidden by the interviewer. Specifically, you want to answer the following questions: “what are the product requirements and evaluation metrics?”, “what data do I have access to?”, ”how will the learning algorithm be used at test time, and does it need to be regularly re-trained?”
- Suppose. Make justified assumptions to simplify the problem. Examples of assumptions are: “we are in small data regime”, “human-level error is 7%”, “the data distribution won’t change over time”, etc.
- Plan. Break down the problem into tasks. A common task sequence in the machine learning case study interview is: (i) data engineering, (ii) modeling, and (iii) deployment.
- Execute. Announce your plan, and tackle the tasks one by one. In this step, the interviewer might ask you to write code or explain the maths behind your proposed method.
- Recap. At the end of the interview, summarize your answer and mention the tools and frameworks you would use to perform the work. It is also a good time to express your ideas on how the problem can be extended.
III Interview tips
Every interview is an opportunity to show your skills and motivation for the role. Thus, it is important to prepare in advance. Here are useful rules of thumb to follow:
Show your motivation.
In machine learning case study interviews, the interviewer will evaluate your excitement for the company’s product. Make sure to show your curiosity, creativity and enthusiasm.
Listen to the hints given by your interviewer.
Example: Given an imbalanced clinical dataset, you are asked to classify if a patient’s health is at risk (1) or not (0). You focus on modeling and propose a logistic regression. The interviewer asks you “what’s your optimization objective?”. You confidently answer “the binary cross-entropy loss”. Your interviewer follows up with “Would you consider modifying your loss function?” In this scenario, the interviewer probably expects you to connect the dots between your loss function and the imbalanced data set. In fact, you might consider weighing the terms in your loss function to account for the data imbalance.
Show that you understand the development life cycle of an AI project.
Many candidates are only interested in what model they will use and how to train it. Remember that developing AI projects involves multiple tasks including data engineering, modeling, deployment, business analysis, and AI infrastructure.
Avoid clear-cut statements.
Because case studies are often open-ended and can have multiple valid solutions, avoid making categorical statements such as “the correct approach is …” You might offend the interviewer if the approach they are using is different from what you describe. It’s also better to show your flexibility with and understanding of the pros and cons of different approaches.
Study topics relevant to the company.
Machine learning case studies are often inspired by in-house projects. If the team is working on a domain-specific application, explore the literature.
Example 1: If the team is working on a face verification product, review the face recognition lessons of the Coursera Deep Learning Specialization (Course 4), as well as the DeepFace (Taigman et al., 2014) and FaceNet (Schroff et al., 2015) papers prior to the onsite.
Example 2: If the team is building an autonomous car, you might want to read about topics such as object detection, path planning, safety, or edge deployment.
Write clearly, draw charts, and introduce a notation if necessary.
The interviewer will judge the clarity of your thought process and your scientific rigor.
Example: Show your ability to strategize by drawing the AI project development life cycle on the whiteboard.
When you are not sure of your answer, be honest and say so.
Interviewers value honesty and penalize bluffing far more than lack of knowledge.
When out of ideas or stuck, think out loud rather than staying silent.
Talking through your thought process will help the interviewer correct you and point you in the right direction.
IV Resources
You can build decision making skills by reading machine learning war stories and exposing yourself to projects. Here’s a list of useful resources to prepare for the machine learning case study interview.
- In deeplearning.ai’s course Structuring your Machine Learning Project, you’ll find insights drawn from Andrew Ng’s experience building and shipping many deep learning products. This course also has two “flight simulators” that let you practice decision-making as a machine learning project leader. It provides “industry experience” that you might otherwise get only after years of ML work experience.
- Stanford Deep Learning class by Andrew Ng and Kian Katanforoosh (CS230):
- Deep learning intuition (video)
- Full-cycle deep learning projects (video)
- AI+healthcare case studies (video)
- Deep learning project strategy (video)
- Case study on conversational assistants (video)
- Search for case studies from the companies in the same industry as the ones you’re interviewing with. Here are examples of company case studies:
- In Machine Learning-Powered Search Ranking of Airbnb Experiences, Grbovic explains how Airbnb built and iterated on a machine learning Search Ranking platform to grow a new two-sided marketplace called Airbnb Experiences.
- If machine learning inference happens on the edge rather than on the cloud, users experience lower latency and their product usage is less impacted by network connectivity. In Machine Learning at Facebook: Understanding Inference at the Edge, Wu et al. present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.
- Personalization is one key component of modern customer engagement programs. In Empowering Personalized Marketing with Machine Learning, Lyft data scientist Girard goes through an applied example of solving a personalized marketing problem.
- Companies all over the world use recommender systems to help users discover relevant content. In Learning a Personalized Homepage, Netflix engineers Alvino and Basilico explain how to best tailor each Netflix user’s homepage to make it relevant, cover their interests and intents, and still allow for exploration of the catalog.
- You can find a complementary list of ML case studies in this Git repository by Chip Huyen.