There’s some confusion surrounding the roles of machine learning engineer vs. data scientist, primarily because they are both relatively new. However, if you parse things out and examine the semantics, the distinctions become clear.
At a high level, we’re talking about scientists and engineers. While a scientist needs to fully understand the, well, science behind their work, an engineer is tasked with building something.
But before we go any further, let’s address the difference between machine learning and data science.
It starts with having a solid definition of artificial intelligence. This term was first coined by John McCarthy in 1956 to discuss and develop the concept of “thinking machines,” which included the following:
- Automata theory
- Complex information processing
Approximately six decades later, artificial intelligence is now perceived to be a sub-field of computer science where computer systems are developed to perform tasks that would typically demand human intervention. These include:
- Speech recognition
- Translation between languages
- Visual perception
Machine learning is a branch of artificial intelligence where a class of data-driven algorithms enables software applications to become highly accurate in predicting outcomes without any need for explicit programming.
The basic premise here is to develop algorithms that can receive input data and leverage statistical models to predict an output while updating outputs as new data becomes available.
The processes involved have a lot in common with predictive modeling and data mining. This is because both approaches demand one to search through the data to identify patterns and adjust the program accordingly.
Most of us have experienced machine learning in action in one form or another. If you have shopped on Amazon or watched something on Netflix, those personalized (product or movie) recommendations are machine learning in action.
Data science can be described as the description, prediction, and causal inference from both structured and unstructured data. This discipline helps individuals and enterprises make better business decisions.
It’s also a study of where data originates, what it represents, and how it could be transformed into a valuable resource. To achieve the latter, a massive amount of data has to be mined to identify patterns to help businesses:
- Gain a competitive advantage
- Identify new market opportunities
- Increase efficiencies
- Rein in costs
The field of data science employs computer science disciplines like mathematics and statistics and incorporates techniques like data mining, cluster analysis, visualization, and—yes—machine learning.
Having said all of that, this post aims to answer the following questions:
- Machine learning engineer vs. data scientist: what degree do they need?
- Machine learning engineer vs. data scientist: what do they actually do?
- Machine learning engineer vs. data scientist: what’s the average salary?
Machine Learning Engineer vs. Data Scientist: What They Do
As mentioned above, there are some similarities when it comes to the roles of machine learning engineers and data scientists.
However, if you look at the two roles as members of the same team, a data scientist does the statistical analysis required to determine which machine learning approach to use, then they model the algorithm and prototype it for testing. At that point, a machine learning engineer takes the prototyped model and makes it work in a production environment at scale.
Going back to the scientist vs. engineer split, a machine learning engineer isn’t necessarily expected to understand the predictive models and their underlying mathematics the way a data scientist is. A machine learning engineer is, however, expected to master the software tools that make these models usable.
What Does a Machine Learning Engineer Do?
Machine learning engineers sit at the intersection of software engineering and data science. They leverage big data tools and programming frameworks to ensure that the raw data gathered from data pipelines are redefined as data science models that are ready to scale as needed.
Machine learning engineers feed data into models defined by data scientists. They’re also responsible for taking theoretical data science models and helping scale them out to production-level models that can handle terabytes of real-time data.
Machine learning engineers also build programs that control computers and robots. The algorithms developed by machine learning engineers enable a machine to identify patterns in its own programming data and teach itself to understand commands and even think for itself.
What Does a Data Scientist Do?
When a business needs to answer a question or solve a problem, they turn to a data scientist to gather, process, and derive valuable insights from the data. Whenever data scientists are hired by an organization, they will explore all aspects of the business and develop programs using programming languages like Java to perform robust analytics.
They will also use online experiments along with other methods to help businesses achieve sustainable growth. Additionally, they can develop personalized data products to help companies better understand themselves and their customers to make better business decisions.
As previously mentioned, data scientists focus on the statistical analysis and research needed to determine which machine learning approach to use, then they model the algorithm and prototype it for testing.
What Do the Experts Say?
Springboard recently asked two working professionals for their definitions of machine learning engineer vs. data scientist.
Mansha Mahtani, a data scientist at Instagram, said:
“Given both professions are relatively new, there tends to be a little bit of fluidity on how you define what a machine learning engineer is and what a data scientist is. My experience has been that machine learning engineers tend to write production-level code. For example, if you were a machine learning engineer creating a product to give recommendations to the user, you’d be actually writing live code that would eventually reach your user. The data scientist would be probably part of that process—maybe helping the machine learning engineer determine what are the features that go into that model—but usually data scientists tend to be a little bit more ad hoc to drive a business decision as opposed to writing production-level code.”
Shubhankar Jain, a machine learning engineer at SurveyMonkey, said:
“A data scientist today would primarily be responsible for translating this business problem of, for example, we want to figure out what product we should sell next to our customers if they’ve already bought a product from us. And translating that business problem into more of a technical model and being able to then output a model that can take in a certain set of attributes about a customer and then spit out some sort of result. An ML engineer would probably then take that model that this data scientist developed and integrate it in with the rest of the company’s platform—and that could involve building, say, an API around this model so that it can be served and consumed, and then being able to maintain the integrity and quality of this model so that it continues to serve really accurate predictions.”
Machine Learning Engineer vs. Data Scientist: Role Requirements
What Are the Requirements for a Machine Learning Engineer?
To work as a machine learning engineer, most companies prefer candidates who have a master’s degree in computer science. However, as this field is relatively new and there is a shortage of top tech talent, many employers will be willing to make exceptions.
However, to stand a chance, potential candidates need to be familiar with the standard implementation of machine learning algorithms which are freely available through APIs, libraries, and packages (along with the advantages and disadvantages of each approach).
According to a report by IBM, machine learning engineers should know the following programming languages (as listed by rank):
Here’s what you’ll need to get the job, based on current job postings:
- Master’s or Ph.D. in computer science, mathematics, or statistics
- Experience working with Java, Python, and R
- Experience with vision processing, deep neural networks, Gaussian processes, and reinforcement learning
- A solid understanding of both probability and statistics
- A firm understanding of mathematics (including the role of algorithm theory in machine learning and complex algorithms that are needed to help machines learn and communicate)
- Advanced knowledge of engineering
- Strong analytical skills
- Experience using programming tools like MATLAB
- Experience working with large amounts of data in a high throughput environment
- Linux SysAdmin skills
- Experience working with distributed systems tools like Etcd, zookeeper, and consul
- Experience working with messaging tools like Kafka, RabbitMQ, and ZeroMQ
- Extensive knowledge of machine learning evaluation metrics and best practices
- Competency with infrastructure as code (for example, Terraform or Cloudformation)
What Are the Requirements for a Data Scientist?
Like machine learning engineers, data scientists also need to be highly educated. In fact, many have a master’s degree or a Ph.D. Based on one recent report, most data scientists have an advanced degree in engineering (16 percent), computer science (19 percent), or mathematics and statistics (32 percent).
That being said, according to Paula Griffin, product manager at Quora, “There are large swaths of data science that don’t require [advanced degree] research-oriented skills. There’s a huge amount of impact that you can have by leveraging the skills that are better built through industry settings as well.”
Here’s what you’ll need to get the job:
- Master’s or Ph.D. in computer science, engineering, mathematics, or statistics (although for many employers, experience can be a solid substitute)
- Experience working with Java, Python, and SQL
- Strong mathematical skills
- Strong analytical skills
- Experience in statistical and data mining techniques (like boosting, generalized linear models/regression, random forests, trees, and social network analysis)
- Knowledge of advanced statistical methods and concepts
- Experience working with machine learning techniques such as artificial neural networks, clustering, and decision tree learning
- Experience using web services like DigitalOcean, Redshift, S3, and Spark
- 5-7 years of experience building statistical models and manipulating data sets
- Experience analyzing data from third-party providers like AdWords, Coremetrics, Crimson, Facebook Insights, Google Analytics, Hexagon, and Site Catalyst
- Experience working with distributed data and computing tools like Hadoop, Hive, Gurobi, Map/Reduce, MySQL, and Spark
- Experience visualizing and presenting data using Business Objects, D3, ggplot, and Periscope
Machine Learning Engineer vs. Data Scientist: Role Responsibilities
What Are the Responsibilities of a Machine Learning Engineer?
The responsibilities of a machine learning engineer will be relative to the project they’re working on. However, if you explore the job postings, you’ll notice that for the most part, machine learning engineers will be responsible for building algorithms that are based on statistical modeling procedures and maintaining scalable machine learning solutions in production.
Here’s what these roles typically demand:
- Develop machine learning models
- Collaborate with data engineers to develop data and model pipelines
- Apply machine learning and data science techniques and design distributed systems
- Write production-level code
- Bring code to production
- Engage in code reviews
- Improve existing machine learning models
- Be in charge of the entire lifecycle (research, design, experimentation, development. deployment, monitoring, and maintenance)
- Produce project outcomes and isolate issues
- Implement machine learning algorithms and libraries
- Communicate complex processes to business leaders
- Analyze large and complex data sets to derive valuable insights
- Research and implement best practices to enhance existing machine learning infrastructure
To get an idea of the variance of machine learning engineering jobs, we took a look at job postings on several different sites.
Here’s a recent posting for a New York City-based machine learning engineer role at Twitter:
Here’s a recent posting for a San Francisco-based machine learning engineer role at Adobe:
What Are the Responsibilities of a Data Scientist?
When compared to a statistician, a data scientist knows a lot more about programming. However, when compared to a software engineer, they know much more about statistics than coding.
Data scientists are well-equipped to store and clean large amounts of data, explore data sets to identify valuable insights, build predictive models, and run data science projects from end to end. More often than not, many data scientists once worked as data analysts.
Here’s what the role typically demands:
- Research and develop statistical models for analysis
- Better understand company needs and devise possible solutions by collaborating with product management and engineering departments
- Communicate results and statistical concepts to key business leaders
- Use appropriate databases and project designs to optimize joint development efforts
- Develop custom data models and algorithms
- Build processes and tools to help monitor and analyze performance and data accuracy
- Use predictive modeling to enhance and optimize customer experiences, revenue generation, ad targeting, and more
- Develop company A/B testing framework and test model quality
Here’s a recent posting for a New York City-based data scientist role at Asana:
Here’s another recent posting for a San Francisco-based data scientist role at Metromile:
Machine Learning Engineer vs. Data Scientist: Salary
How Much Does a Machine Learning Engineer Make?
The wages commanded by machine learning engineers can vary depending on the type of role and where it’s located. According to Indeed, the average salary for a machine learning engineer is about $145,000 per year.
How Much Does a Data Scientist Make?
What data scientists make annually also depends on the type of job and where it’s located. Remember, it is a much broader role than machine learning engineer. That said, according to Glassdoor, a data scientist role with a median salary of $110,000 is now the hottest job in America.
As the demand for data scientists and machine learning engineers grows, you can also expect these numbers to rise.
If you take a step back and look at both of these jobs, you’ll see that it’s not a question of machine learning vs. data science. Instead, it’s all about what you’re interested in working with and where you see yourself many years from now.
Let’s summarize the questions posed at the beginning of this article:
- Data scientist vs. machine learning engineer: do they need a degree?
- Most employers would prefer an advanced degree, but to meet demand, they will be open to hiring those who have the right skills and experience.
- Data scientist vs. machine learning engineer: what do they actually do?
- While there’s some overlap, which is why some data scientists with software engineering backgrounds move into machine learning engineer roles, data scientists focus on analyzing data, providing business insights, and prototyping models, while machine learning engineers focus on coding and deploying complex, large-scale machine learning products.
- Data scientist vs. machine learning engineer: who makes more?
- At present, machine learning engineers make more, but the data scientist role is a much broader one, so there is a wide variety of salaries depending on the specifics of the job.
Whether you become a machine learning engineer or a data scientist, you’re going to be working at the cutting edge of business and technology. And since the demand for top tech talent far outpaces supply, the competition for bright minds within this space will continue to be fierce for years to come. So you really can’t go wrong no matter which path you choose.
Looking to prepare for broader data science roles? Check out Springboard’s Data Science Career Track. It’s a self-guided, mentor-led bootcamp with a job guarantee!
If you’re more narrowly focused on becoming a machine learning engineer, consider Springboard’s AI / Machine Learning Career Track, the first of its kind to come with a job guarantee.