The machine learning field has made significant progress over the last decade, offering solutions for almost all kinds of domains, like banking (fraud detection), e-commerce (recommendation system), and medical applications (tumor detection). Although machine learning provides many solutions, it is not always feasible to incorporate a machine learning-based approach for solving the problem at hand.
In this article, we will discuss machine learning’s limitations and when it is best to avoid using it.
5 key limitations of machine learning
There are a number of limitations and concerns in using machine learning to solve a variety of problems. We have summarized the top five below:
- Ethics. We are slowly moving into the stage called “dataism,” which means humans trust data and algorithms more than their personal insights. This can raise questions against the ethics of machine learning algorithms and systems. Some situations are still unanswered—like how should a self-driving car react in a fatal collision, who to blame if a self-driving car kills someone, etc. In the future, we might get a choice to select an ethical framework in a self-driving car.
- Data. Machine learning systems, especially deep learning models, have been data-hungry (recently, some progress has been made towards training deep learning models with few sample examples). They need a good amount of labeled data to train and provide useful insights and predictions. As the size of the architecture increases, so does the data requirement. For example, GPT3 was trained on 499 billion tokens and cost approximately $4.6 million. Another problem lies with the quality of the data. The poor quality data can significantly reduce the model’s accuracy or give risky predictions. For example, if data created for breast cancer detection uses X-rays from mostly white women, the model trained on this dataset can be biased for predictions reading X-rays of Black women.
- Interpretability. Interpretability is a major issue with machine learning, especially deep learning algorithms. For example, suppose you are working in a finance firm and your boss asks you to build a model to detect fraudulent transactions, along with some metrics like recall and accuracy. In that case, your model should be able to justify its classifications. A deep learning algorithm can have good accuracy and recall for this task—but might fail to validate its decisions.
- Deterministic system. Machine learning algorithms find applications in a deterministic domain, such as Computational Fluid Dynamics (CFD). The traditional methodology to solve governing equations of physics can take a long time—sometimes months—to get final results. A deep learning system applied to this domain might be able to get results in a short period of time, but it doesn’t understand the laws of physics. For example, it might give correct final answers, but intermediate fields like density can have negative values that are not possible according to physics laws.
- Reproducibility. Reproducibility is a growing issue in the machine learning field due to a lack of transparency for code and testing methodology for models being developed. New models developed in research labs are being implemented in real-world applications at a fast pace. However, these models can fail to perform in the real world despite their state-of-the-art performance in research papers. Reproducibility can help different industries and practitioners to implement the same model and find any hidden problems sooner. Lack of reproducibility can prevent models from being assessed for bias, safety, and robustness.
2 instances when you should (definitely) not use machine learning
Below are two examples where machine learning is not feasible.
- Solving less complex problems. Machine learning, specifically deep learning algorithms, are useful for finding complex relationships and hidden patterns in data consisting of many interdependent variables. For less complicated problems, if the rule-based system is giving performance comparable to a machine learning system, then it is advisable to avoid the use of a machine learning system.
- Lack of labeled data and in-house expertise. Most deep learning models require labeled data and an expert team to train the models and put them in production. It is advisable not to use deep learning algorithms to deliver projects if you don’t have enough labeled data and a dedicated team. For example, let’s say that you are developing a model that detects illegal listings from the e-commerce company website. The operation team has determined some keywords to help find illegal listings. Due to the mentioned constraints, you might go with a rule-based approach using keywords to detect illegal listings, and later, as the second version of the model, you can implement an image classification system along with some text models to detect illegal listings once required resources are acquired.
How do you know when to use machine learning, and when not to?
The easiest way around this question is to abide by a simple rule: Don't build a machine learning model where a simpler approach might succeed just as well.
Sometimes, a company might prefer to train a model that is interpretable vs. a more accurate one that might be more difficult to interpret (e.g. deep learning). However, a lot of research is taking place to attempt to address this very issue in deep learning. Before you start, ask yourself: does the problem you're trying to solve require that your model be interpretable? If not, you have your answer.
Is machine learning engineering the right career for you?
Knowing machine learning and deep learning concepts is important—but not enough to get you hired. According to hiring managers, most job seekers lack the engineering skills to perform the job. This is why more than 50% of Springboard's Machine Learning Career Track curriculum is focused on production engineering skills. In this course, you'll design a machine learning/deep learning system, build a prototype, and deploy a running application that can be accessed via API or web service. No other bootcamp does this.
Our machine learning training will teach you linear and logistical regression, anomaly detection, cleaning, and transforming data. We’ll also teach you the most in-demand ML models and algorithms you’ll need to know to succeed. For each model, you will learn how it works conceptually first, then the applied mathematics necessary to implement it, and finally learn to test and train them.
Find out if you're eligible for Springboard's Machine Learning Career Track.