{"id":14640,"date":"2020-09-25T22:46:34","date_gmt":"2020-09-26T05:46:34","guid":{"rendered":"https:\/\/www.springboard.com\/?p=14640"},"modified":"2022-10-13T09:33:14","modified_gmt":"2022-10-13T16:33:14","slug":"when-not-to-use-ml","status":"publish","type":"post","link":"https:\/\/www.springboard.com\/blog\/data-science\/when-not-to-use-ml\/","title":{"rendered":"When Should You Not Use Machine Learning?"},"content":{"rendered":"\n<p>The machine learning field has made significant progress over the last decade, offering solutions for almost all kinds of domains, like banking (fraud detection), e-commerce (recommendation system), and medical applications (tumor detection). Although machine learning provides many solutions, it is not always feasible to incorporate a machine learning-based approach for solving the problem at hand.<\/p>\n\n\n\n<p>In this article, we will discuss the limitations of machine learning and when it is best to avoid using it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">5 key limitations of machine learning<\/h2>\n\n\n\n<p>There are a number of limitations and concerns in using machine learning to solve a variety of problems. We have summarized the top five below:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Ethics.<\/strong> We are slowly moving into the stage called \u201c<a href=\"https:\/\/en.wikipedia.org\/wiki\/Dataism\" target=\"_blank\" rel=\"noreferrer noopener\">dataism<\/a>,\u201d which means humans trust data and algorithms more than their personal insights. This can raise questions against the ethics of machine learning algorithms and systems. Some situations are still unanswered\u2014like how should a self-driving car react in a fatal collision, who to blame if a self-driving car kills someone, etc. <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" target=\"_blank\" data-type=\"post\" data-id=\"24427\" rel=\"noreferrer noopener\">Data Scientists<\/a>, ML Engineers, Deep Learning Researchers are working around the clock to crack these challenges.  In the future, we might get a choice to select an ethical framework in a self-driving car.<\/li><li><strong>Data.<\/strong> Machine learning systems, especially deep learning models, have been data-hungry (recently, some progress has been made towards training deep learning models with few sample examples). They need a good amount of labeled data to train and provide useful insights and predictions. As the size of the architecture increases, so does the data requirement. For example, <a href=\"https:\/\/en.wikipedia.org\/wiki\/GPT-3\" target=\"_blank\" rel=\"noreferrer noopener\">GPT3<\/a> was trained on 499 billion tokens and <a href=\"https:\/\/lambdalabs.com\/blog\/demystifying-gpt-3\/\" target=\"_blank\" rel=\"noreferrer noopener\">cost approximately $4.6 million<\/a>. Another problem lies with the quality of the data. The poor quality data can significantly reduce the model\u2019s accuracy or give risky predictions. For example, if data created for breast cancer detection uses X-rays from mostly white women, the model trained on this dataset can be biased for predictions reading X-rays of Black women. In such cases, the more accurate the data gathered via predictive analysis (which comes within the domain of <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" rel=\"noreferrer noopener\">data science<\/a>), the more convenient the outcome will be.<\/li><li><strong>Interpretability.<\/strong> Interpretability is a major issue with machine learning, especially deep learning algorithms. For example, suppose you are working in a finance firm and your boss asks you to build a model to detect fraudulent transactions, along with some metrics like recall and accuracy. In that case, your model should be able to justify its classifications. A deep learning algorithm can have good accuracy and recall for this task\u2014but might fail to validate its decisions.<\/li><li><strong>Deterministic system<\/strong>. Machine learning algorithms find applications in a deterministic domain, such as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Computational_fluid_dynamics\" target=\"_blank\" rel=\"noreferrer noopener\">Computational Fluid Dynamics (CFD)<\/a>. The traditional methodology to solve governing equations of physics can take a long time\u2014sometimes months\u2014to get final results. A deep learning system applied to this domain might be able to get results in a short period of time, but it doesn\u2019t understand the laws of physics. For example, it might give correct final answers, but intermediate fields like density can have negative values that are not possible according to physics laws.<\/li><li><strong>Reproducibility.<\/strong> Reproducibility is a growing issue in the machine learning field due to a lack of transparency for code and testing methodology for models being developed. New models developed in research labs are being implemented in real-world applications at a fast pace. However, these models can fail to perform in the real world despite their state-of-the-art performance in research papers. Reproducibility can help different industries and practitioners to implement the same model and find any hidden problems sooner. Lack of reproducibility can prevent models from being assessed for bias, safety, and robustness.<\/li><\/ul>\n\n\n<div class=\"bg-leaf-50 p-4 my-3\"><h4 class=\"fw-bold text-center\">Get To Know Other\tData Science Students<\/h4><div class=\"row row-cols-1 row-cols-lg-3\"><div class=\"col\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/abby-morgan\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1654205000\/Student%20Success\/Abby_Morgan.jpg\" alt=\"Abby Morgan\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Abby Morgan<\/p><p class=\"text-muted lh-1\">Data Scientist at NPD Group<\/p><\/div><div class=\"w-100 d-block d-md-none mt-3\"><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/abby-morgan\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/leoman-momoh\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203191\/Student%20Success\/Leoman_Momoh_125x125.png\" alt=\"Leoman Momoh\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Leoman Momoh<\/p><p class=\"text-muted lh-1\">Senior Data Engineer at Enterprise Products<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/leoman-momoh\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/jonah-winninghoff\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1680561342\/Jonah_Winninghoff.png\" alt=\"Jonah Winninghoff\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Jonah Winninghoff<\/p><p class=\"text-muted lh-1\">Statistician at Rochester Institute Of Technology<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/jonah-winninghoff\">Read Story<\/a><\/p><\/div><\/div><\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">2 instances when you should (definitely) not use machine learning<\/h2>\n\n\n\n<p>Below are two examples where machine learning is not feasible.<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><strong>Solving less complex problems.<\/strong> Machine learning, specifically deep learning algorithms, are useful for finding complex relationships and hidden patterns in data consisting of many interdependent variables. For less complicated problems, if the rule-based system is giving performance comparable to a machine learning system, then it is advisable to avoid the use of a machine learning system.<\/li><li><strong>Lack of labeled data and in-house expertise.<\/strong> Most deep learning models require labeled data and an expert team to train the models and put them in production. It is advisable not to use deep learning algorithms to deliver projects if you don\u2019t have enough labeled data and a dedicated team. For example, let\u2019s say that you are developing a model that detects illegal listings from the e-commerce company website. The operation team has determined some keywords to help find illegal listings. Due to the mentioned constraints, you might go with a rule-based approach using keywords to detect illegal listings, and later, as the second version of the model, you can implement an image classification system along with some text models to detect illegal listings once required resources are acquired.<\/li><\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">How do you know when to use machine learning, and when not to?<\/h2>\n\n\n\n<p>The easiest way around this question is to abide by a simple rule: Don&#8217;t build a machine learning model where a simpler approach might succeed just as well.<\/p>\n\n\n\n<p>Sometimes, a company might prefer to train a model that is interpretable vs. a more accurate one that might be more difficult to interpret (e.g. deep learning). However, a lot of research is taking place to attempt to address this very issue in deep learning. Before you start, ask yourself: does the problem you&#8217;re trying to solve require that your model be interpretable? If not, you have your answer. <\/p>\n\n\n\n<p class=\"rm has-background\" style=\"background-color:#efeff6\"><strong>Since you\u2019re here\u2026<br><\/strong>Curious about a career in data science? Experiment with our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/resources\/guides\/data-science-process\/\" target=\"_blank\">free data science learning path<\/a>, or join our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" target=\"_blank\">Data Science Bootcamp<\/a>, where you\u2019ll get your tuition back if you don&#8217;t land a job after graduating. We\u2019re confident because our courses work \u2013 check out our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/success\/\" target=\"_blank\">student success stories<\/a> to get inspired.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The machine learning field has made significant progress over the last decade, offering solutions for almost all kinds of domains, like banking (fraud detection), e-commerce (recommendation system), and medical applications (tumor detection). Although machine learning provides many solutions, it is not always feasible to incorporate a machine learning-based approach for solving the problem at hand. [&hellip;]<\/p>\n","protected":false},"author":100,"featured_media":19005,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_eb_attr":"","_eb_data_table":"","footnotes":""},"categories":[67],"tags":[],"marketing_tags":[],"class_list":{"0":"post-14640","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science"},"acf":[],"_links":{"self":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14640"}],"collection":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/users\/100"}],"replies":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/comments?post=14640"}],"version-history":[{"count":3,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14640\/revisions"}],"predecessor-version":[{"id":35018,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14640\/revisions\/35018"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media\/19005"}],"wp:attachment":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media?parent=14640"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/categories?post=14640"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/tags?post=14640"},{"taxonomy":"marketing_tags","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/marketing_tags?post=14640"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}