{"id":14630,"date":"2020-08-28T05:41:11","date_gmt":"2020-08-28T12:41:11","guid":{"rendered":"https:\/\/www.springboard.com\/?p=14630"},"modified":"2022-10-13T09:37:51","modified_gmt":"2022-10-13T16:37:51","slug":"scikit-learn","status":"publish","type":"post","link":"https:\/\/www.springboard.com\/blog\/data-science\/scikit-learn\/","title":{"rendered":"Scikit-Learn: A Complete Guide With a Logistic Regression Example"},"content":{"rendered":"\n<p>Scikit-Learn is a machine learning library that includes many supervised and unsupervised learning algorithms. To date, Scikit-Learn is the first stop for most data scientists and machine learning engineers to build their first machine learning model or set a benchmark for further experiments. This is handy because you don\u2019t always need complex and computationally expensive deep learning algorithms to model your data.<\/p>\n\n\n\n<p>In this article, we will focus on logistic regression and its implementation on the MNIST dataset using Scikit-Learn, a free software machine learning library for Python.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is Scikit-Learn logistic regression used for?<\/h2>\n\n\n\n<p>There are two primary problems in supervised machine learning: regression and classification (just to let you know almost 70% problems in <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" rel=\"noreferrer noopener\">data science<\/a> are classification problems). Logistic regression (the term logistic regression is a &#8220;fake friend&#8221; because it does not refer to regression) is a classification algorithm used for classification problems, such as determining whether a tumor is malignant or benign and assessing automotive types. It is essential for an ML engineers or <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" target=\"_blank\" data-type=\"post\" data-id=\"24427\" rel=\"noreferrer noopener\">Data Scientists<\/a> to have a clear understanding on logistic regression.<\/p>\n\n\n\n<p>In simple terms, logistic regression is the process of finding the best possible plane (decision boundary, Figure 1) that separates classes under consideration. It also assumes that these classes are linearly separable.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/e007327b4c2565bf8c4f8c6488615447\/79e48\/screen-shot-2020-11-27-at-12.47.19-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/e007327b4c2565bf8c4f8c6488615447\/5a190\/screen-shot-2020-11-27-at-12.47.19-pm.png\" alt=\"screen shot 2020 11 27 at 12 47 19 pm\" title=\"screen shot 2020 11 27 at 12 47 19 pm\"\/><\/a><\/figure>\n\n\n\n<p><em>Figure 1: Sample decision plane in 2D (Source: jeremyjordan.me)<\/em><\/p>\n\n\n\n<p>Since linear regression is a fundamental building block of machine learning, we\u2019ll use this concept as a jumping-off point to explain the mathematics of logistic regression.<\/p>\n\n\n\n<p>The main difference between linear regression and logistic regression is the output function. Linear regression uses a linear function that outputs continuous values in any range, whereas logistic regression uses a sigmoid function that limits outputs in the range of zero to one.<\/p>\n\n\n<div class=\"bg-leaf-50 p-4 my-3\"><h4 class=\"fw-bold text-center\">Get To Know Other\tData Science Students<\/h4><div class=\"row row-cols-1 row-cols-lg-3\"><div class=\"col\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/melanie-hanna\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203193\/Student%20Success\/Melanie_Hanna_125x125.png\" alt=\"Melanie Hanna\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Melanie Hanna<\/p><p class=\"text-muted lh-1\">Data Scientist at Farmer's Fridge<\/p><\/div><div class=\"w-100 d-block d-md-none mt-3\"><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/melanie-hanna\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/brandon-beidel\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1635453422\/Brandon_Beidel_125x125.png\" alt=\"Brandon Beidel\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Brandon Beidel<\/p><p class=\"text-muted lh-1\">Senior Data Scientist at Red Ventures<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/brandon-beidel\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/esme-gaisford\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203193\/Student%20Success\/Esme_Gaisford_125x125.png\" alt=\"Esme Gaisford\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Esme Gaisford<\/p><p class=\"text-muted lh-1\">Senior Quantitative Data Analyst at Pandora<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/esme-gaisford\">Read Story<\/a><\/p><\/div><\/div><\/div><\/div>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Sigmoid function or logistic function<\/strong><\/li><\/ul>\n\n\n\n<p>Mathematically, the sigmoid function can be described as:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/a8dea81d2fa815eb5691a54f9335bac3\/c7c3c\/screen-shot-2020-11-27-at-12.47.51-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/a8dea81d2fa815eb5691a54f9335bac3\/c7c3c\/screen-shot-2020-11-27-at-12.47.51-pm.png\" alt=\"sigmoid function\" title=\"sigmoid function\"\/><\/a><\/figure>\n\n\n\n<p>This limits the value of output in the range of zero to one, as shown in Figure 1.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/084d045e88af8e7c0b9beac0bbf58c5b\/1bba8\/screen-shot-2020-11-27-at-12.24.03-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/084d045e88af8e7c0b9beac0bbf58c5b\/1bba8\/screen-shot-2020-11-27-at-12.24.03-pm.png\" alt=\"screen shot 2020 11 27 at 12 24 03 pm\" title=\"screen shot 2020 11 27 at 12 24 03 pm\"\/><\/a><\/figure>\n\n\n\n<p><em>Figure 2: Sigmoid function. (Source: Wikipedia)<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Hypothesis<\/strong><\/li><\/ul>\n\n\n\n<p>For linear regression hypothesis function can be written as:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/264df0e05b07158dca0b5c0360b9e4ab\/18872\/screen-shot-2020-11-27-at-12.10.54-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/264df0e05b07158dca0b5c0360b9e4ab\/18872\/screen-shot-2020-11-27-at-12.10.54-pm.png\" alt=\"linear regression hypothesis function\" title=\"linear regression hypothesis function\"\/><\/a><\/figure>\n\n\n\n<p>Which is a simple linear function (straight line). This function can be modified for logistic regression as:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/19fb684aa826dd83c89d173b5625af9c\/00b70\/screen-shot-2020-11-27-at-12.12.49-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/19fb684aa826dd83c89d173b5625af9c\/00b70\/screen-shot-2020-11-27-at-12.12.49-pm.png\" alt=\" simple linear function\" title=\" simple linear function\"\/><\/a><\/figure>\n\n\n\n<p>Hence:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/e22867b06f53af6b7f8dd02aeb1165fb\/7cb89\/screen-shot-2020-11-27-at-12.13.55-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/e22867b06f53af6b7f8dd02aeb1165fb\/7cb89\/screen-shot-2020-11-27-at-12.13.55-pm.png\" alt=\"screen shot 2020 11 27 at 12 13 55 pm\" title=\"screen shot 2020 11 27 at 12 13 55 pm\"\/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/b5c8925461dfd6a431c18a215158582b\/f1d1f\/screen-shot-2020-11-27-at-12.52.12-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/b5c8925461dfd6a431c18a215158582b\/f1d1f\/screen-shot-2020-11-27-at-12.52.12-pm.png\" alt=\"notations\" title=\"notations\"\/><\/a><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Cost function<\/strong><\/li><\/ul>\n\n\n\n<p>In simple terms, the cost function measures the performance of any given machine learning model with respect to data under consideration. This cost function is used to optimize the parameters of the machine learning model after each iteration, during the training phase, to get more accurate predictions.<\/p>\n\n\n\n<p>The cost function for logistic regression is given by:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/c586c338ba36c95363a19e543070956c\/fa2f5\/screen-shot-2020-11-27-at-12.54.16-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/c586c338ba36c95363a19e543070956c\/fa2f5\/screen-shot-2020-11-27-at-12.54.16-pm.png\" alt=\"screen shot 2020 11 27 at 12 54 16 pm\" title=\"screen shot 2020 11 27 at 12 54 16 pm\"\/><\/a><\/figure>\n\n\n\n<p>This can be further simplified to:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/f113f22272703077c18083419657a8df\/ef3e1\/screen-shot-2020-11-27-at-12.18.39-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/f113f22272703077c18083419657a8df\/ef3e1\/screen-shot-2020-11-27-at-12.18.39-pm.png\" alt=\"screen shot 2020 11 27 at 12 18 39 pm\" title=\"screen shot 2020 11 27 at 12 18 39 pm\"\/><\/a><\/figure>\n\n\n\n<p>This cost function is also known as negative log-likelihood loss or cross-entropy loss.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/beb5a1e12e2c3c49ed7143fd88b9769a\/321ea\/screen-shot-2020-11-27-at-12.19.37-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/beb5a1e12e2c3c49ed7143fd88b9769a\/321ea\/screen-shot-2020-11-27-at-12.19.37-pm.png\" alt=\"screen shot 2020 11 27 at 12 19 37 pm\" title=\"screen shot 2020 11 27 at 12 19 37 pm\"\/><\/a><\/figure>\n\n\n\n<p><em>Figure 3: Cost function (Source: Researchgate)<\/em><\/p>\n\n\n\n<p>Figure 2 depicts the cost function. When \u201cy\u201d is one and \u201ch\u201d is zero (blue line), the cost function will be high, thus severely penalizing the machine learning model. When \u201cy\u201d is one and \u201ch\u201d is also one (blue line), then the cost function will be zero, meaning no penalty for making correct predictions. Similarly, when \u201cy\u201d is zero and \u201ch\u201d is one (red line), the penalty will be high, whereas when \u201cy\u201d is zero and \u201ch\u201d is also zero, the penalty will be zero.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/b9c30e051cbe34f6dbde8d30c4374690\/2c288\/screen-shot-2020-11-27-at-12.51.04-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/b9c30e051cbe34f6dbde8d30c4374690\/2c288\/screen-shot-2020-11-27-at-12.51.04-pm.png\" alt=\"screen shot 2020 11 27 at 12 51 04 pm\" title=\"screen shot 2020 11 27 at 12 51 04 pm\"\/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><a href=\"https:\/\/www.springboard.com\/library\/static\/9d07a63de4803b2c0e8cbcd7ff09071e\/f7616\/screen-shot-2020-11-27-at-12.25.10-pm.png\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/library\/static\/9d07a63de4803b2c0e8cbcd7ff09071e\/f7616\/screen-shot-2020-11-27-at-12.25.10-pm.png\" alt=\"screen shot 2020 11 27 at 12 25 10 pm\" title=\"screen shot 2020 11 27 at 12 25 10 pm\"\/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Implementing logistic regression on the MNIST dataset<\/h2>\n\n\n\n<p>In this section, we will implement logistic regression on the <a href=\"https:\/\/www.openml.org\/d\/554\" target=\"_blank\" rel=\"noreferrer noopener\">MNIST dataset<\/a>. The MNIST dataset is a well-known benchmark dataset in the machine learning community. This dataset consists of pictures of handwritten digits with labels. All images are squares sized 28 x 28 pixels. The label ranges from zero to nine. This is a multinomial logistic regression problem.<\/p>\n\n\n\n<p>By default, Scikit-learn takes care of the implementation, whether it\u2019s a binary or multinomial problem depending on the number of labels present in the dataset.<\/p>\n\n\n\n<p>The code for implementing logistic regression with Scikit-learn on MNIST dataset can be found <a href=\"https:\/\/github.com\/mayank311996\/cheatsheets\/blob\/master\/blogs\/SB\/Scikit-learn-A-Complete-Guide-With-a-Logistic-Regression-Example\/sklearn_logistic_regression.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. This includes a detailed implementation of the logistic regression model with Scikit-learn.<\/p>\n\n\n\n<p class=\"rm has-background\" style=\"background-color:#efeff6\"><strong>Since you\u2019re here\u2026<br><\/strong>Curious about a career in data science? Experiment with our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/resources\/guides\/data-science-process\/\" target=\"_blank\">free data science learning path<\/a>, or join our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" target=\"_blank\">Data Science Bootcamp<\/a>, where you\u2019ll get your tuition back if you don&#8217;t land a job after graduating. We\u2019re confident because our courses work \u2013 check out our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/success\/\" target=\"_blank\">student success stories<\/a> to get inspired.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Scikit-Learn is a machine learning library that includes many supervised and unsupervised learning algorithms. To date, Scikit-Learn is the first stop for most data scientists and machine learning engineers to build their first machine learning model or set a benchmark for further experiments. This is handy because you don\u2019t always need complex and computationally expensive [&hellip;]<\/p>\n","protected":false},"author":100,"featured_media":19011,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_eb_attr":"","_eb_data_table":"","footnotes":""},"categories":[67],"tags":[],"marketing_tags":[],"class_list":{"0":"post-14630","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science"},"acf":[],"_links":{"self":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14630"}],"collection":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/users\/100"}],"replies":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/comments?post=14630"}],"version-history":[{"count":4,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14630\/revisions"}],"predecessor-version":[{"id":35023,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/14630\/revisions\/35023"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media\/19011"}],"wp:attachment":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media?parent=14630"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/categories?post=14630"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/tags?post=14630"},{"taxonomy":"marketing_tags","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/marketing_tags?post=14630"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}