{"id":10349,"date":"2023-09-27T12:12:25","date_gmt":"2023-09-27T19:12:25","guid":{"rendered":"https:\/\/www.springboard.com\/?p=10349"},"modified":"2023-10-06T05:22:50","modified_gmt":"2023-10-06T12:22:50","slug":"machine-learning-gpt-3-open-ai","status":"publish","type":"post","link":"https:\/\/www.springboard.com\/blog\/data-science\/machine-learning-gpt-3-open-ai\/","title":{"rendered":"OpenAI GPT-3: Everything You Need to Know [Updated]"},"content":{"rendered":"\n<p>OpenAI\u2019s latest model has gone viral again. Much like its predecessor, there is no stopping to the buzz that OpenAI\u2019s latest model GPT-3 is creating around the internet. As experts praise the model for its intuitive capabilities which range from writing articles to generating code, many experts including the founder of OpenAI have called out the hype \u201cway too much\u201d. The timing of the release lines up OpenAI\u2019s new business model of commercialising its AI through API. Unarguably, OpenAI GPT-3 draws the attention of data science enthusiasts towards the idea of artificial general intelligence taking shape. In this blog, we present the various aspects of GPT-3, from the details of building blocks to the performance of the model and the possibilities of what all it can do.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">OpenAI GPT-3: What is GPT-3?<\/h2>\n\n\n\n<p><em>Generative Pre-trained Transformer 3<\/em> (<strong>GPT-3<\/strong>) is a language model that leverages deep learning to generate human-like text (output). Not only can it produce text, but it can also generate code, stories, poems, etc. For these capabilities and reasons, it has become such a hot topic in the area of natural language processing (NLP- &#8211; an essential sub-branch of <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" rel=\"noreferrer noopener\">data science<\/a>). <\/p>\n\n\n\n<p>GPT-3 was introduced by Open AI earlier in May 2020 as a successor to their previous language model (LM) GPT-2. It is considered to be better and bigger than GPT-2. In fact, with around 175 Billion trainable parameters, OpenAI GPT-3\u2019s full version is the largest model trained so far when compared to other language models. This 72-page <a rel=\"noreferrer noopener\" aria-label=\"research paper (opens in a new tab)\" href=\"https:\/\/arxiv.org\/pdf\/2005.14165.pdf\" target=\"_blank\">research paper<\/a>** describes in great detail the features, capabilities, performance and limitations of the model. <\/p>\n\n\n\n<p><em>**We will refer to the research paper as \u201cthe paper\u201d in our blog at multiple places. <\/em><\/p>\n\n\n\n<p>In this blog, we have included some illustrations directly from the paper and focused on some key highlights so as to save you from having to read through the paper. <\/p>\n\n\n\n<p><em>GPT-3 is a very large language model. Given some input text, it can probabilistically determine what tokens from a known vocabulary will come next. <\/em>Before we go ahead and see what makes GPT-3 so special, lets first understand what is a language model? <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What are Language Models (LMs)?<\/h2>\n\n\n\n<p>Simply put, language models are statistical tools to predict the next word(s) in a sequence. In other words, language models are probability distribution over a sequence of words. Language models have many applications like:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Part of Speech (PoS) Tagging<\/li>\n\n\n\n<li>Machine Translation<\/li>\n\n\n\n<li>Text Classification<\/li>\n\n\n\n<li>Speech Recognition<\/li>\n\n\n\n<li>Information Retrieval<\/li>\n\n\n\n<li>News Article Generation<\/li>\n\n\n\n<li>Question Answering, etc.<\/li>\n<\/ul>\n\n\n\n<p>A popular encoding method used in NLP is Word2Vec which was developed in 2014. The real boost to language models came in 2019 with the arrival of the \u201ctransformer\u201d. You can read more about \u201cattention\u201d and \u201ctransformer\u201d here in the <a href=\"https:\/\/arxiv.org\/pdf\/1706.03762.pdf\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"paper in which it was proposed (opens in a new tab)\">paper in which it was proposed<\/a>. Or leave us feedback and we will cover it for you in one of our blogs!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Makes OpenAI GPT-3 Different?<\/h2>\n\n\n\n<p>The first thing that GPT-3 overwhelms with is its sheer size of trainable parameters which is 10x more than any previous model out there. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"934\" height=\"592\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trainable-parameters.jpg\" alt=\"OpenAI GPT-3\" class=\"wp-image-35714\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trainable-parameters.jpg 934w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trainable-parameters-380x241.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trainable-parameters-380x241.jpg 420w\" sizes=\"(max-width: 934px) 100vw, 934px\" \/><\/figure>\n\n\n\n<p>In general, the more parameters a model has, the more data is required to train the model. As per the creators, the OpenAI GPT-3 model has been trained about 45 TB text data from multiple sources which include Wikipedia and books. The multiple datasets used to train the model are shown below:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong><em>Dataset <\/em><\/strong><\/td><td><strong>Quantity (tokens)<\/strong><\/td><td><strong>Weight in training mix<\/strong><\/td><td><strong>Epochs elapsed when training for 300B tokens<\/strong><\/td><\/tr><tr><td><em>Common Crawl (filtered) <\/em><\/td><td>410 billion<\/td><td>60%<\/td><td>0.44<\/td><\/tr><tr><td><em>WebText2 <\/em><\/td><td>19 billion<\/td><td>22%<\/td><td>2.9<\/td><\/tr><tr><td><em>Books1 <\/em><\/td><td>12 billion<\/td><td>8%<\/td><td>1.9<\/td><\/tr><tr><td><em>Books2 <\/em><\/td><td>55 billion<\/td><td>8%<\/td><td>0.43<\/td><\/tr><tr><td><em>Wikipedia <\/em><\/td><td>3 billion<\/td><td>3%<\/td><td>3.4<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong><em>Common Crawl<\/em><\/strong><em> <\/em>corpus contains petabytes of data collected over 8 years of web crawling. The corpus contains raw web page data, metadata extracts and text extracts with light filtering. <\/p>\n\n\n\n<p><strong><em>WebText2<\/em><\/strong><em> <\/em>is the text of web pages from all outbound Reddit links from posts with 3+ upvotes. <\/p>\n\n\n\n<p><strong><em>Books1 &amp; Books2<\/em><\/strong><em> <\/em>are two internet-based books corpora. <\/p>\n\n\n\n<p><strong><em>Wikipedia<\/em><\/strong><em> <\/em>pages<em> <\/em>in the English language are also part of the training corpus. <\/p>\n\n\n\n<p>The third column in the table \u201cWeight in training mix\u201d refers to the fraction of examples during training that are drawn from a given dataset. <\/p>\n\n\n\n<p>A major issue in training models and in particular such large training models with so much data from the internet is that these models have the capacity to memorise the content and then contaminate downstream tasks like testing as they might have already seen the data. Though the creators of GPT-3 took some measures to avoid the training and test data overlaps but a bug in the filtering caused some of the data to leak. As mentioned in the paper, the team <strong>could not retrain the model<\/strong> due to the high cost associated with the training. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">OpenAI GPT-3 Architecture<\/h2>\n\n\n\n<p>The GPT-3 is not one single model but a family of models. Each model in the family has a different number of trainable parameters. The following table shows each model, architecture  and its corresponding parameters:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"942\" height=\"286\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/gpt-3.jpg\" alt=\"A screenshot of a cell phone\n\nDescription automatically generated\" class=\"wp-image-35710\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/gpt-3.jpg 942w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/gpt-3-380x115.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/gpt-3-380x115.jpg 420w\" sizes=\"(max-width: 942px) 100vw, 942px\" \/><\/figure>\n\n\n\n<p>In fact, the OpenAI GPT-3 family of models is based on the same transformer-based architecture of the GPT-2 model including the modified initialisation, pre-normalisation, reverse tokenisation, with the exception that it uses alternating dense and sparse attention patterns. <\/p>\n\n\n\n<p>The largest version GPT-3 175B or \u201cGPT-3\u201d has 175 B Parameters, 96 attention layers and 3.2 M batch size.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"855\" height=\"331\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/open-gpt-3.jpg\" alt=\"OpenAI GPT-3\" class=\"wp-image-35713\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/open-gpt-3.jpg 855w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/open-gpt-3-380x147.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/open-gpt-3-380x147.jpg 420w\" sizes=\"(max-width: 855px) 100vw, 855px\" \/><figcaption class=\"wp-element-caption\">Original Transformer Architecture<\/figcaption><\/figure>\n\n\n\n<p>Shown in the figure above is the original transformer architecture. As mentioned before, OpenAI GPT-3 is based on a similar architecture, just that it is quite larger. While language models like BERT use the Encoder to generate embeddings from the raw text which can be used in other machine learning applications, the GPT family use the Decoder half, so they take in embeddings and produce text. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Accuracy \/ Performance \/ Numbers of OpenAI GPT-3<\/h3>\n\n\n\n<p>The various tasks that any language model can perform depend on how it is fine-tuned\/updated. With GPT-3 many of the NLP tasks discussed earlier can be done without any fine-tuning, gradient or parameter updates which makes this model <strong>Task-Agnostic<\/strong>. So OpenAI GPT-3 can perform tasks with very few or no examples\/demonstration (or shots as they are better known). Before we dive into the numbers lets first understand the concept of Zero\/One\/Few shot tasks with respect to the model and see how one can interact with the model using a few examples. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"819\" height=\"754\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/zero-shot.jpg\" alt=\"OpenAI GPT-3 Shot Task Interaction\" class=\"wp-image-35715\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/zero-shot.jpg 819w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/zero-shot-380x350.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/zero-shot-380x350.jpg 420w\" sizes=\"(max-width: 819px) 100vw, 819px\" \/><\/figure>\n\n\n\n<p>Above figure shows the three settings in which GPT-3 can perform the task of translating from English to French. <\/p>\n\n\n\n<p>The <strong>Few-shot (FS)<\/strong> setting is kind of similar to how we go about training a machine learning model where we give some inputs and corresponding outputs to a model and then expect the model to perform on an unseen input. However, the difference here is that unlike a normal <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/14-essential-machine-learning-algorithms\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/14-essential-machine-learning-algorithms\/\" rel=\"noreferrer noopener\">ML algorithm<\/a>, the model does not do any weight updates. It just infers on the basis of the \u201cshots\u201d that it has been fed. One typically feeds in between 10-100 shots for one such setting (as per the paper). <\/p>\n\n\n\n<p><strong>One-Shot (1S)<\/strong> setting is the same as FS except that only one example\/demo\/context is fed to the model in addition to the last context(which is the task). <\/p>\n\n\n\n<p><strong>Zero-Shot (0S)<\/strong> is when there is no context allowed except for the last (which is the task). This kind of setting is \u201c<strong>unfairly hard<\/strong>\u201d as it could be difficult for even humans to understand what the task is with no example or demonstration. <\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"801\" height=\"423\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/close-up-of-a-map.jpg\" alt=\"A close up of a map\" class=\"wp-image-35716\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/close-up-of-a-map.jpg 801w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/close-up-of-a-map-380x201.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/close-up-of-a-map-380x201.jpg 420w\" sizes=\"(max-width: 801px) 100vw, 801px\" \/><\/figure>\n\n\n\n<p>The above image shows the accuracy of the OpenAI GPT-3 model while performing the Zero-shot, One-shot and Few-shots tasks along with the number of parameters and shots for a simple task  to remove random symbols from a word. Now, let\u2019s have a look at how the models (175B params to 125M params) perform at some well known (benchmarked) tasks. All results cited (quoted) from the paper. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">OpenAI GPT-3 Language Modelling<\/h3>\n\n\n\n<p>\u201cOur largest model sets a new SOTA on PTB by a substantial margin of 15 points, achieving a perplexity of 20.50. Note that since PTB is a traditional language modelling dataset it does not have a clear separation of examples to define one-shot or few-shot evaluation around, so we measure only zero-shot.\u201d<\/p>\n\n\n\n<p>The team calculated 0S perplexity on the Penn Tree Bank dataset. <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">LAMBADA<\/h4>\n\n\n\n<p>The LAMBADA dataset basically tests a model\u2019s capability to predict the last word of sentences which require reading a paragraph of context. <\/p>\n\n\n\n<p>\u201c&#8230;in a zero-shot setting GPT-3 achieves 76% on LAMBADA, a gain of 8% over the previous state of the art.\u201d<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"733\" height=\"460\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/lambada-1.jpg\" alt=\"LAMBADA dataset\" class=\"wp-image-35718\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/lambada-1.jpg 733w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/lambada-1-380x238.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/lambada-1-380x238.jpg 420w\" sizes=\"(max-width: 733px) 100vw, 733px\" \/><figcaption class=\"wp-element-caption\"><em>GPT-3 results for FS setting on LAMBADA, Source: paper<\/em><\/figcaption><\/figure>\n\n\n\n<p>\u201c&#8230;GPT-3 2.7B outperforms the SOTA 17B parameter Turing-NLG in this setting (FS), and GPT-3 175B advances the state of the art by 18%&#8230;\u201d <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">HellaSwag<\/h4>\n\n\n\n<p>\u201cThe HellaSwag dataset involves picking the best ending to a story or set of instruction. GPT-3 achieves 78.1% accuracy in the one-shot setting and 79.3% accuracy in the few-shot setting, outperforming the 75.4% accuracy of a fine-tuned 1.5B parameter language model but still a fair amount lower than the overall SOTA of 85.6% achieved by the fine-tuned multi-task model ALUM.\u201d<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">StoryCloze<\/h4>\n\n\n\n<p>The StoryCloze 2016 dataset involves selecting the correct ending sentence for five-sentence long stories. \u201cHere GPT-3 achieves 83.2% in the zero-shot setting and 87.7% in the few-shot setting (with K = 70). This is still 4.1% lower than the fine-tuned SOTA using a BERT based model [LDL19] but improves over previous zero-shot results by roughly 10%.\u201d<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">OpenAI GPT-3: Closed Book Question Answering<\/h3>\n\n\n\n<p>This task tests the ability of OpenAI GPT-3 to answer questions about broad factual knowledge. GPT-3 was tested on three different QA datasets. The results for the same are shown in the table below:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"745\" height=\"195\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/cell-phone-description-of-closed-book.jpg\" alt=\"OpenAI GPT-3: Closed Book Question Answering\" class=\"wp-image-35719\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/cell-phone-description-of-closed-book.jpg 745w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/cell-phone-description-of-closed-book-380x99.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/cell-phone-description-of-closed-book-380x99.jpg 420w\" sizes=\"(max-width: 745px) 100vw, 745px\" \/><figcaption class=\"wp-element-caption\"><em>Results on three Open Domain QA Tasks, Source: paper<\/em><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"702\" height=\"456\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trivia-qa-1.jpg\" alt=\"A close up of a map\n\nDescription automatically generated\" class=\"wp-image-35721\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trivia-qa-1.jpg 702w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trivia-qa-1-380x247.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/trivia-qa-1-380x247.jpg 420w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/><\/figure>\n\n\n\n<p>The figure above shows GPT-3\u2019s performance on the TriviaQA dataset. It can be observed how the performance grows with size and how 1S and FS settings beat 0S and match + exceed the SOTA on the task. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">OpenAI GPT-3: Language Translation<\/h3>\n\n\n\n<p>Although GPT-3\u2019s training data comprised of &gt; 90% English text it did include some foreign language text. Following graph (taken from the paper) summarises the performance of GPT-3 on the language translation task. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"717\" height=\"464\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/translation-multi-bleu-1.jpg\" alt=\"Translation (Multi-BLEU)\" class=\"wp-image-35723\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/translation-multi-bleu-1.jpg 717w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/translation-multi-bleu-1-380x246.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/translation-multi-bleu-1-380x246.jpg 420w\" sizes=\"(max-width: 717px) 100vw, 717px\" \/><figcaption class=\"wp-element-caption\"><em>GPT-3 translation performance in FS setting on 6 language pairs<\/em><\/figcaption><\/figure>\n\n\n\n<p>\u201cFor the three input languages studied, GPT-3 significantly outperforms prior unsupervised NMT work when translating into English but underperforms when translating in the other direction.\u201d<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Winograd-Style Tasks<\/h3>\n\n\n\n<p>The Winograd Schemas Challenge involves determining which word a pronoun refers to, when the pronoun is grammatically ambiguous but semantically unambiguous to a human. <\/p>\n\n\n\n<p>\u201cOn Winograd GPT-3 achieves 88.3%, 89.7%, and 88.6% in the zero-shot, one-shot, and few-shot settings respectively, showing no clear in-context learning but in all cases achieving strong results just a few points below state-of-the-art and estimated human performance.\u201d<\/p>\n\n\n<div class=\"bg-leaf-50 p-4 my-3\"><h4 class=\"fw-bold text-center\">Get To Know Other\tData Science Students<\/h4><div class=\"row row-cols-1 row-cols-lg-3\"><div class=\"col\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/jonah-winninghoff\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1680561342\/Jonah_Winninghoff.png\" alt=\"Jonah Winninghoff\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Jonah Winninghoff<\/p><p class=\"text-muted lh-1\">Statistician at Rochester Institute Of Technology<\/p><\/div><div class=\"w-100 d-block d-md-none mt-3\"><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/jonah-winninghoff\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/meghan-thomason\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203464\/Student%20Success\/Megan_Thomason_125x125.png\" alt=\"Meghan Thomason\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Meghan Thomason<\/p><p class=\"text-muted lh-1\">Data Scientist at Spin<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/meghan-thomason\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/sam-fisher\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203194\/Student%20Success\/Sam_Fisher_125x125.png\" alt=\"Sam Fisher\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Sam Fisher<\/p><p class=\"text-muted lh-1\">Data Science Engineer at Stratyfy<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/sam-fisher\">Read Story<\/a><\/p><\/div><\/div><\/div><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Common Sense Reasoning<\/h3>\n\n\n\n<p>Three datasets were considered for this task. The first dataset PhysicalQA (PIQA) asks common sense questions about how the physical world works and is intended as a probe of grounded understanding of the world. \u201cGPT-3 achieves 81.0% accuracy zero-shot, 80.5% accuracy one-shot, and 82.8% accuracy few-shot (the last measured on PIQA\u2019s test server). This compares favourably to the 79.4% accuracy prior to the state-of-the-art of a fine-tuned RoBERTa.\u201d<\/p>\n\n\n\n<p>There are few more results mentioned in the paper for tasks like reading comprehension, SuperGLUE, NLI, synthetic and qualitative tasks (arithmetic, word scrambling and manipulation, SAT analogies, News article generation, learning and using novel words, correcting English grammar). Let\u2019s pick up the most interesting task of News Article Generation. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">News Article Generation<\/h3>\n\n\n\n<p>The release of GPT-2\u2019s largest model was briefly on hold due to the controversy of it being capable of generating fake news. GPT-3 model was able to generate news articles that are practically indistinguishable from the real ones. One of the experiments showed that for the 175B model, humans were able to distinguish fake articles with only 52% accuracy. <\/p>\n\n\n\n<p>Here are some of the examples of the fake news article generated by GPT-3 along with the accuracy that the human participants achieved in being able to distinguish it. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"721\" height=\"457\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-1.jpg\" alt=\"Article for which human's had lowest accuracy in identifying (accuracy 12%), Source: paper\" class=\"wp-image-35724\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-1.jpg 721w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-1-380x241.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-1-380x241.jpg 420w\" sizes=\"(max-width: 721px) 100vw, 721px\" \/><figcaption class=\"wp-element-caption\"><em>Article for which human&#8217;s had lowest accuracy in identifying (accuracy 12%), Source: paper<\/em><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"735\" height=\"417\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-2.jpg\" alt=\"News Article Generation: human participants to identify\" class=\"wp-image-35725\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-2.jpg 735w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-2-380x216.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/new-article-generation-2-380x216.jpg 420w\" sizes=\"(max-width: 735px) 100vw, 735px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Article generated by GPT-3 that was most easy for human participants to identify (accuracy 61%), Source: paper<\/em><\/strong><\/figcaption><\/figure>\n\n\n\n<p>The plot below shows the human ability to detect model generated fake news articles. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"717\" height=\"469\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/human-ability-to-detect-model-generated-articles.jpg\" alt=\"human ability to detect model generated fake news articles\" class=\"wp-image-35726\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/human-ability-to-detect-model-generated-articles.jpg 717w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/human-ability-to-detect-model-generated-articles-380x249.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/human-ability-to-detect-model-generated-articles-380x249.jpg 420w\" sizes=\"(max-width: 717px) 100vw, 717px\" \/><\/figure>\n\n\n\n<p>It can be observed from the plot above that the ability to distinguish fake article decreases as the model size increases. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How Can We Get Our Hands on the Model?<\/h3>\n\n\n\n<p>You can\u2019t simply download the model or train it on your own even if you have the infrastructure. OpenAI has built an API which is accessible through a waiting list. You can visit their site and <a rel=\"noreferrer noopener\" aria-label=\"join the waiting list (opens in a new tab)\" href=\"https:\/\/beta.openai.com\/\" target=\"_blank\">join the waiting list<\/a>. In fact, you can go to the demo section of <a href=\"https:\/\/beta.openai.com\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\">https:\/\/beta.openai.com<\/a> and try out some demos yourself to get a fair idea of how some of the use-cases work. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"733\" height=\"394\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/text-generation-ai-1.jpg\" alt=\"Text Generation Demo\" class=\"wp-image-35728\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/text-generation-ai-1.jpg 733w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/text-generation-ai-1-380x204.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/text-generation-ai-1-380x204.jpg 420w\" sizes=\"(max-width: 733px) 100vw, 733px\" \/><figcaption class=\"wp-element-caption\"><em>Demo section at <a href=\"https:\/\/beta.openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\">https:\/\/beta.openai.com<\/a><\/em><\/figcaption><\/figure>\n\n\n\n<p>If you select the Q&amp;A task and click on \u201cSee cached response\u201d button, you will get the following result:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"712\" height=\"271\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa.jpg\" alt=\"Q&amp;A task\" class=\"wp-image-35729\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa.jpg 712w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-380x145.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-380x145.jpg 420w\" sizes=\"(max-width: 712px) 100vw, 712px\" \/><\/figure>\n\n\n\n<p>So if you were to do a task like the one shown above, you would need to write a code similar to the one shown below:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-2.jpg\" alt=\"Q&amp;A-open AI\" class=\"wp-image-35730\" style=\"width:796px;height:499px\" width=\"796\" height=\"499\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-2.jpg 796w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-2-380x238.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/qa-2-380x238.jpg 420w\" sizes=\"(max-width: 796px) 100vw, 796px\" \/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/openai.com\/api\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/openai.com\/api\/\" rel=\"noreferrer noopener\">OpenAI<\/a><\/figcaption><\/figure>\n\n\n\n<p>As you can observe in the code snippet above, the API is provided with 5 contexts and the last Q is the task that the model needs to complete. It needs to predict what words will follow \u2018<strong>A:<\/strong>\u2019.<\/p>\n\n\n\n<p>Since the waiting list is just too long and it could be a while before you get your hands on your own API key, we have some examples gathered from the web on how machine learning enthusiasts and <em><a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" target=\"_blank\" data-type=\"post\" data-id=\"24427\" rel=\"noreferrer noopener\">data scientists<\/a><\/em> are using the model for different applications. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Natural Language to SQL generation<\/h3>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Another line of work to be &quot;aided&quot; &#8211; data, business analytics. Apparently GPT-3 is able to abstract and generate SQL queries. Tech acceleration is taking off rapidly. Industry application will follow. Video courtesy: <a href=\"https:\/\/twitter.com\/faraaznishtar?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">@FaraazNishtar<\/a><a href=\"https:\/\/twitter.com\/hashtag\/GPT3?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#GPT3<\/a> <a href=\"https:\/\/twitter.com\/hashtag\/OpenAI?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#OpenAI<\/a> <a href=\"https:\/\/twitter.com\/hashtag\/Tech?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#Tech<\/a> <a href=\"https:\/\/twitter.com\/hashtag\/SQL?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#SQL<\/a> <a href=\"https:\/\/twitter.com\/hashtag\/Analytics?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#Analytics<\/a> <a href=\"https:\/\/t.co\/sJcRdymxi5\">pic.twitter.com\/sJcRdymxi5<\/a><\/p>&mdash; Generalist Lab (@Generalist_Lab) <a href=\"https:\/\/twitter.com\/Generalist_Lab\/status\/1286385514795409410?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">July 23, 2020<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>The ability to convert natural language queries into SQL statements is a testament to how AI is bridging the gap between human language and technical database queries. This kind of technology isn&#8217;t just about understanding natural language; it&#8217;s also about understanding the intricacies of database languages and structures. For those looking to further explore the backend side of web applications and the intricate dance between databases and servers, diving into a <a href=\"https:\/\/www.springboard.com\/blog\/software-engineering\/best-full-stack-bootcamps\/\" target=\"_blank\" rel=\"noreferrer noopener\">full stack developer bootcamp<\/a> can offer a holistic view.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Creating ToDo List Apps (and other apps)<\/h3>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">I built a todo list app simply by describing it to GPT-3.<br><br>It generated the React code for a fully functioning app within seconds.<br><br>I&#39;m becoming more impressed and aware of its capabilities every single day. <a href=\"https:\/\/t.co\/QGrClar03s\">pic.twitter.com\/QGrClar03s<\/a><\/p>&mdash; Sharif Shameem (@sharifshameem) <a href=\"https:\/\/twitter.com\/sharifshameem\/status\/1284421499915403264?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">July 18, 2020<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Python Code Generation from Natural Text<\/h3>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\"><a href=\"https:\/\/twitter.com\/hashtag\/GPT3?src=hash&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">#GPT3<\/a> generates your API<br><br>I have been exploring the capabilities of GPT-3. Built a demo app on top of it that lets you generate Flask (Python) API code just by describing the functions in English.<br><br>Check it out \ud83d\udc47<br><br>Thanks to <a href=\"https:\/\/twitter.com\/OpenAI?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">@openai<\/a> and <a href=\"https:\/\/twitter.com\/gdb?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">@gdb<\/a> for providing me access. <a href=\"https:\/\/t.co\/bNcRoAHWLQ\">pic.twitter.com\/bNcRoAHWLQ<\/a><\/p>&mdash; Samanyou Garg (@SamanyouGarg) <a href=\"https:\/\/twitter.com\/SamanyouGarg\/status\/1295039749221097472?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">August 16, 2020<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>You will get many more such examples on the internet of how enthusiasts who have access to the API are creating more such applications with GPT-3. <\/p>\n\n\n\n<p>Python Code Generation from Natural Text demonstrates GPT-3&#8217;s proficiency with Python, a leading language known for its simplicity and power. As Python evolves, both novices and experts benefit from its vast ecosystem. Those intrigued by its capabilities might find insights in various <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/best-data-science-bootcamps\/\" target=\"_blank\" rel=\"noreferrer noopener\">data science bootcamps<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Limitations of OpenAI GPT-3<\/h3>\n\n\n\n<p>The creators of GPT-3 themselves accept that the model has its weaknesses and does commit silly mistakes. In particular, it does not perform well on text synthesis tasks like repetitions, contradictions, coherence loss over long passages, etc. However, this is not too different from other language models. The architecture also introduces a fundamental limitation on the model. The GPT-3 model is an autoregressive language model and not a bidirectional one (like BERT). So GPT-3 is more suited for tasks which are \u201cin-context\u201d learning-based and not the ones which depend on \u201cfine-tuning\u201d. <\/p>\n\n\n\n<p>Shown below are the accuracy results of GPT-3 models on arithmetic tasks. It can be seen how smaller models perform poorly on simple tasks of even single-digit or double-digit arithmetic and accuracy on 4-digit (and above) arithmetic is low. <\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"684\" height=\"441\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/arithemeticfew-shot.jpg\" alt=\"Arithemetic tasks\" class=\"wp-image-35731\" srcset=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/arithemeticfew-shot.jpg 684w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/arithemeticfew-shot-380x245.jpg 380w, https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2022\/10\/arithemeticfew-shot-380x245.jpg 420w\" sizes=\"(max-width: 684px) 100vw, 684px\" \/><figcaption class=\"wp-element-caption\"><em>GPT-3 results on Arithmetic tasks with FS setting, Source: paper<\/em><\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Summary<\/h4>\n\n\n\n<p>To summarise:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>GPT-3 is a very large language model (the largest till date) with about 175B parameters.<\/li>\n\n\n\n<li>It is trained on about 45TB of text data from different datasets.<\/li>\n\n\n\n<li>As such the model itself has no knowledge, it is just good at predicting the next word(s) in the sequence. It is not designed to store or retrieve facts. <\/li>\n\n\n\n<li>It produces more fluent and human-like text outputs.<\/li>\n\n\n\n<li>You don\u2019t need task-specific datasets to accomplish a task using GPT-3. It is \u201cTask-Agnostic\u201d.<\/li>\n\n\n\n<li>You cannot download or retrain the model. You need an API key (can get by joining the waitlist). It has \u201cclosed-API\u201d access.<\/li>\n\n\n\n<li>It is good mostly for English language tasks.<\/li>\n\n\n\n<li>Longer outputs from the model tend to degrade.<\/li>\n\n\n\n<li>The outputs can be biased and abusive. <\/li>\n\n\n\n<li>There are known contaminations in the benchmark experiments which have been called out clearly in paper. <\/li>\n<\/ul>\n\n\n\n<p>Even with the API still in the closed-beta state and a long waiting list, the AI and data science community is quite excited about the potential and power of the model and how artificial general intelligence (AGI) is evolving. However, if we are to learn from the issues associated with GPT-2 we need to be more careful and responsible with what we create using this model. <\/p>\n\n\n\n<p class=\"rm has-background\" style=\"background-color:#efeff6\"><strong>Since you\u2019re here\u2026<\/strong>Are you interested in this career track? Investigate with our free guide to <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" data-type=\"post\" data-id=\"24427\">what a data professional <em>actually<\/em> does<\/a>. When you\u2019re ready to build a CV that will make hiring managers melt, join our <a href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" target=\"_blank\" rel=\"noreferrer noopener\">Data Science Bootcamp<\/a> which will help you land a job or your tuition back!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI\u2019s latest model has gone viral again. Much like its predecessor, there is no stopping to the buzz that OpenAI\u2019s latest model GPT-3 is creating around the internet. As experts praise the model for its intuitive capabilities which range from writing articles to generating code, many experts including the founder of OpenAI have called out [&hellip;]<\/p>\n","protected":false},"author":85,"featured_media":10357,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_eb_attr":"","_eb_data_table":"","footnotes":""},"categories":[67],"tags":[1473,1474],"marketing_tags":[],"class_list":{"0":"post-10349","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science","8":"tag-ai","9":"tag-artificial-intelligence"},"acf":[],"_links":{"self":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/10349"}],"collection":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/users\/85"}],"replies":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/comments?post=10349"}],"version-history":[{"count":4,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/10349\/revisions"}],"predecessor-version":[{"id":50303,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/10349\/revisions\/50303"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media\/10357"}],"wp:attachment":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media?parent=10349"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/categories?post=10349"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/tags?post=10349"},{"taxonomy":"marketing_tags","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/marketing_tags?post=10349"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}