{"id":23577,"date":"2020-06-15T15:28:00","date_gmt":"2020-06-15T22:28:00","guid":{"rendered":"https:\/\/www.springboard.com\/blog\/?p=23577"},"modified":"2023-06-28T00:33:50","modified_gmt":"2023-06-28T07:33:50","slug":"predictive-text-generation","status":"publish","type":"post","link":"https:\/\/www.springboard.com\/blog\/data-science\/predictive-text-generation\/","title":{"rendered":"Deep Learning Project Ideas: Text Generation Using Recurrent Neural Networks (RNNs) and Transformers in NLP"},"content":{"rendered":"\n<p>When I began typing the title of this article, \u201ctext generation using recurrent n\u2026\u201d, the tool I\u2019m typing on, Google Docs, began automatically completing my sentences. In this case, it accurately suggested recurrent neural networks! If you\u2019ve ever used Gmail compose or even Google search, this wouldn\u2019t surprise you, because the predictive text has been in vogue for a very long time, and has widespread use across industries.<\/p>\n\n\n\n<p>In this blog post, Springboard\u2019s machine learning mentor Raghav Bali inspires many deep learning project ideas and walks you through creating your own predictive text generation model with recurrent neural networks and transformers. Raghav is a senior data scientist at the UnitedHealth Group, where he designs and implements machine learning, AI and deep learning-based solutions for healthcare and insurance. Before United Health, he has also worked at American Express and Intel, building enterprise-level intelligent solutions.<\/p>\n\n\n\n<p><em>For further reading, <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-scientist-job-description\/\" data-type=\"post\" data-id=\"2371\">check out data scientist job description here<\/a> and <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" data-type=\"post\" data-id=\"2291\">learn more about data science<\/a>.<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding Language to Build Recurrent Neural Networks<\/h2>\n\n\n\n<p>For humans, language is an integral part of existence. We use language to communicate thoughts and ideas every single day. We instinctively understand what language is. Before getting machines to understand them, let\u2019s first define \u2018Language\u2019 in abstract terms, is a collection of alphabets, used in specific settings\/context, to create words\/vocabulary, following a set of rules\/grammar.&nbsp;<\/p>\n\n\n\n<p>It takes us, humans, many years of learning to communicate in a language. This is simply because languages are complex and constantly evolving. Rules of spelling and grammar aren\u2019t standard, there are multiple ways to communicate the same thing, and the same word might mean different things in different contexts. For example, take the sentence, \u201cThe bowler made a batsman duck.\u201d What does this mean? Does it mean that the bowler made the batsman bend to avoid being hit by the ball? Or that the bowler got the batsman\u2019s wicket at zero? Or that he converted him into a quacking duck?<\/p>\n\n\n\n<p>Let\u2019s see another example. \u201cThe stolen painting was found by the tree.\u201d Did the tree find the painting? Or was the painting left near the tree?<\/p>\n\n\n\n<p>As people, we understand communication through a complex process which takes into account several tangible and intangible aspects. Sarcasm, lingo, hashtags, etc. are processed by the human brain easily, which can be difficult for machines to replicate. But we have to try. Today, let\u2019s see a deep learning project idea, based on which you can build complex models to improve on existing systems.<\/p>\n\n\n<div class=\"bg-leaf-50 p-4 my-3\"><h4 class=\"fw-bold text-center\">Get To Know Other\tData Science Students<\/h4><div class=\"row row-cols-1 row-cols-lg-3\"><div class=\"col\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/diana-xie\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203192\/Student%20Success\/Diana_Xie_125x125.png\" alt=\"Diana Xie\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Diana Xie<\/p><p class=\"text-muted lh-1\">Machine Learning Engineer at IQVIA<\/p><\/div><div class=\"w-100 d-block d-md-none mt-3\"><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/diana-xie\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/jonah-winninghoff\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1680561342\/Jonah_Winninghoff.png\" alt=\"Jonah Winninghoff\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Jonah Winninghoff<\/p><p class=\"text-muted lh-1\">Statistician at Rochester Institute Of Technology<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/jonah-winninghoff\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/melanie-hanna\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203193\/Student%20Success\/Melanie_Hanna_125x125.png\" alt=\"Melanie Hanna\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Melanie Hanna<\/p><p class=\"text-muted lh-1\">Data Scientist at Farmer's Fridge<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/melanie-hanna\">Read Story<\/a><\/p><\/div><\/div><\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding Recurrent Neural Networks<\/h2>\n\n\n\n<p>A recurrent neural network (RNN) is an upgraded version of the neural network, where connections between nodes are treated as sequential signals.&nbsp;<\/p>\n\n\n\n<p>Take the visual below, for instance. In this case, you\u2019ll notice that the input for h2 is not just x2, but also y1, which is the output of the previous action. We use this for natural language processing and text generation applications because typically language is a sequence. When people speak, words take meaning based on previous words, and sentences take meaning from previous sentences. With RNNs, the context follows.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"357\" height=\"142\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2020\/06\/predictive-text-generation.png\" alt=\"predictive text generation\" class=\"wp-image-46238\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">How to Build a Language Model<\/h2>\n\n\n\n<p>We can build language models across many levels \u2014 word-level, phrase-level or what we are going to do today, which is corrector-level. Basically, a corrector is a set of alphabets, punctuation, etc., which helps predict the next corrector. We are doing this primarily because corrector-level language modelling will give you a finite and manageable vocabulary. You can go forth and build word-level models on the same principles as well.<\/p>\n\n\n\n<p>We will use <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/keras-vs-tensorflow\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/keras-vs-tensorflow\/\" rel=\"noreferrer noopener\">TensorFlow 2.0 with Keras<\/a> as the high-level library. We\u2019ll use gated recurrent units (GRUs), which are better at managing long-range dependencies and gradient problems. We are taking the book \u2018Adventures of Sherlock Holmes\u2019 from the Gutenberg library as our input dataset.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Step 1: Pre-processing<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Import the required libraries from Tensorflow.&nbsp;<\/li>\n\n\n\n<li>Set the data path to the book in Project Gutenberg. For access to all the links and references, sign up here.<\/li>\n\n\n\n<li>Download the book using Gutenberg\u2019s standard API.&nbsp;<\/li>\n\n\n\n<li>Prepare text by performing basic clean up.\n<ul class=\"wp-block-list\">\n<li>Identify unique character count\/vocabulary size: In this case, it is 96, because we\u2019re doing corrector-level analysis. This would have been significantly higher if we went with a word-level language model.<\/li>\n\n\n\n<li>Perform character-to-integer mapping: Give every unique corrector a corresponding integer for machines to understand. You might also need to perform reverse mapping for decoding the output.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Please note that because we are performing text generation at corrector-level, we don\u2019t have to perform activities that we typically do with NLP such as soft-word removal etc.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Step 2: Data preparation<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Create sequences with a max length of 100.&nbsp;<\/li>\n\n\n\n<li>Create batches of sequences, also of fixed size, in this case, we are using a batch size of 64.&nbsp;<\/li>\n\n\n\n<li>Run a quick shuffle.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Step 3: Prepare the model<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Using Keras\u2019 sequential API, prepare a model with one embedding layer. You can increase the number of layers if you wish. However, it will increase the training time.<\/li>\n\n\n\n<li>Define your vocabulary size, embedding dimensions and RNN units.&nbsp;<\/li>\n\n\n\n<li>Set up callbacks.<\/li>\n\n\n\n<li>Train the dragon, in this case, it is your language model, for 64 epochs.&nbsp;<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Step 4: Text generation<\/h4>\n\n\n\n<p>Try with a context input such as \u2018Watson you are\u2019 and see what your model predicts. In our case, it says, \u201cin the street,\u201d which is a meaningful and grammatically correct prediction. However, as we move further and further away from the context, predictions are losing quality and turning into gibberish. You can continue to optimise the model to improve prediction accuracy.<\/p>\n\n\n\n<p>To get a hands-on demonstration of how we built this model, watch Raghav\u2019s session on Youtube: <a aria-label=\" (opens in a new tab)\" href=\"https:\/\/www.youtube.com\/watch?v=vSN5Tn38ZIc\" target=\"_blank\" rel=\"noreferrer noopener\">Text Generation using RNNs and Transformers in NLP<\/a>. He also explores decoding strategies like greedy, beam search, sampling, top-k sampling, top-p sampling\/nucleus sampling, encoder-decoder architecture and more!<\/p>\n\n\n\n<p class=\"rm has-background\" style=\"background-color:#efeff6\"><strong>Since you\u2019re here\u2026<br><\/strong>Curious about a career in data science? Experiment with our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/resources\/guides\/data-science-process\/\" target=\"_blank\">free data science learning path<\/a>, or join our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" target=\"_blank\">Data Science Bootcamp<\/a>, where you\u2019ll get your tuition back if you don&#8217;t land a job after graduating. We\u2019re confident because our courses work \u2013 check out our <a rel=\"noreferrer noopener\" href=\"https:\/\/www.springboard.com\/success\/\" target=\"_blank\">student success stories<\/a> to get inspired.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I began typing the title of this article, \u201ctext generation using recurrent n\u2026\u201d, the tool I\u2019m typing on, Google Docs, began automatically completing my sentences. In this case, it accurately suggested recurrent neural networks! If you\u2019ve ever used Gmail compose or even Google search, this wouldn\u2019t surprise you, because the predictive text has been [&hellip;]<\/p>\n","protected":false},"author":100,"featured_media":46655,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_eb_attr":"","_eb_data_table":"","footnotes":""},"categories":[67],"tags":[],"marketing_tags":[],"class_list":{"0":"post-23577","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science"},"acf":[],"_links":{"self":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/23577"}],"collection":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/users\/100"}],"replies":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/comments?post=23577"}],"version-history":[{"count":4,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/23577\/revisions"}],"predecessor-version":[{"id":46656,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/23577\/revisions\/46656"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media\/46655"}],"wp:attachment":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media?parent=23577"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/categories?post=23577"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/tags?post=23577"},{"taxonomy":"marketing_tags","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/marketing_tags?post=23577"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}