{"id":2590,"date":"2019-01-15T11:12:20","date_gmt":"2019-01-15T05:42:20","guid":{"rendered":"https:\/\/www.springboard.com\/?p=2590"},"modified":"2023-07-08T18:18:25","modified_gmt":"2023-07-09T01:18:25","slug":"ggplot2-in-r-tutorial","status":"publish","type":"post","link":"https:\/\/www.springboard.com\/blog\/data-science\/ggplot2-in-r-tutorial\/","title":{"rendered":"Ggplot2 Function Cheat Sheet and R Tutorial"},"content":{"rendered":"\n<p>The ggplot2 package, created by Hadley Wickham, provides a fast and efficient way to produce good-looking data visualizations that you can use to&nbsp;derive and communicate insights from your data sets. The package was designed to help you create all different types of data graphics in R, including histograms, scatter plots, bar charts, box plots, and density plots. This textbook has&nbsp;numerous examples of visualizations&nbsp;you can build in ggplot2.<\/p>\n\n\n\n<p>The ggplot2 package offers a powerful graphics language for creating elegant and complex plots.&nbsp;Originally based on Leland Wilkinson\u2019s&nbsp;The Grammar of Graphics, ggplot2 allows you to create graphs that represent both univariate and multivariate numerical and categorical data in a straightforward manner. Grouping can be represented by color, symbol, size, and transparency. The creation of trellis plots (i.e., conditioning), graphs that show relationships between different variables, is relatively simple.<\/p>\n\n\n\n<p>In recent years, ggplot2\u2019s popularity has grown exponentially. Due to its popularity, the functionalities built into this package have increased \u2014 which might be overwhelming for someone getting started with ggplot2. So I created this ggplot2 tutorial and cheatsheet to help you learn the basic functionalities of ggplot2.<\/p>\n\n\n\n<p>This is a quick ggplot2 tutorial through the basics of ggplot2 \u2014 enough so that you can create beautiful visualizations in R.<\/p>\n\n\n\n<p>You can use it as an extremely handy reference, or cheat-sheet, if you have&nbsp;<i>just&nbsp;<\/i>started your data science journey with ggplot2 in R, you can use it to help guide you to what you need to get done if you\u2019re looking to create a specific data visualization in R.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">ggplot2 Cheat Sheet of Essential Functions<\/h2>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.springboard.com\/blog\/wp-content\/uploads\/2017\/03\/ggplotv4-cheat-sheet-791x1024.jpg\" alt=\"ggplot2 Cheat Sheet\" class=\"wp-image-5940\"\/><\/figure>\n\n\n\n<p><em>Here is a\u00a0<a href=\"https:\/\/ddf46429.springboard.com\/uploads\/resources\/1495744514_ggplotv4.docx.pdf\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/ddf46429.springboard.com\/uploads\/resources\/1495744514_ggplotv4.docx.pdf\" rel=\"noreferrer noopener\">downloadable version<\/a>\u00a0as a PDF\u00a0in case you want to have it handy with you as you navigate ggplot2 and data visualization in R.<\/em><\/p>\n\n\n<div class=\"bg-leaf-50 p-4 my-3\"><h4 class=\"fw-bold text-center\">Get To Know Other\tData Science Students<\/h4><div class=\"row row-cols-1 row-cols-lg-3\"><div class=\"col\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/melanie-hanna\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203193\/Student%20Success\/Melanie_Hanna_125x125.png\" alt=\"Melanie Hanna\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Melanie Hanna<\/p><p class=\"text-muted lh-1\">Data Scientist at Farmer's Fridge<\/p><\/div><div class=\"w-100 d-block d-md-none mt-3\"><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/melanie-hanna\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/peter-liu\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203191\/Student%20Success\/Peter_Liu_125x125.png\" alt=\"Peter Liu\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Peter Liu<\/p><p class=\"text-muted lh-1\">Business Intelligence Analyst at Indeed<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/peter-liu\">Read Story<\/a><\/p><\/div><\/div><div class=\"col d-none d-md-block\"><div class=\"card success-story-card h-100 d-flex justify-content-between mb-0\"><div class=\"flex-grow-1 text-center\"><a class=\"d-inline-block rounded-circle\" href=\"\/success\/aaron-pujanandez\" style=\"width:125px;height:125px;overflow:hidden\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/res.cloudinary.com\/springboard-images\/image\/upload\/v1629203192\/Student%20Success\/Aaron_Pujanandez_125x125.png\" alt=\"Aaron Pujanandez\" style=\"object-fit:contain;max-width:170px;height:125px\" \/><\/a><p class=\"fw-bold mb-0\">Aaron Pujanandez<\/p><p class=\"text-muted lh-1\">Dir. Of Data Science And Analytics at Deep Labs<\/p><\/div><p class=\"mb-0 mx-auto text-center\"><a class=\"btn btn-primary mx-auto\" href=\"\/success\/aaron-pujanandez\">Read Story<\/a><\/p><\/div><\/div><\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">ggplot2 R Tutorial: Bar Charts<\/h2>\n\n\n\n<p><em><strong>The Github repository containing all of the code in this ggplot2 in R tutorial can be found <a href=\"https:\/\/github.com\/Rogerh91\/Springboard-Blog-Tutorials\/blob\/master\/ggplot2%20in%20R%20Tutorial%20Bar%20Graphs\/code.R\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/strong><\/em><\/p>\n\n\n\n<p>When I get my hands on a new dataset, I often want to take a quick look at the shape of the data and at preliminary results before developing my research any further. While many tutorials offer easy ways of plotting data in one way or another, few tutorials lead you through the first steps of data exploration in R. This ggplot2 in R tutorial will help you make sense of large datasets and gives you a framework to do some exploratory graphing of your own.<\/p>\n\n\n\n<p><em><strong>Related<\/strong>:\u00a0<a href=\"https:\/\/www.springboard.com\/blog\/data-science\/free-public-data-sets-data-science-project\/\" target=\"_blank\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/blog\/data-science\/free-public-data-sets-data-science-project\/\" rel=\"noreferrer noopener\">19 Free Public Data Sets for Your Project<\/a><\/em><\/p>\n\n\n\n<p>This ggplot2 in R tutorial assumes that you have already installed R, an IDE of your choice (I use RStudio), as well as the ggplot2 package. All these programs and packages are easy to access and free to install, so if you don\u2019t have them already, you can use this guide to <span class=\"c0\"><a class=\"c7\" href=\"https:\/\/cran.r-project.org\/doc\/manuals\/r-release\/R-admin.html\" target=\"_blank\" rel=\"noopener\">figure out how to get started<\/a>. Jupyter with R is the most intuitive way to <a href=\"http:\/\/blog.revolutionanalytics.com\/2015\/09\/using-r-with-jupyter-notebooks.html\" target=\"_blank\" rel=\"noreferrer noopener\">start with R<\/a> if you don&#8217;t have anything installed. You can install ggplot2 and other libraries using the <a href=\"http:\/\/www.dummies.com\/programming\/r\/how-to-install-and-load-ggplot2-in-r\/\" target=\"_blank\" rel=\"noreferrer noopener\">install.packages command in R<\/a>.\u00a0<\/span><\/p>\n\n\n\n<p>For the rest of the tutorial, I will be working on a sample dataset obtained from The Metropolitan Museum of Art in New York City. This dataset contains a set of metadata for all the artworks housed in the museum\u2019s collection, and can be found on <span class=\"c11\"><a class=\"c7\" href=\"https:\/\/github.com\/metmuseum\/openaccess\" target=\"_blank\" rel=\"noopener\">GitHub<\/a><\/span><span class=\"c2\">&nbsp;thanks to the Met Museum\u2019s Open Access Initiative. &nbsp;<\/span><\/p>\n\n\n\n<p><span class=\"c2\">First things first: make sure you have installed your libraries. Insert the following lines of code on the top.&nbsp;<\/span><\/p>\n\n\n\n<p><span class=\"c3\"><em>library(ggplot2)<\/em><\/span><\/p>\n\n\n\n<p><em>library(dplyr)<\/em><\/p>\n\n\n\n<p><em>library(reshape2)<\/em><\/p>\n\n\n\n<p><span class=\"c2\">You shouldn\u2019t get any errors after running the code above if ggplot2 has been installed correctly.<\/span><\/p>\n\n\n\n<p><span class=\"c2\">Now, lets read in the Metropolitan dataset, which is a raw CSV file.<\/span><\/p>\n\n\n\n<p><em>met.collection &lt;- read.csv(file=&#8221;~\/Documents\/Springboard-Blog\/Springboard-Blog-Tutorials\/data\/MetObjects.csv&#8221;)<\/em><\/p>\n\n\n\n<p><strong>Make sure you change the file path here to whatever it is on your computer! Here&#8217;s a quick guide to how to import <a href=\"http:\/\/rprogramming.net\/read-csv-in-r\/\" target=\"_blank\" rel=\"noopener\">CSVs into R<\/a>. You may also have to work with git-lfs, Github&#8217;s large file system management system to get the CSV file we&#8217;re working with, as it exceeds 200mb in file size. Here&#8217;s a <a href=\"https:\/\/stackoverflow.com\/questions\/34181356\/git-lfs-where-are-the-file-stored-how-to-get-them\" target=\"_blank\" rel=\"noreferrer noopener\">short tutorial <\/a>on that.\u00a0<\/strong><\/p>\n\n\n\n<p><span class=\"c2\">After R has ingested the table (it may take a while!), we can move to one of my favorite R functions: summary()!<\/span><\/p>\n\n\n\n<p><span class=\"c3\"><em>summary(met.collection)<\/em><\/span><\/p>\n\n\n\n<p><span class=\"c2\">Summary is a great function because it looks at every column in your dataset and returns an insightful set of statistics about it. If the column is made of numeric values, it will return the average and standard deviation across the column\u2019s values.<\/span><\/p>\n\n\n\n<p><span class=\"c2\">If your data is composed of strings (such as in our case), summary returns the count of unique strings within a column. The summary() function makes for a great first step for any exploratory data analysis using R.<\/span><\/p>\n\n\n\n<p><span class=\"c2\">I decided to use the summary() function to narrow where I should explore the data &#8212; the dataset has 43 columns in total!<\/span><\/p>\n\n\n\n<p><span class=\"c2\">This analysis got me to three interesting columns: which countries artists are from (their nationality), which cities they are from, and a column that collected the number of artworks associated with a particular artist. While a lot of the top-scoring values are obvious &#8211;the Met Collection is an American museum after all&#8211;some of the more interesting values are found in other columns, such as \u201cCity.\u201d Paris, for instance, is the top-scoring city for artworks across the whole collection, beating New York by a fairly wide margin, which suggests that Paris is a particularly great place to meet talented artists.<\/span><\/p>\n\n\n\n<p><span class=\"c2\">Exploratory graphs of three of these four categories could help us find trends in the dataset that are ripe for further exploration. Let\u2019s start with a bar plot of artists\u2019 nationalities found in the Met Collection.<\/span><\/p>\n\n\n<p>[code lang=&#8221;r&#8221; toolbar=&#8221;true&#8221; title=&#8221;Bar Plot of Artists Nationalities&#8221;]nationality &lt;-data.frame(table(met.collection$Artist.Nationality))<br \/>\nnationality &lt;- nationality[order(nationality$Freq, rank(nationality$Freq), decreasing = TRUE), ]<\/p>\n<p>df &lt;- nationality[2:11, ]<br \/>\nggplot(df, aes(x = Var1, y = Freq)) +<br \/>\ngeom_bar(stat = &#8220;identity&#8221;, color = &#8220;black&#8221;, fill = &#8220;grey&#8221;) +<br \/>\nlabs(title = &#8220;Frequency by Country\\n&#8221;, x = &#8220;\\nCountry&#8221;, y = &#8220;Frequency\\n&#8221;) +<br \/>\ntheme_classic() +<br \/>\ntheme(axis.text.x = element_text(angle = 90, hjust = 1))<br \/>\n[\/code]<\/p>\n\n\n\n<p><span class=\"c2\">The above code creates a frequency table of all elements found in the \u201cArtist.Nationality\u201d column in the dataframe, and then orders it in descending order. I then grab the top ten occurring terms and plot them as a bar graph, reversing the axis labels to make them readable.<\/span><\/p>\n\n\n\n<p>The resulting graph, found below, indicates several things: 1) The Met Collection is primarily an American collection,with some affinity for French artists; 2) the Nationality labels need to be cleaned so that the results can be more easily read, especially duplicate labels.<\/p>\n\n\n\n<p><span class=\"c2\">Let\u2019s see if we can add nuance to the nationality data above by looking at the most popular cities of origin for the Met Collection Archives:<\/span><\/p>\n\n\n<p>[code lang=&#8221;r&#8221; toolbar=&#8221;true&#8221; title=&#8221;Bar Plot of Artists Cities&#8221;]<br \/>\ncity &lt;- data.frame(table(met.collection$City))<br \/>\ncity &lt;- city[order(city$Freq,-rank(city$Freq), decreasing = TRUE), ]<\/p>\n<p>df &lt;- city[2:11, ]<br \/>\nggplot(df, aes(x = Var1, y = Freq)) +<br \/>\ngeom_bar(stat = &#8220;identity&#8221;, color = &#8220;black&#8221;, fill = &#8220;grey&#8221;) +<br \/>\nlabs(title = &#8220;Frequency by City\\n&#8221;, x = &#8220;\\nCountry&#8221;, y = &#8220;Frequency\\n&#8221;) +<br \/>\ntheme_classic() +<br \/>\ntheme(axis.text.x = element_text(angle = 90, hjust = 1))<br \/>\n[\/code]<\/p>\n\n\n\n<p><span class=\"c2\">Wow! Paris really does a number on New York and London. Venice, usually the most disproportionate source of visual art in the world is lagging far behind the big culture capitals.<\/span><\/p>\n\n\n\n<p><span class=\"c2\">Finally, after all of this geographic analysis, it might be worth knowing what time-frame or period predominates the Met Collection.:<\/span><\/p>\n\n\n<p>[code lang=&#8221;r&#8221; toolbar=&#8221;true&#8221; title=&#8221;Bar Plot of Art Timeframes&#8221;]<br \/>\ndate &lt;- data.frame(table(met.collection$Object.Date))<br \/>\ndate &lt;- date[order(date$Freq,-rank(date$Freq), decreasing = TRUE), ]<\/p>\n<p>df &lt;- date[3:11, ]<br \/>\nggplot(df, aes(x = Var1, y = Freq)) +<br \/>\ngeom_bar(stat = &#8220;identity&#8221;, color = &#8220;black&#8221;, fill = &#8220;grey&#8221;) +<br \/>\nlabs(title = &#8220;Frequency by Date\\n&#8221;, x = &#8220;\\nCountry&#8221;, y = &#8220;Frequency\\n&#8221;) +<br \/>\ntheme_classic() +<br \/>\ntheme(axis.text.x = element_text(angle = 90, hjust = 1))<br \/>\n[\/code]<\/p>\n\n\n\n<p>The code above produces the plot below. The Met is primarily composed of 19<span class=\"c6\">th<\/span>&nbsp;and 18<span class=\"c6\">th&nbsp;century artworks, coming either from America or from Europe (most coming from France or Italy). There seems to be a passing interest in art from ancient Egypt or Greece, but not much else by way of non-classical European artworks.<\/span><\/p>\n\n\n\n<p id=\"h.gjdgxs\">Proper data visualization is essential in the field of <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/data-science-definition\/\" target=\"_blank\" rel=\"noreferrer noopener\">data science<\/a>. <span class=\"c2\">Through the use of R&#8217;s summary function and the ggplot2 library, we&#8217;ve started breaking down a large data set and looked for various insights in this ggplot2 in R tutorial. That work is never finished in a proper data analysis: we urge you to take this ggplot2 in R tutorial and use it to break down insights you&#8217;d like to see.<\/span> Furthermore,ggplot2 is quickly becoming a popular data visualization package among <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" target=\"_blank\" data-type=\"post\" data-id=\"24427\" rel=\"noreferrer noopener\">data scientists<\/a> and data analysts.<\/p>\n\n\n\n<p>Companies are no longer just collecting data. They\u2019re seeking to use it to outpace competitors, especially with the rise of AI and advanced analytics techniques. Between organizations and these techniques are the data scientists \u2013 the experts who crunch numbers and translate them into actionable strategies. The future, it seems, belongs to those who can decipher the story hidden within the data, making the role of data scientists more important than ever.<\/p>\n\n\n\n<p>In this article, we\u2019ll look at 13 careers in data science, analyzing the roles and responsibilities and how to land that specific job in the best way. Whether you\u2019re more drawn out to the creative side or interested in the strategy planning part of data architecture, there\u2019s a niche for you.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Is Data Science A Good Career?<\/h2>\n\n\n\n<p>Yes. Besides being a field that comes with competitive salaries, the demand for data scientists continues to increase as they have an enormous impact on their organizations. It\u2019s an interdisciplinary field that keeps the work varied and interesting.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">10 Data Science Careers To Consider<\/h2>\n\n\n\n<p>Whether you want to change careers or land your first job in the field, here are 13 of the most lucrative data science careers to consider.<\/p>\n\n\n\n<div class=\"wp-block-essential-blocks-pro-data-table\"><div class=\"eb-parent-wrapper eb-parent-eb-data-table-cabj7 \"><div class=\"eb-data-table-cabj7 eb-data-table-wrapper\"><div class=\"eb-data-table-wrapper-inner\" data-post-id=\"13385\" data-block-id=\"eb-data-table-cabj7\" data-hide-header=\"false\" data-fixed-header=\"false\" data-show-pagination=\"false\" data-show-search=\"false\" data-fixed-header-scroll-height=\"300\"><\/div><\/div><\/div><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Scientist<\/h3>\n\n\n\n<p>Data scientists represent the foundation of the data science department. At the core of their role is the ability to analyze and interpret complex digital data, such as usage statistics, sales figures, logistics, or market research \u2013 all depending on the field they operate in.<\/p>\n\n\n\n<p>They combine their computer science, statistics, and mathematics expertise to process and model data, then interpret the outcomes to create actionable plans for companies.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>A data scientist\u2019s career starts with a solid mathematical foundation, whether it\u2019s interpreting the results of an A\/B test or optimizing a marketing campaign. Data scientists should have programming expertise (primarily in Python and R) and strong data manipulation skills.&nbsp;<\/p>\n\n\n\n<p>Although a university degree is not always required beyond their on-the-job experience, data scientists need a bunch of <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/best-data-science-courses\/\">data science courses<\/a> and certifications that demonstrate their expertise and willingness to learn.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>The average salary of a data scientist in the US is <a href=\"https:\/\/www.glassdoor.com\/Salaries\/data-scientist-salary-SRCH_KO0,14.htm\" target=\"_blank\" rel=\"noopener\">$156,363<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Analyst<\/h3>\n\n\n\n<p>A data analyst explores the nitty-gritty of data to uncover patterns, trends, and insights that are not always immediately apparent. They collect, process, and perform statistical analysis on large datasets and translate numbers and data to inform business decisions.<\/p>\n\n\n\n<p>A typical day in their life can involve using tools like Excel or SQL and more advanced reporting tools like Power BI or Tableau to create dashboards and reports or visualize data for stakeholders. With that in mind, they have a unique skill set that allows them to act as a bridge between an organization&#8217;s technical and business sides.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>To become a data analyst, you should have basic programming skills and proficiency in several data analysis tools. A lot of data analysts turn to specialized courses or <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/best-data-science-bootcamps\/\">data science bootcamps<\/a> to acquire these skills.&nbsp;<\/p>\n\n\n\n<p>For example, Coursera offers courses like Google&#8217;s Data Analytics Professional Certificate or IBM&#8217;s Data Analyst Professional Certificate, which are well-regarded in the industry. A bachelor&#8217;s degree in fields like computer science, statistics, or economics is standard, but many data analysts also come from diverse backgrounds like business, finance, or even social sciences.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>The average base salary of a data analyst is <a href=\"https:\/\/www.indeed.com\/career\/data-analyst\/salaries\" target=\"_blank\" rel=\"noopener\">$76,892<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Business Analyst<\/h3>\n\n\n\n<p>Business analysts often have an essential role in an organization, driving change and improvement. That\u2019s because their main role is to understand business challenges and needs and translate them into solutions through data analysis, process improvement, or resource allocation.&nbsp;<\/p>\n\n\n\n<p>A typical day as a business analyst involves conducting market analysis, assessing business processes, or developing strategies to address areas of improvement. They use a variety of tools and methodologies, like SWOT analysis, to evaluate business models and their integration with technology.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>Business analysts often have related degrees, such as BAs in Business Administration, Computer Science, or IT. Some roles might require or favor a master\u2019s degree, especially in more complex industries or corporate environments.<\/p>\n\n\n\n<p>Employers also value a business analyst\u2019s knowledge of project management principles like Agile or Scrum and the ability to think critically and make well-informed decisions.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>A business analyst can earn an average of <a href=\"https:\/\/www.indeed.com\/career\/business-analyst\/salaries\" target=\"_blank\" rel=\"noopener\">$84,435<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Database Administrator<\/h3>\n\n\n\n<p>The role of a database administrator is multifaceted. Their responsibilities include managing an organization&#8217;s database servers and application tools.&nbsp;<\/p>\n\n\n\n<p>A DBA manages, backs up, and secures the data, making sure the database is available to all the necessary users and is performing correctly. They are also responsible for setting up user accounts and regulating access to the database. DBAs need to stay updated with the latest trends in database management and seek ways to improve database performance and capacity. As such, they collaborate closely with IT and database programmers.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>Becoming a database administrator typically requires a solid educational foundation, such as a BA degree in data science-related fields. Nonetheless, it\u2019s not all about the degree because real-world skills matter a lot. Aspiring database administrators should learn database languages, with SQL being the key player. They should also get their hands dirty with popular database systems like Oracle and Microsoft SQL Server.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>Database administrators earn an average salary of <a href=\"https:\/\/www.indeed.com\/career\/database-administrator\/salaries\" target=\"_blank\" rel=\"noopener\">$77,391<\/a> annually.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Engineer<\/h3>\n\n\n\n<p>Successful data engineers construct and maintain the infrastructure that allows the data to flow seamlessly. Besides understanding data ecosystems on the day-to-day, they build and oversee the pipelines that gather data from various sources so as to make data more accessible for those who need to analyze it (e.g., data analysts).<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>Data engineering is a role that demands not just technical expertise in tools like SQL, Python, and Hadoop but also a creative problem-solving approach to tackle the complex challenges of managing massive amounts of data efficiently.&nbsp;<\/p>\n\n\n\n<p>Usually, employers look for credentials like university degrees or advanced data science courses and bootcamps.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>Data engineers earn a whooping average salary of <a href=\"https:\/\/www.glassdoor.com\/Salaries\/data-engineer-salary-SRCH_KO0,13.htm\" target=\"_blank\" rel=\"noopener\">$125,180<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Database Architect<\/h3>\n\n\n\n<p>A database architect\u2019s main responsibility involves designing the entire blueprint of a data management system, much like an architect who sketches the plan for a building. They lay down the groundwork for an efficient and scalable data infrastructure.&nbsp;<\/p>\n\n\n\n<p>Their day-to-day work is a fascinating mix of big-picture thinking and intricate detail management. They decide how to store, consume, integrate, and manage data by different business systems.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>If you\u2019re aiming to excel as a database architect but don\u2019t necessarily want to pursue a degree, you could start honing your technical skills. Become proficient in database systems like MySQL or Oracle, and learn data modeling tools like ERwin. Don\u2019t forget programming languages &#8211; SQL, Python, or Java.&nbsp;<\/p>\n\n\n\n<p>If you want to take it one step further, pursue a credential like the Certified Data Management Professional (CDMP) or the <a href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\">Data Science Bootcamp by Springboard<\/a>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>Data architecture is a very lucrative career. A database architect can earn an average of <a href=\"https:\/\/www.glassdoor.com\/Salaries\/data-architect-salary-SRCH_KO0,14.htm\" target=\"_blank\" rel=\"noopener\">$165,383<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Machine Learning Engineer<\/h3>\n\n\n\n<p>A machine learning engineer experiments with various machine learning models and algorithms, fine-tuning them for specific tasks like image recognition, natural language processing, or predictive analytics. Machine learning engineers also collaborate closely with data scientists and analysts to understand the requirements and limitations of data and translate these insights into solutions.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>As a rule of thumb, machine learning engineers must be proficient in programming languages like Python or Java, and be familiar with machine learning frameworks like TensorFlow or PyTorch. To successfully pursue this career, you can either choose to undergo a degree or enroll in courses and follow a self-study approach.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>Depending heavily on the company&#8217;s size, machine learning engineers can earn between <a href=\"https:\/\/www.glassdoor.com\/Salaries\/machine-learning-engineer-salary-SRCH_KO0,25.htm\" target=\"_blank\" rel=\"noopener\">$125K and $187K<\/a> per year, one of the <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/careers-in-ai\/\">highest-paying AI careers<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Quantitative Analyst<\/h3>\n\n\n\n<p>Qualitative analysts are essential for financial institutions, where they apply mathematical and statistical methods to analyze financial markets and assess risks. They are the brains behind complex models that predict market trends, evaluate investment strategies, and assist in making informed financial decisions.&nbsp;<\/p>\n\n\n\n<p>They often deal with derivatives pricing, algorithmic trading, and risk management strategies, requiring a deep understanding of both finance and mathematics.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>This data science role demands strong analytical skills, proficiency in mathematics and statistics, and a good grasp of financial theory. It always helps if you come from a finance-related background.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>A quantitative analyst earns an average of <a href=\"https:\/\/www.glassdoor.com\/Salaries\/quantitative-analyst-salary-SRCH_KO0,20.htm\" target=\"_blank\" rel=\"noopener\">$173,307<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Mining Specialist<\/h3>\n\n\n\n<p>A data mining specialist uses their statistics and machine learning expertise to reveal patterns and insights that can solve problems. They swift through huge amounts of data, applying algorithms and data mining techniques to identify correlations and anomalies. In addition to these, data mining specialists are also essential for organizations to predict future trends and behaviors.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>If you want to land a career in data mining, you should possess a degree or have a solid background in computer science, statistics, or a related field.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>Data mining specialists earn <a href=\"https:\/\/www.glassdoor.com\/Salaries\/data-mining-specialist-salary-SRCH_KO0,22.htm\" target=\"_blank\" rel=\"noopener\">$109,023<\/a> per year.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data Visualisation Engineer<\/h3>\n\n\n\n<p>Data visualisation engineers specialize in transforming data into visually appealing graphical representations, much like a data storyteller. A big part of their day involves working with data analysts and business teams to understand the data\u2019s context.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">General Requirements<\/h4>\n\n\n\n<p>Data visualization engineers need a strong foundation in data analysis and be proficient in programming languages often used in data visualization, such as JavaScript, Python, or R. A valuable addition to their already-existing experience is a bit of expertise in design principles to allow them to create visualizations.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Average Salary<\/h4>\n\n\n\n<p>The average annual pay of a data visualization engineer is <a href=\"https:\/\/www.glassdoor.com\/Salaries\/data-visualization-engineer-salary-SRCH_KO0,27.htm\" target=\"_blank\" rel=\"noopener\">$103,031<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Resources To Find Data Science Jobs<\/h2>\n\n\n\n<p>The key to finding a good data science job is knowing where to look without procrastinating. To make sure you leverage the right platforms, read on.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Job Boards<\/h3>\n\n\n\n<p>When hunting for data science jobs, both niche job boards and general ones can be treasure troves of opportunity.&nbsp;<\/p>\n\n\n\n<p>Niche boards are created specifically for data science and related fields, offering listings that cut through the noise of broader job markets. Meanwhile, general job boards can have hidden gems and opportunities.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Online Communities<\/h3>\n\n\n\n<p>Spend time on platforms like Slack, Discord, GitHub, or IndieHackers, as they are a space to share knowledge, collaborate on projects, and find job openings posted by community members.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Network And LinkedIn<\/h3>\n\n\n\n<p>Don\u2019t forget about socials like LinkedIn or Twitter. The LinkedIn Jobs section, in particular, is a useful resource, offering a wide range of opportunities and the ability to directly reach out to hiring managers or apply for positions. Just make sure not to apply through the \u201cEasy Apply\u201d options, as you\u2019ll be competing with thousands of applicants who bring nothing unique to the table.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">FAQs about Data Science Careers<\/h2>\n\n\n\n<p>We answer your most frequently asked questions.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Do I Need A Degree For Data Science?<\/h3>\n\n\n\n<p>A degree is not a set-in-stone requirement to <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/learn-data-science-without-degree\/\">become a data scientist<\/a>. It\u2019s true many data scientists hold a BA\u2019s or MA\u2019s degree, but these just provide foundational knowledge. It\u2019s up to you to pursue further education through courses or bootcamps or work on projects that enhance your expertise. What matters most is your ability to demonstrate proficiency in data science concepts and tools.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Does Data Science Need Coding?<\/h3>\n\n\n\n<p>Yes. Coding is essential for data manipulation and analysis, especially knowledge of programming languages like Python and R.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Is Data Science A Lot Of Math?<\/h3>\n\n\n\n<p>It depends on the career you want to pursue. Data science involves quite a lot of math, particularly in areas like statistics, probability, and linear algebra.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What Skills Do You Need To Land an Entry-Level Data Science Position?<\/h3>\n\n\n\n<p>To land an entry-level job in data science, you should be proficient in several areas. As mentioned above, knowledge of programming languages is essential, and you should also have a good understanding of statistical analysis and machine learning. Soft skills are equally valuable, so make sure you\u2019re acing problem-solving, critical thinking, and effective communication.<\/p>\n\n\n\n<p class=\"rm has-background\" style=\"background-color:#efeff6\"><strong>Since you\u2019re here\u2026<\/strong>Are you interested in this career track? Investigate with our free guide to <a href=\"https:\/\/www.springboard.com\/blog\/data-science\/what-does-a-data-scientist-do\/\" data-type=\"post\" data-id=\"24427\">what a data professional <em>actually<\/em> does<\/a>. When you\u2019re ready to build a CV that will make hiring managers melt, join our <a href=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" data-type=\"URL\" data-id=\"https:\/\/www.springboard.com\/courses\/data-science-career-track\/\" target=\"_blank\" rel=\"noreferrer noopener\">Data Science Bootcamp<\/a> which will help you land a job or your tuition back!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The ggplot2 package, created by Hadley Wickham, provides a fast and efficient way to produce good-looking data visualizations that you can use to&nbsp;derive and communicate insights from your data sets. The package was designed to help you create all different types of data graphics in R, including histograms, scatter plots, bar charts, box plots, and [&hellip;]<\/p>\n","protected":false},"author":23,"featured_media":2601,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_eb_attr":"","_eb_data_table":"","footnotes":""},"categories":[67],"tags":[],"marketing_tags":[],"class_list":{"0":"post-2590","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-data-science"},"acf":[],"_links":{"self":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/2590"}],"collection":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/users\/23"}],"replies":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/comments?post=2590"}],"version-history":[{"count":3,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/2590\/revisions"}],"predecessor-version":[{"id":47763,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/posts\/2590\/revisions\/47763"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media\/2601"}],"wp:attachment":[{"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/media?parent=2590"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/categories?post=2590"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/tags?post=2590"},{"taxonomy":"marketing_tags","embeddable":true,"href":"https:\/\/www.springboard.com\/blog\/wp-json\/wp\/v2\/marketing_tags?post=2590"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}