You’ve developed the technical skills, you’re confident in your ability to wrangle data, and you’re ready to put your curiosity and creativity to work. So, how do you convince a hiring manager that you’re the data scientist for the job? Cue: the data science portfolio.
Here’s what we’ll cover:
There’s a catch-22 that industry newcomers face: you need a job to gain experience, but no one will give you a job unless you can show that you already have experience. Fortunately, the world of data science has a workaround that is helpful to both newcomers and industry veterans—the data science portfolio. Like other professional portfolios, it not only tells a hiring manager what you’ve worked on, it also shows what you’re capable of. The best part? You don’t need to have an existing data science job to get started.
Like many professional portfolios, a data science portfolio helps establish credibility by showcasing the projects an applicant has worked on. Strong data science portfolios typically highlight an applicant’s technical skills, creativity in devising research questions, ability to analyze data and draw insights, willingness to collaborate with others, and ability to clearly communicate their findings to audiences that might not come from a technical background.
The data science portfolio offers a quick path to building trust with a hiring manager and proving that an applicant has what it takes to do the job. For industry newcomers, a strong portfolio can make up for a lack of employment experience or conventional higher education. For those who have existing industry experience, it’s an effective way for hiring managers to separate the wheat from the chaff.
While many data scientist portfolios showcase projects from current or previous jobs, the beauty of data science is that there is no shortage of public datasets available for newcomers to use for their own projects. In fact, getting creative with public data sets can be a great way to stand out from the competition, according to William Chen, a data science manager at Quora, who said at Kaggle’s CareerCon that he likes seeing projects where people have taken the initiative to explore interesting datasets and find novel results. “I love projects where people show that they are interested in data in a way that goes beyond homework assignments,” Chen said.
If your data science skills are polished and you’re ready to make your mark on the industry, then it’s time to put together a portfolio that will persuade hiring managers that you’ve got what it takes. Below are a few steps that can help you build a strong portfolio.
Check job listings
To build a portfolio for the job you want, start by understanding the skills you will need to showcase in order to impress a hiring manager. Whether your dream job has an emphasis on machine learning, data cleaning, data visualization tools, or requires its data scientists to be confident presenters and communicators, a job listing from LinkedIn or Glassdoor will typically make clear the skills and experience worth highlighting in your portfolio.
Generate project ideas
Most data science projects begin with a problem in need of solving or a question in need of answering, and end with a data scientist expertly teasing out actionable insights from troves of (usually) unruly data. When generating your own project ideas, it helps to think about the issues you’re curious about and how engaging with relevant datasets might shed light on a problem. You can also find inspiration from other people’s projects by visiting data science communities such as Towards Data, Kaggle, and GitHub.
Choose your messy dataset
You don’t need to be an industry insider to get your hands on messy datasets. Many online repositories offer free and open access to large amounts of public data spanning categories from agriculture and earth science, to education and transportation. Some useful resources include:
Clean and analyze
Once you’ve chosen a research question and a dataset to work with, it’s time to clean up the data by unifying multiple data files and finding an answer or significant insights from data analysis. This step presents opportunities for you to showcase creative thinking and problem-solving by exploring multiple angles, finding supplemental data sources, and using compelling visualizations.
Make a good impression
Clear, concise, and comprehensive documentation is key when it comes to putting together an impressive portfolio. It’s a good idea to have several projects that showcase the breadth of your skills and interests, each hosted somewhere accessible—like GitHub or your own blog or website—where the code is visible, the process well-documented, and the project contextualized.
As you acquire additional data science skills, work on compelling and increasingly complex projects, and solve new problems, make sure your portfolio evolves alongside you. A portfolio is a calling card—to stay competitive, keep it up-to-date.
The goal of your portfolio is to show off your dynamic skill-set, so it’s important to include case studies that highlight your different capabilities. Below are a few common project types that can make for a robust portfolio.
Data Cleaning Project
Cleaning up messy data sets is a big part of a data scientist’s job. Including a data cleaning project, in which you find an existing messy data set, identify an interesting question or angle to explore, clean up the data, then perform basic analysis to find insights, will show a hiring manager that you’ve got a strong technical foundation.
Data Storytelling Project
A natural extension of the data cleaning project, the data storytelling project is focused on the actionable insights gleaned from making sense of a data set. It involves finding connections and correlations within the data and determining how it fits into a narrative. For example, you could use data on high school graduation rates to show the effect of funding cuts, which could lead to recommendations for changes in education spending.
Machine Learning Project
Also known as an end-to-end project, machine learning projects are an opportunity to show off technical prowess because they require you to build operational systems and algorithms that can accept data inputs and generate an output. This presents an opportunity for you to also show that you understand how a system or algorithm might be used by a business or organization, and that you can build models and tools with an organization’s goals in mind.
Data Science Blogging
Another way of establishing credibility and showing a hiring manager that you know what you’re talking about is through writing explanatory posts on either your own professional blog, or on a platform that data scientists frequent. In an interview on the Mode Analytics Blog, chief data scientist at DataCamp David Robinson said blogging can show that you understand and can communicate complex concepts. “The most effective strategy for me was doing public work,” Robinson said. “I blogged and did a lot of open source development late in my PhD, and these helped give public evidence of my data science skills.”
Don't forget to check out Springboard's guide on how to become a data scientist with no experience.
Is data science the right career for you?
Springboard offers a comprehensive data science bootcamp. You’ll work with a one-on-one mentor to learn about data science, data wrangling, machine learning, and Python—and finish it all off with a portfolio-worthy capstone project.
Check out Springboard’s Data Science Career Track to see if you qualify.
Not quite ready to dive into a data science bootcamp?
Springboard now offers a Data Science Prep Course, where you can learn the foundational coding and statistics skills needed to start your career in data science.
Download our guide to data science jobs
Packed with insight from industry experts, this updated 60-page guide will teach you what you need to know to start your data science career.
Ready to learn more?
Browse our Career Tracks and find the perfect fit