We interviewed Akshay Mahajan, a Data Scientist at Branch, to discuss his career, a typical workday at Branch, and his recommendations for learning Data Science.
The full video Q&A is below, and here are some of the highlights.
How did you get started in data science?
I went to UC San Diego, studied neuroscience, and for the first two years of college, I had a really intense interest in finance. So I was doing internships in private equity, investment banking and the like, and saw an opportunity there actually to speed up my workload with scripting. And so started dabbling my toes in Python and come my junior year, decided to full-on make the switch into data science. I got a data science internship at PG&E, the Northern California utility. Worked on some really interesting projects there and after that came back to school and did some software engineering internships at Learning Quality, an ed-tech startup down in San Diego, and kind of realized that a good blend of all this would be product management. Somebody who’d done finance, somebody who’d do data science and who studied design. Blending the three, I applied to Branch as a product manager, joined, and within a month I realized there’s such a need for data science here and actually transitioned to data science there.
What does Branch do?
Branch is a unicorn mobile startup. I’ll give you all the buzzwords. Is it a unicorn mobile startup focused on mobile linking, mobile attribution, and mobile search. So to break each of the three down, that is … Mobile linking is if you’ve ever engaged with Airbnb email on your phone where you see here’s homes near me in San Francisco. If I were to click on that and be routed directly in the app, that’s Branch in the backend. So for all these top 20,000, 30,000 apps, Branch is powering the underlying linking infrastructure for that. And so with those mobile marketers who are deploying these campaigns, those email campaigns I just mentioned or some sort of ads that you see on Facebook, behind those, all the marketers can get a unified sense of how their campaigns are performing across all channels and platforms with Branch.
And so we give them a dashboard that lets them do this. And the third aspect, mobile search. Throughout this entire process of building a linking infrastructure, Branch has been able to index in-app content, which means same way that Google is able to do PageRank across real web, Branch is able to kind of gain a sense of popularity of in-app content and distribute that for sort of a spotlight search for Android, and that’s kind of the current product offering that we’re leading in towards.
What is a typical workday like?
My typical workday can vary from one of three things. It can either be me focused on data pipelines and making sure that I have the underlying infrastructure I need in order to get the analytics that I want. The second thing that I could be doing is scripting, say writing something in Python, be it pandas, PySpark, something like that with a general goal in mind is, let me go ahead and answer this business question or build out some sort of notebook where I can share with my team and they can answer these questions. Or the third thing is dashboarding and reporting, be that in Looker or Tableau, building a data model in Looker or dashboarding itself in Tableau. That’s kind of where I find myself blurring those three lines with the interaction with business teams and coworkers all mixed in those.
What’s your favorite and least favorite thing to do as a data scientist?
Let’s start with the bad news first. So least favorite is going to be writing the pipelines, as I mentioned. I think that’s my least favorite part of my day to day where it’s, sometimes I wish the analytics infrastructure was available for me rather than me having to do that. Though I am a firm believer that all data scientists should also be able to write their own ETL because it gives you more respect for where your data comes from and understand the data quality. But my favorite thing is probably sharing insights with people in the business teams and people kind of on the product teams. There’s such an opportunity that you’ll realize when you become a data scientist to be able to show value from your own company’s data to your nontechnical teams. And I think for them being able to ponder and strategize on the data that you’ve provided and come out with actionable insight is just a great feeling knowing that you were involved all together.
What’s one essential tool that you can’t live without?
Pandas, hands down. When I’m working with smaller datasets, I honestly sometimes forego Excel and just use pandas. That has become my go-to, but with Branch often we’re working with larger data volumes so I have to end up using the PySpark API, load up a Spark cluster and then spin up a PySpark notebook. That tends to be my normal workflow. Although when I get the chance to use pandas, I think having one unified API for data loading, manipulation, and visualization all with one line functions is just great.
What types of things to hiring managers look for when hiring data scientists at Branch?
I think hiring data scientists across the industry is generally a pretty vague and nascent area, right? But what we’ve realized here at Branch is that we have a need for kind of a full-stack data scientist. Obviously every company looks for this, but what you want is somebody who has good analyst skills, can look into data, dive deep into patterns and can share insights with the business. You also want someone who can write their own ETL, right? And just knows Python and SQL and functional programming in general. For certain data science roles we have, we want pure, say, NLP focus people for our search team, but across the business, we want people who slant more towards having a good sense of engineering and analytics capabilities.
What’s some tactical advice you have for someone who’s looking to break into data science?
There are two major pathways I see. One is self-learning, which is YouTube. YouTube is the greatest resource I have found for anything data science-related. There are channels that will summarize research papers in two minutes. There are channels that’ll focus on data science content and how to become one, like Springboard. So there’s a lot you can learn about the industry and the roles there, but also when it comes down to learning how to solve practical business problems with programming languages, which is the core job of a data scientist. I think using Kaggle, right, or kind of building your own portfolio, working with dummy data sets and publishing that on the web is the best way to get your own name out there and show that I am not somebody who just learns things from a research perspective. I’m somebody who applies them practically. I’m somebody who has solved these problems and I’m putting them available on GitHub and Public Web for people to audit at work.
I think the second option is signing up for a bootcamp. If you really are in a place in your life where you’re not having access to the opportunities you think you want, a bootcamp – and what I’ve seen from Branch is, we hire a bunch of people from bootcamps – is that it’s a great place to restart and you get to fully learn the curriculum in and out. The people there are experts. I think that is a great place to re-skill and just get a start in data science if you don’t know where to start.
Ready to start or grow your data science career? Check out our Data Science Career Track —you’ll learn the skills and get the personalized guidance you need to land the job you want.