Thanks the internet, aspiring data scientists can easily learn key concepts from elite universities. For instance, students who complete our Data Analysis learning path will take MIT’s algorithms course and Stanford’s machine learning course. They’ll listen to the same lectures, complete the same assignments, and ultimately learn the same concepts.
The experience, of course, isn’t exactly the same. Most online courses don’t give direct access to professors and don't provide the unstructured learning that plays a major role in university life. Students studying statistics or computer science at a university have their core coursework complemented by optional seminars, interactions with their professors, and the occasional late-night inebriated conversation with peers. Through these casual interactions, they gain exposure to an eclectic mix of topics ranging from the history and culture of the field to the newest techniques being developed by researchers. The result is a holistic understanding that guides decisions about future coursework and careers.
Although it’s difficult to replace in-person interactions, we’ve found ample online resources that provide the same type of learning by osmosis. The podcasts we’ve listed below will keep you up to date on the topics being discussed by top data scientists, ranging from cutting edge techniques to allegations of cheating on major contests. If you follow the Twitter accounts we’ve listed, you’ll get a glimpse of what those data scientists are thinking about day to day, while the newsletters we recommend are the same ones that they’re reading.
No matter how much or how little you know about data science, you should listen to Talking Machines. Each episode starts with an overview of a powerful concept like Markov Chain Monte Carlo or collaborative filtering. The majority of each episode is then devoted to an interviewing prominent data scientists like Andrew Ng or Kevin Murphy. The podcast is hosted by Harvard professor Ryan Murphy and journalist Katherine Gorman. Ryan in particular has a remarkable ability to explain complicated ideas in simple terms while still keeping the podcast interesting for advanced listeners. Listen here.
To continue the university metaphor, Partially Derivative is a late night conversation with slightly inebriated friends. The comparison is particularly apt as the hosts start each episode by telling the audience what beers they’re drinking. The show self-describes as “the show about data, data science, drinking, and awesomeness,” and there’s plenty of laughter as well. The podcast started as an overview of recent data-related articles, but recently they’ve added interviews with guests. They focus on cool applications of data science, like predicting which Game of Thrones characters will die or understanding what makes Indian food taste good. Despite the light-hearted banter, the hosts are serious data scientists and give real insight into the articles they cover. Listen here.
Udacity’s data science podcast is structured as a conversation between hosts Katie Malone and Ben Jaffe. The episodes, which tend to be brief and entertaining, cover a variety of topics ranging from neural nets to careers in data science. Overall, this is the most accessible podcast we’ve encountered. Listen here.
If we were to compare our podcasts to experiences at a university, the O’Reilly Data Show is equivalent to sitting in on a graduate level seminar. The target audience is practicing data scientists and machine learning researchers rather than aspiring data scientists, and episodes have titles like “The tensor renaissance in data science” or “Coming full circle with Bigtable and HBase.” There’s a lot to learn in each episode, but newcomers to data science may have to put in some effort to understand what’s being discussed. Listen here.
Twitter is roughly the equivalent of eavesdropping in the faculty lounge. You might not understand everything, but you get to hear what data scientists are thinking about day to day and you’ll learn a lot along the way. We’ve focused on selecting accounts that are active and relevant, which means that we’ve excluded some important data scientists whose Twitter accounts are inactive.
Data Elixir is a beautifully curated newsletter. Each edition of the weekly newsletter typically has two or three items in each of several categories: news articles, tools and techniques, resources, jobs, and data visualization. The newsletter’s content and website both feel very clean and uncluttered; the articles have clearly been carefully selected. The curator, Lon Riesberg, has a knack for finding good articles that haven’t been covered as well in other places. Subscribe here.
Data Science Weekly has a similar feel to Data Elixir with selected items grouped by category. Like Data Elixir, it contains a mix of news articles and more technical resources and is well curated. It also provides a good mix of articles from around the internet, rather than just the best articles from a single site. Subscribe here.
KD Nuggets has a twice weekly newsletter that summarizes some of the best content from the high volume of articles published on this website. It’s less aggressively curated than Data Elixir or Data Science Weekly, but it’s still a must read. Subscribe here.
Like O’Reilly’s podcast, the O’Reilly Data Newsletter is oriented towards professionals and big data practitioners. It’s a bit more accessible than the podcast, though, in part because readers can pick and choose which articles are most interesting to them. The newsletter typically includes 10-12 articles each week, and it’s arguably the most influential newsletter in the data science community. Subscribe here.