Data science is a great profession—but it can also be a lonely one. For many newly qualified data scientists, the first few years out of college are a real challenge. Without the camaraderie and friendship provided by your peers, it can quickly begin to feel isolating.
However, data science is actually an inherently social profession. Part of a data scientist’s role is to work alongside internal and external stakeholders to deliver key insights; outside of that, many data scientists like to stay connected to their peers working in similar roles as a way to discover new ideas and methodologies.
One of the best ways data scientists network with each other is through data science communities. In this post, we’ll run through the most popular data science communities and show you how you can get involved.
Technically, Kaggle is a coding tool rather than a community per se, but the Google-owned software is so popular among data scientists that it now boasts one of the largest data science communities in the world.
Kaggle has mainly been built as a tool to evaluate and assemble teams across competitions. It allows you to find and publish data sets, explore and build models in a web-based environment, and share your work with other scientists and engineers. You can even enter competitions directly from the software, which offers over 50,000 public datasets and 400,000 public notebooks to address even the most complex challenges.
Another extremely useful feature of Kaggle is that it provides free access to NVidia K80 GPUs in kernels. This benchmark shows that enabling a GPU to your Kernel results in a 12.5X speedup during the training of a deep learning model. There are loads of great resources on how to start using the Kaggle in this way, including this guide.
Best of all, the Kaggle community now has more than 3 million active members who can use Kaggle to share their work and expertise.
It might look a bit dated in 2020, but the IBM Data Science Community web pages are one of the best sources around for expert-level insight into today’s pressing data science challenges. This community has been around for far longer than many of the others on this list, and that means that some of the legends of the industry can be found blogging, podcasting, and even answering direct questions.
In short, the IBM Data Science Community is a great place to visit if you are seeking specific guidance from an industry pro. It’s a little less newbie-friendly than some of the other communities here, but that’s not necessarily a bad thing.
Reddit might seem like a strange choice if you are looking for a professional network, being better known for sharing memes than addressing complex scientific questions. However, in some cases, it’s actually better to use social networks, rather than their professional ones, because you can communicate more freely with your peers.
There are plenty of resources for data scientists on a number of key subreddits, such as r/datascience, r/dataisbeautiful, and r/MachineLearning. r/dataisbeautiful is a largely visual subreddit; r/datascience is more introductory; and r/machinelearning is more for hardcore discussion of papers. The advantage of discussing your problems on an anonymous network is that your employer will never know what you actually think of their new project.
Open Data Science is a community organized around particular, high-level projects. This site aims to act as a bridge between people working in all the different sub-areas of data science—not just engineers and scientists—but developers and students. Members of the community can suggest and create projects, which are as diverse as the members of the community, and then invite others to work collaboratively toward a solution.
Data Science Central is arguably the largest data science community out there, at least in terms of the raw number of contributors it attracts. The bar to entry to post is low, which means content quality varies, as nearly anybody can post. This is the place to go if you are looking to get on top of the latest trends in the industry or to hear about new jobs before they are officially announced.
Data Science Central is based around a forum system, but also includes an editorial platform for experts to share their knowledge via personal blogs. There is also a pretty extensive suite of social interaction tools to make connecting with peers easy.
Data Community DC (DC2 for short) is run on a slightly different model from the other communities on this list. Instead of being a purely volunteer-run affair, DC2 is a non-profit, based in Washington DC, that aims to promote and progress the work of data scientists in the US.
To this end, DC2 offers a range of services to contributors. The core aim of the community is to encourage education in data science. To this end, it has put established professionals in touch with local schools, where they participate in events that are designed to inspire the next generation of data scientists. For those outside the capital, the community also has a lot to offer, though: DC2 has six meetup groups with over 5000 unique members, a board of 12 people, a blog, occasional workshops, and plans for bigger events in the future
Stack Exchange often gets forgotten about by data scientists, who tend to see it as a place for developers to discuss the intricacies of new web frameworks. In reality, however, the community is now so big, and so diverse, that it covers an enormous range of tech subjects, data science included.
In addition, the fact of the matter is that in 2020 most scientists are also coders. And if you have a question about coding, need someone to show you how to achieve something, or even want to hire a developer to build some software for you, Stack Exchange should be your first port of call.
The Data Science Society is an initiative of graduate students at Berkeley and is one of the most exciting data science communities around at the moment. This community has taken the lead in encouraging minorities and women to train in data science. This is the place to go if you are looking to find people who share your values and politics, while also sharing the technical and scientific skills needed to make a difference.
Driven Data is focused on using data science to create a better world. This community focuses on putting a number of key stakeholders in touch, and not just data scientists and coders, but also activists and lawyers, all of whom are working toward socially progressive goals.
It’s never been more important to be part of the data science community, whether this is to generate new project ideas, improve your skills, or simply to stay sane in these strange times.
This post was written by Brian Skewes.