Compete, practice, and earn with these 3 competition platforms

Photo by Serghei Trofimov on Unsplash

One of the most important aspects of learning data science is actually putting your learnings into practice. Data science is, in my opinion, something that is best learned by doing.

If you are learning data science outside of formal education or work experience it can tricky to find places to practice the skills that you are learning. Machine learning competitions offer one, very effective, platform for the practical application of data science techniques.

Participating in data science competitions can help you to learn, earn money and can also provide a portfolio of projects that could help you to land your…


…and what you should do instead

Photo by Nadine Shaabana on Unsplash

As the job of Data Scientist has become increasingly popular over the last few years, helped by being crowned “sexiest job of the 21st century”, Masters degree courses specialising in data science have sprung up. The cost of these courses can range anywhere from $30,000 to $100,000, not including the cost of living whilst you take the time out to study.

My own route into data science did not include any formal education. I taught myself the skills I needed through self-study using free or very low-cost resources widely available online. …


Hands-on Tutorials

How to add robustness to your notebook code

Photo by Kevin Ku on Unsplash

Jupyter notebooks have a somewhat poor reputation in the wider programming community. Joel Grus’ famous “I don’t like notebooks” talk, which he bravely gave at JupyterCon in 2018, covered many of the reasons why. Typically notebooks are seen to promote poor coding practices as they can be difficult to version control, often rely on cells being run in a specific order to return correct results and can be tricky to test and debug.

“I don’t like notebooks”, by Joel Grus

However, Jupyter notebooks are still the number one choice for most data scientists when it comes to performing tasks such…


Learn NLP for free with these fantastic resources

Photo by Leonardo Toshiro Okubo on Unsplash

Natural language processing, or NLP combines the fields of linguistics, computer science and artificial intelligence to enable machines to extract meaning, insights and make predictions from text data.

The majority of modern organisations collect large quantities of text data from their customers and business operations in the form of emails, chats, phone calls and many other interactions. This data can be invaluable in developing a deeper understanding of the problems that customers are facing, and insights from this data can help to solve some of these problems as well as automate and optimise customer facing business processes.

It is therefore…


Getting started with data analysis using this useful tool

Photo by Camylla Battani on Unsplash

Data analysis is fundamentally about finding answers to questions with data. When we perform some calculation or compute a statistic for a set of data it is usually not enough to do that across the entire dataset. Instead we will usually want to split the data into groups, perform the computation and then compare the results across different groups.

Data analysis is fundamentally about finding answers to questions with data.

Let’s say we were a digital marketing team investigating the potential reasons behind a recent decline in conversion rate. Looking at conversion rate as a whole over time would be…


Introducing a curated list of free resources for learning data science

Photo by vnwayne fan on Unsplash

Over the last few years I have written serveral articles about learning data science using online resources. During my own learning journey I have identified some of the best free or low cost material available for learning data science.

I recently spent some time consolidating this list into this Github repository so that it can be used as a quick reference for anyone who wants to expand their data science skills. …


Getting Started

… explained in plain English

Photo by ThisisEngineering RAEng on Unsplash

Statistics is “a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data”. Throw programming and machine learning into the mix and you have a pretty good description of the core skills for data science.

Statistics is used in almost all aspects of data science. It is used to analyse, transform and clean data. Evaluate and optimise machine learning algorithms. It is also used in the presentation of insights and findings.

The field of statistics is extremely broad and determining what exactly you need to learn and in what order can be difficult. Additionally…


What are they, what are the options and why do we need them?

Photo by Lewis Ngugi on Unsplash

The Python programming language has many different versions. Similarly, all Python libraries also have multiple versions, work with specific versions of Python and most of them depend on other packages to run, this is known as a set of dependencies.

Every data science project that you undertake is likely to require its own unique set of third-party Python packages. Virtual environments act as self-contained environments encapsulating the Python version and all dependencies for a project. Creating a new virtual environment is one of the first steps that is usually taken when starting any new data science project.

Creating a new…


Learn all the statistics you need for data science for free

Photo by Daniel Schludi on Unsplash

Statistics is a fundamental skill that data scientists use every day. It is the branch of mathematics that allows us to collect, describe, interpret, visualise, and make inferences about data. Data scientists will use it for data analysis, experiment design, and statistical modelling.

Statistics is also essential for machine learning. We will use statistics to understand the data prior to training a model. When we take samples of data for training and testing our models we need to employ statistical techniques to ensure fairness. …


and why the data science generalist will triumph

Photo by Markus Winkler on Unsplash

When I started learning data science a few years ago most job ads requested a PhD, or at the very least a masters, in maths, statistics or a similar subject as an essential requirement.

Over the last couple of years, things have evolved. With the development of machine learning libraries that abstract away much of the complexity behind the algorithms, and a realisation that practically applying machine learning to solve business problems requires a set of skills that are not usually acquired through academic study alone. …

Rebecca Vickery

Data Scientist | Writer, Speaker, Founder DatAcademy | www.rebecca-vickery.com | www.linkedin.com/in/rebecca-vickery

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store