What is Machine Learning? Going beyond the buzzwords.
Machine Learning is the new Linear regression. But what is machine learning and how can it help fuel success in your business? We talk to Jacqueline to find out a little more about Machine Learning.
Hey Jacqueline, can you tell us a bit about yourself?
After finishing my masters in Econometrics, I started working as a consultant at PWC. After a while, I missed the university and decided to do a PhD in a slightly different field to broaden my horizon. My focus was on evolutionary algorithms, an optimization technique inspired by natural evolution. I happened to apply it to robotics but this technique is more general than that. In my free time, I work out (tennis and fitness) and I love to do all different kind of courses such as cooking, wine, photography, guitar, diving, etc.
So, let’s get straight to it, what is machine learning?
Machine learning techniques all have the same goal: to be able to explain patterns in data and predict them if necessary. Recommender systems such as booking.com and bol.com, for example, try to derive from data which hotel or product they can best recommend in order to increase the chances of loyal customers and further purchases. Chatbots create relevant answers and suggestions by categorizing the text that is provided by the subject.
What about Artificial intelligence, how is this different from machine learning?
Chabot’s try to categorize the text that is provided by different subjects so that they can provide a relevant answer. Artificial intelligence is when these machine learning systems can make useful recommendations, or even have what feels like a genuine conversation.
Should every business be using machine learning techniques?
Well, that depends on the end goal of the business. The fact that an organisation has data does not necessarily mean that machine learning techniques will contribute something. Often, visualising the data will help to improve business operations. Simply because data provides us with insights into the current state of affairs. A dashboard with visualisations or a simple data analysis is the only thing that is needed for this. So, this may not be machine learning but it can often be a huge boost and helping hand for business operations.
So, what is then the role of a data scientist?
It starts with collecting the available data and carefully checking and visualizing it. In interactive sessions with the client, questions will arise and potentially other sources for data that can be taken to the next steps. This is, therefore, data visualisation and the “scientist” in data scientist is not really present. That is why often, the role is also called Data Analyst.
When a business wants to explain certain measurement data (when is a marketing campaign successful?) and as the next step wants to predict (How many new customers will this campaign lead to?) We can make use of machine learning techniques (then the role will revolve more around data science).
How do existing machine learning techniques differ?
The only way to answer this question is if we dive a little deeper into how machine learning works. If you already have a good understanding of neural networks, then you’ll know already that linear regression is the same as a neural network without hidden layers (and no activation function). We can conclude that regression is a machine learning technique. But new techniques within machine learning such as deep learning and random forest work a little different than the more traditional models.
The more traditional regression models assume statistical qualities of the underlying data. This allows certain statements to be made about the final significance of the coefficients and therefore the correctness of the model. The model is ultimately easy to understand, but a lot of work (not to mention time) is required to meet these characteristics.
New techniques such as random forest do place importance on the different variables but more on the power of prediction the final tool could hold. For this reason, a lot of training will take place on a training set and tests will be done with a test set of data that the model has not seen already. Contrary to a more understandable model with regression, random forest creates a more complicated model that isn’t easily understandable. The benefit of this is that these techniques can be implemented very quickly through all the available open source libraries.
Working with a data scientist
If this interview has sparked your interest in what machine learning can do for your business, we’d advise you to work with a freelance data scientist who can do a quick scan. The data scientist can apply a few techniques to existing data so in the end that it’s clear what value the data department can add to your organisation.