Cedric Roll, Chief Technology Officer, 8 Securities
Machine learning is the ability for computers to learn without being explicitly programmed. Well, that sounds very much like a living organism to me and more specifically it sounds like us.
At birthwe are bootstrapped with minimal programmed responses (reflexes) to survive our environment. As a new born, when I am hungry I scream, when presented with a baby bottle I suck…. The rest is unknown and has to be discovered. Until we can develop higher communicationabilities with other humans and acquire the solution to more complex problems, we observe, try, get feedback and adapt.
Machine learning replicates this very human and nativeprocess: acquire data, identify correlations, build a response model, get feedback, get more data and adapt the model until it fits its environment.
The net results aremachineswhich areable to interactwith us like never before: they can recognize our faces, understand our speech, understand our meaning, read our handwriting, alert us of fraud,know if we are upset or happy, recommend friendships in social networks,make product suggestions, predict if our health is at risk.
Now, humans are bad at choosing between many options (The paradox of Choice –Schwartz), when we purchase online we have a short attention span and we are afflicted by a multitude of cognitive biases (i.e. we don’t always make rational decisions… it is hard to program for irrational).
Many successful online businesses are addressing thosechallenges byimplementingmachine learning as theircore engine of revenue growth and customer retention. Imagine you just walk into a store and the sales assistant tells you: I know which products you want and he gets it right! That rarely happens on the high street (most of the time I tell them to go away)but it happens every day online when interacting with machines.
Manytechnical solutions are availableto implement machine learning algorithms. However, a technical solution is not the starting point. First and foremost, what is the problem you are trying to solve? Is it a machine learning problem? What are the features of this problem (correlated data setattributes withcausal relationshipto the outcome)? Do you need to discover those features? How to acquire and prepare this data? How to test the results? Is your model generic enough?
Identifying real correlations is not as easy as it sounds. Here are some spurious correlations for you: the US spending on science is correlated to the number of suicide by suffocation, and the US oil import from Norway is correlated to the number of car crash with a train. Not everything is as it seems. The data might move in sync but is there a real cause to effect relationship?
Once this is understood then the appropriate algorithm can be implemented: supervised, unsupervised, logistic, recommender, classifier, clustering… and there are many.
Machine learning is not just a problem solving mechanism; it is the opportunity to build more and more meaningful relationships between the machine and the human.