The Art of Data Science: Chapter Eleven

Jan 02, 2018

Hello, and wish you a happy and prosperous 2018!

Sending out the newsletter on this day means that it gives the opportunity to simultaneously set a New Year resolution. While there are several other things that I've resolved related to other areas of my life, I have one resolution related to this newsletter as well - I'll make it periodic. I hope to send out 12 editions this year, one each month. Let's see if I can keep this up!

Getting on to real stuff..

A new way of playing chess

The most significant development in "artificial intelligence" (loosely defined) since the last edition of this newsletter went out was the announcement of Google Deepmind's AlphaZero beating Stockfish 8, a powerful chess engine in a 100-game match.

While chess engines have been around for a long time (remember that Deep Blue beat Garry Kasparov back in 1997). the "problem" is that most chess-playing programmes have been just that - engines. They've relied on superior computation in order to compute positions to a high "depth" (number of moves ahead) and then pick the move that gives the best outcome based on that. Of course, this search has been boosted with some cleverness such as alpha-beta pruning and taboo-search, at heart it was a "brute force solution", and not so much intelligence.

Unlike these engines (and like its go compatriot AlphaGo Zero, also developed by the same team), AlphaZero uses a method of artificial intelligence called "reinforcement learning". The concept behind this is that rather than calibrating a model based on a set of given inputs and outputs, the model is allowed to "explore" and make choices at random. For each such exploration, there is an associated "reward", and the exploration-reward combination is used to slowly calibrate the model.

In a way, reinforcement learning tries to imitate how intelligent creatures (not necessarily human) learn - by trying different things and then responding to incentives. While it is conceptually simple, implementation in a lot of cases is not particularly easy.

AlphaZero started with only the rules of the game, including termination conditions (checkmate, same position occurring thrice, stalemate, etc.). It then started playing against itself (remember that for a computer it's possible to play a game on both sides in an honest manner), and the results of each game was fed back into the program as a "reward" (it is assumed in such cases that the result of a game is a function of the agent's actions during the game).

And then AlphaZero kept on playing itself, continuously updating its decision-making system as a function of results, until it was good enough to beat, in particular conditions, one of the strongest chess engines around.

A lot of literature about AlphaZero talks about how it "became a strong system with only 4 hours of training". Now, given the quality of computers on which AlphaZero was developed, the time taken to master the game is meaningless - had the computing resources been greater, it would have achieved the same result in a shorter amount of time. Instead, the metric to track is the number of games AlphaZero had to play against itself in order to achieve its level of mastery (Deepmind hasn't disclosed this number).

The other thing to keep in mind is that while a system that kept playing against itself to become a master at the game sounds sexy, the real challenge here is in terms of problem formulation and representation. Chess is a rather complex game with a really large number of possible positions. Moreover, for small changes in position, the way to play the game changes. In this sense, the challenge for Deepmind would've been in formulating the play - how do you represent the board, the moves and the payoffs? It would have taken tremendous effort on that front to get the results that we've seen.

I have a blogpost on the topic. Grandmaster Daniel King has some excellent analysis for a couple of games (this and this) from the 100-game match.

Deep Learning and AI Predictions

I stumbled upon the November/December edition of the MIT Technology Review, and it turns out to be a special edition on Artificial Intelligence. There are quite a few articles on the "fairness" of AI, and one by Samanth Subramanian on how AI is transforming the Indian IT Services sector. Apart from this, perhaps thanks to nudges from the Numbers Rule Your World blog (highly recommended, btw), I found two very interesting pieces.

One is about Deep Learning, and whether there's too much in the "AI world" riding on this one method. Now for some history - the theory behind training Artificial Neural Networks more than 2 or 3 "layers deep" (hence deep learning) was developed back in the 1980s by Geoffrey Hinton (interviewed for this article) et al, but Computer Scientists had put it in cold storage because it needed too much data and too much processing power to be of any use.

Even when I was studying CS (early 2000s) the method was largely dismissed, but sometime later that decade, processing became cheap enough and data became plentiful enough that the method actually started to show results. A 2012 paper by Hinton et al conclusively showed that deep learning systems outperformed other algorithms in image recognition.

The most obvious application, of course, has been in Computer Vision - have you noticed how good Google Photos has got in identifying people? Or how chatbots such as Alexa or Siri are so effective? There are other related applications in medical imaging (again a Vision problem) and Natural Language Processing.

The problem, of course, is that Machine Learning is not Magic, and Neural Networks and Deep Learning systems are essentially large mathematical solutions, and there is only so much intelligence that can be built into it. From the article:

Neural nets are just thoughtless fuzzy pattern recognizers, and as useful as fuzzy pattern recognizers can be—hence the rush to integrate them into just about every kind of software—they represent, at best, a limited brand of intelligence, one that is easily fooled.

The other problem about deep learning that the article talks about is about the large sets of data that are required to train such networks. Mathematically it approximately works like this - the "deeper" a network, the more the degrees of freedom for the variables that need to be calibrated. While that gives more flexibility in discerning patterns that would've been otherwise hard to determine, it means that without sufficient data, the pattern recognition can be incomplete, and possibly inadequate.

As a result, while companies such as Google or Amazon, which have access to tonnes of data, can realistically find sufficient data to train their deep nets, smaller companies looking to use such techniques for business applications might trip up.

Another article, again recommended by Numbers Rule Your World, is about the "seven deadly sins of AI predictions". Read the whole thing, and the commentary by Numbers Rule Your World, but the biggest takeaway for me was the phrase "suitcase words" - words or phrases that can mean different things to different people. Coined by AI pioneer Marvin Minsky, it pretty much encapsulates a lot of things I claim to do - "machine learning", "artificial intelligence", "data science", "analytics"! Marketing when you're doing something that's covered by a suitcase word isn't easy!

In any case, read the article about the Seven Deadly Sins. It's quite insightful.

Elsewhere, research by Siddharth Garg et al of New York University has shown that in a sense neural networks are fragile, and this can pose a security risk, especially when these networks are being used in critical applications such as self-driving cars.

Das Reboot

This is not normally the kind of book you'd see being recommended in a Data Science newsletter, but I found enough in Raphael Honigstein's book on the German football renaissance in the last 10 years for it to merit a mention here.

So the story goes that prior to the 2014 edition of the Indian Premier League (cricket), Kolkata Knight Riders had announced a partnership with tech giant SAP, and claimed that they would use "big data insights" from SAP's HANA system to power their analytics. Back then, I'd scoffed, since I wasn't sure if the amount of data that's generated in all cricket matches till then wasn't big enough to merit "big data analytics".

As it happens, the Knight Riders duly won that edition of the IPL. Perhaps coincidentally, SAP entered into a partnership with another champion team that year - the German national men's football team, and Honigstein dedicates a chapter of his book to this, and other, partnerships, and the role of analytics in helping the team's victory in that year's World Cup.

If you look past all the marketing spiel ("HANA", "big data", etc.) what SAP did was to group data, generate insights and present it to the players in an easily consumable format. So in the football case, they developed an app for players where they could see videos of specific opponents doing things. It made it easy for players to review certain kinds of their own mistakes. And so on. Nothing particularly fancy; simply simple data put together in a nice easy-to-consume format.

A couple of money quotes from the book. One on what makes for good analytics systems:

‘It’s not particularly clever,’ says McCormick, ‘but its ease of use made it an effective tool. We didn’t want to bombard coaches or players with numbers. We wanted them to be able to see, literally, whether the data supported their gut feelings and intuition. It was designed to add value for a coach or athlete who isn’t that interested in analytics otherwise. Big data needed to be turned into KPIs that made sense to non-analysts.'

And this one on how good analytics can sometimes invert hierarchies, and empower the people on the front to make their own good decisions rather than always depend on direction from the top:

In its user-friendliness, the technology reversed the traditional top-down flow of tactical information in a football team. Players would pass on their findings to Flick and Löw. Lahm and Mertesacker were also allowed to have some input into Siegenthaler’s and Clemens’ official pre-match briefing, bringing the players’ perspective – and a sense of what was truly relevant on the pitch – to the table.

A lot of business analytics is just about this - presenting the existing data in an easily consumable format. There might be some statistics or machine learning involved somewhere, but ultimately it's about empowering the analysts and managers with the right kind of data and tools. And what SAP's experience tells us is that it may not be that bad a thing to tack on some nice marketing on top!

Hiring data scientists

I normally don't click through on articles in my LinkedIn feed, but this article about the churn in senior data scientists caught my eye enough for me to click through and read the whole thing. I must admit to some degree of confirmation bias - the article reflected my thoughts a fair bit.

Given this confirmation bias, I'll spare you my commentary and simply put in a few quotes:

Many large companies have fallen into the trap that you need a PhD to do data science, you don’t.

Not to mention, I have yet to see a data science program I would personally endorse. It’s run by people who have never done the job of data science outside of a lab. That’s not what you want for your company.

Doing data science and managing data science are not the same. Just like being an engineer and a product manager are not the same. There is a lot of overlap but overlap does not equal sameness.

Most data scientists are just not ready to lead the teams. This is why the failure rate of data science teams is over 90% right now. Often companies put a strong technical person in charge when they really need a strong business person in charge. I call it a data strategist.

I have worked with companies that demand agile and scrum for data science and then see half their team walk in less than a year. You can’t tell a team they will solve a problem in two sprints. If they don’t’ have the data or tools it won’t happen.

So that's it for this edition. As usual, please do forward and share, and keep the feedback coming!

Cheers
Karthik

Art of Data Science

The Art of Data Science: Chapter Eleven

Discussion about this post