(Note – for a good introduction to the history and current state of AI, see my colleague Frank Chen’s presentation here.)
In the last couple of years, magic started happening in AI. Techniques started working, or started working much better, and new techniques have appeared, especially around machine learning (‘ML’), and when those were applied to some long-standing and important use cases we started getting dramatically better results. For example, the error rates for image recognition, speech recognition and natural language processing have collapsed to close to human rates, at least on some measurements.
So you can say to your phone: ‘show me pictures of my dog at the beach’ and a speech recognition system turns the audio into text, natural language processing takes the text, works out that this is a photo query and hands it off to your photo app, and your photo app, which has used ML systems to tag your photos with ‘dog’ and ‘beach’, runs a database query and shows you the tagged images. Magic.
There are really two things going on here – you’re using voice to fill in a dialogue box for a query, and that dialogue box can run queries that might not have been possible before. Both of these are enabled by machine learning, but they’re built quite separately, and indeed the most interesting part is not the voice but the query. In fact, the important structural change behind being able to ask for ‘Pictures with dogs at the beach’ is not that the computer can find it but that the computer has worked out, itself, how to find it. You give it a million pictures labelled ‘this has a dog in it’ and a million labelled ‘this doesn’t have a dog’ and it works out how to work out what a dog looks like. Now, try that with ‘customers in this data set who were about to churn’, or ‘this network had a security breach’, or ‘stories that people read and shared a lot’. Then try it without labels (‘unsupervised’ rather than ‘supervised’ learning).
Today you would spend hours or weeks in data analysis tools looking for the right criteria to find these, and you’d need people doing that work – sorting and resorting that Excel table and eyeballing for the weird result, metaphorically speaking, but with a million rows and a thousand columns. Machine learning offers the promise that a lot of very large and very boring analyses of data can be automated – not just running the search, but working out what the search should be to find the result you want.
That is, the eye-catching demos of speech interfaces or image recognition are just the most visible demos of the underlying techniques, but those have much broader applications – you can also apply them to a keyboard, a music recommendation system, a network security model or a self-driving car. Maybe.