[You can read this on the BBC News website – sorry it’s late being posted here. But one day I’ll be late too..]
While Google is as secretive about its internal processes and systems as Apple is about product development, every now and then senior people post articles on the official Google blog and offer their thoughts on the development of the web.
In the latest posting two Google engineers, Alfred Spector and Franz Och, look at how search strategies will benefit from the faster computers, greater volumes of data and better algorithms we are likely to see in the next decade, speculating that “we could train our systems to discern not only the characters or place names in a YouTube video or a book, for example, but also to recognise the plot or the symbolism.”
Instead of just analysing hypertext links, they say, the systems would observe, record and learn from all sorts of behaviour and interpret “objects, nuances, intentions, meanings, and other deep conceptual information.”
They foresee better search and seamless translation between languages, which will benefit Google. It will also, Spector and Och are keen to point out, help the rest of us too by improving the quality of scientific research and analysis of large data sets.
Conceptual search sounds particularly appealing at the moment as I am spending almost every waking moment working on the Cambridge Film Festival, where I am a trustee.
Being able to find “Polish films about an ice skater with a body double and a sad ending” with a single search would be a real boon (and yes, it’s a real question), even if the price is letting Google rifle through every bit of digitised data for all eternity.
While I am not as sanguine as Google’s engineers about the prospects of solving many of the fundamental problems that have faced the artificial intelligence community for almost half a century simply by throwing faster hardware and bigger disks at them, the blog post reveals something of the unbounded optimism that Google still possesses.
Give us enough data, they seem to be saying, and we will show you your deepest desires and make them real. Google does not ask for belief in an unseen deity or faith in the supernatural, but it does want to be trusted like a priest, or a parent.
Google and other search engines have developed vast data stores that can keep track of the contents of a significant part of the whole web and store most of the transaction data too. Each year the Google Zeitgeist shows how search trends change, because they can look at every query we make and analyse what web search expert and blogger John Battelle calls “the database of intentions”.
They also keep this data for a long time, sometimes in anonymised form but still there, and as we saw when AOL released what they thought was anonymous data on people’s searches, it is easier than it looks to link a person to a pattern of behaviour.
It will get a lot easier if there is some form of ‘intelligent’ or concept-based search, too. ‘Find me someone who likes Derek Jarman films, lives in Cambridge and shows an obsessive interest in Google’, would probably nail me, I suspect.
The acquisition and analysis of data is a continuing project, but this has significant implications in a changing world populated by mortal humans, because Google won’t lose interest in me when I’m dead.
At the recent dConstruct conference in Brighton author Steven Johnson talked about his book The Ghost Map, an exploration of the way that John Snow and Henry Whitehead discovered the source of the 1854 cholera outbreak in London and prompted the building of modern sewerage systems.
The breakthrough came when Snow plotted the deaths from cholera on a map of the area and drew a line around the homes that took water from the pump on Broad Street and the correlation became obvious.
In his talk Johnson joked that the people shown on the map were a ‘social network of dead people’ and that he should have called his book The Wisdom of Dead Crowds in homage to James Surowiecki.
The people on Snow’s maps were just dots, but today’s social networks have far richer portraits of the dead who populate them.
My friend Anne was slightly freaked out when she got a note on Facebook to tell her that the online status of a friend who had died recently had been updated. It turned out that her friend’s brother has taken over the profile as an online memorial.
Last.fm and Apple know what music I listen to, Facebook knows who my friends are, Dopplr knows where I’m going and Brightkite knows where I am. And of course Twitter knows what I’m thinking.
At the same time I am leaving a mark with every link I have made from my blog to someone else’s writing and every time I have clicked on one item from a page of search results to find a video or book or song, because it affects the rankings those pages get in other people’s search results.
Taken as a whole these stored interactions and connections carve out a Bill-shaped space on the network, and this space will only slowly fade away after I die.
If Google and the other search engines index it more fully, as they plan to do, then it could last a very long time and my ghost will haunt the network for many years.
Of course I won’t be alone. Not only my friends, but my ancestors will be joining me. We are rapidly reaching the point where information that is not digitised – and therefore searchable – simply stops being useful or used. Fortunately more and more old books, newspapers and other paper records are scanned and digitised and made available online.
Mormons believe that the dead can be baptised into their religion and that doing so removes an obstacle to their salvation. Perhaps the search engines can perform a similar service for reputations rather than souls.