The Attribution Engine. (or The Who Dunnit Engine)

Could, what I call, the "Attribution Engine" soon be upon us? Since another name for it could be: "The Who Did What Engine" or maybe "The Who Dunnit Engine", you might start to get an inkling of what it is, and that it may effect anonymity. Indeed it could strip away a lot of anonymity so far taken for granted... retrospectively.

What is the Attribution Engine? It is one logical progression of the already existing search engines like Google or Bing etc. You can already do a bit of attribution with existing search engines by searching for someone's name and find all manner of things they have done or are connected with. I'm not saying those few huge corporations running the search engines will, or even will want to, develop their engines in the way I'm about to describe, but is it one possible direction and it's all technically possible.

The logical progression will probably start as follows:- One day, one or more of the search engine corporations may add a facility to search for faces. Searching for faces in photos is already a very well tried and improving technique. It is for example used to scan live feeds from CCTV cameras to spot the faces of known shoplifters in department stores, or to spot known card counters in casinos, and it is probably used on the social media site you frequent to find faces of your friends and suggest tagging them in photos you upload. Your face is recognized by a simple set of biometrics, basically just a few numbers: like how far apart your eyes are compared to how far they are to the tip of your nose and the corner of your mouth etc. So the social media site you use may already hold those biometrics about your face.

So the technology to recognize faces is already well established, and thus when web crawlers gradually scour the World Wide Web page by page for their search engines, as well as storing key text from web pages for people to search against, they could be made to search photos for faces too. It would take longer than searching the text, so faces would take longer to get onto the search engine database, but it is 100% possible make no mistake.

Once the web crawlers have been doing their work for a few months you will be able to use the search engine to search for faces. To search for someone's face you would simply upload a photo of them to the search engine which would analyse it and calculate the biometrics of the face and search for those numbers in its database, bringing-up a list of web pages in just the same way as it brings-up a list of web pages when you type-in someone's name. In the least sinister situation you would know the person's name and so, if social media sites allow it, most of the time you would only have to type the name to search for their face because the search engine could 'ask' social media sites for the already calculated facial biometrics of that person.

Of course, just like now, when you use a search engine to search for a name you get a lot of stuff you didn't want, the same will apply for the the face search, with the face search it will be due to distorted, out of focus, or low resolution photos... or people with very similar faces. A bit annoying these unwanted results in the list, but you'll whiz though them even faster than the unwanted results from a text based search because your brain is naturally super quick and skilled at recognising faces. Also just like with searching for a name, you could add another bit of info to the search request that narrows down the search and gets rid of many unwanted results, like instead of just searching for: "fred bloggs" or his photo, you could search for: "fred bloggs mycity university" or his photo plus "mycity university". (And as you will read later, this narrowing-down effect is one of the principals that will make the Attribution Engine deadly accurate).

The face search would be the first major step in the development of the Attribution Engine and the first opening of a huge can of worms. You have probably already had a tiny peak into the can of worms by searching for your own name or that of a prospective employee etc. There is a lot of junk there like embarrassing pictures of over enthusiastic drinking sessions from your earlier years, things that you had forgotten about, things that in the pre-internet age would never ever have seen the light of day again except to be viewed by close friends that were probably also there at the time anyway.... so you can see: a little bit of anonymity has been lost in the internet age already. Also if you're a creative person, or you once had a stab at being one, all your previous efforts (what you now may consider to be sub-standard) will probably pop-up too, all-be-it some way down the search list. But maybe after being initially mortified by the unearthing of your earlier exploits you may embrace the experience and decide it's all just part of the rich experience of being a 21st Century human, and draw comfort from the fact that most people will have a similar back-catalogue of cringe-worthy images and text anyway.

...But this first (face searching) step in Attribution Engines goes to another deeper level of this unearthing process (one of many ever deeper and more revealing levels as Attribution Engines get more and more sophisticated). The can of worms is properly opened for the first time, because some people will have used assumed names for their more sensitive contributions to the internet, or just not included a name at all, and thought this would provide them with anonymity. They may have been blogging, for example, expressing some controversial opinions or making libellous comments, and along side they probably have posted some of their day-to-day photos. Others may have been using, shall we say, risque dating sites, posting risque photos and text. But unfortunately, in the future, The Attribution Engine will be able to attribute people to these once thought anonymous pieces of text. It will be able to find out who done'em. For example a large corporation (most of which are already very proactive and sometimes over zealous about protecting their image and brand) would be able to search (as they can now) for their company name being mentioned in any libellous articles, but these articles may well have accompanying photos with faces in them ... so the corporations in future would be able find-out who is behind some anonymous articles by running the photos they find though the face search.

Even if you are not the libellous or controversial blogger, the face search will unearth even more of the irksome stuff of your misspent youth etc. than you currently get when typing-in your name into a search engine, because even if you appear in the background of some obscure photo not even taken by anyone you know, it will still find you there. And it does not even have to be a photo taken in the internet age... if a photo or its accompanying article is interesting then there is someone somewhere scanning it in and putting it on the internet, e.g. because it is historically interesting, or it is something interesting to a specialist group or... basically, due to human nature, anything personal, intimate, or embarrassing.

The next step in Attribution Engine development would be to use a people map. By "people map" I mean a map of associations between people (or in Facebook terminology who's friends with who). They say there is only six degrees of separation between any two people on this planet, no matter how dissimilar or how far apart geographically, and a people map would indeed show how true that is. If The Attribution Engine was given access to the established people maps that already exist in social media sites then it could eliminate a lot of mistakes it would make in face searches. For example it can see how likely it is that it has correctly identified your face by consulting the people map: if you are in the photo with other faces of people you know, or went to work with, or there is some associated text that contains a name of your friend, then it can be a lot more confident that it has identified your face correctly and put that search result closer to the top of the list it displays. It can work the other way too -- if it 'sees' your face together a lot with another person (and it can be reasonably sure some other facts like date and country confirm it) it can update the people map with the new connection between you, even if it has been secret up until then, which may in turn connect other people, or whole groups of people.

The next step in the march of Attribution Engines will strip away anonymity from people who were even more careful: people who made sure their face or name etc, or any faces and names at all, were not included with their controversial articles, tweets and blogs etc. Because in the next step The Attribution Engine starts to go beyond photos: next it looks at phraseology and uncommon words. Any text written on a web page or blog has a certain signature by the way you phrase things, local slang you use, unusual words, etc. For example a slightly misquoted saying is sometimes quite particular to just one family. And a certain slang word may only be said by people from a small locality. Some in-jokes and specialized greetings may even be unique to one group of friends. So imagine the Attribution Engine looking at an anonymous blog article: when it has identified all the phraseology aspects about it, imagine it drawing them on a Venn Diagram of all the people in the world and their phraseology: the more factors (circles) you draw on it, the smaller the common over-lapping area gets narrowing it down to a small group or an individual person. It can deduce who knows who, and thus who people are.

Take me for example, even though I'm British I'll spell verbs like "rationalize" ending in "ize" and not "ise" despite being British and despite every other British person I know spelling such words "ise". (This is because, when I was young, while looking in an Oxford English Dictionary I noticed it said that "ize" was their preferred spelling so I adopted it, and no amount of social convention to the contrary has made me change). Also I use capital letters and punctuation quite consistently (although not necessarily totally correctly) even in very rushed informal messages (like tweets, texts etc) (which by the way also identifies me as being born before about 1975). So if it looks at an article and sees it has "ize" spellings (but has otherwise British phraseology) and I use capitals plus punctuation then the number of people who could have written the article has been reduced a hundred fold, and every time you compare another criteria you may reduce the number of people by ten fold or whatever...even if a criterion can only reduce the number of possible authors by a few percent it's worth doing because the reduction keeps accumulating (keeps narrowing-down). It's not quite that simple because not every trait may be present in every article so the computer algorithm would have to use weightings: go more on high probabilities rather than dead certainties.

In fact if The Attribution Engine is not given access to a people map (by social media platforms) it can build its own (all-be-it initially a much less accurate one) by using this "Venn Diagram" method mentioned above, the difference is it won't have a name for you: for that little overlapping area of all the circles that represents an individual person, instead to start with it will just give you a number and try to resolve your name later by searching for your 'signature' it's built-up. For example if it were to find your signatures used by just one fb user then it's done: found you no trouble (like shooting fish in a barrel)... but if it can't find you like that, as I said before it will often just use probabilities from a number of sources and provided they are above a certain percentage... it's still nailed you.

Further steps Attribution Engines can take is to analyse videos and start to gather even more about you like your voice print (like a finger print for your voice) or your gate (the way you walk).

When AI gets a lot better, after looking at a lot of videos and photos of you it will even know what cloths you wore at different periods in your life. If it was unsure about a connection it found about you before from a photo, perhaps because it could not 'see' your face fully, if it 'sees' your favourite stripy scarf: bang, that pushed the probability of it being you way up, it's nailed you.

Now it knows everything it needs to, in your 'signature', to attribute anything you do, to you. Your 'signature' is like a dossier of metrics: facial biometrics, phraseology idiosyncrasies, who you know, usernames you have used, politics, areas of interest, things you hate, where you live and where you have been, your voice print, your gate, and so on. Any attribute that can be stored as a few numbers or a few words it will collect and keep in your signature.

OK. You may think that before this goes too far there will be public pressure to outlaw keeping these signatures for everyone, and hence the searches they would enable. You may be right. But outlawing these things will of course only stop reputable companies (i.e ones who obey regulations) from storing/using the signatures... large criminal organizations, and probably national security agencies, will ignore this law. If you live in a relatively free country like the UK it's not much of a concern if the security agency stores and searches these things because even with huge data centres and super computers they only really have time to worry about people who are a threat to the state or life-and-limb, so we may as well let them use the facilities to save them having to pretend they are not doing it. And if you allow the security agencies to use the searches you may as well let the police, provided they obey some simple rules like not being able to use these searches unless it is part of an official investigation.

The large criminal organizations would be the worry: if you have stopped the legitimate corporations making these facilities available to the general population the criminal organizations gain an advantage over ordinary citizens, and they would have free reign to blackmail and intimidate the general public, or manipulate politicians or civil servants or police.

So, because everyone's past will become transparent, we will really need a package of measures like amnesties for very minor past crimes, and just a general mutual agreement to "judge not, lest ye be judged" and "live and let live" etc. for things that are not criminal but just embarrassing or shaming. This "live and let live" policy may not be too difficult to achieve, it really depends on how many people have done something embarrassing in their life, and if it's as high a percentage as I think then it's a done deal.... it will be especially easy in predominantly Christian countries because of the repentance, redemption, and tolerance principals of the religion (assuming they have not become blind to these things due to fundamentalism). I cannot comment on other religions as I am only familiar with Christianity, I don't know whether other religions would be better or worse. In predominantly atheist countries where the "if it's not hurting anyone and it's legal" principal goes (in theory) then it should not be too bad either.

The big worry is non democratic countries. The Attribution Engine will make life increasingly intolerable in dictatorships with Stazy like states and so there will be increasingly large numbers of refugees fleeing from these countries as the Attribution Engine kicks-in because those that have strayed outside the strict political or moral position or the state are rapidly identified. Brexit arguments aside, we (Britain) may have to stop all normal immigration and the only type of immigration will have to be refugees... any skill requirements by industry or commerce will have to be met from the domestic population including this large pool of refugees. Other free countries may have to adopt a similar policy.

Although the Iraq War, and the only partially successful (and far more bloody than hoped) Arab Spring, had turned the world against the aspiration of turning all the worlds countries into democracies, the Attribution Engine may see a change again in world opinion... back to that democratic aspiration as life in dictatorships become increasingly intolerable.

The only small hope for the people of these strict dictatorships is that the Attribution Engine will uncover such a large proportion of the population as having strayed politically that the state will realize that it is futile to try and control what people think.

[This document: The Attribution Engine V1.1 Very rough draught V1.0 posted V1.0: (c) J.Prestidge 2012. Updated: 5-Jun-2018 (was called The Attributation Engine but the word Attribution is a good fit so no need for a new word) ]