Google’s Algorithm Goal is Human Like PerceptionImage Credit: Wikipedia
With the advent of Panda followed by the Penguin algorithm, Google changed from a strict algorithm of inputs and outputs into a company that believes machine learning is the path for a continually improved search engine. Google wants its algorithm to perceive the world as humans do, even if that means introducing false positives or negatives. What this means and the implications I delve into further below.
Humans are Great at Finding Patterns
Before diving into Google, let’s talk a little bit about humans. Babies grow up learning to recognize different objects that seem the same as something they have learned before. That red truck they were taught about can in turn realize that the green semi-truck with a different sound will still be a truck to them. People shopping for the freshest watermelon will look, feel, and hear for different qualities, oftentimes not being able to fully explain why they chose the one they did.
The pattern analysis that humans naturally perform is what Google wants to emulate beyond just image search; Google wants the top results to reflect how a human would qualitatively choose sites #1-10. Google strives to avoid having humans at Google manually choosing or having an input/output algorithm deciding what should go in what position and is creating an algorithm to mimic human behavior around pattern recognition.
Yet Humans Make Mistakes
Image Credit: StatsToDo
Ironically, the desire for the perfect algorithm (a robot that can perceive like a human) will lead to many false positives with websites getting burned just as many have already from Panda (even the recent rollback of some sites impacted by Panda shows this). Still, Google has moved forward with pattern analysis (machine learning) in my opinion based on the belief for the following two areas:
- Humans generally are comfortable with pattern analysis among other humans, so naturally having a robot do this is just as acceptable.
- A robot that can do pattern analysis will eventually be (or already are) better than humans can at pattern analysis and thus while there might be false positives, will be less than if a human were to look at it.
The potential problem with number one is that humans may not actually be comfortable with algorithms that engineers themselves cannot explain how it works, for how easy could it be to code something nefarious in and claim it is part of the algorithm’s, not a human’s intervention?
Google [Hearts] the NSA
Image Credit: Silicon Valley Watcher
An example of people generally not being comfortable with this in today’s climate stems from the NSA bulk-collecting meta data to find patterns. Part stems from the secrecy involved (helps to create conspiracy theories), part stems in the mass amount of data collection contra US constitutional law, and part stems from the view that it is a “guilty until proven innocent” way of judging people.
This is the path that Google has gone down for Penguin, where Google used to find proof that the paid/bad links one had to your site were gathered by you. With Penguin, Google turned the original “innocent until proven guilty” on its head into a very non-American style of law: guilty until proven innocent. Now if you match the pattern of buying links or the pattern of having low-quality links, then you will be punished first, just as you would have been blamed by the SEO community anyway (how many SEOs do you see proclaiming company X bought links without really knowing that the company did so?).
Image Credit: IEEE Spectrum
I think with number two noted above, robots will eventually be better, but the certainty lies on how those who seek to determine Google’s pattern algorithms figure out ways to avoid it. If the complexity becomes too difficult for engineers to handle, then the simplicity of tricking a pattern analysis becomes easier than it does to realize its being tricked.
In the end, pattern analyses are never perfect (what human can claim 100% success with anything?) and false positives will occur in Google’s algorithm, the question SEOs should be asking Google: What is the false positive acceptance rate at Google? For if you run a pattern analysis, there will be consequences of innocent sites being harmed incorrectly, and understanding how large an error is acceptable would help to give SEOs some understanding of how well (or not) Google is mimicking human behavior with pattern analysis.