Independent and intellectual thoughts ranging from China, SEO, Analytics, and other international topics
10 Nov
The Blackhats have finally caught on (publically at least) to the fact that Google is not doing a very good job with scraping sites, much less authority sites that copy and paste content from your own site. Interestingly enough, Quadzilla has noted that Yahoo is doing a relatively good job on the other hand with its search results:
In the meantime, while Google (and in fact all the Search engines) continues to serve up scraper after scraper - Scrape away! It’s the blackhat technique that just won’t die and continues to drive traffic and make money.
What’s even worse about the scraping aspect is that this is nearly considered to be an acceptable thing to do since according to Google, they cannot determine whether or not you permitted those sites (say the Associated Press) to re-post your content.
It’s the same concept when you see Aaron Wall posting his own same content around the web nearly verbatim and having them rank quite well. This tolerance for duplicate content is rather ridiculous from a user perspective, but from Google’s perspective it lessens their burden from trying to determine what is “truth” and what is a good search result.
But the question remains, what happens when the two need to be the one and the same?
Update (11/13/2008):
SEO ROI has found a site in particular already taking advantage of this nasty aspect on Aimclear:
Trademark Productions steal other people’s content, edit it for the sake of passing through search engine duplicate content filters, and try to pass themselves off as experts you should trust? They’re stealing from Aimclear, Clickz and others.
Here’s the fun kicker from the Sphinn post comment by sockmoney to highlight the problem still going on today:
I battled a site last year that had copied over 10,000 pages of content from my site. It was brought to my attention by a competitor who was also being scraped.
I wrote a program that would crawl their site, pull the dup URL, and report on it against the matching page on my site.
Now keep in mind, my site is 10 years old, the scraper site was less than one year old.
I filed my report as a DMCA violation with Google (my crawler was only able to match 6,000+ pages copied, so that is what I sent to Google in the format they required).
Google contacted the site owners, they countered and said they have broken no copyright laws. Google said we cannot do anything else, sorry. In my mind, Google should be able to see my content was there first, and they do in most cases, so why can’t they assign a scraper penalty to the site(s)? Instead, they choose in this case to let their site continue to grow “acting” like a legit site with legit content, but all along they were simply existing off our content.
My only option was to hire an attorney. I was advised that copyright law is a Federal law, and that it would require going to Federal court, which would cost me a minimum of 30-50k in legal fees.
11 Feb
Here’s my comment via the article at UK Times Online on Slashdot:
You can follow three paths as a search engine (in simplistic terms):
1) Show everything–this implies crap sites (*coughs* boingboing), great sites (*coughs*
/.), malware sites (3221.com), search results sites, etc. thereupon your results are fully awful, but absolutely representative of what a search engine is “supposed” to show by previous comments, and thus get banned in China thereby showing nothing. 2) Do as you are told–obviously not as fun and cries of shenanigans and submissions are there, but then you get to show more results to people around the world who otherwise would just be filled with pure propaganda.
3) Do your own thing–”hitting the corner of the ping-pong table”, barely get by with regulations without getting punished.
Guess what? None of those are illegal to do under any international law at this point in time (although I recall some events within the US on trying to sue sites that just link to other pages, but nothing for the international arena) and certainly nothing illegal to show or not show within the US for political sites.
Remember, this is a corporation, not a government, so there is no “right” that you have for them to “display” your site in “their” index.
At least all algorithmically anyway.
3 Feb
Google issued want could be seen as a general warning to Microsoft over the hostile bid for all of Yahoo:
Google said Sunday that Microsoft’s proposed $44.6 billion takeover of Yahoo could pose a number of potential threats to competition that need to be examined by policymakers around the world.
Google said in a blog post on its Web site that given Microsoft’s anti-competitive conduct in the past and its continued dominance in the technology industry, the proposed transaction could pose threats to “innovation and openness” on the Internet. But Google’s broadly worded concerns lacked detailed claims about the anticompetitive effects of the deal, and the company did not ask federal regulators to take any specific actions at this time.
From a branding perspective, I can see why Google would respond to the bid publicly, but I personally see Google not having to worry with a takeover of Yahoo for the following reasons:
27 Dec
I always have enjoyed pointing out to people how to get free Chinese music from Baidu, then showing them how you can get it from Yahoo! China, but not Yahoo! US. It was really only a matter of time that the big record labels caught on and did something about this:
Yahoo! China lost their appeal to the Beijing Higher People’s Court who upheld a lower court’s ruling in April that the company had violated copyright laws. Yahoo! China has insisted all along that it only provides links to websites for music search results and they should not be held liable for content provided by those third-party web sites.
And of course, Baidu gets preferential treatment:
Meanwhile, Baidu.com successively won the first and second round of their trial. Seven label companies filed the lawsuit against Baidu.com for infringement of their music copyrights. Baidu.com, like Yahoo! China had been insisting that the responsibility lied in the third party websites that provided the illegal music downloads. The local court in Beijing ruled that the music download service offered by Baidu.com was in fact legal.
Early this year, Baidu and EMI signed a strategic partnership deal for online music streaming and download services. Baidu is now authorized to stream EMI Chinese music on its music search channel. EMI Music, the world’s largest independent music company, will share the revenues generated by the advertising.
Goes to show how far nationalism and market strength can go towards helping keep various services for search engines (in case you didn’t know, a lot of Chinese citizens use Baidu for music).
18 Dec
Google announced a few days ago its vision to essentially mash together a Wikipedia like site with a Squidoo layout essentially desiring to have a competitive online encyclopedia that has ads. Beyond the horrible name, this product may come to haunt Google in the long-run becoming a turning point its perceived status as an honest company into a monopolistic corporation similarly to Microsoft.
Google Knol essentially could take down the major content providers such as Wikipedia, Squidoo, Hubpages, Yahoo Answers, etc. as it will naturally be ‘algorithmically’ favored by the grand ‘artificial intelligence’ of Google—just as Youtube currently is for videos. Competition for ad revenue will drive a lot of people to copy millions of text from across the web creating duplicate content issues that Google still cannot detect through its ‘artificial intelligence’ particularly with RSS feeds, in turn creating complaints of infringements on copywriting.
11 Dec
If there was ever a way to push online companies to support Net Neutrality and really push against ISPs, this situation would certainly qualify:
That’s right, you see Roger’s Yahoo! High-Speed Internet showing up on Google Canada in three spots. How dumb can the ISPs be to not think about these kinds of situations (even if it’s just a test)?
7 Dec
Google currently provides a very nice and free analytics program that does wonders in helping online marketers somewhat accurately analyze how their performance is doing (far better than TV Nielson ratings in my opinion or magazine trackings). That said, I’ve come across clients that refuse to use Google Analytics as they are afraid of Google potentially becoming their competitor and using the data within analytics against them. Kevin Gold at Search Marketing Standard (great magazine in my opinion by the way) feels that for the most part Google isn’t going to use its analytics for evil purposes:
[T]he other day I overheard a conversation about Google and how advertisers will not use their Analytics or Optimizer products because of the “big brother” fear. At other times, I have heard website owners claim that Google wiped out their business overnight through some diabolical plot to support other “preferred’ websites.
Who knows…certainly I suppose undercurrents lurk mysteriously in the unknowns…it seems to be happening in politics lately where bribes, influences and other pressures force less than ethical decisions. But I am not sure I believe the big brother fear or the other conspiracy theories. The bigger you get the greater level of scrutiny you receive from governmental and watch dog groups.
For me, I’ll happily use Google Analytics , Optimizer and any other quality product they launch to make my clients more and more money and leave the conspiracy theorists to their own beliefs.
I agree with Kevin on happily using Google Analytics, it truly is a great product (and Google Optimizer looks nice as well), but one always should (one would be very naive otherwise) keep a watchful eye on a powerful corporation (just like you would for the government or any powerful person) to see what they could possibly do with all that data.
Let’s take a look at some other examples in order to give an idea of how things could go wrong:
And yet, Google wants to expand further into online data storage?
It may seem like Google is the “Do No Evil” company at this point in time, but what comes of the day when there is new management? No longer is there a concern about the scrutiny as the new management may think it can get away with the data it has.
Oh, and just because a corporation or government is bigger does not mean it will be less likely to avoid trouble–otherwise the United Nations would be the cleanest government around the world.
3 Dec
Not that many people use Yahoo as they once did (although still around 20% is a decent number), here’s the top numbers:
- Britney Spears
- WWE
- Paris Hilton
- Naruto
- Beyonce
- Lindsay Lohan
- Rune Scape
- Fantasy Football
- Fergie
- Jessica Alba
I’m pleasantly surprised that Rune Scape is on there versus the rest of some very uninteresting and stupid topics,b ut then again I do have a bias towards games and am very anti-pop stuff.
I personally like the category sections more:

Admittedly, if this was a list for Google, I’d be going into far more detail, but with it being Yahoo… eh.