Todays date: 6/14/2025
Last blog entry: 7/27/2004
Last Article entry: 12/4/2003

"T's search engine optimization Blog and SEO News"

"opinions are like asseholes, everyone has one, take from mine what you want and forget the rest of it!"
Da' Tmeister, Editor

h_line2.gif (398 bytes)

Google: New Algos or "SEO Filter"?

The recent changes to Google results have created a lot of buzz within the SEO community. Many theories have been discussed and debated. The "SEO Filter" discussion seems to be the one that holds water, or rather, has the fewest leaks. However, there are anomalies within the results that don't point to a single "theory" being totally true. My belief is that we are seeing newly acquired technology and PageRank modifications in play.

This article makes use of a number of opinions by people whom I personally regard highly as SEOs who do ongoing research and are members of a forum which I know to be pretty reliable. Foremost I'd like to thank all the IHY moderators especially Dan0 (Dan Thies), who pointed to the TSPR paper and the author's status with Google, Alan Perkins for a concise description of how TSPR seems to be used with a query and a special thanks to Bernard Ertl, whose knowledge of his competitors and research of a Florida SERP were very important in bringing Bob and me to the conclusions we have about the recent Google update. I'm not so sure Bob is in total agreement with our research but... we'll see, won't we grasshopper!

The "SEO Filter" Theory

Barry Lloyd, AKA MakeMeTop, was the first to advance an explanation for some wild moves downward in the results. The theory is that sites that had used link text and reciprocal link campaigns to manipulate results were being penalized. Or were they penalized?

Barry also pointed out two articles which served to explain a number of the changes which were being reported in the forums. The Hilltop paper written by Krishna Bharat and George A. Mihaila, and the CIRCA Technology white paper by Applied Semantics (a company recently acquired by Google and the creators of the technology behind AdSense) were cited by Barry as possibly related to what has been seen in the Google SERPs of late.

Hilltop made a lot of sense to me, and its ideas seemed to be reflected in the results. The paper and one of its authors were in the news after a conference and it negated the type of "manipulation" I personally felt had been too prominent in the results. That manipulation appeared to be at the centre of what the filters seemed to be filtering. Or was it really a filter?

The technique which I refer to is something I like to call "optimization through promotion." Basically, a link campaign is run, requesting specific link text and descriptions. In most of the sites I looked at the pages linked to were optimized for that text. Hence, the theory was that an "SEO Filter" was in play and that the old results could be found through added text in the query. The "-adsfs -adsf" or "-googlegoo" query supposedly shows the "SEO Filter" at work. Or does it?

Other SERPs demonstrate that it could just be a filter for "SEO over-optimization," tripped by having overly high keyword density (commonly found in bad spam implementations of cloaking and keyword stuffing), over-optimization of a title or headings and a host of other well-known SEO techniques. This particular theory seemed to hold some water, but not enough for me to conclude that it is a very good explanation for the results being displayed by Google.

A good example of this is the Examination of a 'Florida' SERP discussion and SERP analysis prepared by Bernard, a moderator at IHY. I found this to be anomalous compared to the other SERPs I'd seen supposedly affected by the "SEO Filter". It filtered the site but the linking part of the theory didn't hold water because it was linked to by a number of the "experts" identified in the SERP. In my opinion, it was an anomaly because it possibly was only penalized for an SEO technique. Or was it penalized?

Perhaps it was the high keyword density alone. Links were normal and most used the company name rather than the optimized phrase in the anchor text. That is not to say that even a single inbound link with the matching text could trip the filter. That very well could be the case if it is in fact an "SEO Filter," but as some have noted this would be very easy to manipulate by malicious webmasters. In the final analysis it was this SERP and others like it which helped me to better understand what was taking place in the Google results. In my final conclusions I will try to shed some light on this SERP.

Applied Semantics Influence

Another commonly discussed topic is that of the seemingly conspicuous division between the algorithmic results and ads into categories of commercial and informational. This has raised a lot of eyebrows with the constant rumor of Google going public, the theory being that it is a money grab. I can only hope this isn't the case because I've yet to see that work out in a positive result for anyone, stockholders or users!

One thing that I have noticed in the backlinks I watch is the sudden creep of ads running on sites into the backlinks counted in PR calculations. As far as I can tell, that is new, and distressing. One site I watched rise up the SERPs suddenly died. It was running run-of-site ads on some obscure directories. I'm watching a couple of others in the same situation to see if it is filtered now, or just a case of it disappearing with refreshes of the database.

If you read the Applied Semantics paper you'll note some discussion of stemming. In my opinion, the stemming that is being reported and the belief that Google is now stemming is not always reflected in the results. It is still selective stemming, but with a larger "dictionary" being used. This is likely influenced by the stemming used in the Applied Semantics technology. I believe that the enlarged stemming dictionary explains the "optimisation" and "optimization" change. In the past they were separated, but now you find both together.

While researching with Bob some terms with which I'm very familiar, we came across some SERPs which did show stemming being used. Bob is in Boston, Massachusetts and I'm in Toronto, Ontario. We both saw different ads, and most importantly the stemming was different depending on the ads which were being displayed. This seems to be some proof of my theory that stemming is a result of influences from the Applied Semantics ad serving technology. There has also been the rumor that this technology is in early release and is only being used in cases where the user's physical location invokes use of the technology or possibly different parts of the technology.

The best indication of this is the idea that if "true" stemming were being implemented Google would be plural insensitive. In all cases I looked at this was not the case, at first. Every day I see new instances of Google stemming, but nothing has changed: you can't optimize a page for stemmed results; you still need to target them on different pages (my favorite method), and watch the SE referrers to determine which is most important. I go on the premise that most people are looking for more than one result, so they query using plurals, but that is neither a proven or researched technique. Call it an educated guess.

Is it Topic-Sensitive PageRank?

Dan0, a moderator at IHY, pointed out the Topic-Sensitive PageRank paper by Taher H. Haveliwala, which has some similarities to and overlaps with the Hilltop paper, and would if implemented result in "experts" regaining prevalence in the results. Note in the early days of Google this was the case. It gained popularity quickly with researchers for this reason.

In the paper, Haveliwala, and employee of Google since October, discusses the use of topics to improve PageRank:
"To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. By using these (precomputed) biased PageRank vectors to generate query-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared."

Interesting that the last sentence mentions using "context in which the query appeared" and Applied Semantics' technology is also contextual. I see a theme here.

Of particular interest is this statement later in the article:
"By making PageRank topic-sensitive, we avoid the problem of heavily linked pages getting highly ranked for queries for which they have no particular authority [3]. Pages considered important in some subject domains may not be considered important in others, regardless of what keywords may appear either in the page or in anchor text referring to the page".

Are the present results being caused by a filter or a new algorithm? If TSPR is truly what we're seeing, it seems apparent that it is a new algorithm. It was noted in the IHY staffroom discussion that keywords could be easily mistaken for topics in this context.

Some clues to TSPR usage are found in the SERPs and Google's directory. After a short disappearance the ODP category links are back in the results for many sites. Or are they really ODP categories? Perhaps they are TSPR topics and descriptions.

One of the anomalies I've noticed coming up in these results is that sites with no ODP category for the page in the results seem to slip past "non relevant" incoming link filters. These sites have large numbers of irrelevant links, many using optimized link text. A term for which I watch the SERP closely is "search engine optimization." For obvious reasons, at any given time I know the top 30 and when they last appeared there, have done link research on them, and have a pretty good idea how they got there and stay there. Because of my work with SeoPros, let's just say this SERP gets extra attention and has been a bit of a hobby for more than a few years. Yeah, I know, get a life.

There are two sites that are slipping through the cracks, both of which have one thing in common. They either don't have an ODP listing (never a good sign for an SEO), or the listing URL in the Google algo search SERP is displayed in the Google directory results instead of the URL in the ODP listing. When you look in the Google directory, which is supposed to be a clone of ODP, the instances that I and Bernard found give a good explanation of most of the questioned ranks. Any result that is stemmed, or could be stemmed, is likely to be fluctuating for a while.

One additional clue is a site I had been watching, which in my opinion should have tripped the filter by adding the link text phrase to the title. It had an effect, but a positive rather than negative one. In my opinion, if the "SEO Filter" is a reality then this should have affected the rank negatively. It was #1, up from #5-10ish then dropped to #2 a few days later. This was after the phrase had not been in the title for weeks prior to this update. In fact, I noticed the change after checking my notes and confirming a discussion that Bob and I had a while back about the optimization of the page, or, ummm, lack thereof... says a little about the link text. Or does it?

Another clue to the adjustment of PR using the implementation in the paper is how ODP categories are influencing results. Note sites that place well and those below them seem to be influenced by the category they are included in on ODP. This is an indication that, as the paper suggests, the categories of the ODP are being used to identify "experts" for a topic. Nothing definitive here, I'll let you make up your own mind about this. In my opinion, if it is, it's because of the quality of the editors and the meta editors overseeing them. The paper also discusses adversarial editors and specifically mentions the ODP, but this could just as well be Google itself at present. All Google really requires is the initial ODP dump and they have all they need to use TSPR. Who knows? The Google directory may be a reflection of the topics and experts. Funny, Google's directory does look an awful lot like the algo search results.

Conclusions

In the end, the questions about anomalies in the Florida SERP, Bernard, I, and others had been discussing IMO, and the opinion of others, is, they were influenced by the "authorities" in which the sites were in, and the PR associated with the "authority" page. In other words, good old PR as we have always known it! Bernard believes there is at least 1 anomaly remaining, however, IMO, his data is slightly skewed by "non authoritative" links. I used link: in my research and found that the quality of the incoming links seemed to smooth out the anomalies. IMO, www.domain.com -site:www.domain.com is showing links Google PR has never counted as many are less than PR4 or have whitebars.

These changes have been ongoing since April, with the explanation that new "filters" for inappropriate behavior or unwanted manipulation were being added. The new "filters" were rumored to be for hidden text and links. In my opinion, only human review can accurately detect hidden text and links, and that seemed to be the results of the new "filters." They caught the stupid over-doers, but not the informed, calculated implementations. In my opinion, SEOs' fear of penalties and the seemingly adversarial relationship they have with engines is part of the reason for the "SEO Filter" being pointed to as the source of the changes when all it really was is Google trying to make itself better for its users!

The Goo?

The goo, as it has come to be known, could just be Google giving you what they think you want, namely, the old index. You're searching with negative phrases so it's using the old data because the new Google doesn't do that yet. I have my opinion of Google's present results, and search engines determine what's unwanted manipulation, not me, or anyone else I know.

Hilltop?

Interesting, but if it is Hilltop, in my opinion there is another shoe to drop.

Although current SERPs appear Hilltopesque, it is either only partially implemented to date or the real change is reflected in the TSPR paper. I lean toward a new Google employee overseeing implementation on a grander scale. Haveliwala's paper was researched using Stanford's index of about 120 million pages. Guess what else was based on the same Stanford index -- Google, circa 1998.

"SEO Filter"?

There are indications that there is some sort of linking filter in place, however, it could also be the result of the new algorithm. For some engines, the siege mentality is greater than others. I have believed that Google was least adversarial toward SEOs. But then again I might be just sayin' that to get on their good side.

As Dan0 has pointed out a few times, the "SEO filter" theory is a subjective conclusion. There is plenty of evidence that a lot of the sites thought to be filtered for SEO were just subject to the changes in the algo and the designation of "experts" and topics. There is no absolute proof of this, but in my opinion it fits with past actions by Google. Google seldom bans, preferring to adjust ranking by using PR. The "SEO Filter" seems more like a ban for the optimized term then an adjustment.

In short, do what we did in the old days. It never stopped working, it's just a little slower. In the end what Google does, or any SE for that matter, can't change that, you still have links to your good content and the directories are always looking to improve topics. Good content, and a long term plan is what was left standing after all is said and done, always has been, and always will be, Google just gave SEOs another lesson on why!

See Ya' at The Top
Da' Tmeister

Edited by Bob Gladstein, with some research by the generous moderators, owner Doug Heil, and Members of the IHY forums who willingly share their collective knowledge, without which I'd probably be still scratchin' my noggin too!

............... see ya at the top!
Da' Tmeister, Editor

SEO Hangouts:

For less than the cost of a cuppa' coffee a day?
SEOdojo SEO Training As a certification and training committee member for SeoPros I found theGypsy's SEO Dojo has the best SEO patent library available. Not to mention the incredible peeps to learn with and from!
h_line1.gif (303 bytes)
Social Media Hangouts:

T's Quote:
"What the mind of man can conceive and believe, It can achieve."
Napolean Hill ~ Think and Grow Rich

DoJoPeeps to Checkout!
Steve Gerenscer AKA Feydakin
Animal Charms
h_line1.gif (303 bytes)
Webmaster T's New SEO recommendation service. Search engine marketing, campaign monitoring and certification. Rating real results from active campaigns and services. See your site like a search engine does!

  Looking for something you've read in the past in the Blog area or T's qued for publishing. Check the search engine Webmaster T's optimization and SEO Blog archives. If it was on the cover you'll find it there.
h_line1.gif (303 bytes)


  All the old search engine placement and web development articles are archived here. Watch the searchable Directory evolve from the archive.

[Blog Home] [Blog Articles] [Posts] [Original WOD Archives]