A view on Google’s
Patent: Information Retrieval Based on Historical
Data
by: Peter Faber
Google doesn’t stop innovating
their search engine, and there where others
try to follow, Google is not just 1 step ahead,
but 10 steps ahead. Their latest innovation,
which actually may already be in place for a
year or longer, can be found in the patent:
“Information Retrieval Based on Historical
Data.”
The abstract of the patent is:
“A system identifies a document and obtains
one or more types of history data associated
with the document. The system may generate a
score for the document based, at least in part,
on the one or more types of history data“.
This article has the goal to
give a implified representation of this patent
+ contains recommendations as to what would
be the best SEO techniques to obtain high rankings,
with a specific focus on links. This article
is the opinion of the writer and following recommendation
in this article is done at your own risk.
Google’s search results
have been increasingly difficult to explain
and many theories have been developed on what
is going on. Most popular is the “sand
box” theory, which says that a new site
is put in a virtual sand box and has to wait
until it has aged before obtaining high rankings.
This patent has some excellent information that
can explain this phenomenon.
Information Retrieval
Need to print your photo collections quickly
and creatively? FotoSlate 4 Photo Print Studio
is the answer. Download today!
The information that this invention
of Google is claimed to retrieve based on the
historical data are:
Age/Time
Change
Trends
A score is calculated based on the above 3 factors
which can then, at least partially, be used
to rank the selected pages.
Historical Data
The patent describes a huge
amount of historical data. The following is
an overview of most items for which historical
data can be measured:
Pages/sites
Links
Anchor Texts
Content
Query
Traffic
Ranking
User
Domain
Ranking Based On Information Retrieved From
Historical Data
The patent describes in quite
a lot of detail how selected pages are ranked
based on the information retrieved from historical
data. This chapter will describe the basic logic
applied.
Age/Time
Of all historical data a date
of inception is used to determine 4 important
values:
Age
Average Age
Date
Average Date
These factors can be determined for pages, links,
anchor text, content, topics, queries, etc.
Comparing the age or date of a page to the average
of the site for example tells the search engine
if this information is relatively new or old.
Comparing the average age or
date of a page to the average age or date of
all pages selected for a query (keyword phrase)
tells the search engine if the page is relatively
new or old. This information can be used to
rank the selected pages.
Comparing to an average has
the advantage that there is no preset base of
rules that determine the rankings of a page.
For one query 6 months may be considered new
(product descriptions for example) while for
another page 6 days may be considered old (news
items for example). It all depends on the average
age.
This same logic applies to links.
In order to determine how popular a page or
site is, the average age of all back links tells
the search engine if the popularity of the page
is recent or not. It makes sense that if most
back links have been obtained 4 years ago and
that hardly anybody has been interested to link
to this page/site since then, that the page
is not as popular as the existing back links
would suggest.
The patent goes even as far
as determining age factors for anchor texts
of links.
Change

Kroo (White) Apple Mini Ipod Cool (Color Fade
Resistant) Silicone Skin Case - Retail
Sku# Mini-Clear
$4.99
Keydex 16.5 Ft Rj45 Cat5 Ehanced Utp Molded
Snagless Network Cable (Gray) - Lifetime Warranty
Sku# Key-16Cat-Gray
$3.99
Lg Electronics 16X Dvd±R / 6X (Dvd-Rw) / 8X
(Dvd+Rw) Dual Layer Ide Drive - Begie
Sku# Gsa-4163B
$48.00
Pcms 6Ft Usb 2.0 A To B Cable Printer Cable
Sku# Cable-Usb2-6-Ab
$1.50
Information changes over time.
Opinions change, knowledge changes, popularity
changes, etc. Like mentioned before, a page
that was popular 4 years ago, may be totally
forgotten now, but still have most of its backlinks
that were obtained when the page actually was
popular. However, if this page all the sudden
becomes popular again, and new back links start
showing up, the average age of the backlinks
will remain high. This will prevent the page
of ranking high.
Detecting changes is crucial
to give old information the chance to rank high
again. Consequently, the lack of change can
be a reason to lower the rank of a page.
Trends
Even though comparing to averages
is a great way to get information about freshness,
it fails to recognize smaller events like a
sudden increase in popularity of a page. Though
detecting changes do help to recognize smaller
events, more information can be obtained by
detecting trends.
Sudden increases of popularity
can be caused by seasonal events like Christmas
or the Super Bowl. For this reason the search
engine will try to determine trends within pages
links, anchor text, content, topics, queries,
etc. Detecting trends makes it possible to rank
pages higher that would not be ranked high with
the standard ranking methods or with comparing
to average ages or dates. Google has recognized
here a very important fact of information: Relevance
and importance of information is (con)temporary.
Detecting Spam Using Historical
Data
Having all kinds of historical
data available can be used to detect search
engine spam. Unexpected events that happen to
a site can be an indication of spam. Obviously
a strong improvement of 1 single factor would
not be a direct indication of spam, generally
multiple factors are showing strange behavior
when a site is using spam to increase rankings.
It would not be in Google’s interest to
penalize a site for advertising. However, excessive
advertising in sites/pages that are totally
unrelated will not do your site any good.
Recommendations
Nothing changed in regards to
links. This patent pretty much confirms what
we at www.textlinkbrokers.com already knew and
have been explaining to our customers as well.
The following recommendations can be helpful:
Keep links related
Related links matter, unrelated
links can be considered spam.
Build links on a continuous
moderate bases
As the patent describes, the
average age of your backlinks should not be
too high. It is therefore wise to continue adding
backlinks to secure a reasonable average age
of all your backlinks. How many you need to
add over time depends on your market.
Be better than the average
Very important is to be better
than the average, but don’t overdo it.
It would be expensive and unnecessary.
Focus on seasonal events
A good way to increase the success
of your website is to set up text link campaigns
for seasonal events. Start your advertising
campaign 2 to 3 months before the actual event
to give Google the time to find the links and
update your site’s information with it.
After the event you can let these links go again.
Spread links over multiple sites
(unique backlinks)
A very important factor is the
number of unique websites in your backlinks.
Google seems to put a strong emphasis on this
factor.
About The Author
Peter Faber is an Internet marketing consultant
working for http://www.textlinkbrokers.com,
an SEO company specialized in link building.
He has his own personal blog at http://www.seo-works.com.
planetfuzzy
movies-ipods-zune-tv
|