<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Checking your supplemental page count</title>
	<atom:link href="http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/</link>
	<description>Terrible writing and mere conjecture</description>
	<pubDate>Fri, 25 Jul 2008 20:52:47 +0000</pubDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: Introducing a new SEO term: Supplemental-Only &#187; JLH Design Blog</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-4914</link>
		<dc:creator>Introducing a new SEO term: Supplemental-Only &#187; JLH Design Blog</dc:creator>
		<pubDate>Tue, 10 Jul 2007 21:42:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-4914</guid>
		<description>[...] &#8220;My site went supplemental&#8221; A great discourse took place right on this blog (read the comments) that of course didn&#8217;t get a lot of airplay, but had great [...]</description>
		<content:encoded><![CDATA[<p>[...] &#8220;My site went supplemental&#8221; A great discourse took place right on this blog (read the comments) that of course didn&#8217;t get a lot of airplay, but had great [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sebastian</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-286</link>
		<dc:creator>Sebastian</dc:creator>
		<pubDate>Fri, 09 Mar 2007 19:05:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-286</guid>
		<description>I'm sooo sorry!</description>
		<content:encoded><![CDATA[<p>I&#8217;m sooo sorry!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JLH</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-285</link>
		<dc:creator>JLH</dc:creator>
		<pubDate>Fri, 09 Mar 2007 18:39:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-285</guid>
		<description>How can we make wild-ass-guesses if you are going to start injecting logic into the equation?</description>
		<content:encoded><![CDATA[<p>How can we make wild-ass-guesses if you are going to start injecting logic into the equation?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sebastian</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-284</link>
		<dc:creator>Sebastian</dc:creator>
		<pubDate>Fri, 09 Mar 2007 18:34:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-284</guid>
		<description>Identical timestamps could be explained with crawler optimization. If one crawler has fetched a page it's put in a cache where every process can get a copy for its own purposes. That's no proof for the one database theory, and it doesn't prove that there are two databases;)</description>
		<content:encoded><![CDATA[<p>Identical timestamps could be explained with crawler optimization. If one crawler has fetched a page it&#8217;s put in a cache where every process can get a copy for its own purposes. That&#8217;s no proof for the one database theory, and it doesn&#8217;t prove that there are two databases;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JLH</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-280</link>
		<dc:creator>JLH</dc:creator>
		<pubDate>Wed, 07 Mar 2007 18:05:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-280</guid>
		<description>Excellent points by all so far.  As an aside this is why I have published my nofollow policy, the comments here have added valuable content to this page and should be at the minimum should offer links to the authors.

Truly I stopped worrying about supplementals a while back around the bidcrappy time when the site: operator went to hell.  But it's an issue that comes up often in discussion and with your insights I'm getting a clearer view on it.  

As satisfying as it is to answer somebody's question with, "don't worry about pagerank just build your site for users and you'll get links naturally" it really doesn't put the questioner at ease any.  Same thing goes with, "My site went supplemental, what should I do?"  Perhaps the right answer is "don't worry about it all pages are actually in the supplemental index, work on your site and garner natural links so that more pages stay in the regular index." But again not that satisfying to the average panicking pickle jar art salesmen on the web.  Having concrete examples like Halfdeck showed and clinical observations from John to show as an example go a long way to put someones mind at ease.

I'm not sure if this is totally accurate but I look at the supplemental index as a prioritizing tool for Google.  The entire web is growing faster than they can keep up.  By keep up I mean keeping up with a FRESH crawl of each page and even space in the SERPS.  Of course their isn't a finite amount of search queries as any combinations of words could be used, but after a while I'm sure Google has a statistical hold on what the top 90% of searches are for.  Within those searches they only need to show 1000 results, so in essence there is a finite amount of results available, which they are constantly working on improving.  The biggest subset of the web however is going to be that remaining 10% (numbers just made up by me for clarity) which is pretty much an infinite amount of queries and possible results. 

On the other side of things, the internet itself its growing exponentially.  With CMS like wordpress proliferating on the web anybody can publish a 100 page site in minutes.  When Larry and Sergy put this whole google thing together in their dorm room it took some effort to publish a site, there were large obstacles to getting in the game.  Now the domains are a $1.99 and hosting is cheaper.  Sure they add new data centers and new crawlers and have geniuses working for them that can scale the database up so that you get a search result in 0.0245 seconds, but there is a limit to their growth potential based on hardware, bandwith, and database updates.

Given the nature of a relative finite amount of 'popular' searches and infinite amount of internet growth coupled with limitations on crawling capacity growth they had to come up with a solution.  That solution in my opinion is the supplemental index.  It's a status given to all "discovered" urls that have any value whatsover, a link at some point in time had pointed to the page.  Granted the page may not come up for searches often, may be gone, may actually have little or no value, but it's their duty as a good search engine to at least keep it in the index.  However they don't have the resources to update it as often as the washingtonpost's home page, nor should they.  Thus was born the crawler priority, with supplemental being the lowest priority.

I don't have a clue where the cut-offs are but would imagine that every url that is deemed worthy of being indexed is given a crawler priority.  Some are updated daily, others weekly etc.  All urls are in the supplemental crawl priority which is months rather than days or weeks.  

Now here's where it gets interesting in my point of view.  If given what I said above is even remotely true, it's a scalable solution to a point but requires the judgement of an computation to decide when to crawl pages.  A judgement that in my mind is pretty bad at times.

Looking at my own server logs I am amazed at some of the pages that get crawled regularly, but equally confused at some of them that don't.  Of course if you want to you can play with site structure and external links to improve a pages crawling, but is that really a productive use of ones time.  

To this end, I offer a proposal to the Google team that I'm sure none will read, but I'm going to do it anyway because this is my blog dammit!

1)  Remove the green supplemental thing.  The average searcher has no idea what the heck it means anyway and it does nothing but infuriate the webmasters that do care.  You can still have the supplemental priority and all, I just don't see the point of it.  I think its googles way of having a disclaimer that the URL hasn't been crawled in a while so they can't say if its going to even look like the cached copy.

2) For a given site I'm sure locked away somewhere in a database is a number that's calculated based on the amount of pages in the site and the crawl priority for each page.  Call this crawl load.  The crawl load is calculated somehow like this:

Say you have a 100 page site.  Google has given values to each of those 100 pages, fictitiously let's say that 10 are to be crawled weekly, 30 are to be crawled bi-weekly, 30 are to be crawled monthly, and the remaining 30 are to be crawled every 3 months.  This would give us a calculated crawl load of (10 x 4 x 3) + (30 x 2 x 3) + (30 x 3) + (30 x 1) = 420 or 420 crawls per given 3 month period.  

3)  Now we get into sitemaps which brought us all together in the first place...the webmaster gets to give a hint of what they think is a crawl priority based on their knowledge of the site.  For instance my T&#038;C page, the contact page, and all of the product descriptions hasn't changed in 9 months so I'll put those at the lowest of the scale say 10.  However I update my blog daily and have a news page that updates every 3 days or so, so I prioritize those higher.  

4)  Google then calculates what your crawl load score is based on your recomendations, if it's lower than theirs they use it.  If it's higher than theirs they use theirs.

5)  Let this fact be known and you'll have webmasters all around the world scrambling to optimize their crawl priorities such that their most important pages are crawled regularly and the one thats are the most static are not.  It would be a win-win, google can concentrate on crawling new sites more with the found bandwidth and the site owner would be happy because all of their caches would reflect what's actually on the page.

Now this is just a dream of mine and carries no weight at all, but if you're still reading, thanks.</description>
		<content:encoded><![CDATA[<p>Excellent points by all so far.  As an aside this is why I have published my nofollow policy, the comments here have added valuable content to this page and should be at the minimum should offer links to the authors.</p>
<p>Truly I stopped worrying about supplementals a while back around the bidcrappy time when the site: operator went to hell.  But it&#8217;s an issue that comes up often in discussion and with your insights I&#8217;m getting a clearer view on it.  </p>
<p>As satisfying as it is to answer somebody&#8217;s question with, &#8220;don&#8217;t worry about pagerank just build your site for users and you&#8217;ll get links naturally&#8221; it really doesn&#8217;t put the questioner at ease any.  Same thing goes with, &#8220;My site went supplemental, what should I do?&#8221;  Perhaps the right answer is &#8220;don&#8217;t worry about it all pages are actually in the supplemental index, work on your site and garner natural links so that more pages stay in the regular index.&#8221; But again not that satisfying to the average panicking pickle jar art salesmen on the web.  Having concrete examples like Halfdeck showed and clinical observations from John to show as an example go a long way to put someones mind at ease.</p>
<p>I&#8217;m not sure if this is totally accurate but I look at the supplemental index as a prioritizing tool for Google.  The entire web is growing faster than they can keep up.  By keep up I mean keeping up with a FRESH crawl of each page and even space in the SERPS.  Of course their isn&#8217;t a finite amount of search queries as any combinations of words could be used, but after a while I&#8217;m sure <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> has a statistical hold on what the top 90% of searches are for.  Within those searches they only need to show 1000 results, so in essence there is a finite amount of results available, which they are constantly working on improving.  The biggest subset of the web however is going to be that remaining 10% (numbers just made up by me for clarity) which is pretty much an infinite amount of queries and possible results. </p>
<p>On the other side of things, the internet itself its growing exponentially.  With <acronym title="Content Management System">CMS</acronym> like wordpress proliferating on the web anybody can publish a 100 page site in minutes.  When Larry and Sergy put this whole google thing together in their dorm room it took some effort to publish a site, there were large obstacles to getting in the game.  Now the domains are a $1.99 and hosting is cheaper.  Sure they add new data centers and new crawlers and have geniuses working for them that can scale the database up so that you get a search result in 0.0245 seconds, but there is a limit to their growth potential based on hardware, bandwith, and database updates.</p>
<p>Given the nature of a relative finite amount of &#8216;popular&#8217; searches and infinite amount of internet growth coupled with limitations on crawling capacity growth they had to come up with a solution.  That solution in my opinion is the supplemental index.  It&#8217;s a status given to all &#8220;discovered&#8221; urls that have any value whatsover, a link at some point in time had pointed to the page.  Granted the page may not come up for searches often, may be gone, may actually have little or no value, but it&#8217;s their duty as a good search engine to at least keep it in the index.  However they don&#8217;t have the resources to update it as often as the washingtonpost&#8217;s home page, nor should they.  Thus was born the crawler priority, with supplemental being the lowest priority.</p>
<p>I don&#8217;t have a clue where the cut-offs are but would imagine that every url that is deemed worthy of being indexed is given a crawler priority.  Some are updated daily, others weekly etc.  All urls are in the supplemental crawl priority which is months rather than days or weeks.  </p>
<p>Now here&#8217;s where it gets interesting in my point of view.  If given what I said above is even remotely true, it&#8217;s a scalable solution to a point but requires the judgement of an computation to decide when to crawl pages.  A judgement that in my mind is pretty bad at times.</p>
<p>Looking at my own server logs I am amazed at some of the pages that get crawled regularly, but equally confused at some of them that don&#8217;t.  Of course if you want to you can play with site structure and external links to improve a pages crawling, but is that really a productive use of ones time.  </p>
<p>To this end, I offer a proposal to the <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> team that I&#8217;m sure none will read, but I&#8217;m going to do it anyway because this is my blog dammit!</p>
<p>1)  Remove the green supplemental thing.  The average searcher has no idea what the heck it means anyway and it does nothing but infuriate the webmasters that do care.  You can still have the supplemental priority and all, I just don&#8217;t see the point of it.  I think its googles way of having a disclaimer that the <acronym title="Uniform Resource Locator">URL</acronym> hasn&#8217;t been crawled in a while so they can&#8217;t say if its going to even look like the cached copy.</p>
<p>2) For a given site I&#8217;m sure locked away somewhere in a database is a number that&#8217;s calculated based on the amount of pages in the site and the crawl priority for each page.  Call this crawl load.  The crawl load is calculated somehow like this:</p>
<p>Say you have a 100 page site.  <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> has given values to each of those 100 pages, fictitiously let&#8217;s say that 10 are to be crawled weekly, 30 are to be crawled bi-weekly, 30 are to be crawled monthly, and the remaining 30 are to be crawled every 3 months.  This would give us a calculated crawl load of (10 x 4 x 3) + (30 x 2 x 3) + (30 x 3) + (30 x 1) = 420 or 420 crawls per given 3 month period.  </p>
<p>3)  Now we get into sitemaps which brought us all together in the first place&#8230;the webmaster gets to give a hint of what they think is a crawl priority based on their knowledge of the site.  For instance my T&#038;C page, the contact page, and all of the product descriptions hasn&#8217;t changed in 9 months so I&#8217;ll put those at the lowest of the scale say 10.  However I update my blog daily and have a news page that updates every 3 days or so, so I prioritize those higher.  </p>
<p>4)  <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> then calculates what your crawl load score is based on your recomendations, if it&#8217;s lower than theirs they use it.  If it&#8217;s higher than theirs they use theirs.</p>
<p>5)  Let this fact be known and you&#8217;ll have webmasters all around the world scrambling to optimize their crawl priorities such that their most important pages are crawled regularly and the one thats are the most static are not.  It would be a win-win, google can concentrate on crawling new sites more with the found bandwidth and the site owner would be happy because all of their caches would reflect what&#8217;s actually on the page.</p>
<p>Now this is just a dream of mine and carries no weight at all, but if you&#8217;re still reading, thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron Pratt</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-278</link>
		<dc:creator>Aaron Pratt</dc:creator>
		<pubDate>Wed, 07 Mar 2007 13:30:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-278</guid>
		<description>All but the first link in the above image for Matt Cutts are duplicates so they should be in supplemental. Maybe cuz the title had the word "crappy" in it the algorithm saw that post as low value. ;)</description>
		<content:encoded><![CDATA[<p>All but the first link in the above image for Matt Cutts are duplicates so they should be in supplemental. Maybe cuz the title had the word &#8220;crappy&#8221; in it the algorithm saw that post as low value. <img src='http://www.jlh-design.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Halfdeck</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-277</link>
		<dc:creator>Halfdeck</dc:creator>
		<pubDate>Wed, 07 Mar 2007 13:05:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-277</guid>
		<description>Good questions John (and interesting point, Softplus).

Here's something Matt Cutts said that's often quoted (and worth quoting often):

"so when Bigdaddy didn’t select pages from a site, that would expose more supplemental results for a site."

Underline "expose." He didn't say pages "go" supplemental or they were "tagged" supplemental. He said more supplemental results would be "exposed" - as if they are usually hidden or masked.</description>
		<content:encoded><![CDATA[<p>Good questions John (and interesting point, Softplus).</p>
<p>Here&#8217;s something Matt Cutts said that&#8217;s often quoted (and worth quoting often):</p>
<p>&#8220;so when Bigdaddy didn’t select pages from a site, that would expose more supplemental results for a site.&#8221;</p>
<p>Underline &#8220;expose.&#8221; He didn&#8217;t say pages &#8220;go&#8221; supplemental or they were &#8220;tagged&#8221; supplemental. He said more supplemental results would be &#8220;exposed&#8221; - as if they are usually hidden or masked.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JohnMu</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-276</link>
		<dc:creator>JohnMu</dc:creator>
		<pubDate>Wed, 07 Mar 2007 08:02:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-276</guid>
		<description>The supplemental and the main index are filled with duplicates: you cannot "count" the number of pages (that's why the "about"-count is usually so far off) since the count (from the "database") is way off all the time.  If your site has a number of URLs in the index (take a shop for example), set up a Google Custom Search Engine. Adjust the filters using very fine changes: you'll see how all the different variations of the pages show up step by step (the more you filter out of the index). In the end, I saw that the average (dynamic) page in a forum was indexed over 10x with different variations of the URL, but in the index (with the site:-query) it was shown only once. Which count is the right one?

This is multiplied by using a plugin which does things that you do not know or can not reproduce. This is a general problem with all tools that can not show how a result was found: how do you know it's doing it "right" (especially when there is no obviously right way to do it)? Where is it getting it's data from? And on Google: which datacenter - or does it not matter? 

These things make the supplemental index so crazy - nobody can get a real grip on them :-). 

The question is - do we really need a grip on the items in the supplemental index or should the average webmaster concentrate on getting things into the main index (regardless of whether or not they're in the supplemental index)?</description>
		<content:encoded><![CDATA[<p>The supplemental and the main index are filled with duplicates: you cannot &#8220;count&#8221; the number of pages (that&#8217;s why the &#8220;about&#8221;-count is usually so far off) since the count (from the &#8220;database&#8221;) is way off all the time.  If your site has a number of URLs in the index (take a shop for example), set up a <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> Custom Search Engine. Adjust the filters using very fine changes: you&#8217;ll see how all the different variations of the pages show up step by step (the more you filter out of the index). In the end, I saw that the average (dynamic) page in a forum was indexed over 10x with different variations of the <acronym title="Uniform Resource Locator">URL</acronym>, but in the index (with the site:-query) it was shown only once. Which count is the right one?</p>
<p>This is multiplied by using a plugin which does things that you do not know or can not reproduce. This is a general problem with all tools that can not show how a result was found: how do you know it&#8217;s doing it &#8220;right&#8221; (especially when there is no obviously right way to do it)? Where is it getting it&#8217;s data from? And on Google: which datacenter - or does it not matter? </p>
<p>These things make the supplemental index so crazy - nobody can get a real grip on them :-). </p>
<p>The question is - do we really need a grip on the items in the supplemental index or should the average webmaster concentrate on getting things into the main index (regardless of whether or not they&#8217;re in the supplemental index)?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JLH</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-271</link>
		<dc:creator>JLH</dc:creator>
		<pubDate>Wed, 07 Mar 2007 05:11:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-271</guid>
		<description>Excellent point Halfdeck and dully noted (by the crossing out of my assumption in the post).  Not that it matters much, but you got me thinking, is supplemental really a separate database or just  status in the same database.

So using my query used to find the page that I highlighted as supplemental we see a cache of:
http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+site:www.mattcutts.com+***+-view:adghasdtrb&#038;hl=en&#038;ct=clnk&#038;cd=5&#038;gl=us

Using your query the cache of the page is:

http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+SEO+mistakes+crappy&#038;hl=en&#038;ct=clnk&#038;cd=1&#038;gl=us&#038;client=firefox-a

Both have the same date stamp and the same cache:gYwvJd0xNv0J (whatever that is).

So IF (and that's a big if) the same cache means that its the same file but represented in the supplemental and non-supplemental indexes.  

I'd imagine that being supplemental is just a status that any page is automatically assigned and all the fun that comes with that status, such that it's crawled on the less frequeent supplemental crawl rate and updated during the supplemental refresh.  Now a page may also have regular index status which of course supercedes the supplemental status.

Here's another question.  Are degrees of being purely supplemental? In that the page is not in the regular index at all but there are different levels of supplementalization (new word) like having levels such that some are crawled every 8 weeks, others every 3 months, etc.

Coming out of supplemental would then be better defined as actually just being added to the regular index as the page never actually leaves supplemental.  Of course if a page does leave the supplemental index then its pretty much gone which is much worse than be supplemental.  

So for all of those crying to get out of the supplemental index, be careful for what you wish for you just may get it.  What you should really be pining for is to get back into the regular index.</description>
		<content:encoded><![CDATA[<p>Excellent point Halfdeck and dully noted (by the crossing out of my assumption in the post).  Not that it matters much, but you got me thinking, is supplemental really a separate database or just  status in the same database.</p>
<p>So using my query used to find the page that I highlighted as supplemental we see a cache of:<br />
<a href="http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+site:www.mattcutts.com+" >http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+site:www.mattcutts.com+</a>***+-view:adghasdtrb&#038;hl=en&#038;ct=clnk&#038;cd=5&#038;gl=us</p>
<p>Using your query the cache of the page is:</p>
<p><a href="http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+SEO+mistakes+crappy&#038;hl=en&#038;ct=clnk&#038;cd=1&#038;gl=us&#038;client=firefox-a" >http://72.14.205.104/search?q=cache:gYwvJd0xNv0J:www.mattcutts.com/blog/seo-mistakes-crappy-doorway-pages/+<acronym title="Search Engine Optimizer">SEO</acronym>+mistakes+crappy&#038;hl=en&#038;ct=clnk&#038;cd=1&#038;gl=us&#038;client=firefox-a</a></p>
<p>Both have the same date stamp and the same cache:gYwvJd0xNv0J (whatever that is).</p>
<p>So IF (and that&#8217;s a big if) the same cache means that its the same file but represented in the supplemental and non-supplemental indexes.  </p>
<p>I&#8217;d imagine that being supplemental is just a status that any page is automatically assigned and all the fun that comes with that status, such that it&#8217;s crawled on the less frequeent supplemental crawl rate and updated during the supplemental refresh.  Now a page may also have regular index status which of course supercedes the supplemental status.</p>
<p>Here&#8217;s another question.  Are degrees of being purely supplemental? In that the page is not in the regular index at all but there are different levels of supplementalization (new word) like having levels such that some are crawled every 8 weeks, others every 3 months, etc.</p>
<p>Coming out of supplemental would then be better defined as actually just being added to the regular index as the page never actually leaves supplemental.  Of course if a page does leave the supplemental index then its pretty much gone which is much worse than be supplemental.  </p>
<p>So for all of those crying to get out of the supplemental index, be careful for what you wish for you just may get it.  What you should really be pining for is to get back into the regular index.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Halfdeck</title>
		<link>http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-270</link>
		<dc:creator>Halfdeck</dc:creator>
		<pubDate>Wed, 07 Mar 2007 04:40:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.jlh-design.com/2007/03/checking-your-supplemental-page-count/#comment-270</guid>
		<description>JLH, a supplemental status isn't an either or situation, where a page is either supplemental or is in the main index. For example, run this query 

http://www.google.com/search?q=SEO+mistakes+crappy&#38;ie=utf-8&#38;oe=utf-8&#38;aq=t&#38;rls=org.mozilla:en-US:official&#38;client=firefox-a

and you'll see that the TBPR 4 page is in the main index AND in the supplemental index. In other words, what you found does in no way prove that TBPR is inaccurate to the degree that SERP might imply.

This is also why some people, including myself, always use quotes when we say that a page "goes supplemental." We believe that all pages of a site is in the supplemental index. But the supplemental pages are masked when a page makes it into the main index. Conversely, when Google drops a page out of the main index, then its shadow in the supplemental index is revealed. That doesn't mean the page "turned" supplemental. That record in the supplemental database was always there.</description>
		<content:encoded><![CDATA[<p><acronym title="John Honeck">JLH</acronym>, a supplemental status isn&#8217;t an either or situation, where a page is either supplemental or is in the main index. For example, run this query </p>
<p><a href="http://www.google.com/search?q=SEO+mistakes+crappy&amp;ie=utf-8&amp;oe=utf-8&amp;aq=t&amp;rls=org.mozilla:en-US:official&amp;client=firefox-a" >http://www.google.com/search?q=<acronym title="Search Engine Optimizer">SEO</acronym>+mistakes+crappy&amp;ie=utf-8&amp;oe=utf-8&amp;aq=t&amp;rls=org.mozilla:en-US:official&amp;client=firefox-a</a></p>
<p>and you&#8217;ll see that the TBPR 4 page is in the main index AND in the supplemental index. In other words, what you found does in no way prove that TBPR is inaccurate to the degree that SERP might imply.</p>
<p>This is also why some people, including myself, always use quotes when we say that a page &#8220;goes supplemental.&#8221; We believe that all pages of a site is in the supplemental index. But the supplemental pages are masked when a page makes it into the main index. Conversely, when <strong style="color: rgb(0, 0, 255);">G</strong><strong style="color: rgb(255, 0, 0);">o</strong><strong style="color: rgb(255, 255, 77);">o</strong><strong style="color: rgb(0, 0, 255);">g</strong><strong style="color: rgb(0, 128, 0);">l</strong><strong style="color: rgb(255, 0, 0);">e</strong> drops a page out of the main index, then its shadow in the supplemental index is revealed. That doesn&#8217;t mean the page &#8220;turned&#8221; supplemental. That record in the supplemental database was always there.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
