27th February 2007

Webmaster World Finally Goes down for Cloaking?

UPDATE 3/3/07 In the interest of being honest with the reader, I must admit that the premise of this post was shown to be wrong, looks like there is a known bug with the site: command that is unrelated to the issue of cloaking or being deindexed. HOWEVER, my time-line still is true and Matt Cutts finally addresses the issue in a fair and open post here. Hopefully this issue will be fully fixed soon.

Unedited article starts below:

The discussion about some sites like Webmasterworld and the NY Times being able to show Googlebot their sites content and then require a log-in from an actual person has being going on for some time now. Most notably, in my opinion, has been Philipp Lenssen at Google Blogoscoped with posts like WebmasterWorld Cloaking? , Google Allows Misrepresented Result Snippets , and Does Google Allow Cloaking When They Like the Site?

The issue has been raised at the official Google webmaster forum as well in such threads as Subscription Cloaking and SEO indexing Legalities.

It’s been brought up multiple times on Matt Cutts blog namely in a post where he tries to explain how one can sign up for WMW for free.

Late last week there was at least an addressing of the issue by Matt Cutts in this completley unrelated thread about selling links, where he states:

…you mention cloaking and WMW, but you may not know that I told the administrator of WMW earlier this year that the site would be removed if it met the definition of cloaking. As the result, the administrator of WMW made code changes to WMW. I haven’t circled back around to check on that issue recently, but any site that violates our quality guidelines can be removed from our index, and that would include WMW if we found that it violated our quality guidelines.

Adam Lasnik did mention a while back in November that they were going be announcing a solution to the subscription based indexing, but we haven’t heard much about it since.

Today WMW appears to be on its way out of the Google index with just 350 URLs listed and most of them old, today’s results saved here. It’s tough to tell as they don’t allow caching of the site.

Is there a pending announcement coming soon regarding subscription based indexing? Is WMW finally being called to task and having to play by the same rules? Perhaps the deindexing of WMW is just a prelude to Matt’s aforementioned blog post.
I’m sure of one thing, that this isn’t over, and we’ll see more regarding this subject soon.

I’ll try to update the situation as soon as more information is known.

If you liked this post please buy me a beer. Thanks.

posted in Google, Webmastering | 1 Comment

23rd February 2007

You’ve got 3 seconds

I use the Internet mainly the old fashioned way, I surf it. Oh sure I’ll use a search engine when I need to find something, but usually I just surf. It’s like a conversation with your Grandma then, going from one subject to the next no matter how remote they may be related. Besides lately if you search for something all you are going to find is wiki pages anyway.

So as an Internet consumer I’ve gotten pretty good at making snap-judgements, usually within three seconds I know if I’m going to continue on with the site I landed on or if I’m going to hit the back button. With that in mind I’ve come up with this list of 20 23 things that I consider signals of crap and time to move on.

  1. Delusions of Grandeur – Any site that claims to be “your best”, “the best source for…”, “the most complete…” is usually not. As a matter of fact the sites that are indeed the best and most complete don’t find the need to trumpet it.

  1. What year is it? – If I see an outdated copyright date, or an update date over six months old I move on. If you don’t have the time to update your site, I’ll go to the library to read old information.

  1. 300 baud - I browse with a 5 meg broadband connection. If your site is so laden with pictures or hooked up to the web with a 300 baud voice modem, I’ll back-button you before the 3 seconds is up.

  1. 1994 – Using elements that were popular about a dozen years ago is a true signal of crap. These included guest-books, hit counters, or banner ads.

  1. My Eyes! - You giant font people know who you are, you’ve got sites that look like they were built with AOL’s site builder in 1994. Giant headers and giant fonts that allow about 3 words per sentence.

  1. My Eyes! (part 2) - Awful color schemes drive me nuts. Design is one thing but I’d rather see a white background with black font than try to read your green text on a black background or nasty pink background.

  1. What the hell? –kind of related to #6 but the unreadable sites blow my mind. These are the ones usually with black backgrounds and really small fonts. Do these webmasters actually ever visit their own site?

  1. My Ears! - anything that automatically plays a song is done in seconds, this includes ESPN playing their stupic commercials on load-up. If you’ve got a midi song than I know for sure the site hasn’t been attended to in a few decades.

  1. Vegas baby! –I’m not sure if its a symptom of FrontPage or the like but what in the world would make someone think scrolling banners, flashing text, or blinking ads would make me want to view the site. I shouldn’t have to feel like I’m having a seizure while viewing your site.

  1. Flash This. – Ok, I understand you artsy design types like to have your pretty pictures and all which is great, but for the love of Pete, make it an option. I will not load up and view your flash introduction EVER EVER EVER, so you’d better give me an html link right off the bat or I’m not going any further.

  1. Size does Matter - Apparently some web designers think we are all browsing on our 32″ flat screen monitors, but in reality most are not. If I have to use a scroll bar at the bottom to get your page into view, I’m oughta there!

  1. Look who’s talking now? - more in the content, but one of my pet peeves is changing the person in a paragraph.

  1. F7 - Spell check your document please, its a computer for heavens sake it’s not like you’ve got to use whiteout.

  1. Don’t tell me what to use – The suggestion that this site is “best viewed with…” drives me up a wall. Make the site viewable with the big browsers and those odd balls that write their own browser can learn to deal with it. There isn’t some fool out there using his Netscape Navigator that hasn’t heard of Internet explorer and is just looking for the opportunity you presented to download it.

  1. Web 0.1 beta –an reference to now defunct products like Alta Vista, HotBot, Netscape, or Lycos is a pure sign of junk.

  1. D for Dumb? I think way back when we were all new to the Internet and AOL was the big player it was a neat idea to tell people how to bookmark things with CTRL-D, but we’ve moved past that level of remedial education and if you haven’t, well the web has passed you by.

  1. Water, Water Everywhere –Part of web design is not only having something to say and finding a pretty way to say but it’s also your responsibility to organize the information for the reader. A homepage with 500 links to other pages just screams that this person has no clue how to organize information and a clear signal that the information contained within is just as disorganized. Give me navigation, not 500 decisions to make.

  1. War and Peace – The home page is supposed to be just that, HOME, not the entire freaking site. If your home page is 1200 inches long, well you need to learn how to break things down into categories, subjects, and pages.

  1. MFA - Ever since the introduction of pay per click the web has turned to crap. MFA refers to sites that are made for adsense. The key to being a successful adsense publisher is to have information that just barely scratches the surface on the subject and leaving your viewer wanting more, thus they are not motivated to provide the answers as you won’t clicky the ads. I’m happy to announce I don’t even pay for my hosting with my adsense income on this blog.

  1. Above the fold - You don’t see the wall street journal put the lead story on page 3 do you? No the import stuff, the meat should be right on top, viewable without having to scroll or click. If all you’ve got above the fold is your header or navigation links, I’m gone. Sell me on the site first, then I’ll figure out how to go deeper in.

  1. All Boxed In – If you are still using frames on your site, well I’ve got news for you, you are behind the times. If you’ve got the greatest site in the world and I find the cure for cancer on one of your frames, how the heck am I going to tell anyone about it? I can’t link to it as they’ll just get your home page.

  1. No more snippets - Google does a fine job of creating snippets for me when i search for things. If your only contribution to the Internet is a two sentence snippet about some other site and a link, well then I should have been there in the first place not wasting my time looking at your page.

  1. Product reviews, puuleezze. - There may have been some well meaning affiliates a few years ago, but they’ve all been run out of town by the RSS feed generated affiliate sites. If I see an amazon link box, with a “review” around it I’ll move on to the writer of the review and perhaps the actual merchant who will have more information.

If you liked this post please buy me a beer. Thanks.

posted in Webmastering | 0 Comments

23rd February 2007

Just one more and I’m going home

Odie got fired for ordering too big of cups which reminded me of an image I got in email a few decades ago, it has to be one of the funniest pictures I’ve ever seen I just love the exasperated look of the fella on the right.

One more

If you liked this post please buy me a beer. Thanks.

posted in humor | 0 Comments

20th February 2007

Beware of site submission scams

I just got an email which prompted a thought. Beware of scams out there that want to charge you for submission to search engines. First off submitting to a search engine is pretty much not needed anymore, if theres a link to it, they’ll find it. That’s an old hold-out from when search engines were nothing more than directories with a keyword search enabled.

This particular email came from www DOT googlesubmission DOT info. Notice they aren’t even indexed themselves!

The text of the email is:

DO YOU WANT YOUR WEB SITE ON GOOGLE

GOOGLE can guarantee the inclusion of your site on the most significant and well-known search engine in the world within less than 15 working days

When you visit the site, you find out it cost $199 to submit your site to Google. You can add your URL to Google for free anyway.

I know a few Googler’s have read this blog in the past and if there is anything they can do I hope they at least keep this domain out of their index.

By the way it’s a newly registered domain in Turkey. In case they disappear, I’ve saved a screen shot: Google Submission

If you liked this post please buy me a beer. Thanks.

posted in Google, Webmastering | 3 Comments

20th February 2007

Updated Google Home Page

Ok, they didn’t ask for my input, but I thought I’d put together my spin on what a good re-design would look like for Google. Being the giver that I am, I won’t charge one cent for my new design if they decide to use it.  It’s a subtle difference but a great improvement in my opinion.

Google Home Page

If you liked this post please buy me a beer. Thanks.

posted in Google | 5 Comments

19th February 2007

Why doesn’t my site rank for my keywords?

As the title of this post suggests I’d like to take a look at probably one of the most common questions asked by fellow webmasters. So you’ve got your site listed in all of the major search engines and managed to not go supplemental in Google. Congratulations, you are 3% there! Now come the hard stuff.

Generally the conversation starts out with the webmaster asking whether or not they have tripped the [insert penalty here]. Penalties are often rumored explanations for patterns that thousands of observing webmasters come up in watching the search engines. Such penalties have gotten some traction with names like the sandbox, duplicate content, -30, -950, or thin affiliates. It’s reassuring to people to think that they have simply made a mistake of forgetting their META description so their site is thrown into oblivion or that they used a keyword density greater than xx.xx% so that must be the root cause. Many people spend way to much energy trying to A) understand the flavor-of-the-month-penalty, B) deciding if they suffer from it, and the C) looking for the magic bullet that will get them out of this mysterious penalties grasp.

If you’ve seen the movie Contact written by Carl Sagan, you’ll remember a principal cite called Occam’s razor. Basically this states, “All things being equal, the simplest solution tends to be the best one.” In another way, “when you hear hoofbeats, think horses, not zebras.” Assuming most of my audience doesn’t live on the Serengeti this will be germane. Rather than worry about the above penalties maybe the simplest answer is that the site is suffering from the MSSA Penalty. The My Site Sucks Ass (MSSA) penalty is probably the most common form of penalty most often dolled out by the big three, Google, Yahoo, or TSFKNAMSN (The search engine formerly known as MSN)

If your site sucks-ass you need to ask yourself one question. If I show up in position 950 for “my keyword” why are the 949 sites before me better? Considering that Google is a multibillion dollar company started by virtual geniuses an staffed with more PhDs than most universities perhaps sometimes, just sometimes, they actually get it right and correctly rank sites in order of their importance and quality.

Another fact that is often overlooked is that just as you are trying to rank for “your keyword”, so are thousands and thousands of other sites, and the distinct possibility that they have recently made their sites better than yours also exists. The nice thing about search engines is that they make your research easy for you. They list in order from 1 to 1000 the list of pages that they consider the best for said keyword. The information is there which ones are considered better than others. Certainly there are other reasons beyond your control such as the new spamming technique of the week that will propel a site to the top temporarily, but if you are serious about this your not looking for meteoric climbs but a solid base on which to build.

Bottom line is this, continue to improve your own site with fresh content and modern presentation and worry less about the changing search engine atmosphere and as they continue to do their job better if your site is truly better than the ones above it, you will eventually get there. On the other hand, if your site sucks-ass and Google just figured that out, you may be seen in the forums and newsgroups asking “is Google broke?”

If you liked this post please buy me a beer. Thanks.

posted in Webmastering | 0 Comments

18th February 2007

Strangest referral ever…

If you spend some time going through your server logs you will find some informative trends about how people find your site. You will also see some pretty humerous stuff. I learned to today that I am #1 out of 612,000 for the phrase, “what are those little pigs in google’s logo for?” Now I’m not even sure what this means, nor do I understand why I would be related to it, but I’m number 1 !

Screen shot saved for record.
The longtail of search

If you liked this post please buy me a beer. Thanks.

posted in Site News, Webmastering | 0 Comments

17th February 2007

The boys

Here’s a picture of my boys Connor and Colin, it was taken about November of 2006. We went sledding today, right below the bluff you see at the top of this page.

Connor and Colin, age 6 and 2.

If you liked this post please buy me a beer. Thanks.

posted in Personal | 0 Comments

16th February 2007

Nothing to see here…

There’s nothing to see here, nothing at all, just a link that I don’t want anyone to follow, which is why it’s got nofollow on it. Who do you think will follow it? The logs are set and we’ll be watching. PLEASE FOR THE LOVE OF ALL THINGS PURE, no one link to it!It’s tough to run a public experiment when the public could ruin the experiment isn’t it?

Any bet’s on who will follow the link? bot or human?

updated 2/17/07 9:43 am, so far we’ve had a few curious people and these bots:

1) Mozilla/5.0+(compatible;+SnapPreviewBot;+en-US;+rv:1.8.0.9)+
Gecko/20061206+Firefox/1.5.0.9
2) Mozilla/5.0+(compatible;+Yahoo!+Slurp;+
http://help.yahoo.com/help/us/ysearch/slurp)

If you liked this post please buy me a beer. Thanks.

posted in Personal, Webmastering | 2 Comments

15th February 2007

Is Page Rank Finite?

An interesting dialog took place on the Google Webmaster Group, since we can’t get any credit for it, and a finite amount of people read it I thought I’d transpose it here to save it from falling out of site on the group.

The conversation is as follows.

Admin Aaron

I am wondering is PR is finite? Websites that do not do link building and obtain backlinks in a natural organic way often have very few incoming links which means lower pagerank.

I have a few blogs in the renewable energy area where I find, review and write about companies, people and technology I believe will encourage the growth of alternative energy.

I am generous and link out to others but sinse the site is new and relatively unknown I am not yet getting many links to bring in more pagerank. In other words I am giving more PR out than I am getting.

The question:

I recently started using the nofollow tag on external links with the belief that pagerank is finite and I only have so much to share. Am I shooting myself or those that I link to in the foot? Is this an incorrect assessment of how to pass PR?

I believe this to be a highly relevant question and example of why nofollow tag needs even further explanation.

Thank you if you feel comfortable with answering this Adam Lasnik or others…

Aaron

Sebastian

Aaron, you shoot yourself in both feet. PageRank hoarding is a sin. PR is a sexy fay you should not pass, that’s way to rough. Let her flood your site, let her come and go as she decides, don’t even think of her when you deploy links. And please remove the f** nofollow values in your rel attributes, that’s unethical and counter productive. Really :)
Here is more info:

Sebastian

Admin Aaron

So if I have a new blog with pretty much no PR is makes sense to link out to all my friends? I don’t know Sebastian, in an ideal search engine world that would be great but I am not so sure pagerank in infinite. I have also been thinking the way you do for a long time but now I am not so sure man.

The way I have been using the nofollow is if I write about a grat subject on another site but that site is filled with paid links I use the nofollow because I do not want to pass PR to a bunch of spam sites selling links BUT this recent idea is driving me nuts. ;-(

Admin Aaron

Yo Google, please enable an edit feature for when we spell stuff wrong we can go back and fix it. Just allow people to edit within a given time period… like 5 minutes? Also how about a spell checker for us dumb arses? Thanks! =P

JLH

Aaron you bring up a good point. A while back when some sort of penalty was being dolled out Adam repeatedly said here and other forums that one factor may be “Over Optimization.” To me a site that has all external links nofollowed is ripe to be picked in an algorithim for over optimizations because of the un-naturalness of it. It’s a strong signal of two things 1) this guy doesn’t trust anyone he’s linking to or 2) this guy is trying to hoard page rank. Neither is a signal that should help a site rank for anything.

Back in the day search engines ranked based on the text only, keyword counts etc. People started repeating “free porn” 900 times on a page, then they started looking at the META data people started stuffing them, links were good so people started selling them, in all cases google reacted and penalized the offending sites. Nofollow was introduced as a tool to help eliminate the value that blog comments, message boards, forums, etc where having on the ranking of sites. If you’re the site owner and didn’t write the content you have a way of saying I cannot vouch for what’s there. I would bet that if someone
tries to turn that around and use it to IMPROVE their page rank, they’ll get busted. Just as using your keywords in sentences is good but repeating them 100 times and abusing the H tags is bad.

Another thing to consider is from the inherent nature of the “web” sites are not meant to be islands but rather connected to the whole internet by both incoming and outgoing links. A site is judged not only by the content and anchor text of incoming links but the topical nature of the sites linking in and the topical sites that you link to. If you have all nofollow links you are missing a key ingredient to telling Google what the site is about.

I don’t have any insider knowledge, but given enough time patterns will emerge in the mass of data that they pour over every day, and I wouldn’t doubt a signal they will look at is your linking habbits and the use/abuse of the nofollow tag. PR hoarders will be penalized as its a breakdown in the natural flow of the web. It’s like the wiki debate, if they don’t trust any of their resources then they shouldnt be trusted as one themselves. Of course they’ve got enough momentum that their probably impervious to any sort of algo change, but smaller sites are.

So where is the line between SEO and SEOO (search engine over- optimization)? I’m not sure, but as Adam also says, “does it pass the smell test?” And saying, “I changed all my links to nofollow so I don’t leak page rank” smells like you are trying to game the system and artificially influence googles ranking, which in my experience they frown upon and react.

Personally, I always have nofollow links highlighted in my browser with a bright red box. If I stumble upon a site that has too many, I move on, I wouldn’t doubt google takes that stance sometime in the future. I just don’t trust the nofollow actions yet, and I think its going to get worse. If you really want a bunch of sites that you link to be not followed for ranking purposes (affiliate links, selling links for traffic only, etc) I’d put those on a redirection hidden behind the robots.txt. That way Google isn’t going to follow them at
all because they aren’t going to see them, and they are not going to be able to hold it against you because there is no possible good that can come from a page that is not crawlable. As far as a site with no eternal links, I’d say that’s just as shady as disabling the back button, the site is no longer a part of the web, but only a town where all the streets going to it are one way, and after a while people will notice that no one returns from that town, and no one will go there
anymore. Now I doubt Adam will/can come on here and say that they penalize or don’t penalize or give credit for the use of nofollow as commenting on
the actual ranking is probably not allowed in the least bit, but these are my two cents :)

Sebastian

In the case of linking to a site plastered with affiliate links and ads, where is the point to write about them in the first place, if you don’t like them? Well, given that this site has a page hosting valuable information, wouldn’t it be the search engine’s job to allow or disallow incoming PageRank at this site? Why should you pre-qualify links for Google’s ranking algo? This attitude smells like self-
censorship based on fear, and you shouldn’t fear Google when you link out.

Applying nofollow crap to affiliate links is a completely other story. I have no problem with Google’s position that affiliate links should not carry Googlejuice, but just the human traffic. I say “nofollow crap” because rel=nofollow is IMHO not the suitable instrument to achieve this particular effect, or call it favour to the engines. I use rel=nofollow with affiliate links to save my own ass, although
it’s Google’s job to decide whether a particuilar link should carry weight or not, and although I do think that rel=nofollow is not the right thing to use in this case.

All this thoughts and countless confused discussions are results of the somewhat hapless implementation of this crawler directive, and it’s ongoing semantic morphing. Sure, rel=nofollow is a generic mechanism to do a ton of things which all make sense. But it’s a geeky tool for search geeks, not a suitable tool for webmasters, editors, or publishers.

Ok, back to PageRank. Think of PR as a statistical approach to emulate Joe A. Surfer’s behaviour, where the middel initial stands for Average. PR just tries to follow Joe’s footsteps on the Web, sometimes guessing whether Joe will click link A or B, thus deducting both possible click-paths’ scores equally. PR is a model of the Web, kinda road map from any point to any point, weighting each and every path/lane’s quality by scoring the destination. And because it’s just a
model, it’s weight as ranking factor is not that important as the toolbar suggests. The PR hysteria reminds me of the Beatles BTW. Sure PR is great, it’s sexy and all that, but it doesn’t show the (whole) big picture one has to think of when it comes to optimizing contents. Hence optimize for visitors, remember the visitors pull out the plastic, not the bots, create broad and easy to walk paths through your contents all leading to your signup forms. PR will follow the visitors to honor your efforts not only with green pixels. IOW since
todays PR is Google’s secret sauce, and there are other important link cargos, it just makes not much sense to speculate about PR distribution. PR is addicting and distracting, hence you should ignore it to get your job done.

Sebastian
http://sebastians-pamphlets.com/

Admin Aaron

I agree with a lot of what you guys say but let me ask a few more questions:

JLH - If Google punishes sites that hord PR why does Wikipedia (educational), Amazon (affiliate) and other eCommerce sites doing so well in search engines?

If you look at http://solar.rain-barrel.net and any other of my sites you will see that I am doing nothing to game search engines at all but also notice the PR, how much do I have to share? >> Is PR finite?

<< Sebastian - You said” “In the case of linking to a site plastered with affiliate links and ads, where is the point to write about them in the first place, if you don’t like them?”

Say I find a farmer/inventor who doesn’t have a website but his new wind turbine was written about on a local news site that is littered with paid links and it’s “SEO” is linking out with hidden text to viagra spam. I want to write about this man who is offering something unique but the only reference I can find is on this local spammy news site. Wouldn’t this be a case when the nofollow tag should be used?

Here is another one: Say I write about a cool company that is looking to market complete solar arrays but when I review the company I find that they have multiple sites and I just do not have time to analyse their intent, wouldn’t it be safer to use a nofollow here?

I do not spam or use the nofollow to spam BUT if I understand it correctly what it does is break the connection between you and the possible bad neighborhoods that can have indirect negative impact on a websites “trust” or whatever you want to call it.

One more thought: Did you guys ever think that our debates actually build better algorithms? Are there people in white coats with clip boards taking notes? You guys got all kind of interesting ideas, that’s fo’ sho’, thanks! :)

Sebastian

If you link to linkworthy content you should not use rel=nofollow. If Google doesn’t think the linked site is worth indexing, that’s fine, but it’s not your problem. You’ve done your job by providing your visitors with external content you honestly think is interesting for the crowd. Trying to guess what Google could think with the intention to castrate a link when Google might dislike the target sounds just not right, and it makes no sense, and Google doesn’t encourage you to handle rel=nofollow this way. This goes for both of your examples.

Also, not every site carring tons of ads is a bad neighborhood, think of high ranked sites like SEW for example.

Would you link to my articles (I know you did it already)? Oups, I should have told you that Google (AdSense) pays my hosting fees …

ounds weird, eh?

Sebastian

JLH

Great discussion so far, probably since we are all just throwing out ideas and no one has taken a stance, so I’ll continue best I can.

I’m not sure if they do or would punish or even consider the use of nofollow in a sites ranking but for the examples, wiki, amazon, ebay, etc the hard part in that they already have a ton of momentum as far as traffic, links etc that they would be unharmed by suc a filter. I think were you’d see it is average joes site with 18 incoming links and 125 pages in the index.

Back to the finite PR discussion. I’m trying to get my head around what the question is. Are we concerned that give a pages actual (not toolbar) page rank that it only has so much PR that can go around? So if you’ve got a page with PR5 and you have 3 external links do those links receive 1/3 of your linking power, or if you’ve got 100 links do they see 1/100th of the linking power? From what I understand the actual PR of page is just related to the incoming links and has nothing to do with on-page or linking habits. A page with 100
external links and a given X incoming links will achieve the same PR of a blank page with no text and not links receiving the same quantity/quality of X incoming links.

Now the question becomes is using nofollow on a link reduce that drain on the linking power of the page. If the above is true and our page with 100 links were passing 1/100th of its linking power to each page, if we changed half of them to no follow would the remaining get 1/50th of the linking power and all 100 get the traffic flow? I’m not sure its as straight forward as that. As PR passing has as much to do with being on topic as the actual PR of the page. We’ve all seen sites that cannot even get indexed when they say that they’ve got hundreds of links, well further inspection shows that they are all junk links,
exchanged directories, off-topic etc. This has got to have the same effect on the linking page. Given the same page with 100 links and 50 of them are off topic does the page not pass PR which also means that it doesn’t drain the PR passing to the other links or does the opposite happen where it actually costs you since you choose to link to bad neighborhood or an off-topic arena?

My theory is that they currently just treat a nofollow link as an “ignore function” such the link is no longer a valid link, even though it appears to be to the general public. Google doesn’t even appear to crawl said links, on the other hand Yahoo will crawl them for darn sure, I do not know if they consider the link in their ranking as I don’t spend much time analysing what some 3rd rate search engine does, other than send terribly targeted visitors.

My cynicisim on nofollow also comes from the evolution of it. Originally introduced to try to reduce comment spam, which it didn’t, now has evolved into protecting links for money etc. I find it strange that they got all the CMS out there to institute in for automatic use on blog comments for example, but there is no discussion on removing it. If I write a blog post and 50 people comment on that efficitivly writing my content for me, some of those comments will probably contain good links that are germane to the conversation and should not be nofollow any more as they can only help my page rank for its rightful terms. But there is no mention of that at all.

I’ve taken it upon myself to institute a nofollow policy on my own blog which I publish, its fluid and may change with time, but the key component of it right now is that I’ve used a plugin to make nofollow comment links turn to real links in 14 days. It gives me two weeks to judge each comment and if I see a link I don’t like I’ll break it, otherwise I let it mature and become a normal link that ads to the page and discussion. I’m not so sure that my policy is in line with
googles ideal, but until they give further directions its the route I want to go.

The blanket statement that all links that are paid for does trouble me a bit, based on history more than statements by such pundits like Matt Cutts. I agree in principle with what he says in that you shouldn’t be able to influence the search engine results because you went out and bought 500 links from probloggers out there, but then again they don’t have a problem with yahoos $299 review fee do they? Based on the fact that yahoo doesn’t guarantee inclusion. Well if I’m the expert on solar energy and you want me to write an article about your service that may or not may contain a link to your site, who’s to say that I didn’t receive 25 other offers from other solar energy sites
that I declinrd. In which case I think a nofollow is an misapplication of the tag. Sponsorships are a part of this society and I don’t think it can be controlled with a policy like this. Should the US government require a blanket statement after every product endorsement on TV, a warning label that says Tiger is paid by Nike to use their golf balls so please don’t accept his endorsement because of that? No we leave it up to the general public to make their own decisions. Well in the case of a paid review/posting the endorser is the writer and the general public is google. Their job is to consider the source and consider the subject and decide if the endosement carries any weight. Just as I’d rather take my golf ball advice from Tiger than Al Gore, Google should consider a paid review on Solar Energy by one of the leading experts in the field more
important than my sisters “I love Cats” blog. They are asking us to do their job judging links and pages, which just of course lends itself to even more manipulation than it was originally instituted to contend with.

Speaking of which we should really take this somewhere where someone is at least getting credit for the writing, as I’m not getting any link love or clicks on the ads to the right of this post!

Admin Aaron

Well, to be safe I do not do anything online that is “paid” other than reviewing a few products to pay my hosting fee using amazon.com iframes and google adsense that monetize a few dallors a day. I got a couple directories that scare people away with their high price and extremely strict guidelines (note, know of any good sites? submissions are free, shhh)

Great points, but one question about the Yahoo! directory, people keep using it as an example and I am not so sure it is passing link love at all. In fact, any directory that is paid would be violating Google’s guidelines correct?

Anyone know of a way or tool that shows if a “paid” directory is passing pagerank? This would be a good way to determine is Google is treating everyone equal which is another accusation I often read on blogs.

Also, how about Google just update the guidelines to show proper uses of the nofollow tag?

Just a few more ideas…

JLH

I’m not sure on the yahoo directory as I’m not paying diddly doo for anyone to review my site, but Google still references it by name in their guidelines.

“Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites.”

http://www.google.com/support/webmasters/bin/answer.py?answer=35769

Whether thats the free stuff or the paid one I don’t know. And don’t get me started on that scam of the ODP, anything that claims openness needs to have transparency and they have none. If you submit a site for inclussion and it gets shelved because the editor is your competitor that should be made public so the reasons a site wasn’t included.

I’d suppose John Softplus has some experimental data to back up yahoos linking power.


mpilatow

The official Google policy on buying links seems to be it is fine as long as you are buying links for traffic and not for PR or ranking purposes. Now how they can determine intent is debatable. Some sites make it clear that they are selling links for the purpose of PR but there are many sites who do not mention it at all but you have to know most of the people who have bought links did it to improve their SE ranking and PR score.

JLH

I stumbled upon an interesting item, I wonder if it’s related to this
subject.

http://www.google.com/support/webmasters/bin/answer.py?answer=34397&c…

Most notably:

Although Google crawls billions of pages, it’s inevitable that some sites will be missed. When our spiders miss a site, it’s frequently for one of the following reasons:

* The site isn’t well connected through multiple links to other
sites on the web.
* The site launched after Google’s most recent crawl was
completed.
* The design of the site makes it difficult for Google to
effectively crawl its content.
* The site was temporarily unavailable when we tried to crawl it
or we received an error when we tried to crawl it. You can use Google
webmaster tools to see if we received errors when trying to crawl your
site.

I’m looking at the first one in particular, where they seem to insinuate that your site may not be included in the index if doesn’t include “multiple links to other sites” Would NOFOLLOW on all external links constitute this?

Also, note to Google Editiors: The second one seems a bit dated as Google no longer runs one big crawl and a general new index push anymore as its continous. So if you want to fix that up and give me a link on your homepage that would be great.

wreilly

Is that just a typo and should be “from other sites…”

Interesting disscussion but why does it matter? PR doesn’t seem determine serp position. And no one but the bot minders know anyones actual PR.

Sebastian

Forgot to add:

You cannot lose PageRank by linking out. Google gives you PR with the duty to spread it. With every new link on a page you lower the portion of PR it sends to its link destinations, but all the PR you’ve earned from your inbounds is assigned to your pages. PR is somewhat sticky. This goes for internal links as well as external links. The only way to lower a page’s PR is to lower the PR of at least one page linking to it. Well, Google can nullify PR, but that’s done with the worst offenders only, and it can spoil down the ability of a page to bequeath PR, what you mostly don’t spot because it’s PR doesn’t change.

PR hoarding is creating a black hole, like Wikipedia did recently. Such a black hole sucks PR, but distributes PR to own pages only. BTW Wikipedia has asked Google for the permission to become a black hole before they nofollowed all external links, so they have got some special treatment with regard to PR hoarding “penalties”. Since Wikipedia links didn’t carry much weight before IMO, this black hole should affect only pages living off from Wikipedia inbounds.

Sebastian

JLH

Agreed on ranking, PR is less of an influence, but keeping your site crawled on a regular basis has everything to do with PR, which is my only concern. And if you ever try to launch a new site, having a little PR to start the crawling doesn’t hurt either.

Sebastian

OK, time to debunk some PR myths.

John, add a dampening factor and a few gimmicks to your post and you’re spot on.

{Aaarrrggg … Googlegroups has stolen my cursor again}

PR has absolutely nothing to do with relevancy, so unrelated or off topic links carry the same PR as on topic links.

{Google I want my cursor back!}

Google encouraging YAHOO paid as well as unpaid listings has not that much to do with the payment itself, but with the editorial character of the directory. So Y! links are treated exactly the same way as DMOZ links. BTW you may have spotted that some ODP categories aren’t bothered with PR, these links don’t pass PR, probably because the editors are your competitors.

{Google return my cursor or I reveal that you’ve scanned the brains of all ODP editors to figure out who’s linking honestly and who does not!}

Crawling schedules are more or less completely driven by PageRank.

High PR values result in frequent crawls, low PR makes Ms. Googlebot lazy. Laying out milk and cookies attracts and holds Ms. Googlebot, she loves cookies.There are other factors like freshness and source scores involved when it comes to fetching brand new stuff.

{Nasty Google, I do miss my cursor … pleeeaaase! }

“Page” in PageRank stands for Larry Page, not Web page.

The PR of the whole Web is 1, not 42.

Sebastian out to hunt for a new cursor

Admin Aaron

LOL, thanks for all the info. today but I must say, none of us still know for sure if PageRank is finite and can be used up like fuel. :)

Sebastian >

Definitively finite, PR cannot leave the Web.


JLH

The law of conservation of then PR applies? Page Rank can neither be created nor destroyed. So PR is more like energy which is just converted and transfered than like currency which can be produced, spent, wasted?

softplus

So who’s going to mirror this discussion in their blog so that we can link to it? :-)

Great stuff, guys.

Is pagerank diluted as you add more content to your website? Is pagerank diluted as Google adds more filetypes to the index? Can issues with numerical accuracy result in “chaotic” pagerank fluctuations?

John

If you liked this post please buy me a beer. Thanks.

posted in GWHG, Google, PageRank, Webmastering | 6 Comments

13th February 2007

Those Wacky Belgians

A story hit the wires today that Google lost a court battle in Belgium regarding their use of newspapers materials and something to do with copywriting….yada yada…ho hum… that’s not the story there.

The real meat is toward the end of the article, way down at the bottom, below the fold:

In the future, the court said it would be up to copyright owners to get in touch with Google to complain if the site was posting content that belonged to them. Google would then have 24 hours to withdraw the content or face a daily fine of $1,295.

Since google doesn’t accept emails, calling their 1-800 number get’s you into an endless switchboard game, how exactly are our Belgian friends supposed to contact google? Does commenting on Matt Cutts’ blog count?

I think the writer of the story missed the boat on this, the real headline should be, “Google is going to open a customer service department”, or “Google to invest in: Telephones”, or “Google Employees to get email!”.

The fact that they have 24 hours to not only receive a notification but also RESPOND Is just hilarious, I’m not going to hold my breath on that one.

If you liked this post please buy me a beer. Thanks.

posted in Google | 0 Comments

13th February 2007

PageRank Discussion

If you frequent webmaster forums, discussion groups, or blogs, you’ll more than likely see post after post, and question after question regarding PageRank (PR). For those who don’t know PageRank is the little green bar up in your google tool bar that “is Google’s measure of the importance of this page.” The scale is a simple 11 number 0 through 10. Being that its the only visible indicator from Google it’s often obsessed about.

For a discussion about PageRank, we first need to look at the definition by Google.

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”

Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don’t match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page’s content (and the content of the pages linking to it) to determine if it’s a good match for your query.

Top 13 things that won’t effect your PageRank

Though that seems to sum it up pretty good, often I still see numerous questions regarding a sites page rank going up or down. To start with let’s cover some common elements that have ABSOLUTELY NOTHING to do with a pages PR calculation.

1. Content
William Shakespeare himself could come back to life and write for your site and Google’s page ranking won’t give you one point for the quality of your content. You can spend 20 hours a day for 5 years writing thousands and thousands of unique, highly relevant, beautiful pages and your PageRank will not change. Conversely you can have blank pages and your PageRank will not change.
2. Titles
Changing your page titles from all just the name of the site to a unique description of each page will not improve your PageRank. Erasing all of your titles will not decrease your PageRank
3. Description
Including a highly detailed and extremely accurate description of your page will not affect your PageRank, if on the other hand you accidentally remove all your descriptions and replace them with the number “7″, your PageRank will not suffer.
4. Keywords
Whether you use META keywords on your page or not, or your keywords are stuffed, rotten, or totally irrelevant has nothing to do with PageRank.
5. Google Adwords, Adsense, Analytics
The use of any of Google’s other services including AdWords, AdSensse, or Analytics has nothing to do with PageRank. Whether you spend million dollars a day with AdWords, receive a million dollars a day from Adsense, or check your stats every 10 minutes, has absolutely nothing to do with your PageRank.
6. Sitemaps
Using Google’s webmaster tools and submitting a sitemap or unsubmitting one will not effect your PageRank.
7. Sitemap Errors
If Google’s webmaster tools show you have no errors or 500 errors in your sitemap, your PageRank will not be hurt or helped.
8. Server Problems
If your server goes down for a week or has never been down since Al Gore invented the internet , your PageRank will remain the same.
9. Competitors
Your competitors use of sneaky redirects, hidden text, adwords, sitemaps, the color green, linking scheme, SEO budget, cloaking, affiliate links, or page rank will have no effect on your PageRank.
10. Robots.txt
Having or not having a robots.txt that works or doesn’t work will not help or hurt your PageRank. You can block Google from your site entirely and you will still get PageRank.
11. Duplicate Content
Whether you write all of your own content, have a scraper write it, someone else copies it, publish the same page 500 times, your PageRank will not be harmed or icreased.
12. Age of Domain
If you just bought your domain or got it from Al Gore back in the seventies, it has no bearing on your PageRank.
13. Images/Flash/Java/CSS
You may have the ugliest site in the world or the most beautiful flash laden java enabled page with pictures taken by a professional photographer and it will have no effect on your PageRank.

The point is that only the links and the quality of the links is considered in your PageRank calculation, that’s it, nothing else. All of the above items may help get you links which will increase your PageRank, but they will not be considered in the calculation. You may find an old crappy site that doesn’t have any links but has high PageRank, but you have to consider that that link may have been there for 10 years.

PageRank Factors

Now that we’ve got what doesn’t count out of the way, let’s consider the remaining factors that do.

1. Quantity of links
Of course any calculation must consider the shear quantity of links. More links is better than less links. Less links is worse than more links.
2. Quality of links
Links from higher ranking pages are given more weight than lower ranking pages.

Unknowns

Most often when one asks, “what happened to my page rank?” or “why don’t I have any PageRank?” it is because of these unknowns.

1. What is a link?
We don’t know exactly what Google considers a good link. Some sites that link may have had their link power taken away. Other links like NOFOLLOW links don’t carry any weight. Even in the webmaster tools the links shown may not carry any weight. Using Yahoo’s site explorer to find links to your site has nothing to do with what Google considers.
2. The calculation
If we definitively knew what links Google considered we still don’t know how they calculate how much weight each one passes on. It has to have something to do with the page rank of the linking page, the quantity of links on that page, and other factors but the exact calculation is unknown.
3. Internal PageRank
The page rank you see in the tool bar is not updated often, only a few times a year and even then pretty poorly. If you check multiple data centers you will see that even Google isn’t on the same page as to what a page PageRank is. The internal PageRank however is a continuous function that’s updated while they crawl and index the web.
4. The Scale
There are a few billion more pages today than there were in 1998 when this whole page rank mess yet it is still ranked 0 through 10. Back in 1998 maybe you only needed 100 links to get to a page rank of 6, but today you may need 10,000 links. As the size of the web grows the threshold for the next level in PageRank changes.

Conclusion

PageRank is simply a calculation based on the quantity and quality of links that Google considers to a page. The exact calculation is unknown (to us outsiders) and is ever changing. Most important however to know is that a page with page rank of 2 can and will outrank a page with a PageRank of 6 in a search results page. Given that simple fact, if you’ve read to the bottom of this page then you’ve spent too much time already thinking about PageRank, spend more time with the basics like making sure it’s crawlable.

Update2/14/07

In response to Monika M’s comment below I thought I’d show an example of a lower PageRank page, outranking a higher page. There’s a great little Firefox extension for SEO by Aaron Wall that will let you see the PageRank and other important data about a page right in the search results. Using this extension turned on I ran a search for “webmaster” on google. You will notice that the #1 spot is a page rank of 6, outranking the #2 which is google itself at #9.

Webmaster Search

Regarding the second part of your comment, I think non-reciprocated links are probably the BEST link you can get pointing to you, it helps define the site as an authority. Think of the great page ranked sites out there with millions of links TO them and very few that are reciprocated (Ebay, CNN, Amazon, etc) . I wouldn’t lump all reciprocated links as bad however, if the two sites relate to each other than it’s a good thing. If you’re selling jewelry and De Beers links to you on the their home page, I wouldn’t be afraid to mention it on your site. Who links to you, as well as who you link to helps establish your keywords that your site will rank for, along with on-page content. It’s this simple fact that was taken advantage of when people started Google Bombing. If a site is on a related theme that you are, linking to them will help your site. This is why reciprocal linking just for the sake of getting a link is usually bad if off topic. Unless you really want to rank for “Free Ringtones” or “Cheap Mortgages” having them link to you and you linking to them is only bound to take your sites theme off course. That being said, don’t forget that there is virtually nothing anyone can do external to your site to harm your rankings. So if you find a bunch of links from some off topic site to yours, it’s not going to hurt you, as soon as you link to them however you’ve joined their “neighborhood” which may not be so much of a good thing.

If you liked this post please buy me a beer. Thanks.

posted in Google, PageRank, Webmastering | 8 Comments

7th February 2007

Google Webmaster Help(ers)

The Google Webmaster Group’s description states, “Welcome to the Google Webmaster Help group, where you’ll find answers from expert users and even Google employees.” Many, many people contribute on a regular basis to a positive experience for the visitors of the group. These regular contributers go mostly unrewarded except for the occasional thank you from someone. This is my own very small effort to at least reward them with a link to their sites that isn’t nofollow’ed or obfuscated by some javascript redirect. I’ll try to update this very post weekly as to let it gain some permanent PR and as a little incentive for everyone’s continued excellent efforts.

Top posters this month as of 2/7/07

  1. dockarl: Matthew James
  2. hosting.alandoherty.net: Alan
  3. Robbo: Vince
  4. softplus: John
  5. Red Cardinal: Richard Hearne
  6. cass-hacks: Craig
  7. Admin Aaron: Aaron
  8. N-H-P
  9. MrGamma
  10. JLH: John

All time

  1. softplus: John
  2. webado: Christina
  3. cristina
  4. Phil Payne
  5. JLH: John
  6. Sebastian
  7. Rick1
  8. djc: Dori
  9. surf_doggie: Earl
  10. MrGamma

Thanks again, and I look forward to more inspired discussion,

John

JLH

If you liked this post please buy me a beer. Thanks.

posted in GWHG | 9 Comments

5th February 2007

Updated Webmaster Guidelines

Philipp Lenssen points out that Google Slightly Adjusts Webmaster Guidelines on his blog/forum where I am much lesser contributer. In his post Phillip wonders, “I wonder if there’s any deeper meaning to this…?”, in relation to their inclusion of three simple words, “or otherwise penalized.”

This is only my opinion, but I think this change has a lot to do with the ever changing world of fighting web spam. While on the surface it may appear like they are letting up on spam by not completely deindexing a site and just penalizing it, I think quite the opposite. In the old days if a site was banned you may get Gray Barred, or have your page rank disappear along with your pages. Then the Gray Bar disappeared. Recently this change has been noticed in dozens of forums regarding the -30 penalty, the -950 penalty, or the “q” factor.

Part of any gray or blackhat activities has to be experimentation. Push the envelope and see where the limits are, push back and see what you can get a way with. By reducing the site-wide ban and focusing more on penalties are nearly undetectable Google has removed the feedback element to this testing procedure. They’ve in effect added more confusion to the mix. No longer will it be as easy as seeing no results from the site: command to know if you’ve been caught but a penalty may just appear like a symptom of low ranking pages. The simplest cure for low ranking pages is to improve them, which everyone wants anyway!

I’ve said before and I’ll say it again, if Google wants to expand its fight on spam there are two simple things they could do. #1) is show each and every page of a site when someone uses the site: command, no matter if it’s a piece-of-crap or not and #2) Just do not return those pages in any natural search results. Doing these two things will keep the spam sites scratching their heads more than learning the aspects of the algorithm.

If you liked this post please buy me a beer. Thanks.

posted in Google, Webmastering | 3 Comments

5th February 2007

Google Webmaster Tools Expands Its Focus

Google announced the expansion of the functionality of the Webmaster tools. As they stated in the opening sentence, “You asked, and we listened…”

I’ve logged in and checked out the feature and am pretty impressed so far. It’s very organized and an easy interface. I have several observations on the setup, the concept, and the future of this and other tools.

    They clearly state that they still don’t show all known links, I’m sure many will miss this salient point and still complain.

    Links are shown that clearly don’t effect page rank as many of my own links are from forums and blog comments that are “nofollow“.

    I find the expanded information which is only available to a verified and logged in site owner as a benefit to the site owner. No longer will your competition be able to see where you are getting your links and thus page value from. They of course can still use the more public information such as on Yahoo! and MSN, but with those links you really don’t know what Google sees.

    The competitive webmaster will find it harder to target niches and the competition as it won’t be as easy to target the links of sites that out rank you.

    To bring the process full circle Google needs to completely disable the Link command in the standard engine. Meanwhile I’d welcome a link to the webmaster console sign-up page a long with a disclaimer that the function is purposely disabled and that they don’t show ALL known links.

    This renewed attempt at communication between Google and Webmasters has benefits for all involved beyond the information exchanged. I believe it’s breeding a culture of mutual exchange that will further evolve to help clean up spam in the index. You cannot institute a standard over night and have to build up to it, but I can see a future where all sites need to be registered, verified, and in compliance with all webmaster guidelines to be indexed. It will be so commonplace that it will be mutually accepted as a process required to be listed, like submission services were in the 90s.

    Another step that will probably need to be taken further is the Webmaster Identification process. Currently is a fairly anonymous procedure that really only involves being able to supply an email address and the ability to drop a file on the server. Connecting webmaster tools with other services such as AdWords, AdSense, Analytics, and Gmail will help contain webmasters to one account. As it is I’m sure a certain percentage of webmasters have multiple accounts for various reasons such as an organizational element, but also much more nefarious intents like trying to hide your own link network. The day all sites have be registered with the REAL identity of the owner or organization is the day that many link networks will come crashing down.

    There is still more work to be done, notification of bad links that point to 404 pages would be a huge help, crawl schedule for pages would be another. We all know that some pages are relegated to the supplemental index for whatever magic soup reason, but it would be benificial to Google, the Webmasters, and its users if we had that information. My pages that are only going to be crawled quarterly would not get nearly as much attention as the ones I know are displaying fresh search results.

    And finally, they’ve come a long way since the inception of the original sitemaps program, and aren’s showing any sign of slowing down.

If you liked this post please buy me a beer. Thanks.

posted in Google, Webmastering | 0 Comments

  • Please Support

  • Marquette University

  • Sponsored