GWA is the new net SUV

If the Google Web Accelerator breaks your web application, here are a few ways to protect them from this little sucker:

From the GWA Webmaster FAQ:

Can I specify which links Google Web Accelerator will prefetch on my pages?

Yes, you can. For each link you'd like us to prefetch, simply add the following snippet of code somewhere in your page's HTML source code:

<link rel="prefetch" href="http://url/to/get/">

The href value should be the actual URL you want prefetched. Google will prefetch this page, and when your users click on this link, that page will load more quickly.

You can learn more about the >link> tag on the Mozilla website.

Also worth knowing: the GWA will not prefetch secure pages, so any URL under https is safe.

If you want to block the GWA at the Apache level, see this tip which can be summarized as putting this in a .htaccess file or your Apache configuration:

If you want to redirect GWA users to an explanation page (here gwa-forbidden.html) use:
RewriteEngine on
RewriteBase /
RewriteCond %{REMOTE_ADDR} ^(72.14.192.|72.14.194.)
RewriteCond %{REQUEST_URI} !^/gwa-forbidden.html$
RewriteRule ^.*$ /gwa-forbidden.html

If you want to send a 403 FORBIDDEN error use:
RewriteEngine on
RewriteBase /
RewriteCond ^(72.14.192.|72.14.194.)
RewriteRule ^.*$ - [F]

Though it would be better to send a 412 PRECONDITION FAILED rather than a 403, and mod_security would be a good tool to use for this with either one or the other following set of rules (blocking by IP or blocking by HTTP header):

SecFilterSelective "REMOTE_ADDR" "^72.14.192.*$" "deny,log,status:412"
SecFilterSelective "REMOTE_ADDR" "^72.14.194.*$" "deny,log,status:412"

or
SecFilterSelective "HTTP_X_MOZ" "prefetch" "deny,log,status:412"

Another way to filter proxy requests at the Apache level, without relying on IP ranges (which Google can modify pretty easily) is to detect the "X-moz: prefetch" header (tip from jpack's comment, which also provides a way to log proxied requests to a separate file):

RewriteEngine On
SetEnvIfNoCase X-Forwarded-For .+ proxy=yes
SetEnvIfNoCase X-moz prefetch no_access=yes

# block pre-fetch requests with X-moz headers
RewriteCond %{ENV:no_access} yes
RewriteRule .* - [F,L]

# write out all proxy requests to another log
CustomLog logs/ursite.com-access_log combined env=!proxy
CustomLog logs/ursite.com-proxy_log combined env=proxy

In PHP, one could do a test like this: if(strtoupper($_SERVER[‘HTTP_X_MOZ’]) == ‘PREFETCH’) ...

For Ruby on Rails applications, see How to show Google's Web Accelerator the door in Rails.

For ColdFusion, see: Use CF to block problems with Google Accelorator.

For some context and perspective about the issues brought by the GWA, and mainly the purists' take that the issue comes from broken web applications that rely on GET when they should be using POST, see:

My own take on this is that although it is indeed a recommendation that one should not implement any destructive or otherwise data-modifying action over an HTTP GET request, the reality is that there are tons of web applications out there that implement such actions using regular links (e.g. Google's Blogger or even its own API!). And the very first reason that comes to mind for doing it is that it's not possible to design a POST request that looks like a regular link without resorting to javascript. I particularly subscribe to Jarkko's comment here:

The spec says that developers shouldn't use GET, it doesn't say they are violating the specs if they do. Actually it's specifically said that there can be valid reasons to disobey these recommendations.

I sincerely admit that we as web app developers have a lot to learn from this episode but I still think you're distorting the discussion by bashing 37signals for this. It would be understandable if web application development would start from ground zero today. But it isn't. There's a whole sea of existing applications in the web that will be bitten by this and it's just plain nonsense going around screaming that it's your own fault.

As soon as people start using GWA and wreaking havoc in this imperfect world, they'll just be mad at Google and stop using the Accelerator. That's hardly what Google wants and as it's impossible for them to fix all the broken web apps in the world, there's realistically only one option left for them.

For another (bad) metaphor, this is about the same as leaving all the safety equipment away from a car because "if everyone obeys the traffic rules and laws, there will be no accidents".

But besides the reality check, my other problem with GWA is that it's not a good net citizen -- in fetching objects that most probably will not be displayed by visitors, it's wasting bandwidth and server resources. To me, GWA is the equivalent of an SUV on the net: it gives some sense of comfort to its users at the expense of others' resources.

7 Comments

Thanks a lot for the information !

I'm a dirty little web developper, I have GET active links almost everywhere, I even use that backdoor to hack my own applications, boo the dirty little web developper I am !

Frankly, I don't blame me too hard for that. But... let's face the situation. I'm gonna wash my hands, and until I'm clean, I'll apply your tips.

So, thank you very much, indeed, again.

For the SUV agreed with you, but it's also the case of most search engines. As you don't know if the page you index will ever be used and searched for.

For breaking applications, I understand the concerns, but I disagree completely with the outcome as "We should not do this and that, because we will break what people have done". Joe Gregorio said it very well in his piece.

When there's a bug, I fix it. And Joe is someone who's very practical and doesn't apply standards for the sake of standards. He's also an extremely good developer.

Karl, I see the theory (good) and the reality (bad). The reality principle makes me lean towards the conclusion that when you're a company like Google, it's not a good move to roll out something that wreaks havoc web applications, even when you know they're not following a standard, especially when you know they're not following a standard by a very LARGE margin. It's exactly the same stance as if I had decided years ago to serve pages as application/xhtml+xml and tell IE users to go to hell. Even the W3C sometimes admit that reality and theory are two different things;

http://www.w3.org/2003/01/xhtml-mimetype/

I particularly like the chapter titled "But some browsers don't know about application/xhtml+xml." if you see what I mean.

* First you cite the draft of an article. Read the top
http://www.webstandards.org/learn/askw3c/sep2003.html. And I know the article because it has been written by the W3C QA Team. We do NOT say for example "use XHML 1.1, with text/html"

* Second serving XHTML 1.0 as text/html is a legal usage of XHTML. Your example is moot :)))

* Third, it's definitely not the same level of consequences. SAFE actions (GET vs POST) is something fundamental for the Web. Would you accept to follow a unsafe action because it's more practical? (I think I remember a post from you about a meeting where the guy wanted an unsafe action, because it was more practical for him.)

The fact is that the GWA is somehow a good thing, suddenly people are becoming aware of the bad design for their own applications (that could be abused not by Google but more malicious users) and for users about the way some web sites are designed. Phishing for example, was IMHO, a very good thing for the partial understanding of secure usage of the Web. You can't rely on URI for being sure about a web service and that's good.

Bah, may be the example is not good (BTW, I found it through Google and it ranked #2 for the "application/xhtml+xml" request I think, not bad for a draft!).

While I agree with you that the heads up is a very good thing, I stand by my point that it was not a responsible thing to do from Google, notwithstanding my opinion that this thing is a waste of ressources, hence my SUV comparison.

Un petit complément d'information sur la non différence entre SOAP et REST

hi iran

mensuelles Archives

Recent Entries

  • Steve Jobs

    "Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because...

  • Your privacy on MOTOBLUR by Motorola

    After the Nokia Ovi Store carelessness, it's now Motorola who's allowing strangers to get access to your private information on their MOTOBLUR portal. Exactly like...

  • How to resume a broken ADC download

    (I'm documenting this trick for myself to remember, but it can be useful for others…) Apple, on its Apple Developer Connection site, has a bad...

  • WTF is this ‘myEventWatcherDiv’ doing in my web?

    All of a sudden I started to find the following line in most of the web pages I was browsing, including ones I made where...

  • Your privacy on Nokia Ovi Store

    My friend Adam Greenfield recently complained about the over-engineering culture at Nokia: I was given an NFC phone, and told to tap it against the...