Charles Arthur On Technology

A spammer in the works

Wednesday 08 December 2004 01:00 GMT
Comments

As you read these words, an electronic rain is falling endlessly around the Web, falling onto the millions of blogs set up by eager, well-meaning people who want to air their views and let other people comment on them.

As you read these words, an electronic rain is falling endlessly around the Web, falling onto the millions of blogs set up by eager, well-meaning people who want to air their views and let other people comment on them.

The rain falls on the most praised and the most ignominious, the most important and the most trivial. It's generated by spammers, who have only one purpose - which they follow with the fixedness of any parasite - and that is to boost the Google rankings of their spam sites.

Here's how it works. You start a blog. You decide that it would be good to allow your readers to comment on the things you say; after all, they've read it, so why not let them give their insight (which might be greater than yours)? So you make it feasible for anyone who reads the page to write a comment. No registration, no clicking on a link. Just type some text, click on a button, and there's the comment.

But here's the reality: pretty soon a spammer will find your blog and begin posting junk on it, using automated systems working far faster than any number of people.

Take my own as a typical example. The first post was on Wednesday 7 July. The first comment came on a post made two days later. The first attempted spam came on 20 July, attempting to "comment" on an old post. The content was junk - multiple links to a site selling cigars, US visas, and an online flower shop. It came from a broadband PC in Israel, one I'm sure had been taken over by a hacker and hired out to a spammer to run a program that would post spam onto blogs.

That was the first drop of rain. Now it's a steady drizzle. Last weekend, the various defences I have against junk comments blocked about 1,000 attempts to post spam. They come from all over the world: Korea, Australia, Britain, the US, a Bulgarian ISP (or one of its customers) and what appears to be the ministry of something-or-other ("Ministerstvo spravedlnosti") in the Czech Republic.

How do I know where the attempts come from? The blog server records the IP address - in effect, the caller ID of the computer trying to post the spam - which can then be compared against a global database of ISPs and which addresses they provide services to.

What's more interesting is why spammers want to post irrelevant rubbish onto blogs, even to posts that are no longer visible. They have two reasons, both to do with search engines. Google treats blogs as more important than "normal" websites, because blog content changes so much more quickly. A blog might have new posts perhaps a dozen times a day, with fresh links to websites that had previously been overlooked. So the "Googlebot" (the software program that sniffs around the Web to see where links are being made) often returns to blogs. Secondly, the Googlebot looks for changed information on the website; even though a comment might have been made on a post that nobody is reading, the webserver tells the Google index that something has happened there, and Google adds the comment and its associated links to the index.

Result, for the spammer: an unprotected blog is a splendid way to promote yourself in Google's index to push pointless (but profitable) pursuits such as online poker or "dieting" drugs. Many people don't realise their blogs are being used in this way. Try a search on Google using the phrase "A professor of classics at McGill University and the author of Autobiography of Red". Wow! That's 3,380 hits about Anne Carson! (You know, the professor of classics at McGill! What do you mean, you've never heard of her?) Odd that this precise phrase should turn up so much? Ah, but have a closer look. The great majority of links are to spam sites: the phrase (lifted from Amazon) had a few extra hyperlinks to spam sites added, and was posted to thousands of comment boxes in blogs all over the Web. To see how bad it can get, have a look at one such "polluted" blog post, at the "Cowbell Chronicles" ( www.ineedmorecowbell.com/blog/000069.html). This idle thought posted in 2002 has a comments area that is a repository for a multitude of spam artists.

This electronic rain is flooding all the corners of the Web. Complaining is about as effective as shouting at rainclouds: one passes, another one appears. Many people try complaining to ISPs but as an ISP shuts down one "compromised" machine, a dozen more crop up, always Windows machines, usually taken over by viruses such as MyDoom, which were crafted specifically to create a host of "zombie" machines.

The parasitic economics of spam apply here. The spammer piggybacks on the blogger's bandwidth, upping the costs of running a blog by increasing the length of comments pages, and by calling up the comments page to auto-post commercial junk on it. Even if your comments page is only 2 Kbytes, having it called up by spammers 200,000 times a year means a bandwidth bill for an extra 400 Mbyte.

What sort of shelter is there from this deluge? Some bloggers don't accept comments. Some force contributors to register. Some force them to enter a series of numbers or letters that aren't machine-readable. Some require an e-mail and send a message with a hyperlink to click before the comment can appear. Others, including me, have programs that analyse the content of would-be comments for "spam words". It's not perfect; the other day I found my own attempts at a comment on my blog blocked. Just as with e-mail, spam is throttling one of the great communications mediums enabled by the internet, and particularly the Web.

Who's to blame? Not just the spammers. Blame must also rest on the search engines. By not finding better ways to prevent spammers guying its index, Google, the most popular search engine, is allowing the problem to worsen. Allowing bloggers to report offending spam sites, and then removing the sites from Google, might be one step. It would be a challenge to operate, but anything that throws a spammer onto the back foot must be good.

A final criticism goes to Microsoft, for creating a consumer operating system of astonishing insecurity. If Windows 98 (the first version of Windows written when Microsoft was properly aware of the internet and networks) and its successors had been written with security in mind, there would be far fewer "compromised" Windows machines being used in this way. Sure, there would be some insecure machines, but not the legions drizzling rubbish upon those of us who want to engage in dialogue without being interrupted by parasites.

www.charlesarthur.com/blog

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in