Network: Web Design: Finding an irritation bypass when in search of Marx

IT IS universally agreed that finding information on the Web can be damn irritating. It's not impossible, of course. If it was, no one would use it (and we'd be out of jobs).

Most people, both novice and experienced Net users, will start their quest for information with one of the many search engines. They type in a description of what they are seeking - say, a Marx Brothers movie clip from A Night at the Opera - so they type in "Marx Brothers", hit "search" and are whisked away to what is usually a massive list of possible Web pages that might, just might, have information that is relevant to them.

Naturally if you have that famous movie clip of Groucho trying to stuff the entire crew of the cruise ship into his cabin, then you want one of those websites placed towards the top, if not at the very top, of the search results list. Yet, you are potentially competing against thousands of other Web pages that have the key words "Marx" and "Brothers" on them, some of which might be extolling the virtues of the proletariat rather then containing the comedic genius.

So, how do you separate Karl Marx's philosophies from the Groucho Marx's shtick? Vladimir Lenin's revolution from John Lennon's "Revolution?" How do you make sure that the content you have gets found by the people who need it?

The place to start is by understanding how the search engines that your potential visitors will be using work.

Different flavours

Although the outcome is often the same - a list of search results - there are really two different types of search engines on the Web: crawlers and directories. These two methods differ primarily in the ways that they gather the data from which they create their index of sites, which is then searched.

Crawlers: Crawlers, such as AltaVista or Excite, use a program called a spider, which "crawls" through the Web, indexing pages along the way. Visitors can then search through the results that the spider finds. However, if a change is made to a Web page, the spider has to crawl through that page again before the change is detected. The World Wide Web is a really big place, so it might take a while for the spider to get back again.

Directories: Unlike the active crawlers, passive directories require website creators (or whoever wants to do it) to register a site in their index. The advantage of a directory such as Yahoo! or DMOZ, is that they are far more selective as to what content is indexed, so searches tend to be more focused and produce more accurate results. However, directories are also harder to keep up to date, especially if a site has to be checked by a human being before entry. The other great advantage of a directory is that the searcher can actually bypass the search engine, and find what they are looking for by narrowing down the subject by selecting from lists of increasingly specific topics.

Hybrid Search Engines: Several search engines, for example, Yahoo!, will allow you to search indexes created both by crawlers and directories simultaneously. This allows you to deploy the advantages of both techniques at the same time.

The parts of a search engine

Whether the search engine uses a crawler or a registration directory to get its data, they all have at least two parts in common: the index and the search software.

The Index: All of the content that gets crawled by the spider, and/or all of the entries in the directory get placed into the index. If the search engine uses a spider, then this massive database can contain every page that has been crawled, making it a carbon copy of the Web. If the search engine uses a directory, then only the titles, URLs, and descriptions of Web pages are included in the index.

The Search Software: When a visitor uses a search engine, they first enter one or more keywords. The index is then sifted through by search software which matches the key word(s) to Web pages and ranks them in order of relevance. So, how does the search software make the crucial decision as to which pages are more relevant, and thus closer to the top of the list, than others?

Ranking Web sites

Most search engines that use a crawler to produce the massive amounts of data used to search, determine relevancy by following a set of rules that stay more or less consistent across products. If someone is using the search engine to find the words "Marx Brothers", the search engine will check to see:

Which pages have these words in the

Which pages one or both of the words appear on.

How close to the top the words appear, assuming that the closer the words are to the beginning of a page the more relevant that page is.

How frequently the words appear on the page.

How close the words appear together.

And after considering all of these criteria, it produces the list of sites in order of relevancy. Well, almost.

Secret ingredients

While all of the major search engines follow this basic recipe, if all search engines worked exactly same way, then we would only need one search engine. Some crawlers index more pages than others, while many directories will use human beings to evaluate submitted websites. All search engines will put their own spin on searching to differentiate themselves from the competition.

Next week, I'll go further in depth into some of the secret ingredients that different search engines use to find your site. And then over the following weeks I'll be taking a look at how to optimise a site for searching, and some of the resources online to help you get a handle on the search engine monster.

Jason Cranford Teague is the author of 'DHTML For the World Wide Web'. If you have questions, you can find an archive of his column at Webbed Environments (www.webbedenvironments.com) or e-mail him at jason@webbedenvironments.com

Arts and Entertainment
Nick Frost will star in the Doctor Who 2014 Christmas special

TV
Arts and Entertainment
Friends is celebrating its 20th anniversary this year
TV
Arts and Entertainment
A spell in the sun: Emma Stone and Colin Firth star in ‘Magic in the Moonlight’
filmReview: Magic In The Moonlight
Arts and Entertainment
Ben Whishaw is replacing Colin Firth as the voice of Paddington Bear

TV
Arts and Entertainment
Actor and director Zach Braff

TV
PROMOTED VIDEO
Have you tried new the Independent Digital Edition apps?
Arts and Entertainment

TV
Arts and Entertainment
Meera Syal was a member of the team that created Goodness Gracious Me

TV
Arts and Entertainment
The former Doctor Who actor is to play a vicar is search of a wife

film
Arts and Entertainment

music
Arts and Entertainment
Pointless host Alexander Armstrong will voice Danger Mouse on CBBC

TV
Arts and Entertainment
Pharrell dismissed the controversy surrounding

music
Arts and Entertainment
Jack Huston is the new Ben-Hur

film
Arts and Entertainment

TV
Arts and Entertainment
Cara Delevingne modelling

film
Arts and Entertainment
Emma Thompson and Bryn Terfel are bringing Sweeney Todd: The Demon Barber of Fleet Street to the London Coliseum

theatre
Arts and Entertainment
Sheridan Smith as Cilla Black

TV
Arts and Entertainment
Robin Thicke's video for 'Blurred Lines' has been criticised for condoning rape

Robin Thicke admits he didn't write 'Blurred Lines'

music
Arts and Entertainment
While many films were released, few managed to match the success of James Bond blockbuster 'Skyfall'

film
Arts and Entertainment
Matt Damon as Jason Bourne in The Bourne Ultimatum (2007)

film
Arts and Entertainment
Sheridan Smith as Cilla Black

Review: Cilla, ITV TV
Arts and Entertainment

TV
Arts and Entertainment
Tom Hardy stars with Cillian Murphy in Peaky Blinders II

TV
Arts and Entertainment

art
Arts and Entertainment
Keira Knightley and Benedict Cumberbatch star in the Alan Turing biopic The Imitation Game

film
Arts and Entertainment
Kanye West is on his 'Yeezus' tour at the moment

Music
Have you tried new the Independent Digital Edition apps?

ES Rentals

    Independent Dating
    and  

    By clicking 'Search' you
    are agreeing to our
    Terms of Use.

    Scottish referendum: The Yes vote was the love that dared speak its name, but it was not to be

    Despite the result, this is the end of the status quo

    Boyd Tonkin on the fall-out from the Scottish referendum
    Manolo Blahnik: The high priest of heels talks flats, Englishness, and why he loves Mary Beard

    Manolo Blahnik: Flats, Englishness, and Mary Beard

    The shoe designer who has been dubbed 'the patron saint of the stiletto'
    The Beatles biographer reveals exclusive original manuscripts of some of the best pop songs ever written

    Scrambled eggs and LSD

    Behind The Beatles' lyrics - thanks to Hunter Davis's original manuscript copies
    'Normcore' fashion: Blending in is the new standing out in latest catwalk non-trend

    'Normcore': Blending in is the new standing out

    Just when fashion was in grave danger of running out of trends, it only went and invented the non-trend. Rebecca Gonsalves investigates
    Dance’s new leading ladies fight back: How female vocalists are now writing their own hits

    New leading ladies of dance fight back

    How female vocalists are now writing their own hits
    Mystery of the Ground Zero wedding photo

    A shot in the dark

    Mystery of the wedding photo from Ground Zero
    His life, the universe and everything

    His life, the universe and everything

    New biography sheds light on comic genius of Douglas Adams
    Save us from small screen superheroes

    Save us from small screen superheroes

    Shows like Agents of S.H.I.E.L.D are little more than marketing tools
    Reach for the skies

    Reach for the skies

    From pools to football pitches, rooftop living is looking up
    These are the 12 best hotel spas in the UK

    12 best hotel spas in the UK

    Some hotels go all out on facilities; others stand out for the sheer quality of treatments
    These Iranian-controlled Shia militias used to specialise in killing American soldiers. Now they are fighting Isis, backed up by US airstrikes

    Widespread fear of Isis is producing strange bedfellows

    Iranian-controlled Shia militias that used to kill American soldiers are now fighting Isis, helped by US airstrikes
    Topshop goes part Athena poster, part last spring Prada

    Topshop goes part Athena poster, part last spring Prada

    Shoppers don't come to Topshop for the unique
    How to make a Lego masterpiece

    How to make a Lego masterpiece

    Toy breaks out of the nursery and heads for the gallery
    Meet the ‘Endies’ – city dwellers who are too poor to have fun

    Meet the ‘Endies’ – city dwellers who are too poor to have fun

    Urbanites are cursed with an acronym pointing to Employed but No Disposable Income or Savings
    Paisley’s decision to make peace with IRA enemies might remind the Arabs of Sadat

    Ian Paisley’s decision to make peace with his IRA enemies

    His Save Ulster from Sodomy campaign would surely have been supported by many a Sunni imam