Network: Web Design: Finding an irritation bypass when in search of Marx

IT IS universally agreed that finding information on the Web can be damn irritating. It's not impossible, of course. If it was, no one would use it (and we'd be out of jobs).

Most people, both novice and experienced Net users, will start their quest for information with one of the many search engines. They type in a description of what they are seeking - say, a Marx Brothers movie clip from A Night at the Opera - so they type in "Marx Brothers", hit "search" and are whisked away to what is usually a massive list of possible Web pages that might, just might, have information that is relevant to them.

Naturally if you have that famous movie clip of Groucho trying to stuff the entire crew of the cruise ship into his cabin, then you want one of those websites placed towards the top, if not at the very top, of the search results list. Yet, you are potentially competing against thousands of other Web pages that have the key words "Marx" and "Brothers" on them, some of which might be extolling the virtues of the proletariat rather then containing the comedic genius.

So, how do you separate Karl Marx's philosophies from the Groucho Marx's shtick? Vladimir Lenin's revolution from John Lennon's "Revolution?" How do you make sure that the content you have gets found by the people who need it?

The place to start is by understanding how the search engines that your potential visitors will be using work.

Different flavours

Although the outcome is often the same - a list of search results - there are really two different types of search engines on the Web: crawlers and directories. These two methods differ primarily in the ways that they gather the data from which they create their index of sites, which is then searched.

Crawlers: Crawlers, such as AltaVista or Excite, use a program called a spider, which "crawls" through the Web, indexing pages along the way. Visitors can then search through the results that the spider finds. However, if a change is made to a Web page, the spider has to crawl through that page again before the change is detected. The World Wide Web is a really big place, so it might take a while for the spider to get back again.

Directories: Unlike the active crawlers, passive directories require website creators (or whoever wants to do it) to register a site in their index. The advantage of a directory such as Yahoo! or DMOZ, is that they are far more selective as to what content is indexed, so searches tend to be more focused and produce more accurate results. However, directories are also harder to keep up to date, especially if a site has to be checked by a human being before entry. The other great advantage of a directory is that the searcher can actually bypass the search engine, and find what they are looking for by narrowing down the subject by selecting from lists of increasingly specific topics.

Hybrid Search Engines: Several search engines, for example, Yahoo!, will allow you to search indexes created both by crawlers and directories simultaneously. This allows you to deploy the advantages of both techniques at the same time.

The parts of a search engine

Whether the search engine uses a crawler or a registration directory to get its data, they all have at least two parts in common: the index and the search software.

The Index: All of the content that gets crawled by the spider, and/or all of the entries in the directory get placed into the index. If the search engine uses a spider, then this massive database can contain every page that has been crawled, making it a carbon copy of the Web. If the search engine uses a directory, then only the titles, URLs, and descriptions of Web pages are included in the index.

The Search Software: When a visitor uses a search engine, they first enter one or more keywords. The index is then sifted through by search software which matches the key word(s) to Web pages and ranks them in order of relevance. So, how does the search software make the crucial decision as to which pages are more relevant, and thus closer to the top of the list, than others?

Ranking Web sites

Most search engines that use a crawler to produce the massive amounts of data used to search, determine relevancy by following a set of rules that stay more or less consistent across products. If someone is using the search engine to find the words "Marx Brothers", the search engine will check to see:

Which pages have these words in the

Which pages one or both of the words appear on.

How close to the top the words appear, assuming that the closer the words are to the beginning of a page the more relevant that page is.

How frequently the words appear on the page.

How close the words appear together.

And after considering all of these criteria, it produces the list of sites in order of relevancy. Well, almost.

Secret ingredients

While all of the major search engines follow this basic recipe, if all search engines worked exactly same way, then we would only need one search engine. Some crawlers index more pages than others, while many directories will use human beings to evaluate submitted websites. All search engines will put their own spin on searching to differentiate themselves from the competition.

Next week, I'll go further in depth into some of the secret ingredients that different search engines use to find your site. And then over the following weeks I'll be taking a look at how to optimise a site for searching, and some of the resources online to help you get a handle on the search engine monster.

Jason Cranford Teague is the author of 'DHTML For the World Wide Web'. If you have questions, you can find an archive of his column at Webbed Environments (www.webbedenvironments.com) or e-mail him at jason@webbedenvironments.com

Have you tried new the Independent Digital Edition apps?
Arts and Entertainment

ebooksNow available in paperback
Arts and Entertainment

ebooks
Arts and Entertainment
Feeling all at sea: Barbara's 18-year-old son came under the influence of a Canadian libertarian preacher – and she had to fight to win him back
TV review
Arts and Entertainment
Living the high life: Anne Robinson enjoys some skip-surfed soup
TV review
Arts and Entertainment

Great British Bake Off
Arts and Entertainment
Doctor Who and Missy in the Doctor Who series 8 finale

TV
Arts and Entertainment

film
Arts and Entertainment
Chvrches lead singer Lauren Mayberry in the band's new video 'Leave a Trace'

music
Arts and Entertainment

music
Arts and Entertainment
Home on the raunch: George Bisset (Aneurin Barnard), Lady Seymour Worsley (Natalie Dormer) and Richard Worsley (Shaun Evans)

TV review
Arts and Entertainment

TV
Arts and Entertainment
Strictly Come Dancing was watched by 6.9m viewers

Strictly
Arts and Entertainment
NWA biopic Straight Outta Compton

film
Arts and Entertainment
Natalie Dormer as Margaery Tyrell and Lena Headey as Cersei Lannister in Game of Thrones

Game of Thrones
Arts and Entertainment
New book 'The Rabbit Who Wants To Fall Asleep' by Carl-Johan Forssen Ehrlin

books
Arts and Entertainment
Calvi is not afraid of exploring the deep stuff: loneliness, anxiety, identity, reinvention
music
Arts and Entertainment
Edinburgh solo performers Neil James and Jessica Sherr
comedy
Arts and Entertainment
If a deal to buy tBeats, founded by hip-hop star Dr Dre (pictured) and music producer Jimmy Iovine went through, it would be Apple’s biggest ever acquisition

album review
Arts and Entertainment
Paloma Faith is joining The Voice as a new coach

TV
Arts and Entertainment
Dowton Abbey has been pulling in 'telly tourists', who are visiting Highclere House in Berkshire

TV
Arts and Entertainment

TV
Arts and Entertainment
Patriot games: Vic Reeves featured in ‘Very British Problems’
TV review
Latest stories from i100
Have you tried new the Independent Digital Edition apps?
SPONSORED FEATURES

ES Rentals

    Independent Dating
    and  

    By clicking 'Search' you
    are agreeing to our
    Terms of Use.

    Isis profits from destruction of antiquities by selling relics to dealers - and then blowing up the buildings they come from to conceal the evidence of looting

    How Isis profits from destruction of antiquities

    Robert Fisk on the terrorist group's manipulation of the market to increase the price of artefacts
    Labour leadership: Andy Burnham urges Jeremy Corbyn voters to think again in last-minute plea

    'If we lose touch we’ll end up with two decades of the Tories'

    In an exclusive interview, Andy Burnham urges Jeremy Corbyn voters to think again in last-minute plea
    Tunisia fears its Arab Spring could be reversed as the new regime becomes as intolerant of dissent as its predecessor

    The Arab Spring reversed

    Tunisian protesters fear that a new law will whitewash corrupt businessmen and officials, but they are finding that the new regime is becoming as intolerant of dissent as its predecessor
    King Arthur: Legendary figure was real and lived most of his life in Strathclyde, academic claims

    Academic claims King Arthur was real - and reveals where he lived

    Dr Andrew Breeze says the legendary figure did exist – but was a general, not a king
    Who is Oliver Bonas and how has he captured middle-class hearts?

    Who is Oliver Bonas?

    It's the first high-street store to pay its staff the living wage, and it saw out the recession in style
    Earth has 'lost more than half its trees' since humans first started cutting them down

    Axe-wielding Man fells half the world’s trees – leaving us just 422 each

    However, the number of trees may be eight times higher than previously thought
    60 years of Scalextric: Model cars are now stuffed with as much tech as real ones

    60 years of Scalextric

    Model cars are now stuffed with as much tech as real ones
    Theme parks continue to draw in thrill-seekers despite the risks - so why are we so addicted?

    Why are we addicted to theme parks?

    Now that Banksy has unveiled his own dystopian version, Christopher Beanland considers the ups and downs of our endless quest for amusement
    Tourism in Iran: The country will soon be opening up again after years of isolation

    Iran is opening up again to tourists

    After years of isolation, Iran is reopening its embassies abroad. Soon, there'll be the chance for the adventurous to holiday there
    10 best PS4 games

    10 best PS4 games

    Can’t wait for the new round of blockbusters due out this autumn? We played through last year’s offering
    Transfer window: Ten things we learnt

    Ten things we learnt from the transfer window

    Record-breaking spending shows FFP restraint no longer applies
    Migrant crisis: UN official Philippe Douste-Blazy reveals the harrowing sights he encountered among refugees arriving on Lampedusa

    ‘Can we really just turn away?’

    Dead bodies, men drowning, women miscarrying – a senior UN figure on the horrors he has witnessed among migrants arriving on Lampedusa, and urges politicians not to underestimate our caring nature
    Nine of Syria and Iraq's 10 world heritage sites are in danger as Isis ravages centuries of history

    Nine of Syria and Iraq's 10 world heritage sites are in danger...

    ... and not just because of Isis vandalism
    Girl on a Plane: An exclusive extract of the novelisation inspired by the 1970 Palestinian fighters hijack

    Girl on a Plane

    An exclusive extract of the novelisation inspired by the 1970 Palestinian fighters hijack
    Why Frederick Forsyth's spying days could spell disaster for today's journalists

    Why Frederick Forsyth's spying days could spell disaster for today's journalists

    The author of 'The Day of the Jackal' has revealed he spied for MI6 while a foreign correspondent