Search Engines are searchable databases generated by software "robots" or spiders that roam the World Wide Web looking for pages. With the returns from these spiders a database is created that attempts to provide a picture of what material is available on the web. You can search these databases (Search Engines) which return the sites that most closely match your search query.
| Building the Search Engine
|
|
|
|
|
Ranking Criteria
Are the criteria that the search engines use to determine the order in which results are displayed. Every search engine is different and use different criteria and cover different sites. Each of these aspects is given a level of importance which helps the search engine return the most relevant sites to the search.
|
|
Search Engine Issues:
What they don't Search
For more on how Search Engines work see:
Interesting Info about Search Engines:
Truncation: Truncation is used to indicate ending on a word. For instance the truncated word cat will find cats, category, catalog etc. You should, as this example shows, be careful when using truncation. Truncation is most useful to catch plurals.
| EXTRA
FUN INFO!!
A study of the web's structure, five times larger than any attempted previously, reveals that it isn't the fully interconnected network that we've been led to believe. The study suggests that the chance of being able to surf between two randomly chosen pages is less than one in four. Researchers from three Californian groups at IBM's Almaden Research Center in San Jose, the Altavista search engine in San Mateo and Compaq Systems Research Center in Palo Alto have analysed 200 million web pages and 1.5 billion hyperlinks. Their results, which will be presented next week at the World Wide Web 9 Conference in Amsterdam, indicate that the web is made up of four distinct components. A central core contains pages between which users can surf easily. Another large cluster, labeled 'in', contains pages that link to the core but cannot be reached from it. These are often new pages that have not yet been linked to. A separate 'out' cluster consists of pages that can be reached from the core but do not link to it, such as corporate websites containing only internal links. Other groups of pages, called 'tendrils' and 'tubes', connect to either the in or out clusters, or both, but not to the core, whereas some pages are completely unconnected. To illustrate this structure, the researchers picture the web as a plot shaped like a bow tie with finger-like projections. (from Nature 405, 113 (2000) © Macmillan Publishers Ltd. ) |
ABOUT BULEY
Library Directory | Collections | Floorplans | Directions | Policies | Buley Bulletin | Construction | SCSU Home | MySCSU |