-
What Search Engine ‘Spiders’ Are And How They Work
Posted on March 16th, 2010 No commentsSometimes referred to as ’spiders’ or crawlers, automated search engine robots seek out web pages for the user. Just how do they accomplish this and is this of importance? What is the real purpose of these robots?
These spiders actually have a rather limited scope of understanding and power available to them, far less than you would think considering they’re minions of such great and mighty names as Google and Yahoo. There’s a lot of things out of the scope of their understanding, such as frames, visuals such as movies or pictures, and scripting via java. Nor can they peek into parts of sites protected by passwords, or click buttons. Well, that’s what they can’t do. What CAN they do?
Spiders are able to determine the content of your page by looking at the visible text, the HTML code, and links. Based on the words it finds, the spider determines what the site is about using a complex algorithm to determine what is and isn’t important. Spiders also collect links from websites to follow later, which allows them to effectively hop from site to site to site. Since the entire internet is made up of links between websites, the robots use them to make their way through the internet as they search.
Submitting a new URL to a search engine adds this URL to the queue which the spiders are due to ‘crawl’ or visit. However, even if a URL isn’t submitted directly, the spiders usually find it through links from other websites. If you build link popularity, this will help the spiders find you faster. When the robots arrive, they’ll check your site for a file called ‘robots.txt,’ which will tell them what areas of the website they are not allowed to visit. Off-limits files may include things like binaries or other information that the spiders need not report back.
When the robots return, the information they gathered is assimilated into the search engine’s database. Through a complex algorithm, this data is interpreted and web sites are ranked according to how relevant they are to various topics that would be searched for. Some of the bots are quite easy to notice – Google’s is the appropriately-named Googlebot, where Inktomi utilizes a more ambiguous bot named Slurp. Others may be difficult to identify at all.
There may be robots that you do not want to visit your website such as aggressive bandwidth grabbing robots and others. The ability to identify individual robots and the number of their visits is useful. Information on the undesirable robots is helpful also. IP names and addresses of search engine robots are listed at the end of this article in a resources section. These robots read the pages on your website by visiting your page and looking at the text that is visible on the page, and then looks at the source code tags such as title tags, meta tags and others. They look at the hyperlinks on your page. From these links, the search engine robot can determine what your page is about. Each search engine has its own algorithm to determine what is important. Information is indexed and delivered to the search engine’s database according to how the robot has been set up through the search engine.
The search engine sorts the information that has been delivered to the databases which has become a part of the search engine and directory ranking process. This allows it to display the results. Databases are updated periodically. Robots visit you regularly to find any changes to your pages so that the latest information will be available. The way in which the search engine is set up determines how the number of visits you get is calculated. This can vary with different search engines. If your website is down or experiencing a large amount of traffic, the robot may not be able to access the page they are trying to visit. The website may not be re-indexed when this occurs. This depends on how frequently your site is visited by the robot. In the hope that your site will be accessible again, the robot will re-visit your site to see if it has become accessible.
Justin Harrison is an internationally recognised Internet Marketing expert who provides world class SEO Services to website owners. For more information visit: http://www.seorankings.co.za
Related posts
- How Expert SEO Website Design Assists Our Site To Rank Higher
- Article Ezine Submission: How To Get High Page Rank Fast
- Doctor Video; How To Go About It
- Powerful Dental Web Sites Entice More Clients
- How Doctor Video Marketing Can Help Your Practice
Shared Post
news, seo internet marketing, Search Engine Marketing, search engine optimization, search marketing, seo, seo servicesLeave a reply
blog.moneyblink.com
IM Money Maker


Get You 100% FREE Copy Of Submit My Site Software!