Search engine robots - How they work, what they do (Part 1)
by Daria Goetsch
Why Isn't My Website In The Search Engine?
If your site isn't found in the search engines, it is probably because the robots couldn't deal with it. It could be something as simple as not being able to find the site, or it may be more complicated issues involving the robot's not being able to crawl the site or figure out what your pages are all about.
Submitting your site to the major search engines: that will help with the "can't find it" problem. Even having links pointing back to your site can be enough to attract the search engine robots. Google, for example, suggests that you may not have to submit your pages; they will find your site if you have a link pointing back to it from at least one other site on the web.
If the robots can find your site but can't make sense of it, then you may need to look at the content and technology used on your pages. Frames, Flash, dynamically generated pages, and invalid HTML source code can cause problems when the search engine robot tries to access your web pages. While some search engines are beginning to be able to index dynamically generated pages and Flash (e.g. Google and AllTheWeb), use of some of these technologies can hinder your ability to be indexed by the search engine robots.
Text in images cannot be read by the search engine robots. Using ALT image text is an important way to help the robots "read" your images. Websites with extensive images rely heavily on ALT text to present their content.
How Do I Get The Most Out Of Indexing?
If you know what to "feed" the spidering robots you will help yourself with search engine ranking.
Having a website full of good content is the major factor. Search engines exist to serve their visitors, not to rank your website. You need to be sure to present yourself in your site in the way that will be most useful to the search engine visitor. Each search engine has its own idea of what is important in a page, but they all value text highly. Making sure that the text on your pages includes your most important keyword phrases will help the search engine evaluate the content of those pages.
Making sure that you have good title and meta tags will further assist the search engines in understanding what your page is about. If the text on the page is about widgets, the title is about widgets, and the meta tags are about widgets, the search engine will have a pretty good idea that you are all about widgets. When their visitors search for widgets, the search engines know to list your site in the results.
A sitemap page is a very good way of giving the search engine robot every opportunity to reach your website pages. Since robots click through the links of your web pages, make sure that at least your most important pages are included in the sitemap; you may even want to include all your pages there, depending on the size of your site. Be sure to add a link to the sitemap page from each page on your site.
Another important consideration is that of keeping all of your pages within a small number of "clicks" from your top page. Many robots will not follow links more than two or three levels deep, so if your "widgets" page can only be reached from your home page by following multiple links (e.g. home page >> about us page >> products page >> widgets page), the robot may not crawl deep enough to get to the widgets page.
Testing Your Website For Search Engine Robot Accessibility
To get an idea just what the search engine robot "sees" on your page, you can look at the Sim Spider tool. You may be surprised at how different your site looks to the robot. You can find this tool at http://www.searchengineworld.com/cgi-bin/sim_spider.cgi
You will see text and ALT image text show up in the results. If your entire website is built in Flash, you will see nothing at all because robots don't understand Flash movies.
The Bottom Line
When it comes to search engine robots, think simply. Lots of good content and text, hyperlinks the robots can follow, optimization of your pages, topical links pointing back to your site and a sitemap will help insure the best results when the robots come visiting.
SpiderSpotting - Search Engine Watch
List of robots and protocols for setting up a robots.txt file.
Tutorials, forums and articles about Search Engine spiders and Search Engine Marketing.
Articles and resources about tracking Search Engine spiders.
Sim Spider Search Engine Robot Simulator
Search Engine World has a spider that simulates what the Search Engine robots read from your website.
|About the author |
Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing (http://www.searchinnovation.com), a Search Engine Promotion company serving small businesses. She has specialized in search engine optimization since 1998, including three years as the Search Engine Specialist for O'Reilly & Associates, a technical book publishing company.
Copyright © 2002-2004 Search Innovation Marketing. http://www.searchinnovation.com All Rights Reserved.
| DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware. |
More Search Engine Tricks Articles
More By Developer Shed