Monday, August 9th, 2010 at 2:04 pm
Search Engine basics
Crawler-based SEs, also referred to as spiders or Web crawlers, use special software to automatically and regularly visit websites to create and supplement their giant Web page repositories.
This software is referred to as a “bot”, “robot”, “crawler”, or “spider”. All these terms denote the same concept. Adding your site manually to the search engines listing is not enough there is code that needs to be installed on your site to enable the “spider†to properly scan your site. Settings can be made to disable the search engine from going into your site.  Make an error on those settings and the search engine and world will never find your site.
After a spider has found a page to scan, it retrieves this page via HTTP (like any ordinary Web surfer who types an URL into a browser’s address field and presses “enter”). Just like any human visitor, the crawling software leaves a record on your server about its visit. You can find out how often the search engine has been to your site. If the search engine quits for a bit no problem but if it drops out for a  month you may need to perform some maintenance actions.
Your Web server returns the HTML source code of your page to the spider. The spider then reads it (this process is referred to as “crawling” or “spidering“) and this is where the difference begins between a human visitor and crawling software. The human reads the screen and goes on while the “spiders” extract the data for immediate use.
Spiders concentrate on the content of the Meta tags and the text in the web page. Where a user would be impressed by a fancy flash display the spider can’t see it. The result can be that even though you have the most mind blowing animation on your web site the spiders won’t give you any credit for the content. The spider gives value based on the position of the text on the screen. Repeating the same keyword on the top and bottom of the screen (called prominance I’ll cover that later this week)  lets the spider know it’s important.
Images draw the user into the content of the page but the spider has no idea what the content of the image is. Unless your web site designer has a handle on SEO principles your fancy graphics won’t help your site at all. There is a ALT description setting within image that 99% of the users will never see but the spider reads it for keywords. The spider sees things the users will never look for and uses that information to draw users to your site.
SEO (search engine optimization)  makes your page more search-engine friendly. The optimization is mostly oriented towards crawler-based engines, which are the most-popular on the Internet. The search engine basics presented here represent the tip of the iceberg in terms of knowing how to format a page to get the best results.Â
Spiders work by copying your site into a large repository of  Web pages called a search engine index. The data stored in the search engine index in a the way that makes it possible to quickly determine whether this page is relevant to a particular query and to pull it out for inclusion in the result page shown in response to the query. The process of placing your page in the index is referred to as “indexing“. After your page has been indexed, it will appear on search engine results pages for the words and phrases most common on the indexed Web page. Its position in the list, however, may vary. Achieving a top of first page output by the search engine for your web page is the goal of every SEO. Sometimes all it takes are little changes to tweak the concentration of the page data so the search engine ranks the right words the way you want it to.
Later, when someone searches the engine for particular terms, your page will be pulled out of the index and included in the search results. The search engine now applies a sophisticated technique to determine how relevant your page is to these terms. It considers many on-page and off-page factors and the page is given a certain position, or rank, within other results found for the surfer’s query. This process is called “ranking“.
While search engine basics just scratches the surface a little knowledge can help your standings immensly.