Goals of this lesson
.. Define site architecture
.. Discover why site architecture is important
.. Learn about key elements of site architecture
.. Learn how each element impacts search engines
.. Learn what to do to avoid compatibility issues
Site architecture consists of your site’s navigation structure, the look and
feel of the site (site design), the page layout and the structure of various
elements on your page. Basically it is the entire framework that supports your
web site content.
Search engine rankings are impacted by site architecture decisions made in the
web site design and development stage. By using search engine unfriendly design,
you can sabotage your own site’s ability to rank well and be found by your target
audiences.
.. Using site design components that are compatible with search engine spiders.
.. Having content that is useful and relevant to searchers and other sites (for
linking purposes).
.. Ensuring your site design does not contain elements that negatively impact
a search engine’s ability to index it.
.. Using a navigation and linking structure that encourages regular indexing
by search engines.
Elements of Site Architecture That Impact Search Ranking
• Directory structure
• File naming
• Page file extensions
• Navigation menus
• Multiple entry points
• Heading tags
• Robots META Tag / Robots.txt file
• Error trapping
• Site maps
• Cascading Style Sheets (CSS)
• Server Side Includes (SSI)
• Text content
• Image maps /graphics
• ALT IMG tags
• Intro / splash pages
• Dynamic content
• Tables
• Frames
• Flash
Now let’s go through these elements one by one.
Directory Structure
It is widely believed that many search engines only index web sites to a depth
of 2 levels or a maximum of 50-60 files. Therefore, you should try to keep important
content that you want indexed in the top two directories of your site i.e. www.site.com/level1/level2/page.htm.
See our diagram below:

If you have a very large site and need more of your pages and deeper content indexed, it is recommended that you utilize the Trusted Feed and Paid Inclusion services provided by some engines. We’ll look at these in more detail later in this course.
Search engines index the content of your web page and image file names. To contribute
to your page’s relevancy weight for those pages, you can include keywords and
phrases in those file names. For example, if you were optimizing a jewelry site,
instead of using a page name of catalog7.htm, use gold-jewelry.htm. Instead
of naming an image photo43.jpg, use gold-chain.jpg.
Remember that many search engines cannot distinguish individual words from file
names unless they are separated by a hyphen. That’s why you see many companies
using hyphenated domain names.
So if you want search engines to pick up the keywords in your file names, instead
of using www.site.com/usedcarparts.htm, use www.site.com/used-car-parts.htm.
Don’t use underscores, because apparently these are not recognized as a separator
by engines.
The file extensions of web pages, (e.g. www.site.com/page.html, www.site.com/page.asp),
indicate the type of technologies used to build a given page. Various web design
technologies can affect a site’s search engine compatibility, with flat HTML
based sites considered the most search engine friendly and certain Content Management
Systems (CMS) causing indexing problems.
No matter what technology was used to build your site, make sure your site pages are indexable and supported by the major search engines.
Common navigation elements you’ll find on a web site include:

The disadvantages of these graphical navigation elements are:
• Not all search engines index ALT IMG text
• Not all search engines can follow graphic links
• These take longer to download than text links
The solution is to use keyword-filled text links instead of OR in addition to graphical navigation menus. We’ll go into this in more detail in Lesson 3.
Here is the layout of a typical site:

With this type of architecture, visitors can only reach some content from the top level. Relevant product content is buried a few levels deep, requiring many additional “clicks” to reach. The other disadvantage is that search engines may not index deep content.
Here is the layout of a site with multiple entry points:

In this case, visitors can reach most content from the top level navigation menu. Search engines can also reach and index more content easily, creating multiple entry points for site visitors. Relevant product content is quickly accessible, requiring fewer “clicks” to reach.
When optimizing your site, think of the reverse pyramid and assume EVERY page
on your site is a point of entry:

How to create a site with multiple entry points:
.. Make each page stand-alone in terms of navigation & layout.
.. Optimize each page for different keywords and phrases so you can target a
wider range of visitors. Use unique Title and META Tags for each page, based
on these keywords and phrases.
.. Create a consistent look and feel for all pages.
.. Enable search engines to index all relevant content.
.. Enable visitors to find what they are looking for within a few clicks.
H1, H2, H3 heading tags are attributes of the <font> tag in HTML code
(e.g. <h1 align="center" style="margin-top:
1"><font face="Arial"size="3">Search Engines</font></h1>)
Search engines index these and often place more relevancy “weight” on the content of the heading tags as they are usually used to denote headings of importance.
Take advantage of this by using H1, H2 and H3 tags instead of graphical headings.
Use heading tags to break up your page copy and make it easier to read for your
visitors. Include your most important search keywords and phrases within the
heading text. More information on the Heading Tag can be found here.
The Robots META tag is a tag within the HTML code of a site that instructs search
engine robots what pages of a site they should index and what pages they should
avoid. It enables webmasters to specify any pages they want kept out of the
search engine indices (e.g., order forms and guest books).
In the HTML code of a web site, a sample Robots META Tag looks like this:
<meta name="robots" content="index, follow">
You can instruct a search engine not to index a page by changing the content
of the tag to “noindex, follow” or “noindex, nofollow” if you don’t want it
to follow links on the page either. More information on the Robots META Tag
can be found here.
The Robots META tag goes into the HTML code, between the <head> tags.
Not all search engines support this tag, but it is good practice to include
one, unless you are going to use the Robots Exclusion Protocol, in which case
it’s not necessary.
Robots Exclusion Protocol (Robots.txt File)
The Robots Exclusion Protocol, commonly referred to as the Robots.txt file,
is another method to allow web site administrators to instruct visiting robots
which parts of their site should not be visited and indexed.
When a search robot visits a web site, it firsts checks for the existence of http://www.site.com/robots.txt. If it can find this document, it will analyze and obey the content of that file.
Robots.txt files contain the following information:
User-agent: *
Disallow: /
These lines are used to instruct a particular robot or user agent to avoid certain
directories or pages within a site. The asterisk indicates all robots but if
you only wanted to prevent a certain robot from indexing your site, you could
put the name of the robot in this line.
The “Disallow” line is where you would list files and folders that you didn’t
want indexed. For example, most site administrators would not want the content
of the site’s cgi-bin indexed. To instruct robots to avoid indexing this directory,
you would use the following in your robots.txt file:
User-agent: *
Disallow: /cgi-bin /
More information on the Robots Exclusion Protocol can be found here.
How often have you seen the type of error message below?

This type of error appears when you type in a non-existent web page address in a browser, or you click on a link from a search engine or other site that leads to a page that no longer exists.
This results in:
.. lost traffic
.. brand dilution
.. tarnished reputation
Unfortunately, this type of Page Not Found error is a common occurrence when
moving or editing web site content. For example, if a web page file location
changes from www.site.com/product.htm to www.site.com/folder/product.htm or
a news article page becomes archived after a certain date and no longer available
to the public, then an error page will appear when searchers look for that original
page.
This results in the loss of “legacy listings” or previous ranking positions
for those pages in the search engines. But this doesn’t have to mean lost traffic!
The solution is to create a custom 404 Error page so you NEVER lose traffic.
Click Here to see
an example of a Custom 404 Error Page. Depending on your site’s server set up,
you place your custom error page within your public HTML folder or your server’s
navigation folder and that way your custom error page appears whenever anyone
requests a page that no longer exists.
On your custom error page, you should direct visitors back to your home page or your site map so they can easily find what they were originally seeking.
A site map is just a page containing links to every other page on your site
and a short description of what each page contains. They are a useful tool to
enable your visitors to quickly find the content they are looking for.
Site maps can be a great way to create lots of keyword-filled text links which
contribute towards a site’s search relevancy AND link popularity. They are also
a good way to ensure each and every page of your site is within easy indexing
reach by search engine robots.
Search engine spiders LOVE site maps because of all the links they provide to
indexable areas of your site. Assume EVERY page on your site is a point of entry
and include a link to your site map on each page. Notice all the keywords used
within the link text? These contribute to your site’s search relevancy.
Cascading Style Sheets (CSS) are a simple mechanism for adding style (e.g. fonts,
colors, spacing) to Web documents.
If you are optimizing your web site for search engine compatibility, it is highly
recommended that you use Cascading Style Sheets. They offer the advantage of
storing all style-related code off the page, meaning that all the HTML code
relating to how a page should look is stored in a separate file. This reduces
the amount of code on each page and reduces the amount of code needing to be
indexed by search engines.
Because there is less code for the robots to wade through, your more important
page content is indexed faster by the engines, displayed more quickly to visitors
and more pages are able to be stored and displayed in search results. More information
on Cascading Style Sheets can be found here.
Server Side Includes (SSI)
A Server Side Include (SSI) is a HTML comment that instructs a web server to
dynamically generate elements of a web page prior to displaying the completed
page in a browser or to a search engine robot.
Server Side Includes are a handy way to include up to the minute information
in your web page, for example, the current date and time. They also allow you
to display consistent areas of your pages, (e.g. navigation menu), without reproducing
the code for these areas on every page.
To ensure web pages built with SSI are search engine compatible, you need to
ensure that the text on the page is optimized for target keywords and that the
navigation scheme is indexable. Because SSI tends to slow the download speed
of pages, you should also ensure the page doesn’t take too long to create, which
may cause search engine spiders to “time out” while indexing them and leave
your site. More information on Server Side Includes can be found here.
.. Text Content
.. Image Maps / Graphics
.. ALT IMG Tags
.. Intro / Splash Pages
.. Dynamic Content
.. Tables
.. Frames
.. Flash
The remaining site architectural elements listed above will be discussed in
detail in later lessons.
Using search engine compatible site architecture has a number of benefits:
.. Higher rankings on the search engines
.. Increased visitor traffic to your site
.. Control over what search terms your site is found under.
.. More qualified targets visiting your site, leading to an increase in sales
.. Higher quality, more relevant results in search engines
.. An edge over your competitors
.. Assisting search engine users to find what they’re looking for
.. A higher return on your web site investment
[Congratulations! You’ve reached the end of Lesson 2 – Please take your Review
Quiz when ready]
---> Next Lesson: Text Content SEO
Course index