Web pages are written in one or another dialect of a language called the HyperText Markup Language, or just HTML. It is just that: a language. It has a grammar and a syntax.
As with any language, if you make mistakes in your grammar and syntax, sometimes--often, to be honest--they will not materially interfere with communication. Modern "clients"--browsers--are expressly designed to be fault-tolerant to many of the common faults sloppy web-page makers commit in their haste or ignorance or laziness.
Search-engine robots are also--one must presume, though no one who is not one of their masters knows the details--fairly tolerant of bad HTML. But, considering that as a webmaster interested in SEO, it behooves you to avoid all chances, however small, that something could interfere with searchbots properly reading your pages, you want to be sure that each is in 100% correct HTML. (And, as XHTML--which is inherently very much less tolerant of faults--becomes more and more nearly necessary, it is all the more important to have "clean" pages to convert from.)
(Of course, sound HTML also maximizes the chances that your visitors will see your page correctly in their browsers, visitors being something the SEO-fixated webmaster sometimes forgets are the whole point of doing SEO; moreover, there are web apps that must have clean HTML to work with, such as readers for the blind.)
You do not need to be an infallible HTML genius to get your pages 100% clean. All you need do is submit each to a so-called "third-party validator", which is a piece of software run by a trustworthy third party, which will scan your pages and either validate them or report exactly what defects it has found. The W3C--essentially the "ruling body" of the web--makes available, free, an excellent such validator.
But manually submitting pages for validation, especially from a large site, or several sites, would be a massively tedious process. Fortunately, that process can be automated, and this package is an implementation of such an automated submission, using the W3C validator.
This package is completely free.
One installation of Validate can and will handle as many distinct web sites as you may have on the same server it resides on, and will issue a separate report page for each site. Validate also allows you to specify particular directories for each site that you do not want validated: that is to save time in the validation process by not trying to validate directories full of, for example, test or backup pages not on display to the public.
Validate takes a few minutes for a one-time install and can thereafter be run as a scheduled "cron" job as often as you like (I recomend daily, overnight, so that you can make changes in your pages at any time and know that, at worst, any errors will be reported no later than the next morning).
As presently configured (this may change in later releases), Validate will submit all pages in directories it is allowed into that have an extension of .htm, .html, or .shtml (it does not submit php scripts, though those, too, are "web pages", and I may add that as an option if there is a call for it).
For each site checked, Validate creates a web-page report listing all pages that did not validate. Each such listing is actually a click-on link to the W3C validator, so you can direct and immediately investigate how and where a page is failing.
There are only two: your site's server must be capable of running PHP, which virtually all are; and it must either have PHP's "safe mode" Off or provide cgi wrapping for PHP scripts, one or the other which situations is the case on almost all servers.
|
(Here is the "Validate" v. 0.21 tool package as a single ZIP file.) |
Change History:
0.10: initial release
0.11: fixed silly mistake in installer that was making a defective cronval.php file.
0.12: trivial fix to validate.php to eliminate a harmless but annoying double-slash in link URLs.
0.13: fix to silly error in cronval.php.
0.14: fix to another silly error in cronval.php.
0.20: improved handling when validator busy or otherwise unavailable.
0.21: trifling change to improve file-read/store speeds.
To link to this page, please copy and paste this exact code:
<strong><a href="http://seo-toys.com/validate-seo-tool/validate-tool.shtml">the "Validate" SEO Tool</a></strong>
SEO (Search Engine Optimization)
Tools, Toys, and Packages:
an introduction to SEO principles and the SEO Tools offered on
this site
The SEO Tools, Toys, and Packages:
the actual free SEO Tools offered on this site
"Freebie"--
several thousand relevant, no-maintenance, daily-changing site pages
"Validate"--
make sure all your web pages are searchbot-readable HTML
|
several sizes of page drop-ins for weather anywhere in the world
--this is the "tiny" form; there are other samples available |
|||||||||||||||
|
exchange rates for (almost) any currency
versus (almost) any others-- this is just a sample of what it can look like: |
|
||||||||||||||||||||||||||||||||||||||||||
"Know"--
very small, very simple, modest but tantalizing "freshness" dropin suitable
for any site or page whatever:
What do you know about OmniKnow?
"ReDate"--
make sure the searchbots know that your pages are fresh
SEO Tips:
useful explanations of SEO Basics
SEO Tips #1:
"What Is SEO?" - an explanation of what SEO is and of some of
the more important basic concepts in doing it
SEO Tips #2:
"Don't Let the Tail Wag the Dog" - basics of good site design that
co-exist with, but transcend, sheer SEO
SEO Tips #3:
"That Pesky www" - how to keep from losing backlink value on
all your pages
SEO Tips #4:
PR versus SERPs - keeping your eye on the
right ball
Find and Buy Books:
both new and used, from our bookshop via Amazon and Abebooks
Internet-Related Books Available New Today:
Internet-Related Books By Title:
("internet" book titles beginning with the word "internet" are broken out separately in the alphabetical title lists below)
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | "Web" | X | Y | Z | non-letters
The "Internet"-Books "Master List" (a very large file!)
|
|
This site is one of The Owlcroft Company family of web sites. Please click on the link (or the owl) to see a menu of our other diverse user-friendly, helpful sites. |
|
And why not look in at Is it a blog yet?
What do you know about OmniKnow?
So that you need not be a victim of the "Browser Wars," I have taken the trouble to assure that
Not every browser renders proper HTML correctly (Internet Explorer famously does not);
so, if your browser experiences any difficulties with this page (or, really, even if it doesn't),
(It's free!)
All content copyright ©2004 - 2008 by The Owlcroft Company