The "Validate" SEO (Search Engine Optimization) Tool

Providing Search Engines With "Clean" Pages


What is a "Clean" Page (and Why Does it Matter?)

Web pages are written in one or another dialect of a language called the HyperText Markup Language, or just HTML. It is just that: a language. It has a grammar and a syntax.

As with any language, if you make mistakes in your grammar and syntax, sometimes--often, to be honest--they will not materially interfere with communication. Modern "clients"--browsers--are expressly designed to be fault-tolerant to many of the common faults sloppy web-page makers commit in their haste or ignorance or laziness.

Search-engine robots are also--one must presume, though no one who is not one of their masters knows the details--fairly tolerant of bad HTML. But, considering that as a webmaster interested in SEO, it behooves you to avoid all chances, however small, that something could interfere with searchbots properly reading your pages, you want to be sure that each is in 100% correct HTML. (And, as XHTML--which is inherently very much less tolerant of faults--becomes more and more nearly necessary, it is all the more important to have "clean" pages to convert from.)

(Of course, sound HTML also maximizes the chances that your visitors will see your page correctly in their browsers, visitors being something the SEO-fixated webmaster sometimes forgets are the whole point of doing SEO; moreover, there are web apps that must have clean HTML to work with, such as readers for the blind.)

You do not need to be an infallible HTML genius to get your pages 100% clean. All you need do is submit each to a so-called "third-party validator", which is a piece of software run by a trustworthy third party, which will scan your pages and either validate them or report exactly what defects it has found. The W3C--essentially the "ruling body" of the web--makes available, free, an excellent such validator.

But manually submitting pages for validation, especially from a large site, or several sites, would be a massively tedious process. Fortunately, that process can be automated, and this package is an implementation of such an automated submission, using the W3C validator.

This package is completely free.


Features

One installation of Validate can and will handle as many distinct web sites as you may have on the same server it resides on, and will issue a separate report page for each site. Validate also allows you to specify particular directories for each site that you do not want validated: that is to save time in the validation process by not trying to validate directories full of, for example, test or backup pages not on display to the public.

Validate takes a few minutes for a one-time install and can thereafter be run as a scheduled "cron" job as often as you like (I recomend daily, overnight, so that you can make changes in your pages at any time and know that, at worst, any errors will be reported no later than the next morning).

As presently configured (this may change in later releases), Validate will submit all pages in directories it is allowed into that have an extension of .htm, .html, or .shtml (it does not submit php scripts, though those, too, are "web pages", and I may add that as an option if there is a call for it).

For each site checked, Validate creates a web-page report listing all pages that did not validate. Each such listing is actually a click-on link to the W3C validator, so you can direct and immediately investigate how and where a page is failing.


System Requirements:

There are only two: your site's server must be capable of running PHP, which virtually all are; and it must either have PHP's "safe mode" Off or provide cgi wrapping for PHP scripts, one or the other which situations is the case on almost all servers.


(Here is the "Validate" v. 0.21 tool package as a single ZIP file.)



Change History:

0.10: initial release

0.11: fixed silly mistake in installer that was making a defective cronval.php file.

0.12: trivial fix to validate.php to eliminate a harmless but annoying double-slash in link URLs.

0.13: fix to silly error in cronval.php.

0.14: fix to another silly error in cronval.php.

0.20: improved handling when validator busy or otherwise unavailable.

0.21: trifling change to improve file-read/store speeds.

To link to this page, please copy and paste this exact code:
<strong><a href="http://seo-toys.com/validate-seo-tool/validate-tool.shtml">the "Validate" SEO Tool</a></strong>

--Site Directory--

Search this site, or the web:
Google
  Web seo-toys.com   

owl logo This site is one of The Owlcroft Company family of web sites. Please click on the link (or the owl) to see a menu of our other diverse user-friendly, helpful sites.       Pair Networks logo Like all our sites, this one is hosted at the highly regarded Pair Networks, whom we strongly recommend. We invite you to click on the Pair link (or their logo) for more information on getting your site or sites hosted on a first-class service.

Click here to send me email.

And why not look in at Is it a blog yet?

So that you need not be a victim of the "Browser Wars," I have taken the trouble to assure that
this web page is 100% compliant with the World Wide Web Consortium's
XHTML Protocol v1.0 (Transitional).
You can click on the logo below to test this page!


Not every browser renders proper HTML correctly (Internet Explorer famously does not);
so, if your browser experiences any difficulties with this page (or, really, even if it doesn't),

(It's free!)

All content copyright ©2004 - 2010 by The Owlcroft Company