Webmasters Weaponry
Machinehead Software Machinehead Software
Machinehead Software Machinehead Software
Webmasters Weaponry

Link Checker

Do you see any dead internal links on my site? ;-)

The Machinehead Webmasters Weaponry Link tester is an off line link tester that enables web masters to automatically test an entire site for broken links. Normally 'links' means links to other web pages (usually href="pagename.html") but WMW allow you to define your own link types (i.e. href=", src=", background=", code=", @import url(" etc.) and will scan for all of these (anything that points to another file).

Wen you click the 'test links' button you are offered four options: Current file, Current directory, Entire Tree, and Spider. All of these options use exactly the same link testing code, but the pages that actually get tested vary. The current file option is a useful quick check on the currently selected page providing only three lists i.e. Dead links, Alien links and good links. It would be folly to provide a list of orphaned files for this or the current directory option - typically this list would be long and meaningless.

The second pair of options are for link testing the entire site, providing a full analysis. The difference between 'entire tree' and 'spider' is easy to explain: Entire tree tests every page in every directory of your web site, whereas the spider only tests pages that can be found by following links, i.e. just the visible part of your site. The spider takes the page that is currently selected in WMW's file browser to be the starting point. Typically you get a bigger list of orphaned files with the spider.

When you run the full link analysis (Entire tree or spider) WMW gives you four lists: Dead Links, Alien Links, Good Links, Orphaned Files, Unique Links.

Dead Links

Dead or broken links are usually caused by spelling mistakes in the page. I.e. the filename or file path is incorrect or the file simply doesn't exist.

WMW also checks for some standard 'windows weeny' errors such as accidental use of backslash '\' characters in a file path (instead of forward slash characters '/'), and case sensitivity errors. These additional checks are particularly useful if you are developing your site under Windows, but the site is hosted on a Linux or UNIX web server. If your site contains these types of error it will appear to work just fine when testing your local copy with a web browser but these links won't work correctly when uploaded to a non-windows web server.

WMW also provides a user editable list of invalid characters and will flag up any URL containing the characters you have listed as being dead links of type 'invalid character'.

Good, Dead and Alien Links General

The first three lists are all very similar in appearance. The lists each have 4 columns:
Source Page: This is the page in which the link was found - double click to open this page in your default browser.
Link Text: This is the text of the files name an path as it appears in the source page. Right click and select 'copy link text to clipboard' to copy this text to the Windows clipboard so it can be searched for easily within the source of your web page. Select 'open link' to open the linked file using the default application registered for the file type (good & alien only - not dead links!). WMW doesn't test alien links for you only local links (relative AND absolute references to the home domain), but if you have got several links pages linking to other peoples sites its handy to have them all in one list where each can be tested in turn.
Type: This is the type of link I.e. href=", src=", background=" etc.
Error: A short error message to identify the type of bad link i.e. File not found, cAse SensitivITY, Windows path separator \ etc. Obviously this column will always be blank in the lists of Good and alien links.

In all of the link checkers lists (except 'aliens') the source file paths can be toggled between DOS and WWW formats. I.e. c:\directory\file.html or http://domain/directory/file.html. Double clicking the list item will either open the local (off line version) or www (on line) version as appropriate using your default web browser.

Alien Links

Alien Links are usually links to other domains, but this list might also include difficult to spot misspellings of the home domain. I.e. http://www.machinehead-software.co.uk/ resolves correctly as a link to my home page but http://www.machinhead-software.co.uk/ does not and would appear in my list of alien links. These kind of hard wired links (absolute references) are difficult to spot when testing off line with a web browser because the browser will attempt to go on line and retrieve the version on the server instead.

Good Links

The list of good links is optional and only worth displaying for statistical purposes (i.e. if you want to know how many have been found etc). In a typical web site this list can be rather long and might cause a noticeable delay when you close the link checker window. However in order to compile the other lists WMW must test all links including the good ones and the list of good links is essential to enable WMW to create a list of orphaned files.

Orphaned Files

Orphans are files that aren't linked to by any of your web pages. Most web hosts place a limit on the amount of space that you can use on the servers hard disk so this is a good way identify files that can be safely deleted in order to stay within your server space limits.

The list should be used with care though because if you haven't defined all possible ways that you may have defined your file paths etc and done the full analysis the list could be longer than expected!

The orphans list is slightly different to the others. Its four columns are File, Size, type and Date where:
File is the full windows/DOS path and file name.
Size is its size in KB
Type is the type of file i.e. HTML Document etc.
Date is the date and time when the file was last modified.

Double click to open the appropriate file using the default application associated with this file type. Right click and select Delete or remove to permanently delete the selected files from your local copy of your web site or simply remove them from the list as appropriate. You can also rename a file from here, copy a files, or just the text into the windows clipboard.

You can automatically create two flavours server side deletion script. If your site is hosted on a windows server click 'windows' and wmw writes a DOS batch file in the windows clipboard that you can run from the home directory of your site (locally or on the server) to delete all of the listed and selected files. Click the Linux button to create a Linux/UNIX shell script version.

Unique Files/Links

The last tab shows a list of unique verified files along with the link count or number of links pointing to each file. By default the list contains files of all types including .jpg's .gif's etc and including internal links within a page (i.e. http:/Domain/directory/file.html#internal_link).

To filter the list so that it only contains web pages click 'Web Pages Only'. To filter out the '#links' click 'Unique files only'. The total and link frequency will be automatically adjusted when these items are removed.

When the #links have been removed click the 'create site map' button to auto generate a site map web page similar to this one. The listings in this version of the page are grouped by directory, then within each directory grouping the pages are ranked in order of link popularity. The site map creator offers a range of styles and customisations. Click here to learn more about the site map creator.

Link Testing Service

If you can't wait for me to get WMW finished and released, you could try using my link testing service. Click here for more details.

Webmasters Weaponry Home
Miscellaneous Downloads Home
Music Player Home
Bicycle Software Home

Page Design by Nigel Jones and the Machinehead Programming Team Machinehead Software