Introduction
------------
I imagine most people who have downloaded LinkExaminer have a pretty good
idea of what it is and what it does, but on the off chance that you don't - simply
put it scans a website and returns a bunch of information about the pages on the
site to help whoever runs it. How does it help them out? First it shows you some
of the more obvious things, such as links that are broken both internally on your
site and externally as well. LinkExaminer also gives insight into SEO properties
such as the existence of page titles, keywords, etc , and the ability to generate
completely custom reports based on scans performed.
The Basics of Use
-----------------
Using LinkExaminer is pretty straightforward - launch the application, under
the menu Scan choose 'Set URL', enter in your main URL, click 'Start' and
you're off and running! The amount of time it takes to scan a site can vary
widely based on the options that are selected, so keep that in mind. It's
also possible to abort a scan, tweak the configuration, and then resume the
scan from where you stopped - but keep in mind that if you do this any changes
made won't affect any pages that were previously scanned.
When entering your URL into the LinkExaminer it's usually a good idea to be
as explicit as possible - the program will try to make its best guess for
anything missing, but if you tell it, then it doesn't need to guess.
What does it scan?
------------------
The current version can scan any type of HTTP and HTTPS content, so basically
anything that a webserver can transmit. The parsing engine is able to fully
understand HTML and should do an intelligent job with anything you throw at
it. CSS is not quite as fully understood by the parser but it shouldn't have
any issues extracting links - that being said, it won't be able to easily tell
whether or not particular images are used on a page (so it will check them
all). JavaScript is barely supported - enough to extract any simple links
that might be present, but not sophisticated enough to be able to interpret
the code in any meaningful way. The code for handling SSL is provided by the
excellent OpenSSL library, all other code was internally developed.
The Main Display
----------------
On the main display (listview) you see the results of the scan as it goes. The
background of each line also indicates the current processing state - red
indicates an error, green means everything is fine, and the shades of grey
(dark to light) mean it's either not processed -> retrieved -> parsed and
inserted. A right-click context menu is also available to give additional
information about selected rows; more detail about what each function does
is covered later in the documentation.
It's worth noting that while the report and sitemaps will not be affected by
anything removed or sorted on the main listview, the CSV and text file exports
will save data as it's seen in the main display. This also is true of the
view filters if they are currently in use.
Columns Explained
-----------------
Hopefully the columns themselves are relatively self explanatory, but for the
sake of clarity here's what I think they are: :)
URL
The full URL that this row is referring to
HTTP Code
This is the HTTP code returned by the server. There are also error
codes with parens () around them, that are greater than 900 - these are
internal error codes generated by the harvester (not the server).
HTTP Message
This is simply the human-readable version of the HTTP code.
Internal
This indicates whether or not the link checker considers the link
internal or external to the site being scanned. What constitutes
internal links can be tweaked in the configuration.
Robots.txt
This indicates whether or not a URL matches the criteria of a robots.txt
file. Depending on the configuration settings, these links may or may
not be examined.
NoFollow
This indicates whether or not a link contains the NoFollow attribute -
this tells the search engines not to index or assign any weight to the
link.
Dynamic
This indicates whether or not a link (or page) looks to be dynamically
generated. It's also possible for a scripted page to be returning
static content to be considered dynamic.
Relative
This indicates whether or not a link was explicit, meaning it included
the full domain name, instead of just the path or a starting location
relative to where the user is on the site.
SEO
This shows whether or not some basic SEO components are missing from
a page. Currently it checks to make sure that a page title, meta
keywords and meta description are all present - if they are not, then
it displays which are missing.
Title
This is either the text of the first link going into the page, or once
a page is harvested it becomes the title text from the page (if
available).
Depth
This is the minimum number of links that need to be clicked on to get
to this particular page.
In
This is the number of inbound links pointing to this page.
Out
This is the number of outbound links on the page. Depending on config
settings this may include duplicates.
Content Type
This displays the Content-Type header returned by the webserver.
Size
The size (in K) of the links contents.
Last Modified
The last date the file was marked as modified by the server.
Link Type
These are what type of links are pointing into this page. For example,
if it was an image file, the link type might be CSS and IMG.
Duration
This is the time (in fractional seconds) it took to get the page.
Similarity
This shows how similar this page is to other pages, if similarity is
turned on.
As with most listviews, you can move around the column headers as well as sort
a particular column by clicking on it. These changes will be saved if you have
'Save column info' turned on in the configurations.
The Status Bar
--------------
Across the bottom of the main dialog is the status bar, which gives you
some up-to-the-second statistics of what's going on. The first section tells
you what the application is currently doing, the second one is the URL that's
currently set as active. The third is how many links per second (lps) that
(on average) are being scanned followed by a listing of exactly how many
pages are new, remaining to be harvested, as well as the total count. The
final section is how much memory the application is currently using - this
can be helpful in giving you a heads up when you're scanning a larger site
or are caching the harvested pages.
The Exporters
-------------
The following exporters get their data from what is currently being displayed
in the main dialog. This means that if you have the view to only show errors,
then they will only export errors. Also, if you have some kind of sorting
being applied, that will also be reflected in the exported data.
Text File (URLs)
This is pretty simple, it's just the URL, each one on its own line.
CSV file
Suitable for importing into other applications, this contains almost
everything displayed in the main dialog.
The next exporters get their data directly from the internal database maintained
by LinkExaminer, so they are not influenced by most things you'd do in the
dialog. The only notable exception is that if you remove a page (via the right-
click menu), it's removed from these reports as well. All of these exporters
are generated using user-generated templates, so it's possible for you to make
your own; if you're interested in that, check out the Custom Templates section.
Sitemap (XML)
This is a template-driven sitemap generator; the default template is
for the type of sitemaps used by the search engines like Google, etc.
Report (HTML)
This is a template-driven HTML report that can be highly customized
to include what you want. The default report just contains the basic
items that most other link checkers report.
View Source
-----------
This right-click option is only available when Caching is enabled under the
Parser configuration. With this, you can view the raw HTML source that was
downloaded from the site.
View Link Details
-----------------
This is probably the option you'll be using the most - it gives you a view of
how links flow into and out of a page. In the top section it shows all of the
inbound pages that link in, as well as what type of link it was that they
referenced the page with, and at what depth in the website hierarchy they are.
The bottom pane contains all of the outbound links from the page, with similar
additional information to the inbound links. You can also right-click on any
link to open it in a browser or copy the URL to the clipboard.
View Parser HTML
----------------
The parser HTML view lets you see how the HTML parsing engine views the page -
if for some reason it doesn't seem to be able to find a link, then looking at
this is a good way to get a sense for what might be going wrong. This is the
actual parser output (as opposed to the page source), so it will automatically
format it as well as break up tags from content.
The first column is what line in the source code that the corresponding HTML
came from - this can be helpful if you spot something wrong, to get to the
exact place in your code. The second column is the tag depth, which is
similar to the link depth, except that this looks at tag open and closes for
this parsed page only. You can use this to spot whether or not your page may
be missing a closing tag, etc - if everything is fine you should start at
depth 0 and when you scroll to the bottom it should end at depth 0 - if it's
anything else, then there are either too many closes or not enough to map to
the actual HTML tags. The third column is how the parser interprets the chunk
of code: whether it's a tag, if it's a close, if it's a single (a tag which
either doesn't expect a close or has one specified at the end of the tag), or
if it's text. The final column is the parsed HTML output back into a sort of
formatted way, so it looks like what you'd expect to see.
As with View Source, this option is only available if you have Caching turned
on in the parser configuration.
View Parser Content
-------------------
This viewer lets you get a sense of how a search engine sees your page - it's
only the extracted content, removed from the HTML. Some of this may not be
quite as readable as the HTML would be, because it will also be including the
alt text tags from links, images, etc.
Configuration: General
----------------------
Save column info
With this turned on, any changes made to the columns in the main
display (such as order or size) will be saved.
Auto scan after URL entry
When you enter a URL, the scanner will automatically start.
Clear data before scan
When a scan is started, all the previous data is removed.
Colorize Listview
This toggles the background coloring on the main display.
Display after processing
With this turned on, links will only show up on the main display after
they have been harvested, instead of as they are discovered.
Refresh
This changes the interval that the main display is updated. If you
have a particularly large size (more than a million pages), then you
might want to increase this, but in most cases 1 second is fine.
Double-click action
With this you can customize what happens on the main listview when you
double-click on a row.
Configuration: Exclude
----------------------
The exclusion rules are a powerful aspect of the scanning engine. If you would
like to exclude certain files or directories from the search, then you can do
so from here. Multiple rules can be specified, each rule should be on a new
line. A test URL section is provided below the rules, and this will let you
see whether or not a particular URL would get excluded as well as which rule
(or rules) in particular it matched.
The exclusion engine itself is more like a mini search engine, and it has some
understanding of URL incorporated. For instance, let's say we had the URL:
http://www.testurl.com/gallery/favorites/thumbnails.php
If you added the rule:
favorites
Then it would exclude this site - but, if you just had the word "favorite"
without the s at the end, then it would NOT match. This is because it
understands certain things like slashes (/), underscores (_), etc are
indications of word boundaries. Now, let's say that you wanted it to match
either case, such as favorite or favorites; there are three ways you could
do this:
Two rules:
favorite
favorites
Character Wildcard:
favorite?
Wildcard:
favorite*
The first way is obvious, just explicitly list the two words that you want to
match. The second example uses a character wildcard, the question mark (?);
this will consider any character, or the absence of a character, as a match.
The character wildcard doesn't need to appear at the end, it can appear at the
beginning or in the middle, and you can use as many as you'd like. The final
method is to use the normal wildcard, this says that as long as it matched the
first part, then anything after that it doesn't care about. So, if it ran
into the word "favoritely" (yes, I know it's not a word), then it would NOT
match with the character wildcard match, but it WOULD match with the normal
wildcard. The wildcard (*) can be used either at the start or the end of the
search term, and you could put one on either (or both) ends of the word, but
never in the middle.
Now, let's say that you want to exclude everything in the favorites directory
EXCEPT the thumbnails; you could do that with the following:
favorites -thumbnails
You can use the plus (+) or minus (-) sign to indicate if something must exist
or must not exist in the match. So, in this instance it sees that favorites
exists, but then it also sees that thumbnails exists - but thumbnails has the
minus (must not) at the start, so the rule doesn't match. So, you might be
wondering if this means that "favorites" is the same as "+favorites", right?
The answer is: sort of. :) In this case, there's no difference between the
plus (must) version and nothing at all, but let's say you used the following
rule:
favorite favorites -thumbnails
This is effectively saying that it can match EITHER favorite or favorites and
must not match thumbnails. So if there are multiple things that you might
want match against, then you can just add them in the same rule.
There are a couple of other rules, but they're a bit less likely to be used.
The first is the number wildcard (#), this functions just like the character
wildcard, except that it only matches numbers. You can use a (^) if you need
to match only against the start or the end of the URL. For example, if you
wanted to exclude any gif's, you could use the rule:
*.gif^
You could also leave off the "*." and it would match as well since it considers
periods as word breaks. Finally, you can use quotes to encapsulate phrases,
such as:
"happy cat"
Otherwise it would see that as two words, happy OR cat. Ultimately what you
do with these is really limited only by your imagination.
Configuration: Harvester
------------------------
Obey robots.txt
A website can include a file which limits what automated systems (such
as this) go to. With this turned on, LinkExaminer will skip whatever
a search engine would skip.
Obey NoFollow
It's possible in HTML to tag a link as nofollow - now search engines
still follow the link, they just don't assign any weight or track the
fact that you link to it. With this checked, the harvester won't
follow these links at all.
Check external links
With this checked, it will check that external links are valid, but it
will not add any links found on the page to the scan.
Allow parent walk
If you start on a subdirectory, such as
http://localhost/blog/index.htm
The engine will not go any higher or into any link that doesn't start
with '/blog'. With this turned on, it will walk back to the root and
include everything.
Consider URL case sensitive
In most cases you won't want this turned on, but if you do, then it
will consider index.htm to be different than index.HTM, and will check
both links.
Subdomains internal
With this turned on, subdomains will be considered internal; so, if
you were scanning localhost, it would also consider image.localhost
or files.localhost to be internal.
Accept cookies
With this turned on, cookies are accepted and persist throughout the
scan. Cookies are specific to each thread and not shared.
Max depth
This is the maximum depth that will be walked, if you want it to go
as deep as possible, it should be set to all 9's.
Max redirects
When a redirection happens, and it points to another redirection,
this is considered "Redirection Depth", and this value limits how many
redirection to redirection links will be followed. In general you
don't want this too large to avoid redirection loops.
Max links
You can limit the maximum number of links that will be scanned; once
the number is met the scan will stop.
Threads
This is the number of concurrent threads that will be used to retrieve
links. In most cases, the default of 10 is fine, but if you have CPU
and bandwidth to spare, you can increase this to shorten your scan
times.
Retry count
If it is unable to connect to a page, how many times it will attempt to
retry before going on to the next link.
Retry timeout
How long (in seconds) it will wait before it considers a request to
have failed.
Max filesize (k)
This is the maximum amount of data (in kilobytes) it will download.
User Agent
This is the user agent that is reported by the harvester while it scans
pages. It's recommended that this remains the default.
Configuration: Parser
---------------------
Cache pages
With this turned on, a copy of all HTML pages downloaded will be saved
in memory. This enables some of the advanced features that rely on
content analysis.
Count duplicate links
With this turned on, if a page links to the same URL twice, it will
consider this to be two 'hits' for that page; if it was turned off, it would
only be counted once.
Walk forms
With this enabled, it will attempt to walk any form fields it finds.
Spaces in URL as '+'
Some webservers want to see spaces as '+' instead of %20, if this is
the case (it normally won't be), then check this.
Skip graphic elements
With this turned on, anything linked through a graphic tag, such as
IMG, will not be added to the scan list.
Ignore anchors
With this turned on, it will not check to see whether or not anchors
exist on a page. This can radically increase your link count if turned
off and your site extensively uses anchors.
Identify duplicate pages
With this turned on, a SHA1 is calculated for each page (the RAW HTML)
and it is used to find identical content.
Last-Modified <5m is dynamic
This helps the harvester to identify dynamic content - if the Date and
the Last-Modified date reported by the server are within 5 minutes of
each other, then it's considered dynamic.
Find similar pages
This requires caching to be turned on as well as increasing the memory
usage considerably. Once the scan completes, it will compare the
contents of all the HTML pages and report on the maximum similarity
encountered as well as how many pages were close to the same.
No Last-Modified is dynamic
If a webserver responds without a Last-Modified date, the page will be
considered dynamic.
Similar spread
This is the percentage tolerance in matches necessary to be considered
a hit in the similarity analysis.
Configuration: Export
---------------------
Auto save report when done
With this turned on, when a scan has been completed it will automatically
save a copy (using the auto save filename).
Auto open report
With this enabled, when a report is generated it will automatically open
it in the browser.
Add date/time to filename
This automatically adds the current date and time to the filename, so
reports won't be overwritten.
Sitemap template
This specifies what sitemap template to use when generating sitemaps.
Report template
This specifies what report template to use when generating reports.
Custom Templates
----------------
Some of the exporters can be customized using a template/scripting system. The
template system itself was originally developed with HTML in mind, so it looks
similar to it (especially when generating html), but also makes it well suited
for XML (although it can really be used for any text-based format).
The templating system (in use) is actually very simple - there is one section
of the page specified for layout and the rest are scripted sections. The
layout section is the first called, and it (unsurprisingly) controls how the
page is laid out. Each template section is contained inside of an HTML
comment, and starts with BEGINPAGE: followed by the name of the section.
Here's an example of the layout section:
<!-- BEGINPAGE:Layout -->
Another common type of section is a loop section - these are sections that
are going to be looped for each member of a list. They typically start out
with a Header section, followed by a Member and close up with a Footer. When
this is actually generated, it will output the Header once, the Member as
many times as there are members in the list, and the Footer when it’s done. If
you would like to alternate the format (to alternate the background color, etc)
then you can have multiple Member sections followed by a number, for example:
<!-- BEGINPAGE:PageListHeader -->
<!-- BEGINPAGE:PageListMember1 -->
<!-- BEGINPAGE:PageListMember2 -->
<!-- BEGINPAGE:PageListFooter -->
In this case it would do Header, Member1, Member2, Member1, ...etc..., Footer.
If you don't want multiple Member sections, then just leave the number off.
Advanced Scripting
------------------
The HTML Report and Sitemap Generator both have their own template-driven
scripting system - this means that it's possible to make custom reports with
a focus on whatever you're interested in. While it is a simple system, it
does allow you a large amount of flexibility.
The system level variables are the most basic of any of the variables, they
just print out some high-level info about the system itself:
{System.CurrentTime}
This displays the current time. Time is handled specially in the
scripting system, so it's possible to format it in a variety of ways.
{System.Version}
This displays the program's name and version information
When it comes to actually generating some output, you use the {Display.*}
functions - these run through all the members in special ways based on the
settings you've chosen.
{Display.OverallSummary}
This displays the overall summary information
{Display.PageList}
This iterates through the list of pages
{Display.SiteMap}
This generates a sitemap
{Display.LinkInList}
This iterates through any pages linking into the current page
{Display.LinkOutList}
This iterates through any pages linked from the current page
{Display.HTTPCodeSummary}
This prints a summary of the HTTP codes
{Display.ContentTypeSummary}
This prints a summary of the content types
When calling these, it's also possible to change which page template they will
use when outputting information - in this way it's easy to customize the way
things are output. When the page template is overridden, it will always look
for 4 sections: Empty, Header, Member and Footer. So, if you were using
{Display.PageList} the default sections would be PageListEmpty, PageListHeader,
etc - now if you call {Display.PageList-Example} instead, it will use the page
templates ExampleEmpty, ExampleHeader, etc. Now just tweak those for their
specific application - pretty cool, eh?
Also, if there isn't a predefined Display (such as SiteMap, PageList, etc), it
will pass it on as a template section, so you could call just one of your own
sections. You can see this in the 'report-template.htm', at the end of layout
you'll notice it doesn't include the body and html close, but instead calls
{Display.Footer}. This then looks for the template section Footer, which is
at the bottom of the document - this is done so that the entire page is
contained in the HTML document and looks (somewhat) correct in the browser
even when it hasn't been generated.
Options are special in that they don't actually print anything, they're used
to change the way a display request is processed:
{Option.Reset}
Reset all the options
{Option.SetPageListFilter}
Probably the most powerful and most frequently used, this lets you
filter which pages are returned - so anything put as a parameter to
this WILL NOT be displayed. Most of the filter types have positive
and negative modes, and you can use multiple ones in a filter by
using commas between each term. For example:
{Option.SetPageListFilter-NoError}
Will skip any pages that don't have an error - so effectively returning
an error list. Now, let's say you only wanted to show internal errors -
just change it to this:
{Option.SetPageListFilter-NoError,External}
See? Now we're filtering out pages without errors, and that are
external. Here's a list of all the filters:
Redirect
NoRedirect
Error
NoError
Internal
External
RobotBlocked
NoRobotBlocked
NoFollow
NoNoFollow
Relative
Literal
Dynamic
Static
{Option.SetPageListContentTypeAllow}
This will limit the returns only to the content type specified, so this
works differently than the filter. For example:
{Option.SetPageListContentTypeAllow-text/html}
Will only return html files. As with the filter, multiple content
types can be specified using the comma.
{Option.SetPageListBiggerThan}
Pretty straightforward, only pages bigger than the number (specified
in k/bytes) will be returned
{Option.SetPageListSmallerThan}
Have a guess? Only pages smaller than the number are returned
{Option.SetPageListSlowerThan}
As with the above options, this is related to the transfer time, and
is in fractional seconds - so to only return pages that took more than
1.5 seconds, you would do:
{Option.SetPageListSlowerThan-1.5}
When running any of the Summary-type subqueries, they will return their
contents into the following variables:
{Summary.Value}
If there is a value associated with this summary
{Summary.String}
If there is a string associated with the summary
{Summary.Count}
The number of matches to this particular summary row
{Summary.Percent}
The percentage of this row (basically count/total)
{Summary.Total}
The total number of entries related to this summary
{Summary.SizeAverage}
The average filesize (if appropriate)
{Summary.SizeTotal}
The total filesize (if appropriate)
It's important to point out that not all the summary variables will be filled in
with valid info - it depends on which summary is being generated. It won't hurt
anything to try using one - it just won't contain any information.
Now, when iterating through relationships, each pass will load the contents of
the page referenced - so if you're going through LinkInList, all of the {Page.*}
variables will be related to the LinkIn, not to the page it's linking to. There
are also a couple of relationship-specific members:
{Relationship.Hits}
This is the total number of relationship hits
{Relationship.LinkType}
This is the type of inbound link from the current page (a, img, etc)
{Relationship.InLinkCount}
This is the number of links coming into this page from others
{Relationship.OutLinkCount}
This is the number of links going out from this page to other pages
There are a couple of scan-level variables, these are things that deal with
the scan as a whole instead of an individual page:
{Website.URL}
This is the main URL that the scan was done on
{Website.MaxDepth}
This is the deepest depth of the walk
The following variables are all pretty straightforward, they map to what you
see in the listview and their names are what they are:
{Page.Count}
{Page.CurrentCount}
{Page.URL}
{Page.HTTPCode}
{Page.HTTPCodeString}
{Page.RawHTTPCode}
{Page.RawHTTPCodeString}
{Page.IsInternal}
{Page.IsRobotBlocked}
{Page.IsNoFollow}
{Page.IsDynamic}
{Page.IsRelative}
{Page.Title}
{Page.Depth}
{Page.Rank}
{Page.RedirectDepth}
{Page.Hits}
{Page.Links}
{Page.ComponentHits}
{Page.ClickableHits}
{Page.ContentType}
{Page.Size}
{Page.LastModifiedTime}
{Page.LinkType}
{Page.Duration}
{Page.SimilarityPercent}
{Page.SimilarityHits}
{Page.ParsedContent}
Scripting Time
--------------
As mentioned above, Time is handled slightly differently than normal variables,
and has all sorts of variations you can use. For example, {System.CurrentTime}
has the following variations:
{System.CurrentTime}
{System.CurrentTimeRFC822}
{System.CurrentTimeDate}
{System.CurrentTimeSecond}
{System.CurrentTimeSecondPad}
{System.CurrentTimeMinute}
{System.CurrentTimeMinutePad}
{System.CurrentTimeHour24}
{System.CurrentTimeHour24Pad}
{System.CurrentTimeHour}
{System.CurrentTimeHourPad}
{System.CurrentTimeAMPM}
{System.CurrentTimeDay}
{System.CurrentTimeDayPad}
{System.CurrentTimeMonth}
{System.CurrentTimeMonthPad}
{System.CurrentTimeMonthName}
{System.CurrentTimeMonthNameAbbreviation}
{System.CurrentTimeYear2}
{System.CurrentTimeYear4}
Most (if not all) should be pretty obvious, but basically any variable that
ends with the word Time probably has all these modifiers.
Scripting Modifiers
-------------------
Speaking of modifiers, it's possible to have the scripting system do a bit of
processing on any variables it sees. To do this, simply add a dash followed
by the modifier, so if you wanted the year {Page.CurrentTimeYear4} to be
displayed with a comma (I'm not sure why), then you would do the following:
{Page.CurrentTimeYear4-Comma}. Here's a list of the modifiers currently
available:
-Checked
If the string immediately following the modifier matches the
variable, then it returns "CHECKED". Used for inputs.
-Selected
If the string immediately following the modifier matches the
variable, then it returns "SELECTED". Used for inputs.
-ESC
This does escapes necessary for the string to be used in quotes.
-HTML
This does any escaping necessary for HTML, such as converting
greater than/less than, etc.
-CGI
This does the appropriate escaping to be sent as a parameter in a
URL.
-YESNO
This converts the variable 0/1 into Yes or No
-ONOFF
This converts the variable 0/1 into On or Off
-IsEmpty
This prints either "Yes" or "No" depending on the variable.
-Clip
This limits the length of a string to the number specified. If
it must clip the string, then it does so and adds "..." to the
end.
-Chop
This does the same thing as clip, but doesn't add the "..."
-Left
This is used with percentages (0-100), it returns what's left, so
if the percentage was 25%:
{Summary.Percent} would print 24
{Summary.Percent-Left} would print 76
-SizeConvert
This is a bit special, it takes 2 (or 3) parameters after it. The
first is the format of the incoming variable, the second is what
format to convert it to, and the third (optional) parameter is how
much precision it should use. For instance, let's say you have the
filesize in bytes of the page, which is 1234567:
{Page.Size} would print 1234567
{Page.Size-SizeConvertBK} would print 1206
{Page.Size-SizeConvertBM} would print 1
{Page.Size-SizeConvertBM3} would print 1.206
The types are:
B = byte b = bit
K = kilobyte k = kilobit
M = megabyte m = megabit
G = gigabyte g = gigabit
T = terabyte t = terabit
-Precision
Just like with SizeConvert, this can take a floating point variable
and reduce the number of digits to the right of the decimal.
-Approx
This changes an integer into an approximation
-Comma
This adds commas to an integer, so:
{Page.Size} would print 1234567
{Page.Size-Comma} would print 1,234,567
It should be pointed out that only one modifier at a time can be used - there
is no chaining of them, so make sure to choose carefully! :)
Scripting Example
-----------------
So, let's try an example of adding something to the report-template.htm - open
it up in your favorite editor and add the following:
<h2>All:</h2>
{Option.Reset}
{Display.PageList}
Hopefully you have some idea of what this will do at this point, and if you
don't, take a look at the header tag, it might give you a clue. :) Just
generate a report and it should also now include a listing of all the pages
that were harvested.
One quick tip for you, when you're working on the template, it will reload
it each time you generate a report. This means you can perform a scan once,
then just work on the template and generate the report over and over until
you've got what you want.
Conclusion
----------
Hopefully you found this readme file useful (for the few who actually read
these) - my least favorite type of thing to write is docs (my most favorite
being code). I hope you enjoy LinkExaminer and it serves you well...