org.dspace.app.sitemap
Class AbstractGenerator

java.lang.Object
  extended by org.dspace.app.sitemap.AbstractGenerator
Direct Known Subclasses:
HTMLSitemapGenerator, SitemapsOrgGenerator

public abstract class AbstractGenerator
extends Object

Base class for creating sitemaps of various kinds. A sitemap consists of one or more files which list significant URLs on a site for search engines to efficiently crawl. Dates of modification may also be included. A sitemap index file that links to each of the sitemap files is also generated. It is this index file that search engines should be directed towards.

Provides most of the required functionality, subclasses need just implement a few methods that specify the "boilerplate" and text for including URLs.

Typical usage:

   AbstractGenerator g = new FooGenerator(...);
   while (...) {
     g.addURL(url, date);
   }
   g.finish();
 

Author:
Robert Tansley

Field Summary
protected  int bytesWritten
          Number of bytes written to current file
protected  PrintStream currentOutput
          Current output
protected  int fileCount
          Number of files written so far
protected  File outputDir
          Directory files are written to
protected  int urlsWritten
          Number of URLs written to current file
 
Constructor Summary
AbstractGenerator(File outputDirIn)
          Initialize this generator to write to the given directory.
 
Method Summary
 void addURL(String url, Date lastMod)
          Add the given URL to the sitemap.
protected  void closeCurrentFile()
          Finish with the current sitemap file.
 int finish()
          Complete writing sitemap files and write the index files.
abstract  String getFilename(int number)
          Return the filename a sitemap at the given index should be stored at.
abstract  String getIndexFilename()
          Get the filename the index should be written to.
abstract  String getLeadingBoilerPlate()
          Return the boilerplate at the top of a sitemap file.
abstract  int getMaxSize()
          Return the maximum size in bytes that an individual sitemap file should be.
abstract  int getMaxURLs()
          Return the maximum number of URLs that an individual sitemap file should contain.
abstract  String getTrailingBoilerPlate()
          Return the boilerplate at the end of a sitemap file.
abstract  String getURLText(String url, Date lastMod)
          Return marked-up text to be included in a sitemap about a given URL.
protected  void startNewFile()
          Start writing a new sitemap file.
abstract  boolean useCompression()
          Return whether the written sitemap files and index should be GZIP-compressed.
abstract  void writeIndex(PrintStream output, int sitemapCount)
          Write the index file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fileCount

protected int fileCount
Number of files written so far


bytesWritten

protected int bytesWritten
Number of bytes written to current file


urlsWritten

protected int urlsWritten
Number of URLs written to current file


outputDir

protected File outputDir
Directory files are written to


currentOutput

protected PrintStream currentOutput
Current output

Constructor Detail

AbstractGenerator

public AbstractGenerator(File outputDirIn)
Initialize this generator to write to the given directory. This must be called by any subclass constructor.

Parameters:
outputDirIn - directory to write sitemap files to
Method Detail

startNewFile

protected void startNewFile()
                     throws IOException
Start writing a new sitemap file.

Throws:
IOException - if an error occurs creating the file

addURL

public void addURL(String url,
                   Date lastMod)
            throws IOException
Add the given URL to the sitemap.

Parameters:
url - Full URL to add
lastMod - Date URL was last modified, or null
Throws:
IOException - if an error occurs writing

closeCurrentFile

protected void closeCurrentFile()
                         throws IOException
Finish with the current sitemap file.

Throws:
IOException - if an error occurs writing

finish

public int finish()
           throws IOException
Complete writing sitemap files and write the index files. This is invoked when all calls to addURL(String, Date) have been completed, and invalidates the generator.

Returns:
number of sitemap files written.
Throws:
IOException - if an error occurs writing

getURLText

public abstract String getURLText(String url,
                                  Date lastMod)
Return marked-up text to be included in a sitemap about a given URL.

Parameters:
url - URL to add information about
lastMod - date URL was last modified, or null if unknown or not applicable
Returns:
the mark-up to include

getLeadingBoilerPlate

public abstract String getLeadingBoilerPlate()
Return the boilerplate at the top of a sitemap file.

Returns:
The boilerplate markup.

getTrailingBoilerPlate

public abstract String getTrailingBoilerPlate()
Return the boilerplate at the end of a sitemap file.

Returns:
The boilerplate markup.

getMaxSize

public abstract int getMaxSize()
Return the maximum size in bytes that an individual sitemap file should be.

Returns:
the size in bytes.

getMaxURLs

public abstract int getMaxURLs()
Return the maximum number of URLs that an individual sitemap file should contain.

Returns:
the maximum number of URLs.

useCompression

public abstract boolean useCompression()
Return whether the written sitemap files and index should be GZIP-compressed.

Returns:
true if GZIP compression should be used, false otherwise.

getFilename

public abstract String getFilename(int number)
Return the filename a sitemap at the given index should be stored at.

Parameters:
number - index of the sitemap file (zero is first).
Returns:
the filename to write the sitemap to.

getIndexFilename

public abstract String getIndexFilename()
Get the filename the index should be written to.

Returns:
the filename of the index.

writeIndex

public abstract void writeIndex(PrintStream output,
                                int sitemapCount)
                         throws IOException
Write the index file.

Parameters:
output - stream to write the index to
sitemapCount - number of sitemaps that were generated
Throws:
IOException - if an IO error occurs


Copyright © 2010 DuraSpace. All Rights Reserved.