org.dspace.app.statistics
Class LogAnalyser

java.lang.Object
  extended by org.dspace.app.statistics.LogAnalyser

public class LogAnalyser
extends Object

This class performs all the actual analysis of a given set of DSpace log files. Most input can be configured; use the -help flag for a full list of usage information. The output of this file is plain text and forms an "aggregation" file which can then be used for display purposes using the related ReportGenerator class.

Author:
Richard Jones

Field Summary
static String configFile
          the config file from which to configure the analyser
 
Constructor Summary
LogAnalyser()
           
 
Method Summary
static String[] analyseQuery(String query)
          Take a search query string and pull out all of the meaningful information from it, giving the results in the form of a String array, a single word to each element
static void createOutput()
          generate the analyser's output to the specified out file
static File[] getLogFiles(String logDir)
          get an array of file objects representing the passed log directory
static LogLine getLogLine(String line)
          split the given line into it's relevant segments if applicable (i.e.
static Integer getNumItems(Context context)
          get the total number of items in the archive at time of execution, ignoring all other constraints
static Integer getNumItems(Context context, String type)
          get the number of items in the archive which were accessioned between the provided start and end dates, with the given value for the DC field 'type' (unqualified)
static Integer increment(Map map, String key)
          increment the value of the given map at the given key by one.
static void main(String[] argv)
          main method to be run from command line.
static Date parseDate(String date)
          Take the standard date string requested at the command line and convert it into a Date object.
static void processLogs(Context context, String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, Date myStartDate, Date myEndDate, boolean myLookUp)
          using the pre-configuration information passed here, analyse the logs and produce the aggregation file
static void readConfig(String configFile)
          read in the given config file and populate the class globals
static void setParameters(String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, Date myStartDate, Date myEndDate, boolean myLookUp)
          set the passed parameters up as global class variables.
static void setRegex(String fileTemplate)
          set up the regular expressions to be used by this analyser.
static String unParseDate(Date date)
          Take the date object and convert it into a string of the form YYYY-MM-DD
static void usage()
          print out the usage information for this class to the standard out
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

configFile

public static String configFile
the config file from which to configure the analyser

Constructor Detail

LogAnalyser

public LogAnalyser()
Method Detail

main

public static void main(String[] argv)
                 throws Exception,
                        SQLException
main method to be run from command line. See usage information for details as to how to use the command line flags (-help)

Throws:
Exception
SQLException

processLogs

public static void processLogs(Context context,
                               String myLogDir,
                               String myFileTemplate,
                               String myConfigFile,
                               String myOutFile,
                               Date myStartDate,
                               Date myEndDate,
                               boolean myLookUp)
                        throws IOException,
                               SQLException
using the pre-configuration information passed here, analyse the logs and produce the aggregation file

Parameters:
context - the DSpace context object this occurs under
myLogDir - the passed log directory. Uses default if null
myFileTemplate - the passed file name regex. Uses default if null
myConfigFile - the DStat config file. Uses default if null
myOutFile - the file to which to output aggregation data. Uses default if null
myStartDate - the desired start of the analysis. Starts from the beginning otherwise
myEndDate - the desired end of the analysis. Goes to the end otherwise
myLookUp - force a lookup of the database
Throws:
IOException
SQLException

setParameters

public static void setParameters(String myLogDir,
                                 String myFileTemplate,
                                 String myConfigFile,
                                 String myOutFile,
                                 Date myStartDate,
                                 Date myEndDate,
                                 boolean myLookUp)
set the passed parameters up as global class variables. This has to be done in a separate method because the API permits for running from the command line with args or calling the processLogs method statically from elsewhere

Parameters:
myLogDir - the log file directory to be analysed
myFileTemplate - regex for log file names
myConfigFile - config file to use for dstat
myOutFile - file to write the aggregation into
myStartDate - requested log reporting start date
myEndDate - requested log reporting end date
myLookUp - requested look up force flag

createOutput

public static void createOutput()
generate the analyser's output to the specified out file


getLogFiles

public static File[] getLogFiles(String logDir)
get an array of file objects representing the passed log directory

Parameters:
logDir - the log directory in which to pick up files
Returns:
an array of file objects representing the given logDir

setRegex

public static void setRegex(String fileTemplate)
set up the regular expressions to be used by this analyser. Mostly this exists to provide a degree of segregation and readability to the code and to ensure that you only need to set up the regular expressions to be used once

Parameters:
fileTemplate - the regex to be used to identify dspace log files

readConfig

public static void readConfig(String configFile)
                       throws IOException
read in the given config file and populate the class globals

Parameters:
configFile - the config file to read in
Throws:
IOException

increment

public static Integer increment(Map map,
                                String key)
increment the value of the given map at the given key by one.

Parameters:
map - the map whose value we want to increase
key - the key of the map whose value to increase
Returns:
an integer object containing the new value

parseDate

public static Date parseDate(String date)
Take the standard date string requested at the command line and convert it into a Date object. Throws and error and exits if the date does not parse

Parameters:
date - the string representation of the date
Returns:
a date object containing the date, with the time set to 00:00:00

unParseDate

public static String unParseDate(Date date)
Take the date object and convert it into a string of the form YYYY-MM-DD

Parameters:
date - the date to be converted
Returns:
A string of the form YYYY-MM-DD

analyseQuery

public static String[] analyseQuery(String query)
Take a search query string and pull out all of the meaningful information from it, giving the results in the form of a String array, a single word to each element

Parameters:
query - the search query to be analysed
Returns:
the string array containing meaningful search terms

getLogLine

public static LogLine getLogLine(String line)
split the given line into it's relevant segments if applicable (i.e. the line matches the required regular expression.

Parameters:
line - the line to be segmented
Returns:
a Log Line object for the given line

getNumItems

public static Integer getNumItems(Context context,
                                  String type)
                           throws SQLException
get the number of items in the archive which were accessioned between the provided start and end dates, with the given value for the DC field 'type' (unqualified)

Parameters:
context - the DSpace context for the action
type - value for DC field 'type' (unqualified)
Returns:
an integer containing the relevant count
Throws:
SQLException

getNumItems

public static Integer getNumItems(Context context)
                           throws SQLException
get the total number of items in the archive at time of execution, ignoring all other constraints

Parameters:
context - the DSpace context the action is being performed in
Returns:
an Integer containing the number of items in the archive
Throws:
SQLException

usage

public static void usage()
print out the usage information for this class to the standard out



Copyright © 2010 DuraSpace. All Rights Reserved.