# Advanced Configuration for Connector Manager v2.4.4 and Later #
Connector Manager version 2.4.4 moves most of the configuration
options that a Connector administrator may wish to set out of the
web application bean definition file and into a separate application
properties file.
This change makes the upgrade process much less painful for those
administrators have have tailored the Connector Manager deployment
to their enterprise. The web application bean definition file is
redeployed during the upgrade process, overwriting any modifications
that an administrator may have made. However, the web application
properties file is not overwritten during an upgrade, so setting
common configuration properties in that file preserves, preserves
those customizations through future upgrades.
It is strongly suggested that the administrator only set advanced
properties whose values differ from the default values mentioned
below. This allows Google to tune the defaults in later releases
to the benefit of all who had not explicitly overridden them.
## Setting or Modifying Advanced Configuration Properties ##
To set or modify Connector Manager Advanced Configuration
Properties, you must edit its `applicationContext.properties` file.
This [Java Properties](http://java.sun.com/j2se/1.5.0/docs/api/java/util/Properties.html) file
is a plain-text file in `ISO-8859-1` character encoding,
and must remain so when modified. The syntax for setting
property values must conform the Java Properties specification.
The Connector Manager supports only plain-text properties files,
not XML-formatted properties files.
1. Shutdown the Connector's Tomcat server.
- Make a backup copy of the file:
$TOMCAT_HOME/webapps/connector-manager/WEB-INF/applicationContext.properties
- Edit the file:
$TOMCAT_HOME/webapps/connector-manager/WEB-INF/applicationContext.properties
- Make the necessary modifications (see below) and save the file.
- Restart the Connector's Tomcat server.
- Examine the logs in
$TOMCAT_HOME/logs
, looking for any errors that may have been generated by mis-configured properties.
Note: $TOMCAT_HOME
represents the Apache Tomcat installation directory. For Connectors installed using the Google Connector Installer (GCI), this would be the Tomcat directory in the Connector Installation.
Feed Connection Properties
gsa.feed.protocol
The gsa.feed.protocol
property specifies the URL protocol for
the feed host on the GSA. The supported values are http
and
https
.
For example:
gsa.feed.protocol=http
Since: 2.8.0
gsa.feed.host
The gsa.feed.host
property specifies the host IP address for the
feed host on the Google Search Appliance.
For example:
gsa.feed.host=172.24.2.0
gsa.feed.port
The gsa.feed.port
property specifies the HTTP host port for the
feed host on the GSA.
For example:
gsa.feed.port=19900
gsa.feed.securePort
The gsa.feed.securePort
property specifies the HTTPS host port
for the feed host on the GSA. This port will be used if the
gsa.feed.host
property is set to https
.
For example:
gsa.feed.securePort=19902
Since: 2.8.0
gsa.feed.validateCertificate
The gsa.feed.validateCertificate
property specifies whether to
validate the GSA certificate when sending SSL feeds. If the GSA
certificate is installed in the Tomcat keystore, this should be
set to true
, otherwise it must be set to false
.
For example:
gsa.feed.validateCertificate=false
Since: 2.8.0
manager.locked
The manager.locked
property is used to lock out the Admin Servlet
and prevent it from making further changes to the Feed Connection properties.
If it is set to true
or missing the Servlet will
not be allowed to update the Feed Connection properties.
NOTE:This property will automatically be changed to true
upon
successful update of the gsa.feed.host
and gsa.feed.port
when registering
a Connector Manager with a Google Search Appliance. Therefore, once the
Feed Connection properties are successfully updated by the Admin Servlet,
subsequent updates will be locked out until the flag is manually
reset to false
. For more information, see
Changing the GSA Feed Host.
manager.locked=false
Feed Logging Properties
feedLoggingLevel
The feedLoggingLevel
property controls the logging of the feed
record to a log file. The log record will contain the feed XML
without the content data. Set this property to ALL
to enable feed
logging, OFF
to disable. Customers and developers can use this
functionality to observe the feed record and metadata information
the connector manager sends to the Google Search Appliance.
The feed log contains most of the information feed to the Search
Appliance, but does not log the Document content.
For example:
feedLoggingLevel=ALL
feed.logging.FileHandler.pattern
The feed.logging.FileHandler.pattern
property specifies the
location and naming convention used when generating feed logs.
The feed log filename pattern follows the
java.util.logging.FileHandler
rules. The default pattern places feed logs is the logs
directory of
the Tomcat installation for the Connector Manager, in files named
google-connectors.feed*.log
.
For example:
feed.logging.FileHandler.pattern=/var/logs/connectors/acme-connectors.feed%g.log
feed.logging.FileHandler.limit
The feed.logging.FileHandler.limit
property specifies an approximate
maximum size, in bytes, to any one feed log file before creating a new
feed log file. If this is zero, then there is no limit. The default limit is 50MB.
For example:
feed.logging.FileHandler.limit=0
feed.logging.FileHandler.count
The feed.logging.FileHandler.count
property specifies how many feed
log files to cycle through. No more than this number of feed logs will be
maintained, with older logs being discarded as needes. The default feed
log count is 10.
For example:
feed.logging.FileHandler.limit=30
teedFeedFile
If you set the teedFeedFile
property to the name of an existing
file, whenever the connector manager feeds content to the Search Appliance,
it will write a duplicate copy of the feed XML to the file specified by
the teedFeedFile
property. Google Search Appliance customers and
third-party developers can use this functionality to observe the content
the connector manager sends to the Search Appliance and reproduce any
issue which may arise.
For example:
teedFeedFile=/tmp/connector/CMTeedFeedFile
NOTE: The teedFeedFile
will contain all feed data sent to the
Search Appliance, including document content and metadata.
The teedFeedFile
can therefore grow quite large very quickly.
Feed Content Properties
feed.timezone
The feed.timezone
property defines the default time zone used
for Date metadata values for Documents. A null
or empty string
indicates that the local time zone of the machine running the
Connector Manager should be used. The default feed time zone
is local time zone of the Connector Manager. Standard Java TimeZone
identifiers may be used. For example:
feed.timezone=America/Los_Angeles
If a standard TimeZone
identifier is unavailable, then a custom
TimeZone
identifier can be constructed as +/-hours[
minutes]
offset
from GMT. For example:
feed.timezone=GMT+10 # GMT + 10 hours
feed.timezone=GMT+0630 # GMT + 6 hours, 30 minutes
feed.timezone=GMT-0800 # GMT - 8 hours, 0 minutes
feed.file.size
The feed.file.size
property sets the target size, in bytes, of
an accumulated feed file. The Connector Manager tries to collect
many feed Documents into a single feed file to improve the
efficiency of sending data to the Google Search Appliance.
Specifying too small a value may result in many small feeds which
might overrun the GSA's feed processor. However, specifying too
large a feed size reduces concurrency and may result in OutOfMemory
errors in the Java VM, especially if using multiple Connector instances.
The default target feed size is 10MB, which will typically hold
50-100 fed documents.
feed.file.size=10485760
feed.document.size.limit
The feed.document.size.limit
property defines the maximum
allowed size in bytes of a Document's content. Documents whose
content exceeds this size will still have metadata indexed,
however the content itself will not be fed. The default value is
30MB, the maximum file size accepted by the Google Search Appliance.
feed.document.size.limit=31457280
feed.backlog.floor, feed.backlog.ceiling, feed.backlog.interval
The Feed Backlog properties are used to throttle back the
document feed if the Search Appliance has fallen behind processing
outstanding feed items. The Connector Manager periodically polls the Search Appliance,
fetching the count of unprocessed feed items (the backlog count).
If the backlog count exceeds the ceiling value, feeding is paused.
Once the backlog count drops back down below the floor value, feeding
resumes.
# Stop feeding the GSA if its backlog exceeds this value.
feed.backlog.ceiling=10000
# Resume feeding the GSA if its backlog falls below this value.
feed.backlog.floor=1000
# How often to check for feed backlog (in seconds).
feed.backlog.interval=900
Traversal Properties
traversal.batch.size
The traversal.batch.size
property defines the optimal number
of items to return in each repository traversal batch. The batch
size represents the size of the roll-back that occurs during a
failure condition. Batch sizes that are too small may incur
excessive processing overhead. Batch sizes that are too large
may produce OutOfMemory conditions within a Connector or result
in early termination of the batch if processing time exceeds the
travesal.time.limit
. The default traversal batch size is 500 items.
For example:
traversal.batch.size=1000
traversal.poll.interval
The traversal.poll.interval
property defines the number of
seconds to wait after a traversal of the repository finds no new
content before looking again. Short intervals allow new content
to be readily available for search, at the cost of increased
repository access. Long intervals add latency before new
content becomes available for search. By default, the Connector
Manager waits 5 minutes (300 seconds) before retraversing the
repository if no new content was found on the last traversal.
For example:
traversal.poll.interval=900
traversal.time.limit
The traversal.time.limit
property defines the number of
seconds a traversal batch should run before gracefully exiting.
Traversals that exceed this time period risk cancelation.
The default time limit is 30 minutes (1800 seconds).
For example:
traversal.time.limit=3600
traversal.enabled
The traversal.enabled
property is used to enable or disable
Traversals and Feeds for all connector instances in this
Connector Manager. Disabling Traversal would be desirable if
configuring a Connector Manager deployment that only authorizes
search results. Traversals are enabled by default.
traversal.enabled=false
Since: 2.6.0
Miscellaneous Properties
config.change.detect.interval
The config.change.detect.interval
property specifies how often
(in seconds) to look for asynchronous configuration changes.
Values <= 0 imply never. For stand-alone deployments, long
intervals or never are probably sufficient. For clustered
deployments with a shared configuration store, 60 to 300 seconds
is probably sufficient. The default configuration change
detection interval is 0 (never).
config.change.detect.interval=60
Since: 2.8.0