The Content Sync Module of Director (CSM) can then be invoked and configured to create a list of HTTP or CIFS objects and folders for pre-population. After the intial scan, CSM can be used to rescan for updates to the content. Once the content list is created, it can be uploaded, (either by schedule or manually), to the Director. Director then pushes the list to each ProxySG within the Content Distribution list. Installation requires a Windows or LINUX workstation on which to install the Content Sync Module (CSM) component. Managing the Director appliance requires a Java applet, downloadable from the Director appliance, upon login. The CSM operates by crawling a CIFS, or HTTP server and tracks the time that the content was last modified, pushing the content to the ProxySG appliances accordingly. It will generate a list of files along with their last-modified times and keep that in a flat file rather than a true database. You can then push this content out to your SG network, via Director. These appliances will get that list, delete the objects that shouldn’t be there anymore and download the other files.
Note: The Content Sync Module does not ship with Director, but is available as a separately downloable module. This article assumes you have a working DIrector appliance, and are able to log into it using the Director Management Console (DMC) Java application. For details on how to install the CSM, see 000010802
To create a job that crawls a website for content, follow these steps:
- Invoke CSM on your workstation and click on File and select New Job- you can give it any name you want- we chose the default name, as you can see below.
- Click Schedule box, select Recurring tab and specify Day of the Week and Time.
- Click Add > and confirm your schedule entered into the Result Schedule: (Alternatively you could select One Time tab for scheduling purposes.
- Under the New Job just created, select Scan > select radio button Crawl URLS.
- The URL needs to be preceeded by http:// as the example below shows.
- Select the radio buttons, as per your prefernces.
- Select Blue Coat Director under Jobs and enter Director IP address and administrator credentials.
- You can specify the protocol connection method as either Telnet or SSHv2 ( The CSM software will use the Expect software you installed previously to connect to your Direc)
- Select Synchronize item under Jobs and click Enable Synchronization. Note: Synchronization allows the CSM software to update ProxySGs that have information that has been changed.
- TIP: If you do not select this option, the CSM sofware will only download the content to your local host, and NOT contact the Director appliance.
- Select "Synchronize all Devices and Groups": The default is to synchronize all devices and groups associated with Director.
- NOTE: The output for the scan will be placed in : C:/Program Files/Expect-5.21/bin/Output folder. This folder only gets created when you do a your first scan.
- The scan may take up to 20 mins to complete, depending on the Website you choose.
- On a successfully scan, the output log will look something like this, except for the enable password errors.
Points to note:
The latest CSM log keeps track of the latest version of each object on the chosen website. In other words, the CSM does not download, cache, and push actual content out to the SG, it merely keeps track of what content there is to cache, and hands that list off to each SG, so it can download, and cache the objects. The list is displayed in the tracking window when a job first starts to run, and provides a progress report of every 500 objects scanned or crawled. The log is kept in the data directory where you installed CSM, identified by a timestamp. This log is in non-verbose mode and is called CsmGuiJobs.txt . You can change the default to verbose mode by using Tools>Options>Verbose Mode in the main window.
The CSM Configuration file contains, in the all the settings for one job. Each new job has its own configuration file, located in C:/Program Files/Expect-5.21/bin/data. The first CSM configuration file for the job you create is titled csm.cfg. Each new job has its own configuration file; for example, csm001.cfg, csm002.cfg, and so on. Each time the job is run, the csmXXX.cfg file is output in the data directory with a timestamp, so you can see what changes you made in each running of the job.
The CSM Configuration file,called CSM001.cfg, is kept in the same folder, and should not be edited directly. Most of the settings can be changed through the Management Console standard windows; a few can be made only through the Advanced window of the Management Console. (These few settings generally do not need to be changed; the defaults are usually satisfactory.
The recomended platform for the Content Sync Module, and Expect, is Microsoft Windows XP with service pack 3 installed. There are known problems with this software being installed to Windows 7, and 64 bit Windows 2003 servers.
Frequently asked questions:
1: When we create a CIFS crawl job what is the correct entry for the "Corresponding URL" box? If you leave this option blank the job does not run, so what must I place here?
The Coresponding URL is used only when you are scanning Directories. Since the Director appliance/SG network can only distribute URLS we need to send out URLS. Each Url uses this syntax "file://<SG IP address>
Here's a sample output of a CSM job pulling files from the default 'Sample pictures director on a windows workstation.
Using username "admin".
Last login: Wed Feb 8 05:02:50 2012 from 10.125.48.32
Copyright (c) 1997-2010, BlueCoat Systems, Inc.
Welcome to SG-ME 184.108.40.206 #65441 2011.05.03-034023
DIrector # cli help disable
DIrector # line-vty length 0
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/Winter.jpg" all
Command ID: 1328677458899394
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/Water%20lilies.jpg" all
Command ID: 1328677459240196
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/desktop.ini" all
Command ID: 1328677459538524
DIrector # content distribute url "file://10.125.0.51/Desktop.ini" all
Command ID: 1328677459845087
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/Sunset.jpg" all
Command ID: 1328677460135670
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/Thumbs.db" all
Command ID: 1328677460342931
DIrector # content distribute url "file://10.125.0.51/Sample%20Pictures/Blue%20hills.jpg" all
Command ID: 1328677460550356
DIrector # exit
Blue Coat Systems CSM/SG-ME 220.127.116.11 #32468 2008.01.30-083843 ended: Wed Feb 08 10:23:26 India Standard Time 2012
2: Why do we see URLS like this? "file://10.125.0.51/<file:///\\10.125.0.51\>"
This is because of outlook html format. It automatically identifies and converts them as hyperlinks. When you see it in normal text mode it will display text and link like that.
For more clarity here is an example:
If you have a directory (C:\MyDir\) contents as mentioned below :
And you are scanning that directory (C:\MyDir\) using CSM and provided Corresponding URL as “file://testserver/” then CSM will generate and distribute below URLS:
That means whatever director you are scanning will be replaced by Corresponding URL.
3: Does the Content Sync Module ( CSM) application create jobs on the Director appliance?
No, the the CSM does not create a job on the Director appliance. It runs only when triggered by the CSM application. Each time it runs it uses Director CLI commands to execute the tasks on the Director appliance.
You can use also use the Query option provided in the CSM to know the caching status of the URLs that you have distributed. Here is an example screenshot:
For details on how to create a job to scan a CIFS server, see 000010235
For a definition of what it means to crawl a webserver, see WIKI site.
For details on a known problem with CSM and timezone changes, see 000008918
For a list of Proxy SG version compability with Director SGME 18.104.22.168, see 000013458
For details on what problems you may face launching the Director Managment console Java application, see 000016900
For details on helpful Director command Line syntax, see 000014637