DESCRIPTION

A server for converting docs from other formats to OOo, generating pdf files, keeping unchangeable snapshots in pdf format, getting and setting metadata. To be used together with ERP5 Document business template, although can be run standalone for any other client application.

Server tested using OOo 2.0.3, and everything works. Some files created in OOo 2.0.2 may crash - this is a known issue. Files created in OOo 2.0.3 and MS Office normally go through with no problems.

It returns a file in a desired format; html files are zipped because they often consist of more than one file.

INSTALLATION:

Patch your SimpleXMLRPCServer.py with this:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=893642&group_id=5470

Download the source code from svn. Create a writable tmp directory under your oood base dir.

If you want to run it remotely on screen-less server, use Xvfb (start it by Xvfb -ac :1, set DISPLAY environment variable to :1, then run start.sh). To run smoothly it needs java-1.4.2-kaffe, which should be installed before intalling OOo rpms (otherwise you can probably activate it with command line switches).

SETUP:

Done in config.py. Important parts are:

- paths to programs

- loadtime - how long it takes for your machine to load an OOo instance; this is important because when it is restarted, a script spawns a new instance and then has to wait before trying to connect, and also because if another instance is started too soon (before the previous one loads completely) they are both merged into one.

- timeout - when an OOo instance we called does not return for too long, we decide it has crashed; we kill it and start another one. This is the time after which we undertake this drastic action.

SETTING UP OOo INSTANCES

When run, the server starts a configured number of OOo instances as different users (user_0, user_1 etc), each having its own settings. By default, instances are run in "headless" mode (they don't produce visible windows). However, the first time every instance is run you have to go through the registration process, otherwise it won't start. For the first time you have therefore to start every instance in a "normal" mode (by removing "headless" parameter in start.py) and clicking through registration dialogs. It is also highly recommended that you set up Java path in OOo options in every instance - this reduces the required loadtime from about half a minute to a few seconds.

USAGE:

Start your OOo instances by running start.py

Start the server by running serw.py; check the stdout and log.txt to see what is going on. If necessary, ctrl-c the server and rerun it.

    CAUTION: if one of the OOo instances has been restarted because of a timeout, it dies together with the server!
        so if you restart the server after there has been an OOo restart, rerun the start.py script.

TESTING:

Run test_worker.py to test the main tool.
Run test_server.py to test communication and xmlrpc interface.
Run test_timeout.py to check if the OOo gets properly restarted.

PORTABILITY

Not portable. Almost all the code can be run on any OS, except for the "rebuild" function in pool.py and pid recording in start.py. This is for killing and restarting OOo instance - done in a bit unclean way, but works. I don't know a better way to do that.

COMMON PROBLEMS

When there is a timeout and OOo instance is restarted, the server throws exception "invalid literal for int" - this probably means that the otput of "ps" command is formatted differently, and you have to experiment with the line
      pids=os.popen('ps -A -o pid,ppid | grep %d | cut -f 2 -d " "' % pid)
in pool.py.

If an OOo instance crashes, it is killed and restarted after a timeout; this is ok, but the xmlrpc server thread that called that instance remains, therefore locking one client thread permanently. This is a known limitation of Python - threads can not be killed or stopped.

BG
