SReview

Installing SReview

This will install SReview in a way that is useful for a small conference. That is, you expect to have no more than a handful of talks.

The above configuration should work, and will be sufficient for a small conference. The downside, however, is that there will be only one backend process running at all times. When a talk reaches the cutting state, but no sreview-dispatch process is available immediately, it may take a long time for a process to become available for use.

There are two ways to fix that:

Using multiple cores

In the default configuration, it is safe to run sreview-dispatch multiple times. Each instance will request one job, run it, and then request the next job.

However, it does not allow prioritizing short-running jobs (like sreview-cut, which should never take more than a few minutes) over long-running ones (like sreview-transcode, which may easily take several hours). The result may be that reviewers may have to wait before the system produces another review, which is not ideal.

Using a distributed resource manager

A DRM like gridenine, SLURM, PBS, or Torque allows one to submit a job and have it be run elsewhere. In such a configuration, you would configure SReview to submit jobs to the DRM system, and it would then be up to the DRM system to decide where to run it; e.g., it could be configured to run high-I/O jobs (like sreview-cut) on the file server where the files are directly available, whereas high-CPU jobs (like sreview-transcode) would run on nodes with many CPU cores and reasonable network bandwidth but not the fileserver.

Due to the increased flexibility in managing jobs that way, the author of SReview strongly recommends the use of a DRM system for most installations, even if SReview only runs on one system. However, because setting up a DRM system is a lot of work and can be fairly complicated, this is not the default mode of operation.

Since the author is most familiar with gridengine, a short tutorial on how to set up a gridengine-based system follows. Instructions for other DRM systems are welcome.

The above creates a gridengine environment with two queues, one called lowprio.q and one called hiprio.q. The system is configured such that gridengine will never allow more jobs to be running than you have CPU cores. However, if all CPUs are busy and a job is submitted in the hiprio.q queue, gridengine will send SIGSTOP to all processes started by the shortest-running job in the lowprio.q queue, and then allow the hiprio.q job to be started. Once the hiprio.q job finishes, the lowprio.q job will receive a SIGCONT and be allowed to continue.

To add more worker machines to the system, install gridengine-exec on all other machines in the network. Then, perform the following tasks on the master:

Help!

If you need more help, contact wouter: