How and When do I use coshell for distributed builds?
What is coshell?
By using coshell nmake can distribute jobs to other machines in
your network, allowing you to utilize free CPU time on multiple machines
for distributed builds.
With this feature nmake can take advantage of extra processors
within an homogeneous local area network by:
- establishing shell coprocesses to machines within the network
- sending jobs to these shell coprocesses
coshell is not supported on all platforms. If there is
no nmake_root/bin/coshell command in your nmake
distribution then coshell is not supported for your machine.
Should I use coshell or -j?
When you need to decrease your build times
always try the -jn (jobs) option first. The jobs
option tells nmake to execute n parallel jobs on the
same machine. This gives you parallel building without coshell's overhead
of scheduling and remotly executing jobs, not to mention the coshell setup
requirments. However, if your build machine is too busy for -j to
be effective coshell can be used to distribute parallel jobs to other machines
in your local network.
coshell Pre-setup Checklist
Before attempting to use coshell the following items must be done.
Note: "machines" refers to machines in your network which
will be used for distributed nmake jobs.
- The clocks on all machines must be in sync. (Out of sync clocks will
result in inconsistent date stamps causing unnecessary recompiles later.)
- Mount the nmake product directory and the build file systems
on all machines.
- ksh88i or later must be installed on all machines.
- Screen locking and screen saver tools must have low CPU usage.
(If they use too much CPU then those machines will be busy and jobs
will not be sent to them.)
- The same version of the OS and compiler tools must be available on all
machines. All machines should be identical in this regard. This
is the only way to guarantee the compiled output from the machines are
consistent with each other.
- Coshell must be able to rlogin to client machines without
being prompted for a password.
- There can be no interactive prompts in the .profile
on the client machines for the user running coshell.
- nmake_root/lib/ssd must have the same owned as the
- Check Makefiles for ordering issues. Make sure directories and targets
are built in the correct order and "-" is used appropriately
to make sure certain prerequisites are built before their targets.
For more information see page 10-13 of the
nmake User's Guide.
The following items must be done to setup the coshell environment.
For more information see page 10-19 of the
nmake User's Guide.
- Run the nmake_root/bin/genshare and
nmake_root/bin/genlocal commands once on host machine.
- Edit the generated file named local.
- Setup shell environment:
- export COSHELL=coshell
- export NPROC=<number of concurrent jobs>
- export SHELL=ksh (88i version or later)
- export FPATH=nmake_root/fun:$FPATH
- export CS_MOUNT_LOCAL=<tmp directory accessible
by all machines>
- PATH must contain the location of the coshell
command (located in nmake_root/bin)
Reading coshell Output
Using coshell means nmake jobs are being processed in parallel.
Unfortunately, the output from these jobs also comes in parallel, which
means the output from the concurrent jobs will be mixed together.
For example, the output from a particular compile may not follow directly
after its compile command line. This makes it difficult to read the build
output, especially when tracing errors, since it is hard to match
command line executions with the corresponding output and errors.
Release lu3.2 introduces a new
Output Serialization feature.
The job of this feature is to organize this mixed-up output on the fly,
thus making the build output from concurrent jobs much more meaningful.
Releases prior to lu3.2 do not have this capibility, there is no
known solution to this problem for earlier releases. See the
lu3.2 Release Notes for more information.
- Make sure your environment variables such as FPATH and LD_LIBRARY_PATH
do not end with a colon (:). The extra colon may cause
a "server already running", or
"cannot connect to server" message.
- If you need to pass environment variables from the host process
to the distributed jobs use the COEXPORT environment variable on
- Never put coshell -r in Makefiles.
For more complete information on coshell refer to the
coshell(1) man page and
Chapter 10 of the
nmake User's Guide, which
includes a sample coshell session.
Last Update: Friday,12-Aug-2016 10:44:48 EDT