By default, work_queue_factory will run as many workers as the indicated masters have tasks ready to run. If there are multiple masters, then enough workers will be started to satisfy their collective needs. For example, if there are two masters with the same project name, each with 10 tasks to run, then work_queue_factory will start a total of 20 workers.
If the number of needed workers increases, work_queue_factory will submit more workers to meet the desired need. However, it will not run more than a fixed maximum number of workers, given by the -W option.
If the need for workers drops, work_queue_factory does not remove them immediately, but waits to them to exit on their own. (This happens when the worker has been idle for a certain time.) A minimum number of workers will be maintained, given by the -w option.
If given the -c option, then work_queue_factory will consider the capacity reported by each master. The capacity is the estimated number of workers that the master thinks it can handle, based on the task execution and data transfer times currently observed at the master. With the -c option on, work_queue_factory will consider the master's capacity to be the maximum number of workers to run.
If work_queue_factory receives a terminating signal, it will attempt to remove all running workers before exiting.
-M,--master-name <project> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Name of a preferred project. A worker can have multiple preferred projects. | |||||||||||||||||||||||||||||||||||||||||||||||||||
-T,--batch-type <type> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Batch system type: local, condor, sge, pbs, torque, blue_waters, slurm, moab, cluster, amazon, mesos. (default is local) | |||||||||||||||||||||||||||||||||||||||||||||||||||
--catalog <catalog> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Set catalog server to <catalog>. Format: HOSTNAME:PORT | |||||||||||||||||||||||||||||||||||||||||||||||||||
-B,--batch-options <options> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Add these options to all batch submit files. | |||||||||||||||||||||||||||||||||||||||||||||||||||
-w,--min-workers <workers> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Minimum workers running. (default=5) | |||||||||||||||||||||||||||||||||||||||||||||||||||
-W,--max-workers <workers> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Maximum workers running. (default=100) | |||||||||||||||||||||||||||||||||||||||||||||||||||
--workers-per-cycle <workers> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Maximum number of new workers per 30 seconds. ( less than 1 disables limit, default=5) | |||||||||||||||||||||||||||||||||||||||||||||||||||
--autosize | Automatically size a worker to an available slot (Condor and Mesos). | ||||||||||||||||||||||||||||||||||||||||||||||||||
-c --capacity | Use worker capacity reported by masters. | ||||||||||||||||||||||||||||||||||||||||||||||||||
-P,--password <file> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Password file for workers to authenticate to master. | |||||||||||||||||||||||||||||||||||||||||||||||||||
-t,--timeout <time> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Abort after this amount of idle time. | |||||||||||||||||||||||||||||||||||||||||||||||||||
-C,--config-file <file> | |||||||||||||||||||||||||||||||||||||||||||||||||||
Use the configuration file -E,--extra-options <options> | Extra options that should be added to the worker.
| --condor-requirements <str> | Manually set requirements for the workers as condor jobs. May be specified several times, with the expresions and-ed together (Condor only).
| -S,--scratch <file> | Scratch directory. (default is /tmp/${USER}-workers)
| --factory-timeout <n> | Exit after no master has been seen in | -d,--debug <flag> | Enable debugging for this subsystem.
| -o,--debug-file <file> | Write debugging output to this file. By default, debugging is sent to stderr (":stderr"). You may specify logs be sent to stdout (":stdout"), to the system syslog (":syslog"), or to the systemd journal (":journal").
| --factory-timeout <#> | Set factory timeout to <#> seconds. (off by default) This will cause work queue to exit when their are no masters present after the given number of seconds.
| --wrapper <Wrap all commands with this prefix.> |
| --wrapper-input <Add this file needed by the wrapper.> |
| --mesos-master <hostname> | Specify the host name to mesos master node (for use with -T mesos)
| --mesos-path <filepath> | Specify path to mesos python library (for use with -T mesos)
| --mesos-preload <library> | Specify the linking libraries for running mesos(for use with -T mesos)
| -h, --help | Show this screen.
| |
work_queue_factory -T condor -M barneyTo maintain a maximum of 100 workers on an SGE batch system, do this:
work_queue_factory -T sge -M barney -W 100To start workers such that the workers exit after 5 minutes (300s) of idleness:
work_queue_factory -T condor -M barney -t 300If you want to start workers that match any project that begins with barney, use a regular expression:
work_queue_factory -T condor -M barney.\* -t 300If running on condor, you may manually specify condor requirements:
work_queue_factory -T condor -M barney --condor-requirements 'MachineGroup == "disc"' --condor-requirements 'has_matlab == true'Repeated uses of condor-requirements are and-ed together. The previous example will produce a statement equivalent to: requirements = ((MachineGroup == "disc") && (has_matlab == true)) Use the configuration file my_conf:
work_queue_factory -Cmy_confmy_conf should be a proper JSON document, as:
{ "master-name": "my_master.*", "max-workers": 100, "min-workers": 0 }Valid configuration fields are:
master-name foremen-name min-workers max-workers workers-per-cycle task-per-worker timeout worker-extra-options condor-requirements cores memory disk