cctools
Public Member Functions
work_queue.WorkQueue Class Reference

Python Work Queue object. More...

Inheritance diagram for work_queue.WorkQueue:

Public Member Functions

def __init__
 Create a new work queue. More...
 
def name
 Get the project name of the queue. More...
 
def port
 Get the listening port of the queue. More...
 
def stats
 Get queue statistics. More...
 
def stats_hierarchy
 Get worker hierarchy statistics. More...
 
def stats_category
 Get the task statistics for the given category. More...
 
def specify_category_mode
 Turn on or off first-allocation labeling for a given category. More...
 
def specify_category_autolabel_resource
 Turn on or off first-allocation labeling for a given category and resource. More...
 
def task_state
 Get current task state. More...
 
def enable_monitoring
 Enables resource monitoring of tasks in the queue, and writes a summary per task to the directory given. More...
 
def enable_monitoring_full
 As enable_monitoring, but it also generates a time series and a debug file. More...
 
def activate_fast_abort
 Turn on or off fast abort functionality for a given queue for tasks in the "default" category, and for task which category does not set an explicit multiplier. More...
 
def activate_fast_abort_category
 Turn on or off fast abort functionality for a given queue. More...
 
def empty
 Determine whether there are any known tasks queued, running, or waiting to be collected. More...
 
def hungry
 Determine whether the queue can support more tasks. More...
 
def specify_algorithm
 Set the worker selection algorithm for queue. More...
 
def specify_task_order
 Set the order for dispatching submitted tasks in the queue. More...
 
def specify_name
 Change the project name for the given queue. More...
 
def specify_master_preferred_connection
 Set the preference for using hostname over IP address to connect. More...
 
def specify_min_taskid
 Set the minimum taskid of future submitted tasks. More...
 
def specify_priority
 Change the project priority for the given queue. More...
 
def specify_num_tasks_left
 Specify the number of tasks not yet submitted to the queue. More...
 
def specify_master_mode
 Specify the master mode for the given queue. More...
 
def specify_catalog_server
 Specify the catalog server the master should report to. More...
 
def specify_log
 Specify a log file that records the cummulative stats of connected workers and submitted tasks. More...
 
def specify_transactions_log
 Specify a log file that records the states of tasks. More...
 
def specify_password
 Add a mandatory password that each worker must present. More...
 
def specify_password_file
 Add a mandatory password file that each worker must present. More...
 
def specify_max_resources
 Specifies the maximum resources allowed for the default category. More...
 
def specify_category_max_resources
 Specifies the maximum resources allowed for the given category. More...
 
def specify_category_first_allocation_guess
 Specifies the first-allocation guess for the given category. More...
 
def initialize_categories
 Initialize first value of categories. More...
 
def cancel_by_taskid
 Cancel task identified by its taskid and remove from the given queue. More...
 
def cancel_by_tasktag
 Cancel task identified by its tag and remove from the given queue. More...
 
def shutdown_workers
 Shutdown workers connected to queue. More...
 
def blacklist
 Blacklist workers running on host. More...
 
def blacklist_with_timeout
 Blacklist workers running on host for the duration of the given timeout. More...
 
def blacklist_clear
 Remove host from blacklist. More...
 
def specify_keepalive_interval
 Change keepalive interval for a given queue. More...
 
def specify_keepalive_timeout
 Change keepalive timeout for a given queue. More...
 
def estimate_capacity
 Turn on master capacity measurements. More...
 
def tune
 Tune advanced parameters for work queue. More...
 
def submit
 Submit a task to the queue. More...
 
def wait
 Wait for tasks to complete. More...
 

Detailed Description

Python Work Queue object.

This class uses a dictionary to map between the task pointer objects and the work_queue.Task.

Constructor & Destructor Documentation

def work_queue.WorkQueue.__init__ (   self,
  port = WORK_QUEUE_DEFAULT_PORT,
  name = None,
  catalog = False,
  exclusive = True,
  shutdown = False 
)

Create a new work queue.

Parameters
selfReference to the current work queue object.
portThe port number to listen on. If zero is specified, then the default is chosen, and if -1 is specified, a random port is chosen.
nameThe project name to use.
catalogWhether or not to enable catalog mode.
exclusiveWhether or not the workers should be exclusive.
shutdownAutomatically shutdown workers when queue is finished. Disabled by default.
See Also
work_queue_create - For more information about environmental variables that affect the behavior this method.

References work_queue.WorkQueue.__free_queue(), work_queue.WorkQueue._shutdown, work_queue.WorkQueue._stats, work_queue.WorkQueue._stats_hierarchy, work_queue.WorkQueue._task_table, work_queue.WorkQueue._work_queue, work_queue.WorkQueue.shutdown_workers(), work_queue_create(), work_queue_delete(), work_queue_specify_master_mode(), and work_queue_specify_name().

Member Function Documentation

def work_queue.WorkQueue.name (   self)

Get the project name of the queue.

Note: This is defined using property decorator. So it must be called without parentheses (). For example:

1 >>> print q.name

References work_queue.WorkQueue._work_queue, and work_queue_name().

def work_queue.WorkQueue.port (   self)

Get the listening port of the queue.

Note: This is defined using property decorator. So it must be called without parentheses (). For example:

1 >>> print q.port

References work_queue.WorkQueue._work_queue, and work_queue_port().

def work_queue.WorkQueue.stats (   self)

Get queue statistics.

Note: This is defined using property decorator. So it must be called without parentheses (). For example:

1 >>> print q.stats

The fields in stats can also be individually accessed through this call. For example:

1 >>> print q.stats.workers_busy

References work_queue.WorkQueue._stats, work_queue.WorkQueue._work_queue, and work_queue_get_stats().

def work_queue.WorkQueue.stats_hierarchy (   self)

Get worker hierarchy statistics.

Note: This is defined using property decorator. So it must be called without parentheses (). For example:

1 >>> print q.stats_hierarchy

The fields in stats_hierarchy can also be individually accessed through this call. For example:

1 >>> print q.stats_hierarchy.workers_busy

References work_queue.WorkQueue._stats_hierarchy, work_queue.WorkQueue._work_queue, and work_queue_get_stats_hierarchy().

def work_queue.WorkQueue.stats_category (   self,
  category 
)

Get the task statistics for the given category.

Parameters
selfReference to the current work queue object.
categoryA category name. For example:
1 s = q.stats_category("my_category")
2 >>> print s
The fields in work_queue_stats can also be individually accessed through this call. For example:
1 >>> print s.tasks_waiting

References work_queue.WorkQueue._work_queue, and work_queue_get_stats_category().

def work_queue.WorkQueue.specify_category_mode (   self,
  category,
  mode 
)

Turn on or off first-allocation labeling for a given category.

By default, only cores, memory, and disk are labeled. Turn on/off other specific resources with specify_category_autolabel_resource. NOTE: autolabeling is only meaningfull when task monitoring is enabled (enable_monitoring). When monitoring is enabled and a task exhausts resources in a worker, mode dictates how work queue handles the exhaustion:

Parameters
selfReference to the current work queue object.
categoryA category name. If None, sets the mode by default for newly created categories.
modeOne of category_mode_t:
  • WORK_QUEUE_ALLOCATION_MODE_FIXED Task fails (default).
  • WORK_QUEUE_ALLOCATION_MODE_MAX If maximum values are specified for cores, memory, or disk (e.g. via specify_max_category_resources or specify_memory), and one of those resources is exceeded, the task fails. Otherwise it is retried until a large enough worker connects to the master, using the maximum values specified, and the maximum values so far seen for resources not specified. Use specify_max_retries to set a limit on the number of times work queue attemps to complete the task.
  • WORK_QUEUE_ALLOCATION_MODE_MIN_WASTE As above, but work queue tries allocations to minimize resource waste.
  • WORK_QUEUE_ALLOCATION_MODE_MAX_THROUGHPUT As above, but work queue tries allocations to maximize throughput.

References work_queue.WorkQueue._work_queue, and work_queue_specify_category_mode().

def work_queue.WorkQueue.specify_category_autolabel_resource (   self,
  category,
  resource,
  autolabel 
)

Turn on or off first-allocation labeling for a given category and resource.

This function should be use to fine-tune the defaults from specify_category_mode.

Parameters
qA work queue object.
categoryA category name.
resourceA resource name.
autolabelTrue/False for on/off.
Returns
1 if resource is valid, 0 otherwise.

References work_queue.WorkQueue._work_queue, and work_queue_enable_category_resource().

def work_queue.WorkQueue.task_state (   self,
  taskid 
)

Get current task state.

See work_queue_task_state_t for possible values.

1 >>> print q.task_state(taskid)

References work_queue.WorkQueue._work_queue, and work_queue_task_state().

def work_queue.WorkQueue.enable_monitoring (   self,
  dirname = None,
  watchdog = True 
)

Enables resource monitoring of tasks in the queue, and writes a summary per task to the directory given.

Additionally, all summaries are consolidate into the file all_summaries-PID.log

Returns 1 on success, 0 on failure (i.e., monitoring was not enabled).

Parameters
selfReference to the current work queue object.
dirnameDirectory name for the monitor output.
watchdogIf True (default), kill tasks that exhaust their declared resources.

References work_queue.WorkQueue._work_queue, and work_queue_enable_monitoring().

def work_queue.WorkQueue.enable_monitoring_full (   self,
  dirname = None,
  watchdog = True 
)

As enable_monitoring, but it also generates a time series and a debug file.

WARNING: Such files may reach gigabyte sizes for long running tasks.

Returns 1 on success, 0 on failure (i.e., monitoring was not enabled).

Parameters
selfReference to the current work queue object.
dirnameDirectory name for the monitor output.
watchdogIf True (default), kill tasks that exhaust their declared resources.

References work_queue.WorkQueue._work_queue, and work_queue_enable_monitoring_full().

def work_queue.WorkQueue.activate_fast_abort (   self,
  multiplier 
)

Turn on or off fast abort functionality for a given queue for tasks in the "default" category, and for task which category does not set an explicit multiplier.

Parameters
selfReference to the current work queue object.
multiplierThe multiplier of the average task time at which point to abort; if negative (the default) fast_abort is deactivated.

References work_queue.WorkQueue._work_queue, and work_queue_activate_fast_abort().

def work_queue.WorkQueue.activate_fast_abort_category (   self,
  name,
  multiplier 
)

Turn on or off fast abort functionality for a given queue.

Parameters
selfReference to the current work queue object.
nameName of the category.
multiplierThe multiplier of the average task time at which point to abort; if zero, deacticate for the category, negative (the default), use the one for the "default" category (see fast_abort)

References work_queue.WorkQueue._work_queue, and work_queue_activate_fast_abort_category().

def work_queue.WorkQueue.empty (   self)

Determine whether there are any known tasks queued, running, or waiting to be collected.

Returns 0 if there are tasks remaining in the system, 1 if the system is "empty".

Parameters
selfReference to the current work queue object.

References work_queue.WorkQueue._work_queue, and work_queue_empty().

def work_queue.WorkQueue.hungry (   self)

Determine whether the queue can support more tasks.

Returns the number of additional tasks it can support if "hungry" and 0 if "sated".

Parameters
selfReference to the current work queue object.

References work_queue.WorkQueue._work_queue, and work_queue_hungry().

def work_queue.WorkQueue.specify_algorithm (   self,
  algorithm 
)

Set the worker selection algorithm for queue.

Parameters
selfReference to the current work queue object.
algorithmOne of the following algorithms to use in assigning a task to a worker. See work_queue_schedule_t for possible values.

References work_queue.WorkQueue._work_queue, and work_queue_specify_algorithm().

def work_queue.WorkQueue.specify_task_order (   self,
  order 
)

Set the order for dispatching submitted tasks in the queue.

Parameters
selfReference to the current work queue object.
orderOne of the following algorithms to use in dispatching submitted tasks to workers:

References work_queue.WorkQueue._work_queue, and work_queue_specify_task_order().

def work_queue.WorkQueue.specify_name (   self,
  name 
)

Change the project name for the given queue.

Parameters
selfReference to the current work queue object.
nameThe new project name.

References work_queue.WorkQueue._work_queue, and work_queue_specify_name().

def work_queue.WorkQueue.specify_master_preferred_connection (   self,
  mode 
)

Set the preference for using hostname over IP address to connect.

'by_ip' uses IP address (standard behavior), or 'by_hostname' to use the hostname at the master.

Parameters
selfReference to the current work queue object.
preferred_connectionAn string to indicate using 'by_ip' or a 'by_hostname'.

References work_queue.WorkQueue._work_queue, and work_queue_master_preferred_connection().

def work_queue.WorkQueue.specify_min_taskid (   self,
  minid 
)

Set the minimum taskid of future submitted tasks.

Further submitted tasks are guaranteed to have a taskid larger or equal to minid. This function is useful to make taskids consistent in a workflow that consists of sequential masters. (Note: This function is rarely used). If the minimum id provided is smaller than the last taskid computed, the minimum id provided is ignored.

Parameters
selfReference to the current work queue object.
minidMinimum desired taskid
Returns
Returns the actual minimum taskid for future tasks.

References work_queue.WorkQueue._work_queue, and work_queue_specify_min_taskid().

def work_queue.WorkQueue.specify_priority (   self,
  priority 
)

Change the project priority for the given queue.

Parameters
selfReference to the current work queue object.
priorityAn integer that presents the priorty of this work queue master. The higher the value, the higher the priority.

References work_queue.WorkQueue._work_queue, and work_queue_specify_priority().

def work_queue.WorkQueue.specify_num_tasks_left (   self,
  ntasks 
)

Specify the number of tasks not yet submitted to the queue.

It is used by work_queue_factory to determine the number of workers to launch. If not specified, it defaults to 0. work_queue_factory considers the number of tasks as: num tasks left + num tasks running + num tasks read.

Parameters
qA work queue object.
ntasksNumber of tasks yet to be submitted.

References work_queue.WorkQueue._work_queue, and work_queue_specify_num_tasks_left().

def work_queue.WorkQueue.specify_master_mode (   self,
  mode 
)

Specify the master mode for the given queue.

Parameters
selfReference to the current work queue object.
modeThis may be one of the following values: WORK_QUEUE_MASTER_MODE_STANDALONE or WORK_QUEUE_MASTER_MODE_CATALOG.

References work_queue.WorkQueue._work_queue, and work_queue_specify_master_mode().

def work_queue.WorkQueue.specify_catalog_server (   self,
  hostname,
  port 
)

Specify the catalog server the master should report to.

Parameters
selfReference to the current work queue object.
hostnameThe hostname of the catalog server.
portThe port the catalog server is listening on.

References work_queue.WorkQueue._work_queue, and work_queue_specify_catalog_server().

def work_queue.WorkQueue.specify_log (   self,
  logfile 
)

Specify a log file that records the cummulative stats of connected workers and submitted tasks.

Parameters
selfReference to the current work queue object.
logfileFilename.

References work_queue.WorkQueue._work_queue, and work_queue_specify_log().

def work_queue.WorkQueue.specify_transactions_log (   self,
  logfile 
)

Specify a log file that records the states of tasks.

Parameters
selfReference to the current work queue object.
logfileFilename.

References work_queue.WorkQueue._work_queue, and work_queue_specify_transactions_log().

def work_queue.WorkQueue.specify_password (   self,
  password 
)

Add a mandatory password that each worker must present.

Parameters
selfReference to the current work queue object.
passwordThe password.

References work_queue.WorkQueue._work_queue, and work_queue_specify_password().

def work_queue.WorkQueue.specify_password_file (   self,
  file 
)

Add a mandatory password file that each worker must present.

Parameters
selfReference to the current work queue object.
fileName of the file containing the password.

References work_queue.WorkQueue._work_queue, and work_queue_specify_password_file().

def work_queue.WorkQueue.specify_max_resources (   self,
  rmd 
)

Specifies the maximum resources allowed for the default category.

Parameters
selfReference to the current work queue object.
rmDictionary indicating maximum values. See for possible fields. For example:
1 >>> # A maximum of 4 cores is found on any worker:
2 >>> q.specify_max_resources({'cores': 4})
3 >>> # A maximum of 8 cores, 1GB of memory, and 10GB disk are found on any worker:
4 >>> q.specify_max_resources({'cores': 8, 'memory': 1024, 'disk': 10240})

References work_queue.WorkQueue._work_queue, and work_queue_specify_max_resources().

def work_queue.WorkQueue.specify_category_max_resources (   self,
  category,
  rmd 
)

Specifies the maximum resources allowed for the given category.

Parameters
selfReference to the current work queue object.
categoryName of the category.
rmDictionary indicating maximum values. See for possible fields. For example:
1 >>> # A maximum of 4 cores may be used by a task in the category:
2 >>> q.specify_category_max_resources("my_category", {'cores': 4})
3 >>> # A maximum of 8 cores, 1GB of memory, and 10GB may be used by a task:
4 >>> q.specify_category_max_resources("my_category", {'cores': 8, 'memory': 1024, 'disk': 10240})

References work_queue.WorkQueue._work_queue, and work_queue_specify_category_max_resources().

def work_queue.WorkQueue.specify_category_first_allocation_guess (   self,
  category,
  rmd 
)

Specifies the first-allocation guess for the given category.

Parameters
selfReference to the current work queue object.
categoryName of the category.
rmDictionary indicating maximum values. See for possible fields. For example:
1 >>> # A maximum of 4 cores may be used by a task in the category:
2 >>> q.specify_max_category_resources("my_category", {'cores': 4})
3 >>> # A maximum of 8 cores, 1GB of memory, and 10GB may be used by a task:
4 >>> q.specify_max_category_resources("my_category", {'cores': 8, 'memory': 1024, 'disk': 10240})

References work_queue.WorkQueue._work_queue, and work_queue_specify_category_first_allocation_guess().

def work_queue.WorkQueue.initialize_categories (   filename,
  rm 
)

Initialize first value of categories.

Parameters
selfReference to the current work queue object.
rmDictionary indicating maximum values. See for possible fields.
filenameJSON file with resource summaries.

References work_queue.WorkQueue._work_queue, and work_queue_initialize_categories().

def work_queue.WorkQueue.cancel_by_taskid (   self,
  id 
)

Cancel task identified by its taskid and remove from the given queue.

Parameters
selfReference to the current work queue object.
idThe taskid returned from submit.

References work_queue.WorkQueue._work_queue, and work_queue_cancel_by_taskid().

def work_queue.WorkQueue.cancel_by_tasktag (   self,
  tag 
)

Cancel task identified by its tag and remove from the given queue.

Parameters
selfReference to the current work queue object.
tagThe tag assigned to task using work_queue_task_specify_tag.

References work_queue.WorkQueue._work_queue, and work_queue_cancel_by_tasktag().

def work_queue.WorkQueue.shutdown_workers (   self,
  n 
)

Shutdown workers connected to queue.

Gives a best effort and then returns the number of workers given the shutdown order.

Parameters
selfReference to the current work queue object.
nThe number to shutdown. To shut down all workers, specify "0".

References work_queue.WorkQueue._work_queue, and work_queue_shut_down_workers().

Referenced by work_queue.WorkQueue.__init__().

def work_queue.WorkQueue.blacklist (   self,
  host 
)

Blacklist workers running on host.

Parameters
selfReference to the current work queue object.
hostThe hostname the host running the workers.

References work_queue.WorkQueue._work_queue, and work_queue_blacklist_add().

def work_queue.WorkQueue.blacklist_with_timeout (   self,
  host,
  timeout 
)

Blacklist workers running on host for the duration of the given timeout.

Parameters
selfReference to the current work queue object.
hostThe hostname the host running the workers.
timeoutHow long this blacklist entry lasts (in seconds). If less than 1, blacklist indefinitely.

References work_queue.WorkQueue._work_queue, and work_queue_blacklist_add_with_timeout().

def work_queue.WorkQueue.blacklist_clear (   self,
  host = None 
)

Remove host from blacklist.

Clear all blacklist if host not provided.

Parameters
selfReference to the current work queue object.
hostThe of the hostname the host.

References work_queue.WorkQueue._work_queue, work_queue_blacklist_clear(), and work_queue_blacklist_remove().

def work_queue.WorkQueue.specify_keepalive_interval (   self,
  interval 
)

Change keepalive interval for a given queue.

Parameters
selfReference to the current work queue object.
intervalMinimum number of seconds to wait before sending new keepalive checks to workers.

References work_queue.WorkQueue._work_queue, and work_queue_specify_keepalive_interval().

def work_queue.WorkQueue.specify_keepalive_timeout (   self,
  timeout 
)

Change keepalive timeout for a given queue.

Parameters
selfReference to the current work queue object.
timeoutMinimum number of seconds to wait for a keepalive response from worker before marking it as dead.

References work_queue.WorkQueue._work_queue, and work_queue_specify_keepalive_timeout().

def work_queue.WorkQueue.estimate_capacity (   self)

Turn on master capacity measurements.

Parameters
selfReference to the current work queue object.

References work_queue.WorkQueue._work_queue, and work_queue_specify_estimate_capacity_on().

def work_queue.WorkQueue.tune (   self,
  name,
  value 
)

Tune advanced parameters for work queue.

Parameters
selfReference to the current work queue object.
nameThe name fo the parameter to tune. Can be one of following:
  • "asynchrony-multiplier" Treat each worker as having (actual_cores * multiplier) total cores. (default = 1.0)
  • "asynchrony-modifier" Treat each worker as having an additional "modifier" cores. (default=0)
  • "min-transfer-timeout" Set the minimum number of seconds to wait for files to be transferred to or from a worker. (default=300)
  • "foreman-transfer-timeout" Set the minimum number of seconds to wait for files to be transferred to or from a foreman. (default=3600)
  • "fast-abort-multiplier" Set the multiplier of the average task time at which point to abort; if negative or zero fast_abort is deactivated. (default=0)
  • "keepalive-interval" Set the minimum number of seconds to wait before sending new keepalive checks to workers. (default=300)
  • "keepalive-timeout" Set the minimum number of seconds to wait for a keepalive response from worker before marking it as dead. (default=30)
valueThe value to set the parameter to.
Returns
0 on succes, -1 on failure.

References work_queue.WorkQueue._work_queue, and work_queue_tune().

def work_queue.WorkQueue.submit (   self,
  task 
)

Submit a task to the queue.

It is safe to re-submit a task returned by wait.

Parameters
selfReference to the current work queue object.
taskA task description created from work_queue.Task.

References work_queue.WorkQueue._task_table, work_queue.WorkQueue._work_queue, and work_queue_submit().

def work_queue.WorkQueue.wait (   self,
  timeout = WORK_QUEUE_WAITFORTASK 
)

Wait for tasks to complete.

This call will block until the timeout has elapsed

Parameters
selfReference to the current work queue object.
timeoutThe number of seconds to wait for a completed task before returning. Use an integer to set the timeout or the constant WORK_QUEUE_WAITFORTASK to block until a task has completed.

References work_queue.WorkQueue._task_table, work_queue.WorkQueue._work_queue, rmsummary.snapshots_count, and work_queue_wait().


The documentation for this class was generated from the following file: