allpairs_master uses the Work Queue system to distribute tasks among processors. Each processor utilizes the allpairs_multicore(1) program to execute the tasks in parallel if multiple cores are present. After starting allpairs_master, you must start a number of work_queue_worker(1) processes on remote machines. The workers will then connect back to the master process and begin executing tasks.
-p,--port <port> | |
The port that the master will be listening on. | |
-e,--extra-args <args> | |
Extra arguments to pass to the comparison function. | |
-f,--input-file <file> | |
Extra input file needed by the comparison function. (may be given multiple times) | |
-o,--debug-file <file> | |
Write debugging output to this file. By default, debugging is sent to stderr (":stderr"). You may specify logs be sent to stdout (":stdout"), to the system syslog (":syslog"), or to the systemd journal (":journal"). | |
-O,----output-file <file> | |
Write task output to this file (default to standard output) | |
-t,--estimated-time <seconds> | |
Estimated time to run one comparison. (default chosen at runtime) | |
-x,--width <item> | |
Width of one work unit, in items to compare. (default chosen at runtime) | |
-y,--height <items> | |
Height of one work unit, in items to compare. (default chosen at runtime) | |
-N,--project-name <project> | |
Report the master information to a catalog server with the project name - <project> | |
-P,--priority <integer> | |
Priority. Higher the value, higher the priority. | |
-d,--debug <flag> | |
Enable debugging for this subsystem. (Try -d all to start.) | |
-v, --version | Show program version. |
-h, --help <> | |
Display this message. | |
-Z,--port-file <file> | |
Select port at random and write it to this file. (default is disabled) | |
--work-queue-preferred-connection <connection> | |
Indicate preferred connection. Chose one of by_ip or by_hostname. (default is by_ip) |
a b are 45 percent similarTo use the allpairs framework, create a file called set.list that lists each of your files, one per line:
a b c ...Because allpairs_master utilizes allpairs_multicore(1), so please make sure allpairs_multicore(1) is in your PATH before you proceed.To run a All-Pairs workflow sequentially, start a single work_queue_worker(1) process in the background. Then, invoke allpairs_master.
% work_queue_worker localhost 9123 & % allpairs_master set.list set.list compareitThe framework will carry out all possible comparisons of the objects, and print the results one by one (note that the first two columns are X and Y indices in the resulting matrix):
1 1 a a are 100 percent similar 1 2 a b are 45 percent similar 1 3 a c are 37 percent similar ...To speed up the process, run more work_queue_worker(1) processes on other machines, or use condor_submit_workers(1) or sge_submit_workers(1) to start hundreds of workers in your local batch system.
The following is an example of adding more workers to execute a All-Pairs workflow. Suppose your allpairs_master is running on a machine named barney.nd.edu. If you have access to login to other machines, you could simply start worker processes on each one, like this:
% work_queue_worker barney.nd.edu 9123If you have access to a batch system like Condor, you can submit multiple workers at once:
% condor_submit_workers barney.nd.edu 9123 10 Submitting job(s).......... Logging submit event(s).......... 10 job(s) submitted to cluster 298.