Description
This recipe is designed to give an example on how to configure a factory and frontend to submit user jobs to a batch cluster via BOSCO.
Requirement | Description |
A functioning glideinwms factory | The factory should be completely configured and functioning for Grid submissions. The main reason for this is to be able to be assured that the factory is running and works before we do any configuration for BOSCO. |
A functioning glideinwms frontend | The frontend should be completely configured and functioning for Grid submissions. The same reasoning for the factory applies here. |
Valid, current, enabled account to access a submit host and submit to the cluster. Specifically, you need the private and public ssh keys are needed for submission. Then you can add the resource by invoking the "bosco_cluster --add" command. This can be invoked from any host but we suggest to do it form the Frontend so that you don't need to transfer the ssh keys. See the BOSCO manual for more information on adding a BOSCO resource. |
Example BOSCO Factory Entry
<entry name="BOSCO_TEST_carver" auth_method="key_pair" enabled="True" gatekeeper="cmsuser@carvergrid.nersc.gov" gridtype="batch pbs" schedd_name="fermicloud199.fnal.gov" trust_domain="bosco" verbosity="std" work_dir="AUTO"> <config> <max_jobs glideins="3" held="2" idle="1"> <max_job_frontends></max_job_frontends> </max_jobs> <release max_per_cycle="20" sleep="0.2"/> <remove max_per_cycle="5" sleep="0.2"/> <restrictions require_glidein_glexec_use="False" require_voms_proxy="False"/> <submit cluster_size="10" max_per_cycle="100" sleep="0.2"/> </config> <allow_frontends></allow_frontends> <attrs> <attr name="CONDOR_ARCH" const="True" glidein_publish="False" job_publish="False" parameter="True" publish="False" type="string" value="default"/> <attr name="CONDOR_OS" const="True" glidein_publish="False" job_publish="False" parameter="True" publish="False" type="string" value="default"/> <attr name="GLEXEC_BIN" const="True" glidein_publish="False" job_publish="False" parameter="True" publish="True" type="string" value="NONE"/> <attr name="GLIDEIN_Site" const="True" glidein_publish="True" job_publish="True" parameter="True" publish="True" type="string" value="BOSCO_PBS"/> <attr name="USE_CCB" const="False" glidein_publish="True" job_publish="False" parameter="True" publish="True" type="string" value="True"/> <attr name="X509_CERT_DIR" const="True" glidein_publish="False" job_publish="True" parameter="True" publish="True" type="string" value="/osg/certificates"/> </attrs> <files></files> <submit_attrs> <submit_attr name="+remote_queue" value='"serial"'/> <submit_attrs> <infosys_refs></infosys_refs> <monitorgroups></monitorgroups> </entry>
The important pieces of the entry stanza listed above are listed below:
Name | Type | Value | Description |
auth_method |
The key pair in this case refers to the ssh keypair installed to access the BOSCO resource (remote cluster submit host). See Factory Configuration for a complete description. |
||
gatekeeper |
The gatekeeper attribute in the BOSCO case is the username and hostname used by the user to login to the cluster and submit jobs. See Factory Configuration for a complete description. |
||
gridtype | "batch pbs" |
It must be the keyword "batch" followed by the batch system used in the cluster (must be one supported by HTCondor/BOSCO, e.g pbs, condor, lsf, sge. See Factory Configuration for a complete description. |
|
trust_domain | "bosco" |
The trust domain can be any arbitrary value. Both the factory and the frontend must be configured to use the same value of the trust_domain. In this example, "bosco" is the arbitrary value. See Factory Configuration for a complete description. |
|
work_dir | "AUTO" |
The working directory that the pilot starts up in can be any one supported by the remote cluster or batch system. See Factory Configuration for a complete description. |
|
glideins | "3" |
This is a hard limit for the number of glideins that the factory will submit to the remote batch system. For testing purposes this example was restricted to 3 running VMs. See Factory Configuration for a complete description. |
|
held | "1" |
This is a limit for the number of glideins requests that can be in held state. If the number of held requests match this number, the factory will stop asking for more. For purposes of testing, this number was set extremely low. See Factory Configuration for a complete description. |
|
idle | "1" |
This is a limit for the number of glideins requests that can be in idle state. Ordinarily, this attribute is used to determine "pressure" at a grid site. See Factory Configuration for a complete description. |
|
submit_attr | - |
This element is used to specify RSL equivalent info for gt2/gt5. Name and value of the submit attribute configured will be put in the glidein's JDL before submission. For example, the above configuration shows how to configure glidein submission to a specific remote queue and will result in the following line in the glidein's JDL. +remote_queue = "serial" See Factory Configuration for a complete description. |
Example BOSCO Frontend Configuration
This only configuration for the frontend in this example is for the credential setup. The credential setup can be included in the group credential definition or in the global credential definition.
<credential absfname="/path/to/grid_proxy" security_class="frontend" trust_domain="OSG" type="grid_proxy"> <credential absfname="/path/to/bosco_key.rsa.pub" keyabsfname="/path/to/bosco_key.rsa" security_class="frontend" trust_domain="bosco" type="key_pair">
The important pieces of the credential stanza listed above are listed below:
Name | Type | Value | Description |
absfname | "/path/to/grid_proxy" |
This is the full path to the file containing the grid proxy used to identify the glidein with the Frontend See Frontend Configuration for a complete description. |
|
absfname | "/path/to/bosco_key.rsa.pub" |
This is the full path to the file containing the publik key installed on the BOSCO resource to allow ssh access See Frontend Configuration for a complete description. |
|
keyabsfname | "/path/to/bosco_key.rsa" |
This is the full path to the file containing the secret key used to access the BOSCO resource via ssh See Frontend Configuration for a complete description. |
|
security_class | "frontend" |
This is the security class that is defined for the other credentials on this frontend See Frontend Configuration for a complete description. |
|
trust_domain | "bosco" |
The trust domain can be any arbitrary value. Both the factory and the frontend must be configured to use the same value of the trust_domain. In this example, "bosco" is the arbitrary value. See Frontend Configuration for a complete description. |
|
type | "key_pair" |
The key pair in this case refers to the public and secret keys that can be used to ssh to the BOSCO resource submit host. This must match the value specified in the factory for the credentials to be matched properly See Frontend Configuration for a complete description. |
|
pilotabsfname | "/path/to/pilot_proxy" |
A proxy for the pilot is required in all cases, even if proxies are not used to authenticate on the gatekeeper. This is because the proxy is used to establish secure communication between the pilot and the user collector. See Frontend Configuration for a complete description. |