Submitting CMS-Connect jobs
Preparation (to be done once)
In order to submit jobs through CMS-Connect, one has to sign up for the service, as explained in the "Introduction to CMS Connect" page. This process has to be done only once, but it usually takes more than a day to complete, so it's good to do in advance.
Once registered, you should be able to log into the interactive login node:
ssh username@login.uscms.org
and set up your SSH keys, as explained in the "CMS Connect Quickstart" page.
After these preliminary steps have been completed, you should be able to log into the interactive node and submit your Condor jobs.
Job submission
The following is an example of a job submitted explicitly to our Tier-2 at Purdue (you'll have to substitute your username everywhere where I use 'piperov', of course):
ssh piperov@login.uscms.org
[piperov@login ~]$ mkdir condor_test
[piperov@login ~]$ cd condor_test/
[piperov@login condor_test]$ mkdir log
Create a job submit description file with these contents:
# The UNIVERSE defines an execution environment. You will almost always use VANILLA.
Universe = vanilla
# EXECUTABLE is the program your job will run It's often useful
# to create a shell script to "wrap" your actual work.
Executable = short_test.sh
# arguments = --cpu 8 --vm 8
# ERROR and OUTPUT are the error and output channels from your job
# that HTCondor returns from the remote host.
# Error = job.error
# Output = job.output
Error = log/job.error.$(Cluster)-$(Process)
Output = log/job.output.$(Cluster)-$(Process)
Log = log/job.log.$(Cluster)
# The LOG file is where HTCondor places information about your
# job's status, success, and resource consumption.
Log = job.log
# +ProjectName is the name of the project reported to the OSG accounting system
# +ProjectName="osg.ConnectTrain"
+ProjectName="cms.org.purdue"
+DESIRED_Sites="T2_US_Purdue"
+REQUIRED_OS = "rhel7"
Requirements = HAS_SINGULARITY == True
request_cpus = 4
request_memory = 8192
# QUEUE is the "start button" - it launches any jobs that have been
# specified thus far.
Queue 10
Then, create the short_test.sh shell script which will be executed at the site:
#!/bin/bash
# short_test.sh: a short discovery job
printf "Start time: "; /bin/date
printf "Job is running on node: "; /bin/hostname
printf "Job running as user: "; /usr/bin/id
printf "Job is running in directory: "; /bin/pwd
echo
echo "Working hard..."
printf "Date/Time before starting: "; /usr/bin/date
sleep ${1-15}
printf "Date/Time after finishing: "; /usr/bin/date
echo "Science complete!"
Now you can submit this batch of 10 jobs through Condor:
[piperov@login condor_test]$ condor_submit test.submit
Submitting job(s)..........
10 job(s) submitted to cluster 6620377.
You can now check the status of your jobs with condor_q
When the jobs complete, you will find their std_out and std_err streams dumped into files in the log/
sub-directory. The files are named after the cluster ID and job index - something like:
job.output.6620377-0
job.error.6620377-0
If you need to cancel a job that you submitted with wrong parameters for example, you can use condor_rm
To remove a single job of the cluster, use:
[piperov@login condor_test]$ condor_rm 6620377.999
Job 6620377.9 marked for removal
And if you want to remove the whole cluster of jobs:
piperov@login condor_test]$ condor_rm 6620377
All jobs in cluster 6620377 have been marked for removal
There are more examples of submitting Condor jobs through CMS-Connect on the Quick Condor Tutorial page