HTCondor

Always remember to create 3 folders, named error, log and output before launching the job

Check if a job is in hold for which reason with

condor_q -hold 16.0

(replace 16.0 with number of your job)

condor_q -analyze

Useful to check why a job is in idle for so long...

condor_q -l 16.0

Get attribute details of the job

condor_tail 16.0

Check std_output of the job (seems only from latest command in the .sh file)

Some differences between CERN and CNAF HTCondor jobs

HTCondor for CERN lxplus are submitted without should_transfer_file (they have a shared filesytem?), and there is only one log file for the whole cluster (log/neutrino.$(ClusterId).log). The jobs are removed automatically from condor after completion

Instead, HTCondor jobs at CNAF are launched with should_transfer_file activated. It means the jobs are not removed automatically after completion, and I need to launch condor_transfer_data -all and then condor_rm jobnumberafterwards. If the log name is the same, it is overwritten and not appended. We need a separate log file for each job (still, it may be a good idea to append them together afterwards): log/corsika.$(ClusterId).$(ProcId).log

Last updated

Was this helpful?