Automation workflow docs: Trivial good health check
runghc.py
Submit ECAL automation jobs.
usage: runghc.py [-h] [--db DBNAME] [--notify NOTIFY]
[--campaign CAMPAIGN [CAMPAIGN ...]] [--logurl LOGURL]
[--wdir WDIR] [--eosdir EOSDIR] [--pd DATASET]
[--dtire DATATIRE]
{submit,resubmit,check} ...
Named Arguments
- --db
Database name, default is test db
Default: “ecal_online_test”
- --notify
Mattermost incoming webhook url for notifications
- --campaign
Processing campaign(s). “all” for all campaigns in the db
- --logurl
Base url for the logs
Default: “https://ecallogs.web.cern.ch/”
- --wdir
Working directory
- --eosdir
Base path of output location on EOS
- --pd
Primary dataset
Default: “ExpressCosmics”
- --dtire
Data tier
Default: “FEVT”
subcommands
Select command to execute
- subcommand
Possible choices: submit, resubmit, check
Sub-commands
submit
Process all runs marked as new in the automation db
runghc.py submit [-h] [--t0 | --lfn]
Named Arguments
- --t0
Read input files from T0 storage
Default: False
- --lfn
Use logical file names for input files
Default: False
resubmit
Check for failed jobs and resubmit them
runghc.py resubmit [-h] [--lfn]
Named Arguments
- --lfn
Use logical file names for input files
Default: False
check
Check ongoing runs and mark them as done/failed if completed
runghc.py check [-h] [--max-retries MAX_RETRIES]
[--skipped-delay SKIPPED_DELAY]
Named Arguments
- --max-retries
Max number of tries, for each single job. -1 = no limit
Default: -1
- --skipped-delay
Number of days after which a task is considered as stalled and is marked as skipped.
Default: 7