obiwan.runmanager.status¶

Monitors an obiwan production run using qdo

Classes

`QdoList`(outdir[, que_name, skip_succeed, …])	Queries the qdo db and maps log files to tasks and task status
`RunStatus`(tasks, logs)	Tallys which QDO_RESULTS actually finished, what errors occured, etc.

Functions

`get_checkpoint_fn`(outdir, brick, rowstart)
`get_deldirs`(outdir, brick, rowstart[, …])	If slurm timeout or failed, logfile will exist in final dir but other outputs will be in interm dir.
`get_final_dir`(outdir, brick, rowstart[, …])	Returns paths like outdir/replaceme/bri/brick/rs0
`get_interm_dir`(outdir, brick, rowstart[, …])	Returns paths like outdir/bri/brick/rs0
`get_logdir`(outdir, brick, rowstart[, …])
`get_logfile`(outdir, brick, rowstart[, …])
`get_slurm_files`(outdir)

class obiwan.runmanager.status.QdoList(outdir, que_name='obiwan', skip_succeed=False, rand_num=None, firstN=None)[source]¶

Queries the qdo db and maps log files to tasks and task status

Parameters:	outdir – obiwan outdir, the slurm.out files are there que_name* – ie. qdo create que_name skip_suceeded – number succeeded tasks can be very large for production runs, this slows down code so skip those tasks

change_task_state(task_ids, to=None, modify=False, rm_files=False)[source]¶

change qdo tasks state, for tasks with task_ids, to pending,failed, etc

Parameters:	to – change qdo state to this, pending,failed rm_files – delete the output files for that task modify – actually do the modifications (fail safe option)

get_tasks_logs()[source]¶: get tasks and logs for the three types of qdo status

class obiwan.runmanager.status.RunStatus(tasks, logs)[source]¶

Tallys which QDO_RESULTS actually finished, what errors occured, etc.

Args: tasks: dict, each key is list of qdo tasks logs: dict, each key is list of log files for each task

Defaults: regex_errs: list of regular expressions matching possible log file errors

get_logs_for_failed(regex='Other')[source]¶: Returns log and slurm filenames for failed tasks labeled as regex

obiwan.runmanager.status.get_deldirs(outdir, brick, rowstart, do_skipids='no', do_more='no')[source]¶: If slurm timeout or failed, logfile will exist in final dir but other outputs will be in interm dir. Return list of dirst to all of these

obiwan.runmanager.status.get_final_dir(outdir, brick, rowstart, do_skipids='no', do_more='no')[source]¶: Returns paths like outdir/replaceme/bri/brick/rs0

obiwan.runmanager.status.get_interm_dir(outdir, brick, rowstart, do_skipids='no', do_more='no')[source]¶: Returns paths like outdir/bri/brick/rs0