Francesco Pierfederici (Space Telescope Science Institute), Mike Swam (STScI), Gretchen Greene (STScI), Chris Sontag (STScI)
Abstract
The Open Workflow Layer (OWL) is an open source Workflow Management System developed at the Space Telescope Science Institute. OWL is being designed for the James Webb Space Telescope (JWST) science data processing using the Hubble Space Telescope (HST) as a test bed. It is also being seriously considered as a possible replacement for the aging OPUS system being used for HST.
OWL is a thin Python layer on top of Condor, a widely used open source batch scheduling system developed by the University of Wisconsin. As such, OWL can transparently take advantage of the many features offered by Condor without having to re-implement them from scratch. Notable examples include the ability to
Schedule jobs on heterogeneous clusters.
Dynamically add or remove compute resources.
Make use of public and/or private clouds.
The two main capabilities that OWL adds to Condor are a blackboard system and templated workflows.
Blackboards are in-memory data structures, usually persisted in a database, that hold both the instantaneous and the historical state of the system. OWL provides both a process-centric and a dataset-centric blackboard. These together allow developers and pipeline operators to easily see which pieces of data are being processed, which ones have been, which machines are available, their instantaneous and historical load etc. This functionality is essential for operations, as experience with HST science data processing clearly shows.
Templated workflows allow customization of processing at run-time, based on the characteristics of the data (i.e. some pieces of data do not need to go through some processing steps) and/or the environment (e.g. the availability of compute resources, the version of the software being used etc.). They essentially make the mostly static Condor workflows fully dynamic.
In the present paper we describe the architecture of OWL and our design choices (as well as the rationale for them) in detail.Poster in PDF format
Paper ID: P116
Poster Instructions
|