|
Overview of Pipeline Integration at Fermilab
Overview of Pipeline Integration at Fermilab
- Tools and mechanisms for integration
- Data model and FITS/ASCII parameter file standard --
Web page with exact:
- Header keywords
- Table fields and format
- Documentation on meaning of keywords and field contents
- File naming conventions
- Usage (who creates the file, who uses the file)
- Source code control software: "CVS"
- Public domain -- multi-platform support
- Complete logging of changes -- ability to revert if bugs introduced
- Branching capability -- however branch should be used only in certain
situations as appears to take expert users talking to each other to work
without confusion
- UPS/UPD binary distribution control
- Controls naming of executables on all supported platforms
- Allows easy distribution of executables on all supported platforms
- Initial setup/configuration of UPS/UPD system itself is somewhat difficult
- Tags in headers / filenames indicate version of software used and software
dependencies
- Which version of software was run on this input/output file
- Example -- Apply Calibrations, need inputs from photo, astrom,
mtpipe -- outputs: tsObj files
- The 'rerun' tag
- Interface change mechanism
- Multi-platform UNIX code compilation / execution
- Bug Database 'Gnats'
- Excellent overall control of local bugs and enhancement requests
- Difficult to track 'cross system bugs' or 'bugs of unknown origin'
- Old change requests and non-critical bugs tend to pile up
- Regression Tests -- Testbed data
- Ensures that when pipelines are updated things don't break
- As bugs are fixed a test can be added to ensure that they stay fixed
- Hard work to add them -- so not always done
- How Processed Data Gets into the OPDB
- Upstream Imaging Processing: Ops-prepare, MTPIPE, Stamp Collection, Astrometry, PSP, Frames, Photometric Calibration pipelines run, individual pipeline Q/A checks pass.
- Outputs of pipeline (but not corrected frames, atlas images, binned sky, masks) are stuffed into OPDB. (1.5 day per nights' data)
- Merge with existing overlapping runs (2 hours per nights' data)
- Cross run Q/A checks run (2 hours)
- Completed rectangular chunks on sky 'resolved' for Target Selection (4 hours)
- Target Selection Run -- handed to plates (6 hours)
- Imaging data files exported for import into SX. (6 hours)
- Upstream Spectroscopy Processing: Ops-prepare, Spectro-2D, spectro-1D pipelines Run
- Links made in OPDB between object spectrum and identical imaging object. (2 hours)
- Spectro objects exported to SX (science archive).
- Human Intervention Steps
- Q/A at end of each pipeline
- During Final photometric calibrations/MTpipe reductions (hope to reduce)
- Q/A during overlapping runs
- Reprocessing of old data which no longer conforms to current data model
- Feedback to mountain
- File/Tape format problems, iop
- Calibration files
- Confirmation of 'done' for imaging stripe or plate -- There is a lag
- Feedback to pipeline developers
- Bug reports, wish requests, file format problems
- Feedback to future observing schedulers
- Which data is really really good, which needs to be redone
- Imaging: Which sections of which stripes ok -- Time Lag
- Spectroscopy: Which plates are complete -- Time Lag
- How Data is determined to be Good
- Imaging -- 1.5'' seeing, matchups between columns
(but in practice, put everything through to FWHM 3'',
crashes if sky is too variable)
- Spectroscopy -- On mountain S/N per fiber plot
- Move data to the output 'Data Products'
- Creation of preliminary 'calibration object files' tsObj files for
early science and early problem diagnosis on FNAL machines
- SCP/FTP shove of tsObj files to collaboration members (data volume, disk on remote end, version control)
- Loading of SX with preliminary 'calibrated object catalog' -- No atlas yet -- calibrated version control
- Success judged number of quality science papers written and by feedback
of important bugs to pipeline developers
- Creation of 'final' 'calibrated object files'
- SCP shove of final calibrated object files
- Loading of SX with final calibrated object catalog, access to atlas images and related spectra
- Creation of general public distribution CD-ROMs and/or Internet access site.
- Example: large area color GIF image maps -- also used internally
- Corrected frames moved to tape and/or tape robot, as are data which cannot be kept spinning for lack of disk space
|