|
REQUIREMENTS FOR SDSS DATA MANAGEMENT SYSTEMS
June 5, 1995
Revision 1 - February 26, 2004
[All requirements are subject to budgetary,
manpower, and technical limitations]
The data management systems shall maintain all processed data from the SDSS
and provide access by SDSS scientists and operators in order to maximize the
ease of the following:
- Operate the SDSS survey so as to maximize the efficiency of operations.
- Perform Quality Analysis operations on the data so as to ensure its
integrity. The operations will verify calibrations, target selection criteria
and classifications, and completeness & accuracy.
- Provide SDSS scientists with access to the data and tools to permit
selection of spectroscopic targets for certain categories (serendipity;
stars).
- Provide SDSS scientists with access to the data so as to enable scientific
analyses.
Requirements for the Science ArchiveThe science archive shall consist
of:
- A science database that shall:
- a. Retain resolved calibrated object catalogs (photometric
CCD output) for two sky versions of the data: Best and Target.
- b. Retain parameters from spectroscopic pipeline
- c. Retain references to atlas images and extracted spectra
- d. Provide ability to carry out manual target selection for
certain target categories
- e. Provide ability for SDSS scientists to extract subsets of
retained data.
- f. Provide smooth transition to public distribution system.
- A set of files tracked by the science database.
- A set of files not tracked by the science database.
A public version of the science database shall be a snapshot of the Science
Archive. The public version will include two versions of the sky: Best and
Target. It will not include all runs obtained in the course of survey operations.
An enhanced goal is to create a "Runs" database in addition to Target and Best
versions. A Runs database would contain every imaging scan obtained over the
course of operations.
I. Input to Science Archive
- Survey Definition
- a. A description of the North Imaging survey area
- b. Survey progress: A description of sky inserted into database
to date
- Final Astrometric Calibration
- a. List of calibration coefficients on a frame-by-frame
basis.
- b. Position errors stored on an object-by-object basis.
- Final Photometric Calibraton
- a. List of photometric calibration coefficients on a
frame-by-frame basis.
- Merged Object Lists
- a. A list of calibrated objects and parameters from the Frames
pipeline of photo
- b. A list of masks derived from object masks from the
Frames pipeline of photo.
- c. Run, Rerun and Field information.
- d. Star/Galaxy classifications
- e. Target selection flags
- f. Status flags
- g. Cross-identifications to other
catalogs
- Target Selection
- a. A list of all targetable objects with target selection
categories
- b. A list of all objects from (5a) selected as targets with
selection category
- c. Tiling flags for all objects in b.
- Spectroscopic Pipeline
- a. Redshifts and parameters of all targeted objects
- b. Tile and plate information.
- c. Primary target designation, to identify primary
measurements of targets for which multiple spectra have been
obtained.
- Enhanced goal: Scientist derived catalogs
- Enhanced goal: Other input catalogs
- Separate files tracked from Science Database
- a. Atlas Images
- b. 1-D spectra
- c. Corrected frames
- d. Masks
- e. Binned frames
- f. fpFieldStat
- g. psField
- TBD: Southern Survey
II. Functional Goals
- User will be able to carry out efficient queries to locate objects over
one or more ranges of following attributes:
- a. Longitude or latitude in several spherical coordinates
- i) J2000 Ra and Dec
- ii)Galactic coordinates
- iii) Survey Coordinates
- iv) Any linear combination of the two coordinates
- b. Radius within a give point of the sky
- c. u' g' r' i' z' (One set of magnitudes per object)
- d. Any linear combination of c.
- e. Object radius
- f. Surface brightess formed by c and d.
- g. Star/Galaxy classification flag
- h. Target Selection Category
- i. Spectrum available flag
- j. Status and photo flags
- User will be able to carry out queries on any retained object parameter.
- Enhanced Goal: All calibrated quantities can be recomputed using
improved astrometric and photometric calibrations. Queries can be performed on
the recalibrated quantities.
- For all efficient queries, return an esimated number of objects to be
located.
- For all located objects, users shall be able to specify an arbitrary
subset of stored parameters to be returned plus the number of located
objects.
- Users shall be able to perform the following functions:
- a. Proxy queries [e.g, get all objects within each
of 10,000 QSOs in my favorite catalog).]
- b. Formulate new queries based on results of previous queries.
- Users shall be able to query for database metadata:
- a. List of tables
- b. List of attributes
- c. List of enumerated constants with text descriptors.
III. Technical Goals
- User interface
- a. User interface shall be http-based.
- b. User interface shall communicate with a query support layer
via ASCII interface protocol.
- c. Data shall be returnable on the sockets in ASCII, HTML,
or XML format.
- d. User interface shall be documented.
- Data shall be stored in a system providing an industry-standard OSQL-like
interface to enable use of commercial products to provide alternative view
of the database.
- Distributability
- a. A master copy of all data shall be maintained (the Master
Science Archive)
- b. Capability shall be present to replicate all or part of the
Master Science Archive as local databases at SDSS institutions. Replication
may consist of:
- i) Science Database in its entirety
- ii) All or part of separate files tracked by Science Database
- iii) No capability shall be present to replicate an
arbitrarily selected subset of the science database beyond that described
by section 1.c of USER INTERFACE.
- iv) The institution requesting replication shall be responsible
for providing the hardware that the database and/or files will be copied
onto.
- d. No capability is required to be present to replicate all or
part of separate files not tracked by Science Database
- Security
- a. Master Science Archive shall be protected against corruption
by SDSS participant users
- b. Master Science Archive shall be protected against
unauthorized access by non-SDSS participants.
- c. Computer security policies and procedures of the institution
hosting the Master Science Archive shall be followed.
- Version Retention (NEW)
- a. Two prior data release versions of the Science Database,
in addition to the current release, should be maintained on-line.
- System Availability (NEW)
- a. 99% system availability to the end user of the public
version of the current data release.
- b. 95% system availability to the end user of the current
collaboration version of the Science Database.
- c. 95% system availability to the end user on prioir
release versions of the Science Database.
- d. For 95% uptime systems:
- i) Fault response time should be within 16 hours.
- ii) Fault recovery time should be within 48 hours.
|