Observing Operations | Reviews | Survey Management

Sloan Digital Sky Survey
Review of Observing Systems and Survey Operations

A "Day" as an SDSS Observer
Scot Kleinman
April 19, 2000

Upon first arriving at APO, we locate the day staff and find out what has been happening to the telescope during the day. This keeps us informed of what procedural changes may be necessary and gives us a heads-up as to where to look when/if things go wrong later in the night.  If there is time remaining before the daily phone-con (which establishes observing priorities for the evening), we will start up the observing console and scan our email for important items and do some quick aliveness checks of the various systems. After the phone-con, we continue our aliveness checks (per a standardized check-list) and go to the telescope to check out the systems there.

Once the telescope is checked out (and the instrument changed if necessary), we continue in the control room with our standard checks and email reading. (We typically get between 30-60 emails a day which can have subjects varying from who is not going to be in today at the Solar Obs. in Sunspot to new procedures for safe imager handling.) With no problems, our standard check out system takes about one hour.  Once it is close enough to twilight, we remove the enclosure from the telescope and do the few remaining setup items we could not accomplish in the enclosure.  We then go back to the control room and start the telescope slewing to our first field.

During the night, we have roughly twenty-five (25) windows we use on the observer console.  Here is a highlighted list of some of them:

SOP - Spectrograph Operating Program
   fSAOImage - guider display for spectrograph guider
   PGPlot - guider information/statistics plot

IOP - Imaging Operating Program
   skyGang - tells us if we are really pointed/tracking where we think we are
   PGPlot - updated plot of the active focus loop

Murmur - a continuously scrolling window with tons of useful information,
         but not very easy to catch follow.

Watcher - a program which aims to catch important information from the
          Murmur log and other sources and warn us if something is not right.

Interlocks Display - an interlocks status display from the Watcher

System Status Display - shows telescope position, instrument specifics, etc.

Servers - controllers that provide the information to the watcher from the  various subsystems

TPM Display - real time display of some of the TPM data (ex. telescope & mirror positions)

MCPMenu - controls the telescope from the MCP (position, flat field
          screens, calibration lamps, spectrograph slithead clamps, etc.)

TCC - Telescope controller: another continuously scrolling display that
      has useful information, but is sometimes hard to read.  Controls
      tracking, pointing, and focus of telescope and more.
   tccMon - a separate program designed to display the most useful of
            the TCC information in a human readable form

Titrax - time tracking software for observer activities

Weather:
  displayweath - an full screen updated display of current weather conditions  (T, dew point, etc.)
  wx - an hour-by-hour summary of weather conditions used in the night log
  Netscape - to view local or offsite weather pages to check for clouds,
             fronts, etc.  (Netscape is also used to check online
             documentation and obtain other SDSS observing information)

Editor - for night log

Email - constantly trying to catch up on the day's email traffic

QA - there are no QA tools at this time

Once we stop observing, we attempt to run the endNight [SI]OP script which prepares the data tapes and does some other house keeping routines. This process can often last hours.  Once that's started, we put away the telescope, fill/replace the LN2 dewars as necessary, and finish the night log. Once endNight successfully completes, we prepare the tapes for shipping, finish any other details and head home, by this time, often overlapping with the day staff to exchange information on the night's work.

Of course, we also should have a description here for crisis management, but although the need frequently arises, there is no telling where/when such work will be needed.

Efficiency

Most of the observing tools we have get the job done, but few are to the point where they are easy to use, convenient or efficient.  The [SI]OP programs have the ability to write custom scripts inside them, so that is going to be an efficiency plus once we get beyond the current round of development and bug-fixing to devote some time to exploiting it.  I want to highlight some other ares of inefficiency - some of which can be improved easily (and many are being discussed) and others that we are probably just going to have to live with:

Spectro inefficiency: high overhead due to diamondPoints, centering, focusing.
                      no feedback on whether continued observations will be useful.
                     (most of this is currently under investigation)

Imaging inefficiency: having to do multiple lskips.
                      the need for endDrift and goDrift between setup and
                       imaging runs kills eight minutes each scan.
                      variable setup time means sometimes we are starting too
                        early, while other times we miss the desired starting
                        point.
                      IOP more or less works, but needs to be optimized and made
                       more convenient for the observer.
                      endNight MUST work quickly, obviously and reliably.

Night Logs: we log many things which would be better done automatically.

Telescope enclosure: some engineering/setup tasks we can do in the
                      building; others we can't.
                     taking it off during the day heats the telescope up such that
                      we end up spending the first few night hours chasing the
                      seeing as things relax.
                     closing the building takes time and can be held up by a
                      number of things going wrong. Sooner or later, in an
                      emergency closing situation, we WILL get bitten by this.

Interlocks:  an important safeguard, but we often lose time figuring out which
              interlock tripped to prevent us from doing our current task.
              The feedback system should be natural and obvious.

Watcher: error messages are often opaque: not necessarily its fault
          as it is not always being passed meaningful errors (key problem
          in most of our software), danger of ignoring them if they are
          too common or not critical.

Information overload: scrolling murmur and TCC not easy to use.  The Watcher
                       and tccMon help, but do not eliminate this problem.

QA - we as yet, do not have sufficient tools to monitor data quality during
     observations.  This is NOT good (and is being worked on, in part).

DA - system is much more reliable now, than in the past, but depending
     on tape drives night after night can still be a source for data loss. We
     seem to experience higher than expected "bad" tape rates.

Documentation - there isn't enough. (Ellyne discussed this in more detail.)

Staffing

Dark runs seem to be averaging around 18 nights. We find it necessary to interact with the day staff before each observing night, and due to the endNight process and advantages of meeting again with the day staff at the end of the night, our "day" is something like 4pm - 8am. Thus we typically run two shifts a night that are 9-10 hours each.  We need two observers per shift to be able to monitor the above mentioned myriad of systems as well as to handle the night's problems.  This schedule provides some overlap with the observers in each team during the shift change, but working a two-shift a night schedule with three observing teams has proven quite awkward and has resulted in occasional zombie-like observers.  This arrangement requires us to continuously switch from the first shift to the second shift (and back again) during a run.  It has been shown in many studies that constant shift changes lead to increased exhaustion rates and employee unhappiness. We can verify this. The current arrangement is not sustainable over a five-year survey, and data uniformity could eventually suffer.

In addition to the scheduled observing run, we have been needing about three nights at the beginning of the dark run to "shake out" new bugs in the systems (many of which have changed since the versions used the previous dark runs) and are talking now about adding an additional night of testing at the end of each run. This totals, then, about 22 nights * 4 people/night * 9 hours/person = 792 hours/run.

At our current staffing level of 6 people working 40 hours/week, we have an available pool of approx. 960 hours/month.  Thus after the 792 hours observing, we are left with about three (9hr) days of time available per observer. It is in this remaining three days of time, that the observer can, a) process what went right/wrong in the last run b) make improvements to the operating software, instruments, documentation, etc.  c) check and develop quality control and long-term monitoring projects of data and operating systems, not to mention d) engage in some scientific research with the data.

Clearly, the hours just aren't there to allow the observers to take an active role in developing and improving the observing systems and tools. It is usually the case that the best astronomical instruments (be they hardware or software) are developed by the people who are actually going to use them. Despite having the required skill-set to implement the needed last 10% of systems development, the observers simply do not have the time to do so.  Most astronomers will also agree that the best data are taken by people who want to use the data and therefore the SDSS project wisely chose to hire scientists to perform the observations. Scientists are not likely to be happy if not given the chance to actually do science.

A fourth team of observers would both eliminate the need to switch shifts as well as allow more time for the activities mentioned above which are simply falling through the cracks now.

It is one of our biggest frustrations that we feel we have the tools and skills to improve our operating environment and contribute to the scientific output of the SDSS, but do not have the necessary amount of time to do so. While we realize everyone suffers from a lack of available time, we believe that given more time, we can greatly benefit the SDSS and can improve the satisfaction we each derive from our jobs as well.



Review of Observing Systems and Survey Operations
Apache Point Observatory
April 25-27, 2000