Proceedings Volume 5493

Optimizing Scientific Return for Astronomy through Information Technologies

cover
Proceedings Volume 5493

Optimizing Scientific Return for Astronomy through Information Technologies

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 16 September 2004
Contents: 5 Sessions, 65 Papers, 0 Presentations
Conference: SPIE Astronomical Telescopes + Instrumentation 2004
Volume Number: 5493

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Operations Systems: Metrics and New Concepts
  • Virtual Observatories, Archives, and Data Mining
  • Operations Systems: Phase 1, Phase 2, and Scheduling
  • Data Processing Systems: Hardware and Algorithms
  • Operations Systems: Metrics and New Concepts
  • Virtual Observatories, Archives, and Data Mining
  • Data Quality Control
  • Operations Systems: Phase 1, Phase 2, and Scheduling
  • Data Processing Systems: Hardware and Algorithms
Operations Systems: Metrics and New Concepts
icon_mobile_dropdown
Software for the EVLA
Bryan J. Butler, Gustaaf van Moorsel, Doug Tody
The Expanded Very Large Array (EVLA) project is the next generation instrument for high resolution long-millimeter to short-meter wavelength radio astronomy. It is currently funded by NSF, with completion scheduled for 2012. The EVLA will upgrade the VLA with new feeds, receivers, data transmission hardware, correlator, and a new software system to enable the instrument to achieve its full potential. This software includes both that required for controlling and monitoring the instrument and that involved with the scientific dataflow. We concentrate here on a portion of the dataflow software, including: proposal preparation, submission, and handling; observation preparation, scheduling, and remote monitoring; data archiving; and data post-processing, including both automated (pipeline) and manual processing. The primary goals of the software are: to maximize the scientific return of the EVLA; provide ease of use, for both novices and experts; exploit commonality amongst all NRAO telescopes where possible. This last point is both a bane and a blessing: we are not at liberty to do whatever we want in the software, but on the other hand we may borrow from other projects (notably ALMA and GBT) where appropriate. The software design methodology includes detailed initial use-cases and requirements from the scientists, intimate interaction between the scientists and the programmers during design and implementation, and a thorough testing and acceptance plan.
Supporting observatory operations: the archive in the middle
Going from the astronomer submitting an observing proposal to the reception of the data of the corresponding observations requires quite a number of steps: TAC/OPC selection, phase II, scheduling, observations, quality control, data pre-reduction and data delivery. In this contribution, the architecture of ESO's data flow and in particular the evolution of the concept, role an even definition of the archive is presented. Coming from the tail end of the ESO data flow, the archive is now the central repository of information about observations and plays initially mostly a technical, operational role.
VLT interferometry in the data flow system
Pascal Ballester, Tom Licha, Derek J. McKay, et al.
Science interferometry instruments are now available at the Very Large Telescope for observations in service mode; the MID-Infrared interferometry instrument, MIDI, started commissioning and has been opened to observations in 2003 and the AMBER 3-beam instrument shall follow in 2004. The Data Flow System is the VLT end-to-end software system for handling astronomical observations from the initial observation proposal phase through to the acquisition, archiving, processing, and control of the astronomical data. In this paper we present the interferometry specific components of the Data Flow System and the software tools which are used for the VLTI.
Science returns of flexible scheduling on UKIRT and the JCMT
Andrew J. Adamson, Remo P.J. Tilanus, Jane Buckle, et al.
The Joint Astronomy Centre operates two telescopes at the Mauna Kea Observatory: the James Clerk Maxwell Telescope, operating in the submillimetre, and the United Kingdom Infrared Telescope, operating in the near and thermal infrared. Both wavelength regimes benefit from the ability to schedule observations flexibly according to observing conditions, albeit via somewhat different "site quality" criteria. Both UKIRT and JCMT now operate completely flexible schedules. These operations are based on telescope hardware which can quickly switch between observing modes, and on a comprehensive suite of software (ORAC/OMP) which handles observing preparation by remote PIs, observation submission into the summit database, conditions-based programme selection at the summit, pipeline data reduction for all observing modes, and instant data quality feedback to the PI who may or may not be remote from the telescope. This paper describes the flexible scheduling model and presents science statistics for the first complete year of UKIRT and JCMT observing under the combined system.
Science Goal Monitor: science goal driven automation for NASA missions
Anuradha Koratkar, Sandy Grosvenor, John Jung, et al.
Infusion of automation technologies into NASA's future missions will be essential because of the need to: (1) effectively handle an exponentially increasing volume of scientific data, (2) successfully meet dynamic, opportunistic scientific goals and objectives, and (3) substantially reduce mission operations staff and costs. While much effort has gone into automating routine spacecraft operations to reduce human workload and hence costs, applying intelligent automation to the science side, i.e., science data acquisition, data analysis and reactions to that data analysis in a timely and still scientifically valid manner, has been relatively under-emphasized. In order to introduce science driven automation in missions, we must be able to: capture and interpret the science goals of observing programs, represent those goals in machine interpretable language; and allow spacecrafts' onboard systems to autonomously react to the scientist's goals. In short, we must teach our platforms to dynamically understand, recognize, and react to the scientists' goals. The Science Goal Monitor (SGM) project at NASA Goddard Space Flight Center is a prototype software tool being developed to determine the best strategies for implementing science goal driven automation in missions. The tools being developed in SGM improve the ability to monitor and react to the changing status of scientific events. The SGM system enables scientists to specify what to look for and how to react in descriptive rather than technical terms. The system monitors streams of science data to identify occurrences of key events previously specified by the scientist. When an event occurs, the system autonomously coordinates the execution of the scientist's desired reactions. Through SGM, we will improve our understanding about the capabilities needed onboard for success, develop metrics to understand the potential increase in science returns, and develop an "operational" prototype so that the perceived risks associated with increased use of automation can be reduced. SGM is currently focused on two collaborations: 1. Yale University's SMARTS (Small and Moderate Aperture Research Telescope System) observing program - Modeling and testing ways in which SGM can be used to improve scientific returns on observing programs involving intrinsically variable astronomical targets. 2. The EO-1 (Earth Observing-1) mission - Modeling and testing ways in which SGM can be used to autonomously coordinate multiple platforms based on a set of scientific criteria. In this paper, we will discuss the status of the SGM project focusing primarily on our progress with the SMARTS collaboration.
The scientific output of the Hubble Space Telescope from objective metrics
Georges Meylan, Brett S. Blacker, Duccio Macchetto, et al.
After a decade of Hubble Space Telescope (HST) operations, observations, and publications, the Space Telescope Science Institute (STScI) decided it was pertinent to measure the scientific effectiveness of the HST observing programs. To this end, we have developed a methodology and a set of software tools to measure - quantitatively and objectively - the impact of HST observations on astrophysical research. We have gathered Phase I and Phase II information on the observing programs from existing STScI databases, among them the Multi-mission Archive at Space Telescope (MAST). We have gathered numbers of refereed papers and their citations from the Institute for Scientific Information (ISI) and the NASA Astrophysics Data System (ADS), cross-checking information and verifying that our information is as complete and reliable as possible. We have created a unified database with links connecting any specific set of HST observations to one or more scientific publications. We use this system to evaluate the scientific outcomes of HST observations according to type and time. In this paper, we present a few such HST metrics that we are using to evaluate the scientific effectiveness of the Hubble Space Telescope.
Moor: web access to end-to-end data flow information at ESO
Alberto Maurizio Chavan, Michele Peron, Judith Anwunah, et al.
All ESO Science Operations teams operate on Observing Runs, loosely defined as blocks of observing time on a specific instrument. Observing Runs are submitted as part of an Observing Proposal and executed in Service or Visitor Mode. As an Observing Run progresses through its life-cycle, more and more information gets associated to it: Referee reports, feasibility and technical evaluations, constraints, pre-observation data, science and calibration frames, etc. The Manager of Observing Runs project (Moor) will develop a system to collect operational information in a database, offer integrated access to information stored in several independent databases, and allow HTML-based navigation over the whole information set. Some Moor services are also offered as extensions to, or complemented by, existing desktop applications.
Adapting the VLT data flow system for handling high-data rates
Stefano Zampieri, Michele Peron, Olivier Chuzel, et al.
The Data Flow System (DFS) for the ESO VLT provides a global approach to handle the flow of science related data in the VLT environment. It is a distributed system composed of a collection of components for preparation and scheduling of observations, archiving of data, pipeline data reduction and quality control. Although the first version of the system became operational in 1999 together with the first UT, additional developments were necessary to address new operational requirements originating from new and complex instruments which generate large amounts of data. This paper presents the hardware and software changes made to meet those challenges within the back-end infrastructure, including on-line and off-line archive facilities, parallel/distributed pipeline processing and improved association technologies.
SOAR remote observing: tactics and early results
Travel from North America to the 4.1m SOAR telescope atop Cerro Pachon exceeds $1000, and takes >16 hours door to door (20+ hours typically). SOAR aims to exploit best seeing, requiring dynamic scheduling that is impossible to accomplish when catering to peripatetic astronomers. According to technical arguments at www.peakoil.org, we are near the peak rate of depleting world petroleum, so can expect travel costs to climb sharply. With the telecom bubble's glut of optical fiber, we can transmit data more efficiently than astronomers and "observe remotely". With data compression, less than half of the 6 Mbps bandwidth shared currently by SOAR and CTIO is enough to enable a high-fidelity observing presence for SOAR partners in North America, Brazil, and Chile. We discuss access from home by cable modem/DSL link.
Remote observation and observation efficiency
More than three years have passed since Subaru Telescope started its Open Use operation. Currently, more than 60% of the total telescope time is spent for scientific observation. Firstly, we define an index to measure how the telescope is effectively used. By using the index, we review the use of the telescope since 2000. Remote observation and queue observation is a long-term goal of Subaru operation because they are believed to be more efficient way to use the telescope and available resource. Control and observation software has been designed and developed to achieve remote observation and queue observation. Currently, about 30% of the telescope time is used as remote observation. We will discuss how much remote observation has contributed to make the use of the telescope effective.
A free market in telescope time?
As distributed systems are becoming more and more diverse in application there is a growing need for more intelligent resource scheduling. eSTAR Is a geographically distributed network of Grid-enabled telescopes, using grid middleware to provide telescope users with an authentication and authorisation method, allowing secure, remote access to such resources. The eSTAR paradigm is based upon this secure, single sign-on, giving astronomers or their agent proxies direct access to these telescopes. This concept, however, involves the complex issue of how to schedule observations stored within physically distributed media, on geographically distributed resources. This matter is complicated further by the varying degrees of constraints placed upon observations such as timeliness, atmospheric and meteorological conditions, and sky brightness to name a few. This paper discusses a free market approach to this scheduling problem, where astronomers are given credit, instead of time, from their respective TAGs to spend on telescopes as they see fit. This approach will ultimately provide a community-driven schedule, genuine indicators of the worth of specific telescope time and promote a more efficient use of that time, as well as demonstrating a 'survival of the fittest' type selection.
Virtual Observatories, Archives, and Data Mining
icon_mobile_dropdown
The International Virtual Observatory Alliance: recent technical developments and the road ahead
Peter J. Quinn, David G. Barnes, Istvan Csabai, et al.
The International Virtual Observatory Alliance (IVOA: http://www.ivoa.net) represents 14 international projects working in coordination to realize the essential technologies and interoperability standards necessary to create a new research infrastructure for 21st century astronomy. This international Virtual Observatory will allow astronomers to interrogate multiple data centres in a seamless and transparent way, will provide new powerful analysis and visualisation tools within that system, and will give data centres a standard framework for publishing and delivering services using their data. The first step for the IVOA projects is to develop the standardised framework that will allow such creative diversity. Since its inception in June 2002, the IVOA has already fostered the creation of a new international and widely accepted, astronomical data format (VOTable) and has set up technical working groups devoted to defining essential standards for service registries, content description, data access, data models and query languages following developments in the grid community. These new standards and technologies are being used to build science prototypes, demonstrations, and applications, many of which have been shown in international meetings in the past two years. This paper reviews the current status of IVOA projects, the priority areas for technical development, the science prototypes and planned developments.
AstroGrid: powering science from multistreamed data
Nicholas A. Walton, Andrew Lawrence, Anthony E. Linde
The AstroGrid (http://www.astrogrid.org) project is developing a virtual observatory capability to support efficient and effective exploitation of key astronomical data sets of importance to the UK community. It's initial focus is providing the necessary data-grid infrastructure and data-mining tools to support data generated by projects such as WFCAM, VISTA, e-MERLIN, SOHO and Cluster. AstroGrid is a partnership formed by UK archive centres and astronomical computer scientists. Key capabilities of AstroGrid enable multi-disciplinary astronomy, making use of data streams from frontline astronomical instrumentation. This paper presents the development and deployment plans of AstroGrid, describing the products and capabilities already released through the fifth project iteration release. Use of these in early adopter science programmes is noted. AstroGrid is a strongly science driven project that aims to deploy relevant aspects of Grid and Data-Grid technologies. These are discussed here, with an in-depth treatment of specific AstroGrid technological developments to support e.g. collaborative workspaces in the form of MySpace, being discussed elsewhere in this conference. Finally, AstroGrid's close involvement in broader European initiatives, the Astrophysical Virtual Observatory (AVO) and the International Virtual Observatory Alliance (IVOA) is highlighted.
Virtual observatory standards in action
Mark George Allen, Sebastian Derriere, Francois Bonnarel, et al.
Interoperability in Virtual Observatories is based on standards for interchange formats, and protocols. Using real science cases, we present example VO tools based on the AVO prototypes and the CDS services, which implement the new and emerging VO standards. We discuss how these standards, such as VOTable, UCDs and data models enable interoperability between software components, and provide efficient and flexible means for data access. We show how UCDs are used in catalogue filtering and selection of relevant columns. We also demonstrate a simple yet powerful method for accessing data from image archives, customized data sets and local data within a single environment.
The Virtual Solar Observatory: status and initial operational experience
Frank Hill, Richard S. Bogart, Alisdair Davey, et al.
The Virtual Solar Observatory (VSO) is a bottom-up grassroots approach to the development of a distributed data system for use by the solar physics community. The beta testing version of the VSO was released in December 2003. Since then it has been tested by approximately 50 solar physicists. In this paper we will present the status of the project, a summary of the community's experience with the tool, and an overview of the lessons learned.
The European grid of solar observations
The European Grid of Solar Observations (EGSO) is a Grid test-bed that will change the way users analyze solar data. One of the major hurdles in the analysis of solar data is finding what data are available and retrieving those required. EGSO is integrating the access to solar data by building a Grid including solar archives around the world. The Grid relies on metadata and tools for selecting, processing and retrieving distributed and heterogeneous solar data. EGSO is also creating a solar feature catalogue giving for the first time the ability to select solar data based on phenomena and events. In essence, EGSO is providing the fabric of a virtual observatory. Since the first release of EGSO in September 2003, members of the solar community have been involved in product testing. The constant testing and feedback allows us to assure the usability of the system. The capabilities of the latest release will be described, and the scientific problems that it addresses discussed. EGSO is funded under the IST (Information Society Technologies) thematic priority of the European Commission's Fifth Framework Programme (FP5) – it started in March 2002 and will last for three years. The EGSO Consortium comprises 11 institutes from Europe and the US and is led by the Mullard Space Science Laboratory of University College London. EGSO is collaborating with other groups in the US who are working on similar virtual observatory projects for solar and heliospheric data with the objective of providing integrated access to these data.
Mining the LAMOST spectral archive
The Large sky Area Multi-Object fibre Spectroscopic Telescope will yield 10 million spectra of a wide variety of objects including QSOs, galaxies and stars. The data archive of one-dimensional spectra, which will be released gradually during the survey, is expected to exceed 1 terabyte in size. This archive will enable astronomers to explore the data interactively through a friendly user interface. Users will be able to access information related to the original observations as well as spectral parameters computed by means of an automated data-reduction pipeline. Data mining tools will enable detailed clustering, characterization and classification analyses. The LAMOST data archive will be made publicly available in the standard data format for Virtual Observatories and in a form that will be fully compatible with future Grid technologies.
The Large Binocular Camera image simulator: predicting the performances of LBC
Andrea Grazian, Adriano Fontana, Cristian De Santis, et al.
The LBC (Large Binocular Camera) Image Simulator is a package for generating artificial images in the typical FITS format. It operates on real or artificial images, simulating the expected performances of real instruments including several observing conditions (filters, air-mass, flat-field, exposure time) and creating images with the LBC instrumental artifacts (optical deformations, noise, CCD architectures). This simulator can be used also to produce artificial images for other existing and future telescopes, since it is very flexible on its structure. The main aim of LBCSIM is to support the development of pipeline and data analysis procedure able to cope with wide field imaging and fast reduction of huge amount of photometric data. The software consists of three stand alone programs written in C language, using IRAF and running under Linux OS. The LBC Image Simulator is built with particular attention to the Virtual Observatory and Data Grid applications. In this paper, we first describe the software, the performances and several tests carried out before the public release and some examples for the users. In particular, we compared the Hubble Deep Field South (HDFS) as seen by FORS1 with a simulated image and found that the agreement is good. Then, we use this software to predict the expected performances of the LBC instrument by means of realistic simulations of deep field observations with the LBT telescope.
Data modeling for virtual observatory data mining
Holger M. Jaenisch, James Handley, Albert Lim, et al.
< 869.47 -3.27 41.37 602.25 10053.48 620.0042> We propose a novel approach for index-tagging Virtual Observatory data files with descriptive statistics enabling rapid data mining and mathematical modeling. This is achieved by calculating at data collection time 6 standard moments as descriptive file tags. Data Change Detection Models are derived from these tags and used to filter databases for similar or dissimilar information such as stellar spectra, photometric data, images, and text. Currently, no consistent or reliable method for searching, collating, and comparing 2-D imagery exists. Traditionally, methods used to address these data problems are disparate and unrelated to text data mining and extraction. We explore the use of mathematical Data Models as a unifying tool set for enabling data mining across all data class domains.
Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand
G. Bruce Berriman, Ewa Deelman, John C. Good, et al.
This paper describes the design of a grid-enabled version of Montage, an astronomical image mosaic service, suitable for large scale processing of the sky. All the re-projection jobs can be added to a pool of tasks and performed by as many processors as are available, exploiting the parallelization inherent in the Montage architecture. We show how we can describe the Montage application in terms of an abstract workflow so that a planning tool such as Pegasus can derive an executable workflow that can be run in the Grid environment. The execution of the workflow is performed by the workflow manager DAGMan and the associated Condor-G. The grid processing will support tiling of images to a manageable size when the input images can no longer be held in memory. Montage will ultimately run operationally on the Teragrid. We describe science applications of Montage, including its application to science product generation by Spitzer Legacy Program teams and large-scale, all-sky image processing projects.
Integrating existing software toolkits into VO system
Chenzhou Cui, Yong-Heng Zhao, Xiaoqian Wang, et al.
Virtual Observatory (VO) is a collection of interoperating data archives and software tools. Taking advantages of the latest information technologies, it aims to provide a data-intensively online research environment for astronomers all around the world. A large number of high-qualified astronomical software packages and libraries are powerful and easy of use, and have been widely used by astronomers for many years. Integrating those toolkits into the VO system is a necessary and important task for the VO developers. VO architecture greatly depends on Grid and Web services, consequently the general VO integration route is "Java Ready – Grid Ready – VO Ready". In the paper, we discuss the importance of VO integration for existing toolkits and discuss the possible solutions. We introduce two efforts in the field from China-VO project, "gImageMagick" and "Galactic abundance gradients statistical research under grid environment". We also discuss what additional work should be done to convert Grid service to VO service.
Operations Systems: Phase 1, Phase 2, and Scheduling
icon_mobile_dropdown
Scheduling simulation in ALMA
Allen R. Farris
The scheduling subsystem within the ALMA software system is designed to manage the execution of approved observing projects. Since weather will play such an important role in science observations with ALMA, the telescope will operate primarily in a dynamic scheduling mode. Current environmental conditions together with the current state of the telescope itself will be used to algorithmically determine the best scheduling block to execute at any time. In addition to the on-line system, a simulation tool is a significant part of this system that will be used for short and long-term planning. It implements the fundamental concepts used to solve this problem of dynamic scheduling and is capable of using historical weather data. The architecture and function of this simulation tool are presented.
Resolving inherent planning and scheduling conflicts in HST's cycle 12
Ian J. E. Jordan, William M. Workman III, Tricia J. Royle, et al.
Introduction of the Large Proposal category for HST observing in Cycle 11 resulted in a significant migration toward multiple observing programs requiring 100 or more orbits on single target areas. While relatively benign in the inaugural Cycle, this policy created a formidable planning problem in Cycle 12 due to acceptance of several large programs with identical or closely located targets. The nature of this observing pool revealed shortcomings in the established processes for building an integrated HST science plan. Historically it has not been difficult to normalize individual programs within the overall HST observing plan. However, conflicts arising from competing demands and overlapping time windows in Cycle 12 necessitated compromises between programs at a more significant scale than experienced ever before. The planning tools and techniques needed to change rapidly in response, and communication both within the STScI and between the STScI and the affected observers was more crucial than ever before. Large and small-scale changes to major observing programs were necessary to create a viable integrated observing plan. This paper describes the major features of the Cycle 12 observing pool, the impact it had on the STScI front-end operations processes and how an executable Cycle 12 HST observing program was achieved.
Enhancing science program collaboration at Gemini
Kim Gillies, Shane Walker, Allan Brighton
At Gemini, support for observers during Phase 2 is a collaborative effort among individuals spread across four continents at many institutions. Short Phase 2 preparation periods necessitate close communication between observers and support personnel. Email alone has not been an adequate solution. For the 2003B semester, the Gemini Observing Tool has been extended to allow off-site investigators and national project office support personnel to directly access the science program database. The observer is able to keep up to date with changes by accessing his program at any time. Email notifications are generated automatically when activities occur in the science program lifecycle. This paper will give an overview of how this system, based upon Java and freely available open source software, provides these new capabilities.
Data Processing Systems: Hardware and Algorithms
icon_mobile_dropdown
VISTA data flow system: overview
James P. Emerson, Mike J. Irwin, Jim Lewis, et al.
Data from the two IR survey cameras WFCAM (at UKIRT in the northern hemisphere) and VISTA (at ESO in the southern hemisphere) can arrive at rates approaching 1.4 TB/night for of order 10 years. Handling the data rates on a nightly basis, and the volumes of survey data accumulated over time each present new challenges. The approach adopted by the UK's VISTA Data Flow System (for WFCAM & VISTA data) is outlined, emphasizing how the design will meet the end-to-end requirements of the system, from on-site monitoring of the quality of the data acquired, removal of instrumental artefacts, astrometric and photometric calibration, to accessibility of curated and user-specified data products in the context of the Virtual Observatory. Accompanying papers by Irwin et al and Hambly et al detail the design of the pipeline and science archive aspects of the project.
VISTA data flow system survey access and curation: the WFCAM science archive
Nigel C. Hambly, Robert G. Mann, Ian Bond, et al.
VISTA Data Flow System (VDFS) survey data products are expected to reach of order one petabyte in volume. Fast and flexible user access to these data is pivotal for efficient science exploitation. In this paper, we describe the provision for survey products archive access and curation which is the final link in the data flow system from telescope to user. Science archive development at the Wide Field Astronomy Unit of the Institute for Astronomy within the University of Edinburgh is taking a phased approach. The first phase VDFS science archive is being implemented for WFCAM, a wide-field infrared imager that has similar output to, but at a lower data rate than the VISTA camera. We describe the WFCAM Science Archive, emphasising the design approach that is intended to lead to a scalable archive system that can handle the huge volume of VISTA data.
VISTA data flow system: pipeline processing for WFCAM and VISTA
Mike J. Irwin, Jim Lewis, Simon Hodgkin, et al.
The UKIRT Wide Field Camera (WFCAM) on Mauna Kea and the VISTA IR mosaic camera at ESO, Paranal, with respectively 4 Rockwell 2kx2k and 16 Raytheon 2kx2k IR arrays on 4m-class telescopes, represent an enormous leap in deep IR survey capability. With combined nightly data-rates of typically 1TB, automated pipeline processing and data management requirements are paramount. Pipeline processing of IR data is far more technically challenging than for optical data. IR detectors are inherently more unstable, while the sky emission is over 100 times brighter than most objects of interest, and varies in a complex spatial and temporal manner. In this presentation we describe the pipeline architecture being developed to deal with the IR imaging data from WFCAM and VISTA, and discuss the primary issues involved in an end-to-end system capable of: robustly removing instrument and night sky signatures; monitoring data quality and system integrity; providing astrometric and photometric calibration; and generating photon noise-limited images and astronomical catalogues. Accompanying papers by Emerson etal and Hambly etal provide an overview of the project and a detailed description of the science archive aspects.
The Gemini online data processing system
Shane Walker, Kim Gillies, Allan Brighton
Processing astronomical images is an inherently resource intensive procedure that is typically time consuming as well. At the same time, first order reductions are particularly important during the observing process since they can provide key quality assessment information. To resolve this conflict, the Online Data Processing (OLDP) system being commissioned at the Gemini Observatory automatically maps reduction sequences onto a cluster of servers during observing, taking advantage of available concurrency where possible. The user constructs a visual representation of the sequence for an observation using the Gemini Observing Tool. No constraints are placed upon the series of steps that comprise the sequence. At runtime, the OLDP reads the reduction sequence from the Observing Database and splits it into smaller pieces for simultaneous execution on the cluster. Recipe steps can be implemented in IRAF, shell scripts, or Java, and other types can be plugged into the architecture without modifying the core of the code base. This paper will introduce the Gemini OLDP and demonstrate how it utilizes modern infrastructure technology like Jini and JavaSpaces to achieve its goals.
The common pipeline library: standardizing pipeline processing
Derek J. McKay, Pascal Ballester, Klaus Banse, et al.
The European Southern Observatory (ESO) develops and maintains a large number of instrument-specific data processing pipelines. These pipelines must produce standard-format output and meet the need for data archiving and the computation and logging of quality assurance parameters. As the number, complexity and data-output-rate of instrument increases, so does the challenge to develop and maintain the associated processing software. ESO has developed the Common Pipeline Library (CPL) in order to unify the pipeline production effort and to minimise code duplication. The CPL is a self-contained ISO-C library, designed for use in a C/C++ environment. It is designed to work with FITS data, extensions and meta-data, and provides a template for standard algorithms, thus unifying the look-and-feel of pipelines. It has been written in such a way to make it extremely robust, fast and generic, in order to cope with the operation-critical online data reduction requirements of modern observatories. The CPL has now been successfully incorporated into several new and existing instrument systems. In order to achieve such success, it is essential to go beyond simply making the code publicly available, but also engage in training, support and promotion. There must be a commitment to maintenance, development, standards-compliance, optimisation, consistency and testing. This paper describes in detail the experiences of the CPL in all these areas. It covers the general principles applicable to any such software project and the specific challenges and solutions, that make the CPL unique.
Photometric flats: an essential ingredient for photometry with wide-field imagers
We discuss the challenges to photometry introduced by internal redistribution of light in wide-field imaging cameras with focal-reducers. We have developed a simple least-squares procedure which can be used to determine the zero-point variations across the field. The method uses three orthogonally offseted images of a reasonably dense stellar field, plus an image containing at least three standard stars scattered across the field. The method, which does not require rotating the instrument, have been applied to correct data from the Wide Field Imager at La Silla. It has been shown to reduce a 12% center-to-edge gradient down to a ~2% rms variation accross the field. A new method which can be used with data taken during non-photometric nights is also presented.
Operations Systems: Metrics and New Concepts
icon_mobile_dropdown
Lessons learned: the switch from VMS to UNIX operations for the STScI's Science and Mission Scheduling Branch
David S. Adler, William M. Workman III, Don Chance
The Science and Mission Scheduling Branch (SMSB) of the Space Telescope Science Institute (STScI) historically operated exclusively under VMS. Due to diminished support for VMS-based platforms at STScI, SMSB recently transitioned to Unix operations. No additional resources were available to the group; the project was SMSB's to design, develop, and implement. Early decisions included the choice of Python as the primary scripting language; adoption of Object-Oriented Design in the development of base utilities; and the development of a Python utility to interact directly with the Sybase database. The project was completed in January 2004 with the implementation of a GUI to generate the Command Loads that are uplinked to HST. The current tool suite consists of 31 utilities and 271 tools comprising over 60,000 lines of code. In this paper, we summarize the decision-making process used to determine the primary scripting language, database interface, and code management library. We also describe the finished product and summarize lessons learned along the way to completing the project.
Process control charts for dataflow operations of the ESO VLT
Wolfgang Hummel, Rachel Johnson, Andreas Jaunsen, et al.
The Data Flow Operations Group of ESO in Garching, provides many aspects of data management and quality control of the VLT data flow. One of the main responsibilities is to monitor the performance of all operational instruments. We have investigated if the statistical methods of process control can be applied to the quality control of the VLT instruments and the data flow has been analyzed in this concern. The efficiency of these statistical methods is found to be related to the calibration plan, that determines the sampling size and frequency of calibrations. We apply these principles to ISAAC health check plots and give examples to demonstrate performance and limitations.
Data interface control at the European Southern Observatory
Adam Dobrzycki, Nausicaa Delmotte, Nathalie Rossat, et al.
The European Southern Observatory (ESO) manages numerous telescopes which use various types of instruments and readout detectors. The data flow process at ESO's observatories involves several steps: telescope setup, data acquisition (science, calibration and test), pipeline processing, quality control, archivisation, distribution of data to the users. Well defined interfaces are vital for the smooth operation of such complex structures. Also, the future expansion of ESO operations - such as development of new observatories (e.g. ALMA) and supporting the Virtual Observatory (VO) - will make maintenance of data interfaces even more critical. In this paper we present the overview of the current status of the Data Interface Control process at ESO and discuss the future expansion plans.
Data model applications for the SuperAGILE detection system
We show a modeling approach to describe the data involved in an astronomical space mission. The data design process is introductory to the developing of the software system for the SuperAGILE instrument on board of the Gamma ray satellite AGILE. The model will be used to simplify team coding, improve scientific return and to reinvest the results on future experiments.
Virtual Observatories, Archives, and Data Mining
icon_mobile_dropdown
Visualizing astronomy data using VRML
Brett Beeson, Michael Lancaster, David G. Barnes, et al.
Visualisation is a powerful tool for understanding the large data sets typical of astronomical surveys and can reveal unsuspected relationships and anomalous regions of parameter space which may be difficult to find programatically. Visualisation is a classic information technology for optimising scientific return. We are developing a number of generic on-line visualisation tools as a component of the Australian Virtual Observatory project. The tools will be deployed within the framework of the International Virtual Observatory Alliance (IVOA), and follow agreed-upon standards to make them accessible by other programs and people. We and our IVOA partners plan to utilise new information technologies (such as grid computing and web services) to advance the scientific return of existing and future instrumentation. Here we present a new tool - VOlume - which visualises point data. Visualisation of astronomical data normally requires the local installation of complex software, the downloading of potentially large datasets, and very often time-consuming and tedious data format conversions. VOlume enables the astronomer to visualise data using just a web browser and plug-in. This is achieved using IVOA standards which allow us to pass data between Web Services, Java Servlet Technology and Common Gateway Interface programs. Data from a catalogue server can be streamed in eXtensible Mark-up Language format to a servlet which produces Virtual Reality Modeling Language output. The user selects elements of the catalogue to map to geometry and then visualises the result in a browser plug-in such as Cortona or FreeWRL. Other than requiring an input VOTable format file, VOlume is very general. While its major use will likely be to display and explore astronomical source catalogues, it can easily render other important parameter fields such as the sky and redshift coverage of proposed surveys or the sampling of the visibility plane by a rotation-synthesis interferometer.
The AstroGrid MySpace service
Andrew C. Davenhall, Catherine L. Qin, G. Peter Shillan, et al.
MySpace is a component of AstroGrid, the Virtual Observatory infrastructure system being developed in the UK under the national e-Science programme. The MySpace service will provide both temporary and long-term storage for Virtual Observatory users. This work space will typically be used to hold results extracted from archives, but can hold any sort of data. In addition, the MySpace service will provide cache storage for distributed processes which are running on the user's behalf. The novel feature of the MySpace service is that, although the individual items are geographically dispersed, the user can access and navigate the work space seamlessly and easily, as though all the items were stored in a single location. MySpace is written in Java and deployed as a set of Web services. It is a fully integrated component of the AstroGrid system, but its modular nature means that it can be installed and used in isolation or, in principle, in conjunction with components from other Virtual Observatory projects. Functionality similar to that of MySpace is likely to be a common requirement in Virtual Observatory projects. MySpace is under active development and its current state and future plans are described.
The Simple Spectral Access protocol
The goal of the Simple Spectral Access (SSA) specification is to define a uniform interface to spectral data including spectral energy distributions (SEDs), 1D spectra, and time series data. In contrast to 2D images, spectra are stored in a wide variety of formats and there is no widely used standard in astronomy for representing spectral data, hence part of the challenge of specifying SSA was defining a general spectrophotometric data model as well as definitions of standard serializations in a variety of data formats including XML and FITS. Access is provided to both atlas (pre-computed) data and to virtual data which is computed on demand. The term simple in Simple Spectrum Access refers to the design goal of simplicity in both implementing spectral data services and in retrieving spectroscopic data from distributed data collections. SSA is a product of the data access layer (DAL) working group of the International Virtual Observatory Alliance (IVOA). The requirements were derived from a survey among spectral data providers and data consumers and were further refined in a broad discussion in meetings and electronic forums as well as by prototyping efforts within the European Astrophysical Virtual Observatory (AVO) and the US National Virtual Observatory (NVO).
APT: what it has enabled us to do
Brett S. Blacker, Daniel Golombek
With the development and operations deployment of the Astronomer's Proposal Tool (APT), Hubble Space Telescope (HST) proposers have been provided with an integrated toolset for Phase I and Phase II. This toolset consists of editors for filling out proposal information, an Orbit Planner for determining observation feasibility, a Visit Planner for determining schedulability, diagnostic and reporting tools and an integrated Visual Target Tuner (VTT) for viewing exposure specifications. The VTT can also overlay HST’s field of view on user-selected Flexible Image Transport System (FITS) images, perform bright object checks and query the HST archive. In addition to these direct benefits for the HST user, STScI’s internal Phase I process has been able to take advantage of the APT products. APT has enabled a substantial streamlining of the process and software processing tools, which enabled a compression by three months of the Phase I to Phase II schedule, allowing to schedule observations earlier and thus further benefiting HST observers. Some of the improvements to our process include: creating a compact disk (CD) of Phase I products; being able to print all proposals on the day of the deadline; link the proposal in Portable Document Format (PDF) with a database, and being able to run all Phase I software on a single platform. In this paper we will discuss the operational results of using APT for HST's Cycles 12 and 13 Phase I process and will show the improvements for the users and the overall process that is allowing STScI to obtain scientific results with HST three months earlier than in previous years. We will also show how APT can be and is being used for multiple missions.
The Large Magellanic Cloud as a testbed for the astronomical virtual observatory
We are carrying out a comprehensive study of massive star forming complexes in the Large Magellanic Cloud, through the study of ionized regions. Preliminary results for the nebula LHA~120-N~44C are presented here. We are blending i) the spectral and morphological information contained in images taken through selected filters that probe lines sensitive to factors such as excitation mechanisms or hardness of the ionizing radiation, ii) with the already existing photometry from the 2MASS near-infrared survey and iii) multi-wavelength archived images retrieved from various locations. The merging of all these sources of informations will allow us to establish a close link between massive stars and the surrounding interstellar medium and should help constraining the local star formation history and dynamical evolution of these ionized regions in the Large Magellanic Cloud. In this respect, the Astrophysical Virtual Observatory (AVO) prototype tool has proven to be a powerful tool to speed up the discovery process.
Development of the Japanese Virtual Observatory (JVO) prototype
Masahiro Tanaka, Yoshihiko Mizumoto, Masatoshi Ohishi, et al.
The Japanese Virtual Observatory (JVO) project has been conducted by the National Astronomical Observatory of Japan (NAOJ). JVO aims at providing easy access to federated astronomical databases (especially SUBARU, Nobeyama and ALMA) and data analysis environment using the Grid technology. We defined JVOQL (JVO Query Language) for efficient retrieval of astronomical data from a federated database. We then constructed the first version of the JVO prototype in order to study technical feasibility including functionality of JVOQL, remote operations using Globus toolkit. The prototype consists of several components as follows: JVO portal to accept users' requests described in JVOQL, JVO Controller to parse them into individual query requests, and distributed database servers containing Suprime-Cam data of the Subaru telescope and 2MASS data. We confirmed that this prototype actually worked to access to a federated database. We construct the second version of the JVO prototype system to improve usability, which includes new user interfaces, efficient remote operations, and introduction of analysis tools. In the course of this, Grid service and XML database is employed. In this presentation we describe its design and structure of the new JVO prototype system.
Simulating instruments for mining uncalibrated archives
Bruno Voisin, Alberto Micol, Seathrun O'Tuairisg, et al.
As the astronomical community continues to produce deeper and higher resolution data, it becomes increasingly important to provide tools to the scientist that help mining the data in order to provide only the scientifically interesting images. In the case of uncalibrated archives, this task is especially difficult as it is difficult to know whether an interesting source can be seen on images without actually looking. Here, we show how instrument simulation can be used to lightly process the database-stored image descriptors of the ESO/Wide Field Imager (WFI) archive, and compute the corresponding limiting magnitudes. The end result is a more scientific description of the ESO/ST-ECF archive contents, allowing a more astronomer-friendly archive user interface, and hence increasing the archive useability in the context of a Virtual Observatory. This method is developed for improving the Querator search engine of ESO/HST archive, in the context of the EC funded ASTROVIRTEL project, but also provides an independant tool that can be adapted to other archives.
Data Quality Control
icon_mobile_dropdown
A scheme of flat-fielding for LAMOST
Huoming Shi, Gang Wang
Flat-fielding is one of the most important data calibration procedures for wide-field multi-fiber spectroscopy of LAMOST (the Large Sky Area Multi-Object Fiber Spectroscopic Telescope). LAMOST’s unique optical design, wide field of view, significant telescope vignetting and unconventional enclosure structure present a challenge to its flat-fielding method. On the other hand, the spectrographs with its CCD detectors are well fixed on the ground, which implies that there is almost no flexure during spectroscopic observation and thus provides a favorable factor in flat-fielding. Taking into account generally-accepted principles for wide-field multi-fiber spectroscopy and the specific features of LAMOST, a scheme of flat-fielding is designed. It utilizes a combination of the multi-fiber lamp flat field and the offset sky flat field to calibrate all known telescopic and instrumental response ununiformity. The lamp flat field will be used to calibrate the CCD pixel-to-pixel sensitivity variation, fiber transmission as a function of wavelength, and possible spectrographic vignetting. Fiber-to-fiber throughput difference and telescopic vignetting will be corrected with the offset sky flat field. A few choices are proposed for the flat field equipment. The flat-division is described in the context of the data reduction pipeline.
Quality control of VLT-VIMOS data
Paola Sartoretti, Carlo Izzo, Ralf M. Palsa, et al.
VIMOS is the Visible Multi-Object Spectrograph mounted at the Nasmyth focus of the 8.2m Melipal (UT3) telescope of the ESO Very Large Telescope. VIMOS operates with four channels in three observing modes: imaging, multi-object spectroscopy (MOS), and integral field spectroscopy. VIMOS data are pipeline-processed and quality-checked by the Data Flow Operation group in Garching. The quality check is performed in two steps. The first one is a visual check of each pipeline product that allows the identification of any potential major data problem, such as, for example, a failure in the MOS mask insertion or an over/under exposure. The second step is performed in terms of Quality Control (QC) parameters, which are derived from both raw and processed data to monitor the instrument performance. The evolution in time of the QC parameters is recorded in a publically available database (http://www.eso.org/qc/). The VIMOS QC parameters include, for each of the four VIMOS channels, the bias level, read-out-noise, dark current, gain factor, flat-field and arc-lamps efficiencies, resolution and rms of dispersion, sky flat-field structure, image quality and photometric zeropoints. We describe here some examples of quality checks of VIMOS data.
Quality control of VLT FLAMES/GIRAFFE data
Reinhard W. Hanuschik, Jonathan Smoker, Andreas Kaufer, et al.
GIRAFFE is a medium to high resolution spectrograph forming part of the complex multi-element fibre spectrograph FLAMES on the 8.2m VLT-UT2 telescope which also has a fibre link to the high-resolution spectrograph UVES. It has been operational since March 2003. GIRAFFE has been designed to be very stable and efficient. Here, first results concerning the Quality Control process are presented.
Quality control for UVES-fiber at the VLT-Kueyen Telescope
UVES-fiber is part of the FLAMES instrument mounted on the 8.2m Kueyen Telescope (UT2) of the ESO VLT. Up to eight single object fibers can be linked from the FLAMES focus to the red arm of the echelle spectrograph UVES. Science and calibration data are pipeline-processed by the Data Flow Operations group of ESO. Parameters to monitor the performance of the instrument are routinely extracted from calibration frames, stored into a database, and monitored over time. In addition to the Quality Control parameters already present for UVES in slit mode, several specific procedures had to be added in order to monitor the performance in the multi-object case. Particular attention is required for the positioning of the fibers on the detector and the transmission of the fibers. In this paper, we present details of the Quality Control process for UVES-fiber and results from the first year of operations.
Quality control of VLT NACO data
Danuta Dobrzycka, Wolfgang Hummel, Chris Lidman, et al.
The Nasmyth Adaptive Optics System (NAOS) and the High-Resolution Near IR Camera (CONICA) are mounted at the Nasmyth B focus of Yepun (UT4) telescope of the ESO VLT. NACO (NAOS+CONICA) is an IR (1-5 micron) imager, spectrograph, coronograph and polarimeter which is fed by the NAOS - the first adaptive optics system installed on Paranal. NACO data products are pipeline-processed, and quality checked, by the Data Flow Operations Group in Garching. The calibration data are processed to create calibration products and to extract Quality Control (QC) parameters. These parameters provide health checks and monitor instrument's performance. They are stored in a database, compared to earlier data, trended over time and made available on the NACO QC web page that is updated daily. NACO is an evolving instrument where new observing modes are offered with every observing period. Naturally, the list of QC parameters that are monitored evolves as well. We present current QC parameters of NACO and discuss the general process of controlling data quality and monitoring instrument performance.
Chandra automated point source processing
We have implemented a system to automatically analyze Chandra x-ray observations of point sources for use in monitoring telescope parameters such as point spread function, spectral resolution, and pointing accuracy, as well as for use in scientific studies. The Chandra archive currently contains at least 50 observations of star cluster-like objects, yielding 5,000+ sources of all spectral types well-suited for cataloging. The system incorporates off-the-shelf tools to perform the steps from source detection to temporal and spectral analyses. Our software contribution comes from wrapper scripts to autonomously run each step in turn, verify intermediate results, apply any logic required to set parameters, decide best-fit results, merge in data from other catalogs and to format convenient text and web-based output. We will outline this processing pipeline design and challenges, discuss the scientific applications, and focus on its role in monitoring on-orbit observatory performance.
Operations Systems: Phase 1, Phase 2, and Scheduling
icon_mobile_dropdown
Astronomer's proposal tool: the first two years of operation
Anthony J. Roman, Robert Douglas, Ronald Downes, et al.
The Astronomer's Proposal Tool (APT) is an integrated software package for the preparation of observing proposals and plans. It was developed by the Space Telescope Science Institute (STScI) to support Hubble Space Telescope (HST) observing, but it has also been designed so that other observatories can reuse it. The goal of APT is to provide a single user-friendly interface to much of the software that is needed by HST proposers and observers. APT was released in autumn 2002 and has since been used for two cycles of HST observing. This paper will illustrate some of the capabilities and functions of APT including a graphical editor, a display of how science exposures and overhead activities fit into HST orbits, and a timeline of when a observation's constraints can and cannot be met during an observing cycle. Experiences of the first two years of APT operation will also be discussed. Some of the user feedback will be described, and APT's impact on HST observing program implementation work at STScI will be explained. Based on all of this operational experience, several changes have been made to the APT software. These changes will be described.
Phase I changes needed for planning HST large programs
Denise C. Taylor, David Soderblom, William M. Workman III, et al.
Over one-third of HST observing time in the past two cycles has been dedicated to proposals with allocations greater than 100 orbits. This has led to scheduling difficulties in HST's traditional two-phase proposal process. We describe the changes that were made to the Cycle 13 Phase I proposal process that were needed to assist users and planners later on in Phase II. Some traditionally Phase II information is now requested with large proposals submitted in Phase I so users (and planners) can determine the feasibility of scientific constraints in planning the large observations. Since HST proposers use the Astronomer's Proposal Tool (APT) for both Phases, moving Phase II processing into the Phase I interface was more straightforward than would have been possible with RPS2 (the old Phase II tool). We will also describe the expected changes to internal procedures in planning these large proposals after Phase I acceptance.
Observation scheduling tools for Subaru Telescope
Toshiyuki Sasaki, George Kosugi, Robert Hawkins, et al.
Optimization of observation sequences is a function necessary to get high efficiency and reliability of observations. We have implemented scheduling software in the Subaru Telescope observatory software system. The scheduling engine, Spike, developed at STScI is used with some modification for Subaru Telescope. Since the last report at SPIE (Munich, 2000), new functions to Spike are implemented on 1) optimized arrangement of an observation dataset, which consists of a target object and related calibrations, and 2) flexible scheduling of standard stars selected out of a standard star list, which is fed to Spike as a part of observation datasets. Observation datasets with some necessary information, prepared by an observer, are input to the scheduling tools to be converted to Spike Lisp input forms. A schedule created by Spike is inversely converted to Subaru observation commands to be executed with the observation control system. These applications are operable with Web-based display. We present an overall structure of the scheduling tools with some samples of Subaru observation commands of target datasets and a resultant schedule.
The EMIR observing program manager system science simulator
Johan Richard, Roser Pello, Thierry Contini, et al.
We present in this poster paper the Science Simulation aspects of the EMIR Observing Program Manager System (EOPMS). EMIR is a multi-slit near-IR spectrograph presently under development for the Gran Telescopio de Canarias (GTC). We present the scientific functionalities of the EOPMS and its ability to provide the user with the required information during the different observing phases. The exposure time calculator (ETC) and the Image Simulator (IS) will be described, focusing on some unique capabilities with respect to the presently available tools, such as the possibility of 2D spectra simulation and realistic 1D extraction.
Observation planning tools for the ESO VLT interferometer
Now that the Very Large Telescope Interferometer (VLTI) is producing regular scientific observations, the field of optical interferometry has moved from being a specialist niche area into mainstream astronomy. Making such instruments available to the general community involves difficult challenges in modelling, presentation and automation. The planning of each interferometric observation requires calibrator source selection, visibility prediction, signal-to-noise estimation and exposure time calculation. These planning tools require detailed physical models simulating the complete telescope system - including the observed source, atmosphere, array configuration, optics, detector and data processing. Only then can these software utilities provide accurate predictions about instrument performance, robust noise estimation and reliable metrics indicating the anticipated success of an observation. The information must be presented in a clear, intelligible manner, sufficiently abstract to hide the details of telescope technicalities, but still giving the user a degree of control over the system. The Data Flow System group has addressed the needs of the VLTI and, in doing so, has gained some new insights into the planning of observations, and the modelling and simulation of interferometer performance. This paper reports these new techniques, as well as the successes of the Data Flow System group in this area and a summary of what is now offered as standard to VLTI observers.
Robotic telescope scheduling: the Liverpool Telescope experience
The Liverpool Telescope (LT) is a fully robotic 2m telescope located on La Palma in the Canary Islands. It has been in operation since July 2003 and has just started (April 2004) initial robotic operations. In this paper we describe the implementation of the heuristic dispatch scheduler, its interaction with the Robotic Control System (RCS), details of performance metrics we intend to use and present some initial results.
Proposal and observing preparation for ALMA
A number of tools exist to aid in the preparation of proposals and observations for large ground and space-based observatories (VLT, Gemini, HST being examples). These tools have transformed the way in which astronomers use large telescopes. The ALMA telescope has a strong need for such a tool, but its scientific and technical requirements, and the nature of the telescope, provide some novel challenges. In addition to the common Phase I (Proposal) and Phase II (Observing) preparation the tool must support the needs of the novice alongside the needs of those who are expert in millimetre/sub-millimetre aperture synthesis astronomy. We must also provide support for the reviewing process, and must interface with and use the technical architecture underpinning the design of the ALMA Software System. In this paper we describe our approach to meeting these challenges.
Astronomer proposal tool exposure time calculator
The Astronomer Proposal Tool (APT) Exposure Time Calculator (ETC) is a generic Java library for performing ETC calculations. Currently it is primarily used by the web based ETCs supporting Hubble Space Telescope (HST) proposals at Space Telescope Science Institute (STScI). This paper describes the software architecutre, current and potential uses of this library.
Data Processing Systems: Hardware and Algorithms
icon_mobile_dropdown
SIP: a modern flexible data pipeline
Pauline Barmby, Zhong Wang, Massimo Marengo, et al.
IRAC, the Infrared Array Camera on the Spitzer Space Telescope, generated well over 150,000 images during the in-orbit checkout and science verification phase of the mission. All of these were processed with SIP, the SAO IRAC Pipeline. SIP was created by and for the members of the IRAC instrument team at the Smithsonian Astrophysical Observatory, to allow short-timescale data processing and rapid-turnaround software testing and algorithm modification. SIP makes use of perl scripting and data mirroring to transfer and manage data, a mySQL database to select calibration data, and Python/numarray to process the image data; it is designed to run with no user interaction. SIP is fast, flexible, and robust. We present some 'lessons learned' from the construction and maintenance of SIP, and discuss prospects for future improvement.
A pipeline for automatically processing and analyzing archival images from multiple instruments
Seathrun O'Tuairisg, Aaron Golden, Raymond F. Butler, et al.
To take advantage of the recent upsurge in astrophysical research applications of grid technologies coupled with the increase in temporal and spatial coverage afforded to us by dedicated all-sky surveys and on-line data archives, we have developed an automated image reduction and analysis pipeline for a number of different astronomical instruments. The primary science goal of the project is in the study of long-term optical variability of brown dwarfs, although it can be tailored to suit many varied astrophysical phenomena. The pipeline complements Querator, the custom search-engine which accesses the astronomical image archives based at the ST-ECF/ESO centre in Garching, Germany. To increase our dataset we complement the reduction and analysis of WFI (Wide Field Imager, mounted on the 2.2-m MPG/ESO telescope at La Silla) archival images with the analysis of pre-reduced co-spatial HST/WFPC2 images and near infrared images from the DENIS archive. Our pipeline includes CCD-image reduction, registration, astrometry, photometry, and image matching stages. We present sample results of all stages of the pipeline and describe how we overcome such problems as missing or incorrect image meta-data, interference fringing, poor image calibration files etc. The pipeline was written using tasks contained in the IRAF environment, linked together with Unix Shell Scripts and Perl, and the image reduction and analysis is performed using a 40-processor Origin SGI 3800 based at NUI, Galway.
Automated classification of x-ray sources in stellar clusters
Susan M. Hojnacki, Joel H. Kastner
The Chandra X-ray Observatory (CXO) is generating a tremendous amount of multi-dimensional X-ray data of exceptional quality. Currently, astronomers analyze these data one X-ray source at a time, via model-fitting techniques, to determine source physical conditions. More efficient methods of spectral and temporal classification would greatly benefit analysis of observations of rich fields of X-ray sources, such as stellar clusters. A combination of techniques from the fields of multivariate statistics and pattern recognition may provide new insight into, as well as an improvement in the speed and accuracy of, the classification of stellar X-ray spectra. We are adapting and applying such techniques, in the context of analysis of CXO and X-ray Multi-Mirror Mission (XMM-Newton) imaging spectroscopy of star formation regions, to group pre-main-sequence X-ray sources into clusters based on spectral attributes. An automated spectral classification technique for the Orion Nebula Cluster (ONC) population of greater than 1000 X-ray emitting young stars has been developed. As an initial test of the algorithm, deep CXO images of the ONC were analyzed. Clustering results are being compared with known optical, infrared, and radio properties of the young stellar population of the ONC, to assess the algorithm's ability to identify groups of sources that share common attributes.
An automated classification algorithm for multiwavelength data
The important step of data preprocessing of data mining is feature selection. Feature selection is used to improve the performance of data mining algorithms by removing the irrelevant and redundant features. By positional cross-identification, the multi-wavelength data of 1656 active galactic nuclei (AGNs), 3718 stars, and 173 galaxies are obtained from optical (USNO-A2.0), X-ray (ROSAT), and infrared (Two Micron All- Sky Survey) bands. In this paper we applied a kind of filter approach named ReliefF to select features from the multi-wavelength data. Then we put forward the naive Bayes classifier to classify the objects with the feature subsets and compare the results with and without feature selection, and those with and without adding weights to features. The result shows that the naive Bayes classifier based on ReliefF algorithms is robust and efficient to preselect AGN candidates.
The realization of an automated data reduction pipeline in IRAF: the PhotMate system
Developments in imaging technology over the past decade have provided impetus toward the realization of automated data reduction systems within the astronomical community. These developments, in particular advances in CCD technology, have meant that the data envelope associated with even modest observing programmes can reach gigabyte volumes. We describe the development of an automated data reduction system for differential photometry called PhotMate. For issues of reuse and interoperability the system was developed entirely within the IRAF environment and now forms the backbone of our data analysis procedure. We discuss the methodologies behind its implementation and the use of IRAF scripts for the realization of an automated process. Finally, we place the effectiveness of such a system in context by reference to two recent observing runs at the Calar Alto observatory where we tested a new low light level (L3) CCD. It is our belief that this observing campaign is an important indicator of future trends in observational optical astronomy. As the cost of such devices decreases, their usage will increase and with it the volume of data collectively generated, toppling large astronomical projects as the primary data generators.
Data reduction and analysis pipelines: the simple approach for the xFOSC family of instruments
We describe how instrument data-processing pipelines can be quickly and easily developed using the modularity, header-manipulation, and scripting features of the IRAF suite. Our illustration case is the design of a simple IRAF-based reduction and analysis pipeline for the BFOSC instrument on the 1.52m Cassini Telescope at Loiano, run by the Osservatorio Astronomico di Bologna. On the basis of header keywords, raw frames are automatically processed in a series of steps: grouping by any observational parameter(s), CCD reduction, registration, coaddition, photometry, deconvolution, RGB-tricolour representation, and basic astrometry, with spectroscopy partially implemented as of now. In this way, FITS data can be automatically analysed from raw frames to "end product" (of final or near-final scientific quality), while still "at the telescope", thus enabling much faster feedback. Since the xFOSC family of instruments produced by the Astronomical Observatory of Copenhagen, which includes BFOSC, share identical design and operation, it should be simple to adapt the pipeline to any of the ten FOSC instruments: DFOSC on the ESO/Danish 1.54m, ALFOSC on the Nordic Optical Telescope, TFOSC on the new TT1 (Castelgrande) Telescope. We also aim to make it available for "on the fly" archival processing.
Experience in setting up a PC cluster
Ganghua Lin, Mei Zhang
In this paper we summary and present our thinking and experience in setting up a PC cluster, with a consideration that the described thinking and experience may be relevant to or useful for those who intend to buy a similar cluster in the near future.
Outlier detection in astronomical data
Astronomical data sets have experienced an unprecedented and continuing growth in the volume, quality, and complexity over the past few years, driven by the advances in telescope, detector, and computer technology. Like many other fields, astronomy has become a very data rich science. Information content measured in multiple Terabytes, and even larger, multi Petabyte data sets are on the horizon. To cope with this data flood, Virtual Observatory (VO) federates data archives and services representing a new information infrastructure for astronomy of the 21st century and provides the platform to science discovery. Data mining promises to both make the scientific utilization of these data sets more effective and more complete, and to open completely new avenues of astronomical research. Technological problems range from the issues of database design and federation, to data mining and advanced visualization, leading to a new toolkit for astronomical research. This is similar to challenges encountered in other data intensive fields today. Outlier detection is of great importance, as one of four knowledge discovery tasks. The identification of outliers can often lead to the discovery of truly unexpected knowledge in various fields. Especially in astronomy, the great interest of astronomers is to discover unusual, rare or unknown types of astronomical objects or phenomena. The outlier detection approaches in large datasets correctly meet the need of astronomers. In this paper we provide an overview of some techniques for automated identification of outliers in multivariate data. Outliers often provide useful information. Their identification is important not only for improving the analysis but also for indicating anomalies which may require further investigation. The technique may be used in the process of data preprocessing and also be used for preselecting special object candidates.
High-level parallel computing language
High-level Parallel Computing language (HPCL) combines the high performance of Clusters and the easy-to-use property of high-level language Octave. A HPCL program will run concurrently in a set of virtual machines (VMs). Therefore, HPCL programs are machine independent. HPCL keeps the elegant of the current high-level languages. It only needs one additional operator @ to transfer data and commands among the VMs. HPCL is also compatible to the conventional high-level language. Any sequential high-level Octave programs can be run properly in HPCL environment without modification. The realization of HPCL is briefly introduced in this report, including the main system components, execution strategy and message transfer protocol.
Remote distributed pipeline processing of GONG helioseismic data: experience and lessons learned
Jean N. Goodrich, Shukur Kholikov, Charles Lindsey, et al.
The Global Oscillation Network Group (GONG) helioseismic network can create images of the farside of the Sun which frequently show the presence of large active regions that would be otherwise invisible. This ability to "see" through the sun is of potential benefit to the prediction of solar influences on the Earth, provided that the data can be obtained and reduced in a timely fashion. Thus, GONG is developing a system to A) perform initial data analysis steps at six geographically distributed sites, B) transmit the reduced data to a home station, C) perform the final steps in the analysis, and D) distribute the science products to space weather forecasters. The essential requirements are that the system operate automatically around the clock with little human intervention, and that the science products be available no more than 48 hours after the observations are obtained. We will discuss the design, implementation, testing, and current status of the system.