Corral Framework: Trustworthy and Fully Functional Data Intensive Parallel Astronomical Pipelines [IMA]

Data processing pipelines are one of most common astronomical software. This kind of programs are chains of processes that transform raw data into valuable information. In this work a Python framework for astronomical pipeline generation is presented. It features a design pattern (Model-View-Controller) on top of a SQL Relational Database capable of handling custom data models, processing stages, and result communication alerts, as well as producing automatic quality and structural measurements. This pat- tern provides separation of concerns between the user logic and data models and the processing flow inside the pipeline, delivering for free multi processing and distributed computing capabilities. For the astronomical community this means an improvement on previous data processing pipelines, by avoiding the programmer deal with the processing flow, and parallelization issues, and by making him focusing just in the algorithms involved in the successive data transformations. This software as well as working examples of pipelines are available to the community at

Read this paper on arXiv…

J. Cabral, B. Sanchez, M. Beroiz, et. al.
Mon, 23 Jan 17

Comments: 8 pages, 2 figures, submitted for consideration at Astronomy and Computing. Code available at this https URL

Porting the LSST Data Management Pipeline Software to Python 3 [IMA]

The LSST data management science pipelines software consists of more than 100,000 lines of Python 2 code. LSST operations will begin after support for Python 2 has been dropped by the Python community in 2020, and we must therefore plan to migrate the codebase to Python 3. During the transition period we must also support our community of active Python 2 users and this complicates the porting significantly. We have decided to use the Python future package as the basis for our port to enable support for Python 2 and Python 3 simultaneously, whilst developing with a mindset more suited to Python 3. In this paper we report on the current status of the port and the difficulties that have been encountered.

Read this paper on arXiv…

T. Jenness
Thu, 3 Nov 16

Comments: 4 pages, presented at Astronomical Data Analysis Software and Systems XXVI conference, Trieste, Italy, October 2016

The offline software framework of the DAMPE experiment [IMA]

A software framework has been developed for the DArk Matter Particle Explorer (DAMPE) mission, a satellite based experiment. The software framework of DAMPE is mainly written in C++ while the application under this framework is steered in Python script. The framework is comprised of four principal parts: an event data model which contains all reconstruction and simulation information based on ROOT input/output (I/O) streaming; a collection of processing models which are used to process each event data, called as algorithms; common tools which provide general functionalities like data communication between algorithms; and event filters. This article presents an overview of the DAMPE offline software framework, and the major architecture design choices during the development.

Read this paper on arXiv…

C. Wang, D. Liu, Y. Wei, et. al.
Wed, 13 Apr 16

Comments: 5 pages, 2 figures

IVOA recommendation: Parameter Description Language Version 1.0 [IMA]

This document discusses the definition of the Parameter Description Language (PDL). In this language parameters are described in a rigorous data model. With no loss of generality, we will represent this data model using XML. It intends to be a expressive language for self-descriptive web services exposing the semantic nature of input and output parameters, as well as all necessary complex constraints. PDL is a step forward towards true web services interoperability.

Read this paper on arXiv…

C. Zwolf, P. Harrison, J. Garrido, et. al.
Tue, 29 Sep 15

Comments: N/A

Learning from FITS: Limitations in use in modern astronomical research [IMA]

The Flexible Image Transport System (FITS) standard has been a great boon to astronomy, allowing observatories, scientists and the public to exchange astronomical information easily. The FITS standard, however, is showing its age. Developed in the late 1970s, the FITS authors made a number of implementation choices that, while common at the time, are now seen to limit its utility with modern data. The authors of the FITS standard could not anticipate the challenges which we are facing today in astronomical computing. Difficulties we now face include, but are not limited to, addressing the need to handle an expanded range of specialized data product types (data models), being more conducive to the networked exchange and storage of data, handling very large datasets, and capturing significantly more complex metadata and data relationships.
There are members of the community today who find some or all of these limitations unworkable, and have decided to move ahead with storing data in other formats. If this fragmentation continues, we risk abandoning the advantages of broad interoperability, and ready archivability, that the FITS format provides for astronomy. In this paper we detail some selected important problems which exist within the FITS standard today. These problems may provide insight into deeper underlying issues which reside in the format and we provide a discussion of some lessons learned. It is not our intention here to prescribe specific remedies to these issues; rather, it is to call attention of the FITS and greater astronomical computing communities to these problems in the hope that it will spur action to address them.

Read this paper on arXiv…

B. Thomas, T. Jenness, F. Economou, et. al.
Wed, 4 Feb 15

Comments: N/A

Architecture, implementation and parallelization of the software to search for periodic gravitational wave signals [CL]

The parallelization, design and scalability of the \sky code to search for periodic gravitational waves from rotating neutron stars is discussed. The code is based on an efficient implementation of the F-statistic using the Fast Fourier Transform algorithm. To perform an analysis of data from the advanced LIGO and Virgo gravitational wave detectors’ network, which will start operating in 2015, hundreds of millions of CPU hours will be required – the code utilizing the potential of massively parallel supercomputers is therefore mandatory. We have parallelized the code using the Message Passing Interface standard, implemented a mechanism for combining the searches at different sky-positions and frequency bands into one extremely scalable program. The parallel I/O interface is used to escape bottlenecks, when writing the generated data into file system. This allowed to develop a highly scalable computation code, which would enable the data analysis at large scales on acceptable time scales. Benchmarking of the code on a Cray XE6 system was performed to show efficiency of our parallelization concept and to demonstrate scaling up to 50 thousand cores in parallel.

Read this paper on arXiv…

G. Poghosyan, S. Matta, A. Streit, et. al.
Tue, 3 Feb 15

Comments: 11 pages, 9 figures. Submitted to Computer Physics Communications

SAMP, the Simple Application Messaging Protocol: Letting applications talk to each other [IMA]

SAMP, the Simple Application Messaging Protocol, is a hub-based communication standard for the exchange of data and control between participating client applications. It has been developed within the context of the Virtual Observatory with the aim of enabling specialised data analysis tools to cooperate as a loosely integrated suite, and is now in use by many and varied desktop and web-based applications dealing with astronomical data. This paper reviews the requirements and design principles that led to SAMP’s specification, provides a high-level description of the protocol, and discusses some of its common and possible future usage patterns, with particular attention to those factors that have aided its success in practice.

Read this paper on arXiv…

M. Taylor, T. Boch and J. Taylor
Wed, 7 Jan 15

Comments: 12 pages, 3 figures. Accepted for Virtual Observatory special issue of Astronomy and Computing

Virtual Observatory Publishing with DaCHS [IMA]

The Data Center Helper Suite DaCHS is an integrated publication package for building Virtual Observatory (VO) and Web services, supporting the entire workflow from ingestion to data mapping to service definition. It implements all major data discovery, data access, and registry protocols defined by the VO. DaCHS in this sense works as glue between data produced by the data providers and the standard protocols and formats defined by the VO. This paper discusses central elements of the design of the package and gives two case studies of how VO protocols are implemented using DaCHS’ concepts.

Read this paper on arXiv…

M. Demleitner, M. Neves, F. Rothmaier, et. al.
Tue, 26 Aug 14

Comments: N/A

Your data is your dogfood: DevOps in the astronomical observatory [IMA]

DevOps is the contemporary term for a software development culture that purposefully blurs distinction between software development and IT operations by treating “infrastructure as code.” DevOps teams typically implement practices summarised by the colloquial directive to “eat your own dogfood;” meaning that software tools developed by a team should be used internally rather thrown over the fence to operations or users. We present a brief overview of how DevOps techniques bring proven software engineering practices to IT operations. We then discuss the application of these practices to astronomical observatories.

Read this paper on arXiv…

F. Economou, J. Hoblitt and P. Norris
Fri, 25 Jul 14

Comments: 7 pages, invited talk at Software and Cyberinfrastructure for Astronomy III, SPIE Astronomical Telescopes and Instrumentation conference, June 2014, Paper ID 9152-38

IVOA Recommendation: DALI: Data Access Layer Interface Version 1.0 [IMA]

This document describes the Data Access Layer Interface (DALI). DALI defines the base web service interface common to all Data Access Layer (DAL) services. This standard defines the behaviour of common resources, the meaning and use of common parameters, success and error responses, and DAL service registration. The goal of this specification is to define the common elements that are shared across DAL services in order to foster consistency across concrete DAL service specifications and to enable standard re-usable client and service implementations and libraries to be written and widely adopted.

Read this paper on arXiv…

P. Dowler, M. Demleitner, M. Taylor, et. al.
Thu, 20 Feb 14

A practical approach to ontology-enabled control systems for astronomical instrumentation [IMA]

Even though modern service-oriented and data-oriented architectures promise to deliver loosely coupled control systems, they are inherently brittle as they commonly depend on a priori agreed interfaces and data models. At the same time, the Semantic Web and a whole set of accompanying standards and tools are emerging, advocating ontologies as the basis for knowledge exchange. In this paper we aim to identify a number of key ideas from the myriad of knowledge-based practices that can readily be implemented by control systems today. We demonstrate with a practical example (a three-channel imager for the Mercator Telescope) how ontologies developed in the Web Ontology Language (OWL) can serve as a meta-model for our instrument, covering as many engineering aspects of the project as needed. We show how a concrete system model can be built on top of this meta-model via a set of Domain Specific Languages (DSLs), supporting both formal verification and the generation of software and documentation artifacts. Finally we reason how the available semantics can be exposed at run-time by adding a “semantic layer” that can be browsed, queried, monitored etc. by any OPC UA-enabled client.

Read this paper on arXiv…

Date added: Tue, 22 Oct 13