The Durability and Fragility of Knowledge Infrastructures: Lessons Learned from Astronomy [CL]

Infrastructures are not inherently durable or fragile, yet all are fragile over the long term. Durability requires care and maintenance of individual components and the links between them. Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries. Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy. Methods include interviews, ethnography, and document analysis. As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components. Several features of data practices in astronomy contribute to the fragility of that infrastructure. These include different archiving practices between ground- and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data. Infrastructure components are tightly coupled, based on international agreements. However, the durability of these infrastructures relies on much invisible work – cataloging, metadata, and other labor conducted by information professionals. Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.

Read this paper on arXiv…

C. Borgman, P. Darch, A. Sands, et. al.
Wed, 2 Nov 16

Comments: Paper presented at the 2016 Annual Meeting of the Association for Information Science and Technology, October 14-18, 2016, Copenhagen, Denmark. 10 pages; this https URL

Science Learning via Participation in Online Citizen Science [IMA]

We investigate the development of scientific content knowledge of volunteers participating in online citizen science projects in the Zooniverse (, including the astronomy projects Galaxy Zoo ( and Planet Hunters ( We use econometric methods to test how measures of project participation relate to success in a science quiz, controlling for factors known to correlate with scientific knowledge. Citizen scientists believe they are learning about both the content and processes of science through their participation. Won’t don’t directly test the latter, but we find evidence to support the former – that more actively engaged participants perform better in a project-specific science knowledge quiz, even after controlling for their general science knowledge. We interpret this as evidence of learning of science content inspired by participation in online citizen science.

Read this paper on arXiv…

K. Masters, E. Oh, J. Cox, et. al.
Mon, 25 Jan 16

Comments: 32 pages (9 pages of Appendix material). Accepted for publication in the Journal of Science Communication (JCOM; this http URL)

From Stars to Patients: Lessons from Space Science and Astrophysics for Health Care Informatics [IMA]

Big Data are revolutionizing nearly every aspect of the modern society. One area where this can have a profound positive societal impact is the field of Health Care Informatics (HCI), which faces many challenges. The key idea behind this study is: can we use some of the experience and technical and methodological solutions from the fields that have successfully adapted to the Big Data era, namely astronomy and space science, to help accelerate the progress of HCI? We illustrate this with examples from the Virtual Observatory framework, and the NCI EDRN project. An effective sharing and reuse of tools, methods, and experiences from different fields can save a lot of effort, time, and expense. HCI can thus benefit from the proven solutions to big data challenges from other domains.

Read this paper on arXiv…

S. Djorgovski, A. Mahabal, D. Crichton, et. al.
Thu, 17 Dec 15

Comments: 3 pages, to appear in refereed Proc. IEEE Big Data 2015, IEEE press

From Thread to Transcontinental Computer: Disturbing Lessons in Distributed Supercomputing [IMA]

We describe the political and technical complications encountered during the astronomical CosmoGrid project. CosmoGrid is a numerical study on the formation of large scale structure in the universe. The simulations are challenging due to the enormous dynamic range in spatial and temporal coordinates, as well as the enormous computer resources required. In CosmoGrid we dealt with the computational requirements by connecting up to four supercomputers via an optical network and make them operate as a single machine. This was challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as three of them are located scattered across Europe and fourth one is in Tokyo. The co-scheduling of multiple computers and the ‘gridification’ of the code enabled us to achieve an efficiency of up to $93\%$ for this distributed intercontinental supercomputer. In this work, we find that high-performance computing on a grid can be done much more effectively if the sites involved are willing to be flexible about their user policies, and that having facilities to provide such flexibility could be key to strengthening the position of the HPC community in an increasingly Cloud-dominated computing landscape. Given that smaller computer clusters owned by research groups or university departments usually have flexible user policies, we argue that it could be easier to instead realize distributed supercomputing by combining tens, hundreds or even thousands of these resources.

Read this paper on arXiv…

D. Groen and S. Zwart
Tue, 7 Jul 15

Comments: Accepted for publication in IEEE conference on ERRORs

Crowdfunding Astronomy Outreach Projects: Lessons Learned from the UNAWE Crowdfunding Campaign [CL]

In recent years, crowdfunding has become a popular method of funding new technology or entertainment products, or artistic projects. The idea is that people or projects ask for many small donations from individuals who support the proposed work, rather than a large amount from a single source. Crowdfunding is usually done via an online portal or platform which handles the financial transactions involved. The Universe Awareness (UNAWE) programme decided to undertake a Kickstarter crowdfunding campaign centring on the resource Universe in a Box2. In this article we present the lessons learned and best practices from that campaign.

Read this paper on arXiv…

A. Ashton, P. Russo and T. Heenatigala
Mon, 8 Dec 14

Comments: Published – Communicating Astronomy with the Public journal #16 (4 pages) (2014)

The first SPIE software Hack Day [IMA]

We report here on the software Hack Day organised at the 2014 SPIE conference on Astronomical Telescopes and Instrumentation in Montreal. The first ever Hack Day to take place at an SPIE event, the aim of the day was to bring together developers to collaborate on innovative solutions to problems of their choice. Such events have proliferated in the technology community, providing opportunities to showcase, share and learn skills. In academic environments, these events are often also instrumental in building community beyond the limits of national borders, institutions and projects. We show examples of projects the participants worked on, and provide some lessons learned for future events.

Read this paper on arXiv…

S. Kendrew, C. Deen, N. Radziwill, et. al.
Thu, 7 Aug 14

Comments: To be published in Proc. SPIE volume 9152; paper will be available in the SPIE Digital Library via Open Access

10 Simple Rules for the Care and Feeding of Scientific Data [CL]

This article offers a short guide to the steps scientists can take to ensure that their data and associated analyses continue to be of value and to be recognized. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more, but our goal here is not to review that literature. Instead, we present a short guide intended for researchers who want to know why it is important to “care for and feed” data, with some practical advice on how to do that.

Read this paper on arXiv…

Fri, 10 Jan 14