Astrophysics Code Sharing II: The Sequel
On Tuesday, 7 January 2014, at the 223rd AAS meeting in Washington, DC, the AAS Working Group on Astronomical Software (WGAS) and the Astrophysics Source Code Library (ASCL) sponsored a special session on code sharing as a follow-up to the splinter meeting “Astrophysics Code Sharing?” held at the 221st AAS meeting in Long Beach, California, a year earlier. The following is abridged from a longer report that appears on the ASCL website.
The standing-room-only session was moderated by Peter Teuben (University of Maryland), chairman of the ASCL Advisory Committee; Robert Hanisch (STScI), outgoing chair of the WGAS and also a member of the ASCL Advisory Committee, provided closing remarks.
A very brief summary of some main points of the sessions, along with their titles, presenters, and links to slides (as PDF files), is given here.
- Occupy Hard Drives: Making Your Work More Valuable by Giving It Away, Benjamin Weiner (University of Arizona)
Ben pointed out that time spent writing software represents an enormous sunk cost that is, unfortunately, not viewed as doing real work, though writing software is part of doing science. He stated that widely-used software has enabled at least as much science as a new instrument would. He encouraged people to document their code for their own sake, to release it without worrying about bugs or other potential issues in the software, and to write software methods papers for journals. slides
- Maintaining a User Community for the Montage Image Mosaic Toolkit, Bruce Berriman (Caltech)
In this case study of Montage, Bruce stated that releasing software comes with a cost, but that it is still worth doing. Montage was developed under contract and was designed for ease of maintenance, modularity, and sustainability from the beginning. It is maintained primarily through volunteer effort, and in part through collaborations, e.g., with the LSST EPO team. He said the Caltech license under which Montage is distributed does not allow users to redistribute modified code, nor can Montage be included in other distributions such as Redhat. He suggests coders consider licensing carefully. slides
- Cloudy: Simulating the Non-Equilibrium Microphysics of Gas and Dust, and Its Observed Spectrum, Gary Ferland (University of Kentucky)
Gary discussed Cloudy, which, with over three decades of use, is the most mature of the three codes covered in this session. The code is autonomous and self-aware, providing warnings about what might have gone wrong when things do go wrong. Though the user community is broad, and participants in the summer schools that are held on the code have formed collaborations, a Yahoo! discussion forum for Cloudy has not been as successful as they had hoped. Cloudy was released as open access, with the most permissive license possible; Gary cited NSF as making this necessary since the code was developed with public grant funds. Students who work on the code get industry-standard programming experience, which is intended to help students gain employment after graduation. slides
- NSF Policies on Software and Data Sharing and Their Implementation, Daniel Katz (National Science Foundation)
Dan covered the NSF policies that govern software funded by the agency. Though some NSF panels are much more rigorous than others, it is expected that PIs will publish all significant findings, including data and software; he stated quite firmly that data include software according to the government. He also said that it is up to the community via peer-review panels to enforce these policies, that many core research programs don’t enforce this very well, and that the community determines what is and is not acceptable. This may be changing, however, as with an Office of Science and Technology Policy memo on open data, OMB policies are pushing harder on open access. slides
- The Astropy Project’s Self-Herding Cats Development Model, Erik Tollerud (Yale University)
The newest of the three code projects highlighted is Astropy. Erik described the grass-roots effort to self-organize the now ~60 code base contributors, and that this arose out of a common goal: to streamline astronomy tools written in Python, as having eight different packages to do the same thing means that 7/8ths of the effort was wasted. He stated that technology now exists that provides good support for such an effort, including GitHub to manage the processes of many developers, Travis for testing code, and Sphinx for documentation, which is written as the code is written. He pointed out that agreement on the problem was the key in getting the effort to come together and that consensus, guidelines, and expectations make it work. slides
- Costs and Benefits of Developing Out in the Open, David W. Hogg (New York University)
David started out by saying that everything his group does is open — all papers, grant proposals, comments, and codes — and has been since 2005, and that this was a pragmatic, not an ethical decision. He stated that the negatives others give for not releasing code — getting scooped, embarrassment, time, email and support requests, licensing — are overplayed, and that since the public is paying for this, we should return the products we develop to them. He doesn’t know of a single case of someone getting scooped because he/she shared code. Rather, the benefits that sharing openly provides — establishing priority, visibility and goodwill, re-use and citations, feedback and bug-catching, and having the moral high ground — outweigh the overplayed negatives. slides
After David’s presentation, Peter opened the floor for questions and discussion, which was lively and wide-ranging, touching on enforcement of policies governing release of research products, lack of long-term stewardship of software, export-control restrictions, costs and benefits of sharing code, reasons people do not release software, assigning credit to those writing useful programs, licensing, and other topics. Discussion lasted approximately 40 minutes, after which Peter turned the podium over to Robert Hanisch for closing remarks.
Robert reiterated that software sharing is fundamental to the dissemination and validation of research results, and though there are carrots and sticks for software sharing, the sticks are not very strong. He also pointed out that nothing within the funding agencies offers support for software development and that there is a disconnect between national policy and implementation.
He talked also about opportunity for change; as of Sunday, 5 January, the Working Group on Astronomical Software has Frossie Economou as its new chair, and over the weekend the AAS Council had suggested that the WGAS be elevated from a working group to a division within the AAS. Having a division focused on software will provide more visibility for it, and on this hopeful note, the session ended.
A Few Thoughts
This is the fourth discussion session the ASCL has arranged; previous sessions include one at AAS 221 and two at the previous two ADASS meetings. Links to materials or discussion from previous sessions are available on the ASCL blog.
I believe that science should be as transparent as possible, that code release (absent ITAR and other truly compelling reasons), even if only for examination, not reuse, is part of this transparency, and that ultimately code release is better for code authors, especially if the astronomy community works together to make it better for them. Code sharing can make astronomy more efficient, too, which is especially important in the current financial climate.
I want to thank Peter for moderating the session, Bob for offering closing remarks, and the most excellent Ben, Bruce, Gary, Erik, Dan, and David for presenting at this session, our wonderful volunteer whose name I did not get, alas, for her great work and for counting the 149 attendees, the AAS for accepting the proposal in the first place, and the amazing people who sent this session literally around the world through their tweets. Thank you!
Editor, Astrophysics Source Code Library