There is nothing like a rainy and windy day in an isolated Swedish harbour to finish writing the last part in this series of texts that correspond to  a talk entitled “Histoire du traitement des données spatiales” given at the  “Campus Spatial de l’Université Paris Diderot” on September 20 2013. This was a personal account, closer to a random walk in the fog than to a true historic perspective on the subject.

 

The ISDC

 

The ISDC, originally the INTEGRAL Science Data Centre, which became Data Centre for Astrophysics in its later years, was conceived when it became clear that ESA's gamma ray satellite INTEGRAL should be an observatory open to the world astronomical community. This decision was made very early in the mission design.

 

castleEcogia

Fig. 1 "The Castle" in Ecogia in the vicinity of Versoix near Geneva. The building in which the embryonic ISDC team moved in 1996.

The chronology of the ISDC history is the following:

1989 A group of high energy astrophysicists, of which I was, proposes to ESA a gamma ray observatory mission, INTEGRAL. The original concept included a data centre.

1993 ESA selects  INTEGRAL as its second medium size mission, M2, within the Horizon 2000 programme. Teams form to define the instruments and the data centre. I lead the team defining the data centre, it was called ISDC, INTEGRAL Science Data Centre.

1995 ESA selects the instruments that were to be the payload of INTEGRAL and the ISDC as the data centre.

1996 The ISDC moves  from the Geneva Observatory to a building at the outskirts of the small town of Versoix called Ecogia.

1999  The ISDC enters in discussions with the PLANCK mission consortium to see how the work invested in the frame of the INTEGRAL project could be used in the PLANCK data architecture. The Swiss confederation is slow to approve the funding, leading in the end to a proposal for a small contribution.

2001 One of the PLANCK instrument teams decides to use the  ISDC expertise. This is approved by ESA and the Swiss funding authorities.

2002 Launch of INTEGRAL. The ISDC becomes operational. It still is 12 years in the mission.

2006 Astronomers at the Geneva observatory make extensive use of the ISDC experience to build an important section of the GAIA data analysis system.

2007 A collaboration between the institute for particle physics of the university of Geneva  and a staff of the ISDC leads to start the definition of a gamma ray polarimeter, POLAR, at ISDC.

2009 An important collaboration is established between the ISDC and SRON (NL) to develop hardware and data analysis capabilities for the Japanese led X-ray astronomy mission ASTRO-H.

2009 The Swiss National Science Foundation funds CTA (Cerenkov Telescope Array) activities at ISDC.

2010 The successes of the ISDC in several collaborations much beyond the original INTEGRAL mission leads to the preparation of a  strategy  to establish the ISDC within the university of Geneva as  multi-mission data analysis centre serving the Swiss space science institutes and participating to a developing European coordination of such centres.     

2011 ISDC staff, particle physicists and theoretical physicists in Geneva create CAP-Genève, a centre for astroparticle physics.

2012 Funding for data analysis activities for ESA's  EUCLID mission start at ISDC.

2012 The university of Geneva dismantles  the ISDC. The centre does not exist as such anymore, only individual projects are considered under the observatory of Geneva.

The ISDC development

The ISDC was conceived in part to decrease the mission cost to ESA by using  national funding expected to be available for data centre related activities, and in part to use the experience and capacities available within the INTEGRAL user community. This competence had its origin partly in the EXOSAT observatory experience related in the first part of this series.

The INTEGRAL mission was approved by ESA in 1993 in a competitive process. A call for instruments and a data centre to be provided through national funding followed. While this was standard procedure in the case of instruments, it was the first time that a science data centre was included in the call.

The tasks that were expected from a data centre for INTEGRAL were to receive, process, analyse, distribute and archive all the science data. The requirements were spelled out by ESA together with those related to the instruments and issued within the call for proposals. The community organised itself to respond to this call. A consortium of 12 institutes (11 in Europe and one in the US) was formed to propose the ISDC as the data centre. Although the call was competitive, only one consortium was proposed federating the strengths of the whole interested community.

The INTEGRAL ground segment was designed so that ESA kept the control of the uplink to the satellite and instruments and retained responsibility for the safety of the mission. The science data were to be directed as quickly as possible to the ISDC where they were to be processed. Fig. 1 shows the main elements of the ground segment.

 

dataFlow2011

 

Fig. 2 INTEGRAL ground segment. MOC: Mission Operations Centre, at the European Space Operations Centre (ESOC) in Darmstadt; ISOC: INTEGRAL Science Operations Centre at the European Space Astronomy Centre (ESAC) near Madrid.

 

INTEGRAL generates 120 kbits/s 24 hours per day the whole year around. Data "reduction" actually involves that the stored information is more than doubled in the course of the processing. While this rate appears small now (2014), it was significant in regard of the network and storage capabilities of a science institute at the time of conception around 1990. Options considered then to deal with the mass of data included large tape jukeboxes or disk storage.  The first option was the safe one at the time. We nonetheless decided to trust that computing and network progress would continue at a reasonable pace so that the technology would be available at a reasonable cost by the time of launch and decided for disk storage. This one of the few key decisions that allowed us to build a solid system capable of evolving with technology.

 The data processing and analysis of INTEGRAL  includes three timescales. Data are first  processed within seconds, automatically, using algorithms designed to detect gamma ray bursts (bright events that last from a fraction of a second to some hundred seconds), measure their positions on the sky and, also automatically, disseminate this information to the community. This very rapid processing allows astronomers to point instruments in several wavebands  towards the gamma ray burst location as early as possible, and whenever possible, while the burst is still ongoing. A second processing is made within hours of data arrival at ISDC. The aim there is to detect new sources and unexpected variability from known sources. The intervention of astronomers in the process is possible on this timescale and ensures that no false results and expectations are generated. Here again the community is alerted of the results so that observations elsewhere, but also with INTEGRAL, can be organised to study the newly discovered phenomena. Finally the data are consolidated at ESA, telemetry problems that can be resolved are repaired, the data is sent, offline, to ISDC and a "final" analysis is performed prior to data shipment to the end user and archiving. Final is a poor qualification, as a new processing of the data is performed whenever significant progress is made in the instrument calibration. This happens typically on a yearly timescale. The software is then  updated to reflect the new instrument knowledge and all the data are reprocessed using the improved analysis.

INTEGRAL transmits data 24h per day (with the exception of few hours every four days when the spacecraft is at perigee). The ISDC system is therefore working 24h per day the whole year around. The system was designed so that it can run largely autonomously, sending alerts onto mobile phones when abnormal conditions occur, either in the processing system or in the instruments. Special events, like gamma ray bursts, are also announced via telephone to the science operations staff of ISDC.

The development of the ISC system took seven years, from 1995 to the launch in 2002. The team in charge of this development started with two people and grew to 40 people at ISDC. At least as many scientists and engineers worked within the instrument developing teams to provide the instrument specific software that had to be incorporated in the ISDC system. Although all problems that can occur between instrument builders and analysis developers mentioned in the first part of this series were met, and  although (or maybe because) heated discussions resulted from some of the tensions, the development was highly successful. The system was ready and tested before launch and could be turned on as soon as the first signals from the satellite reached the ground. It keeps running since then with only very minor interruptions. In addition to its use within the ISDC the part of the  system that deals with off-line data is widely distributed. It has been and is being used worldwide by astronomers not necessarily familiar with the mission to analyse data from observations they proposed or that were performed and for which data are publicly available. These astronomers contribute to making the best possible use of the facility.

One problem that appeared early in the design of the system was that many processes were to run simultaneously, but using the same data and transferring information between them. This requires intensive interaction between the processes, a difficulty not well mastered at the start of the ISDC activities. We therefore started a technology development to create the tools we needed. Although led by an extremely competent engineer, we decided to stop this development after a number of months and to rely exclusively on existing technologies. This was a second key decision for the ISDC. It is not advisable for a small team concerned with a data processing application, astrophysics in our case, to embark on informatics technological developments.

A further decision we took was to rely solely on freely available software, avoiding commercial solutions. This allowed us to distribute very widely the tools developed. Important was also the granularity of the software. Some had wanted a very highly integrated packaged that we thought would be impossible to test and bring to work. Many more decisions were taken along the project, but these are those that I found shaped the result most.

At launch the ISDC was made of three buildings in Ecogia. In addition to the building in which the embryo of the team moved in 1996, a barn had been renovated to host the operational activities and a new pavillon built to host offices needed for the 36 ISDC staff involved. For the months following launch additional space was rented for the instrument team scientists and engineers who needed proximity to the data. 800'000 lines of code  had ben written and  organised in 340 components, 10 "pipelines", and  1030 data templates. The software was running on 100 SUN computers running under SOLARIS. Four networks, including a secured one for the operations, were set up. Four Tbytes of data storage were installed, enough for two years of operation. The whole complex was documented and under configuration control.

Hiring staff, hosting staff on leave from most of the institutes of the consortium, having a building renovated and another one built, dealing with procedures of the University, the canton of Geneva, the Swiss confederation, population control and work permits, establishing an accounting system and a financial control required a solid administration that had also to be created and invented from scratch. This is an integral part of the ISDC development.

There is considerable stress in the development of a system that is mission critical serving a project on which hundreds of scientists and engineers have worked for years. This stress is amplified by the need to obtain funding regularly for this activity from sources that do not always understand the potential and the difficulties involved. The solidity of the funding is a key part of the success. The relevant Swiss authorities have proven with their support of the ISDC their dedication to support space science activities all the way to the generation of scientific results.

The stress on the  data centre staff induces often that scientists turn their attention exclusively to project development, setting aside their own research, observations with existing instruments, and interpretation of their results. This trend must be met, and a reasonable scientific activity must be kept, if the data analysis developers are to keep pace with the community outside. This again is essential for designing a system that serves users and not developers. The support of a science advisory body   for the ISDC was most helpful in keeping this balance over the years.

Keeping a vibrant scientific enterprise in the midst of the project work is also a condition to ensure that once the satellite is launched the researchers will be able to exploit the data they have learned to process. The ISDC scientists did succeed in this part of their mission as well.  A very significant fraction of the INTEGRAL results have been and are being published by groups  that include one or several of us. This was the original aim of the whole enterprise.

Finally it must be stressed that all the work is made by real people and that the human factors are the dominant ones in the success.

 

The End

 

Despite its success, the University dismantled the ISDC. What is left now is a number of colleagues who gathered a  significant expertise  and can use it in various projects. But lost is the coherent strategy that could have placed the University and Switzerland in the midst of a network that is emerging to deal with the quickly expanding world space data. Lost is also a clear space science strategy for our University. This decision wasted a large fraction of the hard work and talent that had been spent to acquire the credibility necessary to emerge on the world space science data scene.