Systems engineering and science projects: lessons from MeerKAT
Dynamic scientific research and rigorous formal engineering processes do not always make for the easiest partnerships, particularly in the development of large or ‘mega’ projects. Difficulties may arise between multiple stakeholders with conflicting interests, or there may be discrepancies between technologies and designs. Furthermore, some nine out of 10 such schemes overrun costs.1 The MeerKAT2 radio telescope project, however, has demonstrated that the application of systems engineering principles can be highly effective in overcoming some of these issues.
MeerKAT is currently under construction in the Karoo region of South Africa. When completed in 2017, it will consist of 64 interlinked receptors and a main reflector dish of 13.5m diameter. The project is a pathfinder to the larger Square Kilometre Array (SKA), an international collaboration to build the world's largest radio telescope in Africa and Australia. However, MeerKAT is also a megaproject in its own right. It would therefore be prudent to capture any possible lessons from MeerKAT and to make them available to SKA.3
At an early stage in MeerKAT's development, we (the project management team) noted various technical risks for each of the project's subsystems. For the digitizer, these risks included radio frequency interference/electromagnetic compatibility, environmental conditions (which could affect performance), and implementation (without the use of a control processor) of the 10Gb/s ethernet interfaces. For the correlator beamformer (which receives the signals from all the individual antennas and combines them), the risks were the field programmable gate arrays (FPGAs) in the programming environment, the availability of a next-generation processing platform, and the performance of the 40Gb ethernet. For the time and frequency reference subsystem, risks included the design of a synchronization solution, the effect of temperature on the distribution network, and the very long procurement cycles for hydrogen masers. To manage these perceived risks, we used a systems engineering approach. Our team created a concept design that we used to solicit science proposals from the community. We were then able to give direction to the design effort and unambiguous guidance for any trade-off decisions, such as the array layout. To meet the transient science requirements, a dense core with fewer antennas further out was preferable, while the imaging requirements called for a more distributed array with longer baselines. We built a simulation tool to evaluate different array layouts, and were able to find a good compromise.
The correlator beamformer is an evolutionary design based on the KAT-74 precursor to MeerKAT, and was thus regarded as a lower-risk element of the project because we had in-house design and implementation experience readily available. Nonetheless, the correlator beamformer was dependent on improvements in processing technology performance (notably the FPGAs), in line with Moore's law. The other subsystems—the digitizer and the time and frequency reference system—were newly identified and without precedent in South Africa, and had requirements that were unique worldwide. The digitizer, which contains high-speed electronics, was mounted next to a very sensitive radio astronomy receiver in a dusty environment, which was prone to large temperature variation. As a result, there was a high risk of self-induced radio frequency interference, and a high probability of encountering difficulties in stabilizing sensitive analog electronics over a wide operating temperature. To compound these difficulties, the mitigation strategies we identified were limited and often resulted in increased design complexity, which would inevitably lead to challenges in achieving high availability of the system in its remote location. The time and frequency reference system required very precise timing—again in an uncontrolled environment—leading to high technical risk.
We considered the perceived risks in each subsystem and varied our approach to their development accordingly, while also adhering to a single systems engineering management plan. For the high-risk digitizer, we undertook a classic waterfall development process in order to avoid possible changes to requirements between stages. For the correlator beamformer, we allowed the requirements to remain more flexible for a longer time. The time and frequency reference requirements also evolved over time as a result of changing scientific requirements. Instead of focusing on requirements analysis—as with the digitizer—we worked on additional modes based on the KAT-7 system. The greater risks were in the changes to requirements. For instance, we defined mode concurrency after time allocation, and this had a significant impact on the cost of the correlator beamformer hardware. As expected, the risk profiles changed during execution of the project, and prompted us to make slight variations in our approach.
The digitizer (see Figure 1) started out as MeerKAT's highest-risk subsystem. However, it is now in full production with no remaining design risks. In contrast, the correlator beamformer (see Figure 2) still carries a number of open design risks as it approaches critical design review. The time and frequency reference system has not yet completed preliminary design review, mainly because of requirement uncertainty.
Drawing on the MeerKAT experiences, we can extract several key lessons for SKA. First, we have learned that the quality of requirements (completeness, correctness, clarity, and so forth) prior to starting construction and the discipline in managing those requirements are both critical for reliable construction planning. This will be even more important for SKA, which has a distributed and culturally diverse project team (unlike the collocated and well-integrated MeerKAT team). The second lesson we have learned is that complex projects result in complex design trade-offs. To empower a design team to make rational decisions, it is essential that the engineering requirements can be traced to a small set of well-prioritized science goals.
Our future work will prioritize the completion of MeerKAT (due in late 2017). We are currently developing new processing platforms that are potentially useful to SKA, and from 2018 we will focus on construction of that project.
François Kapp is the subsystem manager for the digital backend team of the MeerKAT radio telescope in South Africa. He has also worked on South African space projects, and in the military and commercial sectors. Systems engineering has been a cornerstone of his background in research and development.