Share Email Print

Proceedings Paper

Unveiling ALMA software behavior using a decoupled log analysis framework
Author(s): Juan Pablo Gil; Alexis Tejeda; Tzu-Chiang Shen; Norman Saez
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

ALMA Software is a complex distributed system installed in more than one hundred of computers, which interacts with more than one thousand of hardware device components. A normal observation follows a flow that interacts with almost that entire infrastructure in a coordinated way. The Software Operation Support team (SOFTOPS) comprises specialized engineers, which analyze the generated software log messages in daily basis to detect bugs, failures and predict eventual failures. These log message can reach up to 30 GB per day. We describe a decoupled and non-intrusive log analysis framework and implemented tools to identify well known problems, measure times taken by specific tasks and detect abnormal behaviors in the system in order to alert the engineers to take corrective actions. The main advantage of this approach among others is that the analysis itself does not interfere with the performance of the production system, allowing to run multiple analyzers in parallel. In this paper we'll describe the selected framework and show the result of some of the implemented tools.

Paper Details

Date Published: 18 July 2014
PDF: 7 pages
Proc. SPIE 9152, Software and Cyberinfrastructure for Astronomy III, 91521G (18 July 2014); doi: 10.1117/12.2055352
Show Author Affiliations
Juan Pablo Gil, ALMA (Chile)
Alexis Tejeda, National Radio Astronomy Observatory (United States)
Tzu-Chiang Shen, ALMA (Chile)
Norman Saez, ALMA (Chile)

Published in SPIE Proceedings Vol. 9152:
Software and Cyberinfrastructure for Astronomy III
Gianluca Chiozzi; Nicole M. Radziwill, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?