Systems Seminar - CSE
Wizards and Fruitflies — Using Eternal and Ephemeral Overlays for Monitoring Distributed Systems
Many distributed applications running over emerging large-scale distributed infrastructures, such as p2p systems, PlanetLabs and Grids, require the capability to monitor and manage a large distributed set of nodes at run-time. While many solutions exist for network and infrastructure-level monitoring, there is a scarcity of management solutions that work at the level of the distributed application, e.g., for monitoring node availability, count, top-k loaded nodes, querying log-files, etc. At the same time, much work has been done in the community on both persistent overlays such as distributed hash tables (DHTs), as well as unstructured overlays (e.g., Gnutella). However, we find that two alternative, unexplored, and opposing extremes of the design spectrum – eternal and ephemeral overlays – are better suited to solving the above management tasks.
We present two systems — AVMON and MON — that seek to provide to distributed applications (and deployers) the ability to both monitor long-term availability histories of nodes in a distributed application, as well as to query the group of nodes on the fly. AVMON is a scalable availability monitoring overlay that is resilient to selfish and colluding nodes. AVMON imbues the concept of an eternally persistent (hence eternal) overlay, where peering relationships between nodes, once established, remain forever. Our other system, called MON, allows instant monitoring and management tasks, using the novel concept of an on-demand and short-lived (hence ephemeral) overlay, which survives only for the purpose of an individual management command. Both AVMON and MON are lightweight and fast in terms of memory, computation, bandwidth, and response time. Our mathematical analysis, trace-based simulations, and deployment atop
PlanetLab, all demonstrate the practical performance characteristics of these two approaches in systems containing hundreds to thousands of nodes. We touch briefly upon how to use AVMON for building availability-aware services, the usage of MON in PlanetLab, as well as on other instances of eternal and ephemeral overlays we have studied.
Indranil Gupta is an assistant professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He received his PhD in Computer Science from Cornell University in 2004. He is a recipient of the NSF CAREER award in 2005. His research group DPRG works on distributed protocols and systems, with applications to large-scale distributed systems such as peer-to-peer systems and sensor networks. DPRG research is funded by several NSF grants, including multi-disciplinary ones. For more information on DPRG, visit http://kepler.cs.uiuc.edu