Chowkidar: A Health Monitor for Wireless Sensor Network Testbeds
Ohio State University Columbus United States
Pagination or Media Count:
Wireless sensor network WSN testbeds are useful because they provide a way to test applications in an environment that makes it easy to deploy experiments, configure them statically or dynamically, and gather performance information. Sensor data collected in the field can be replayed on nodes, and new ways to process the data can be tested easily. Testbeds are rapidly growing in size, with hundreds or thousands of devices, and testbed services are also becoming richer and more complex. Due to their size and complexity, faults can and dooccur in these testbeds, affecting the outcomes of experiments. Awareness of testbed health status is important to both testbed administrators charged with maintaining functional services, and users who prefer to use healthy devices and like to know if there are any failures during their experiments. Based on our experience with Kansei, a large WSN testbed at Ohio State, we identify use cases that motivate the design of Chowkidar, a health monitoring facility. Key among these are monitoring as a service that operates independently of users to provide up-to-date testbed status information monitoring of heterogeneous devices over a mixture of IP and non-IP networks distinguishing between node and interface failures and use of network dependency information to diagnose common-mode failures such as power supply or Ethernet hub failure. We then present a centralized and a distributed Chowkidar protocol that reliably monitor the health of large, heterogenous WSN testbeds and experimentally compare their performance. We report on initial experiences and lessons learnt from the integration of Chowkidar with Kansei, including feedback from both testbed users and administrators who have found Chowkidar to be a useful tool for improving the accuracy and efficiency of testbed experimentation and maintenance, and the need for well-defined policies to address issues such as minimizing interference with concurrently running experiments.