Janet, the publicly-funded organization that runs the UK’s research and education network, partnered with Tessella to develop Netsight ®, a new centralized monitoring system that provides a higher levels of visibility into the performance and availability of its UK-wide network of a national core and regional networks. At its heart is a relational database that holds 3.3 billion data rows, which can be accessed and analysed easily.
We looked at off-theshelf products, but these were aimed at traditional IT environments and lacked the depth of permission controls we required.
Background and Challenge
The Janet network is dedicated to the needs of research and education in the UK, connecting 18 million users across the UK’s research and education organisations to each other, as well as to the rest of the world. The range of activities facilitated by the Janet network allows individuals and organisations to extend the traditional boundaries of teaching, learning and research methods. For researchers, the high capacity of the Janet network backbone allows the linking of large data storage and high performance computing facilities at a national and international level.
In order for the network to run effectively and to ensure maximum uptime, it needs to constantly monitor performance and availability. However, as the previous monitoring system was spread across multiple servers in different locations, Janet was finding it lacking in resilience. Janet wanted to find a new approach to monitoring that would provide greater availability and deeper insight into trends and performance.
“Previously the system was replicated across 24 different UNIX machines at different locations around the network,” explains Alan Hames, Service Analysis Group Manager at Janet. “This was not only difficult to maintain but lacked resilience – if one machine went down there could be a big gap in the data collected for that part of the network.”
Janet explored a variety of industry-standard solutions but they did not meet its unique requirements. Instead, it approached Tessella to design and implement the new system through the government’s CATALIST tendering process.
“We looked at off-the-shelf products, but these were aimed at traditional IT environments and lacked the depth of permission controls we required,” adds Alan Hames “Working with Tessella enabled us to build a bespoke solution that fitted our specific needs.”
Solution and Benefits
The result is Netsight ®, a web-based system supporting a wide range of browsers that makes it possible for any user to see their data at the click of a link. Janet connects almost 1,000 organisations to the Internet via the 40Gbit/s Janet backbone, and giving the organisations’ IT staff network visibility is a priority. In addition to viewing data, it is also possible to configure alerts such as traffic conditions or connection availability, and be e-mailed automatically in the event of a problem.
It uses a powerful relational database to store its data, and has been purpose-built with an emphasis on making that data available. Users can see graphs of network performance over any timescale, seeing last year’s data just as easily as the last hour’s, with the ability to zoom in on areas of interest with just the drag of a mouse.
Network measurements are performed by multiple remote servers before being fed to a central database in Bracknell. These remote servers are constantly monitored, and in the event of a problem the measurement jobs are automatically moved to ensure a consistent collection of data. The servers also have local storage capacity to capture data if they lose their connectivity.
“Netsight®, is a significant upgrade to previous offerings,” continues Alan Hames. “By gathering data from around the Janet network and compiling it into a central resilient database, Netsight® enables us to analyse a much greater depth and breadth of information, making it far easier to monitor trends and interrogate issues.”
Netsight® has been built from the ground up as a system capable of monitoring a national IP network. At Janet, Netsight® is taking 250,000 measurements every hour, with online relational database storage of over 3 years. Consequently, the system depends on a 3.5 TB storage disk space that currently holds 3.3 billion data rows, which can be accessed and analysed quickly and easily. The data is also archived indefinitely to simple files for future reference, with a simple mechanism for reloading files back into the database as and when required. No data is ever deleted.
“The main problem you can get with a database of that size is primarily to do with speed because accessing and reading the data can become slow,” says Alan Hames. “Tessella went through a process of fine-tuning the software in order to make it as quick as possible to reach. One way in which Tessella has configured the platform to make it more streamlined, is to pre-calculate data at different resolutions.”
Administration was also a vital consideration for Janet. Netsight® uses a fine-grained permissions system, built around users, user-groups, folders and files. Hence permissions can be easily administered, using a familiar interface. For example, an administrator may choose to allow a user to see a lot of information about one part of the network, whilst giving that same user little or no visibility to other parts of the network. Likewise, regional administrators can be created for different regions, who in turn manage the users and permissions within their own regional network.
“Previously, managing the network was a full time job for one of my staff members but with the new system, it manages itself and there is rarely any downtime,” explains Alan Hames. “That means my staff can focus on adding more strategic value to the business instead of fire-fighting application issues.”
Netsight® uses a modular and extensible design for controlling measurements, and for collecting and displaying the data. Twenty different types of data are currently measured, and in simple cases more can be added via the administration pages with no change required to the underlying software.
“Netsight® is a resilient platform that provides easily extractable data that can be analysed to plan for future trends,” concludes Alan Hames. “It is a mission-critical system which our end-users are using enthusiastically, and which is supplying the data we need for monitoring and planning.”