Service Assurance and Why You Should Care

Organizations are increasingly faced with the requirement that their applications remain available and reliable all of the time. Some suppliers have responded to this requirement by presenting high availability or clustering solutions that are workable, but are focused on how applications were developed in the past rather than acknowledging that today’s applications are architected differently.

It is time to consider adding a new approach, service assurance, to the tools already in use to provide reliability and availability. This approach adds needed support today’s distributed, multi-tier, multi-site applications as well as having support for past approaches.

What is service assurance?

Service assurance is having the tools to monitor all parts of an application, regardless of where it is running or on what type of system, detect issues with an application component or how it is interacting with other application components; present a view of what is happening to IT administrators in real time; and proactively address those issues before they have a chance to become problems. This means being able to examine at all parts of a service. Performance and configuration data concerning components such as the servers, the operating system, network components, storage system, virtualization tools, application frameworks and the applications themselves must be readily available to the IT administrator.

The whole point of service assurance is to mitigate the impact of service disruptions on customers regardless of whether the source of the disruption is internal, such as a component failure, or external, such as a networking problem. This is increasingly difficult.

Tools that offer service assurance must be able to wend their way through complex application systems and prevent application slowdowns or application failures. These complex systems now include physical systems, virtual systems and even systems found in a cloud service provider’s data center. This can be a very challenging task because of today’s distributed, multi-tier application design.

Service assurance vs. HA/Fault tolerance

In the past, application availability and performance were assured using a different set of tools – deploying server clusters or, in the case that a failure is never acceptable, fault tolerant systems.

This approach worked because applications at that time were designed as monolithic blocks of code that ran on a single machine. So, running that application in several places managed by an HA or cluster software product was a good way to insure application availability.

Today’s applications are built upon HA/Fault tolerant platforms as well as more highly distributed solutions. That means that service assurance tools must be able to examine what is happening in these environments as well as in other computing environments that are in use in the company’s IT infrastructure.

Distributed, Multi-tier Applications have become the norm

About 15 years ago, IT architects started using industry standard, X86-based systems as a platform for newly developed applications. Since these systems were, at the time, not as powerful as single-vendor UNIX-based system or mainframes, it was necessary to segment the application into functions and deploy those functions on separate systems. Each layer of code was called a “tier.” These applications were known as multi-tier applications because a complex application was typically constructed using multiple application components.

As these application components continued to be enhanced and improved, they became known as application services. At this point in time, applications were really simple layers of software that invoked and orchestrated the functions offered by these application services.

Multiple instances of each of these application servers were deployed in the organization’s network. Workload management software would balance the load across these different systems to both improve overall performance and increase levels of availability.

It is easy to see that applications were now being hosted on a veritable herd of computers rather than on a single mainframe. Each application service might be hosted on a different operating system running on systems provided by different suppliers. So, Dell, HP or IBM systems might be running Windows, Linux or UNIX to support a given application service.

Enter virtual machine software

When organizations started to demand higher levels of system use to reduce the amount of non-productive time, application services began to be hosted in virtual machines.

It was almost as if IT organizations remembered the benefits of having everything in one place and tried to recreate mainframe computing using virtual servers to support a number of application services on a single physical machine.

Although industry standard systems supporting complex, virtualized computing environments can now address the same level scaleability as traditional mainframes, they are far more complex.

Critical distributed applications need service assurance

It is increasingly clear that tools developed to keep a single system or a small cluster running are no longer sufficient in today’s highly distributed, complex environment. A new set of tools, tools that offer service assurance rather than merely offering high availability are needed. These tools must be able to examine and manage all of the components of the IT infrastructure not just a few selected componets to achieve the goal of preventing service interruptions that have a negative impact on customers.

Tools offered by suppliers, such as Zenoss, can be the foundation of a complete service assurance initiative.

Blueprint for Delivering IT-as-a-Service Webcast - Watch Now