New Cluster Support Now Available in Updated Solaris ZenPack

 

The Solaris operating system is known for its scalability, and many enterprise IT shops run their business-critical applications and services on Solaris-based servers. However, meeting application and service delivery SLAs depends upon the error-free functioning of not only the business-critical application itself, but also the underlying Solaris operating system that hosts the application.

Issues that impact application service delivery can’t always be strictly attributed to the application itself, and focusing on application performance metrics alone will not necessarily lead you to the source of a problem. While application monitoring is important, you also need to monitor the underlying Solaris host operating system.

Many times, issues or errors at the operating system level can ripple out and affect the performance and availability of the applications they host. This in turn can result in application or service degradation or outages. For example, if one or more operating system processes on a Solaris host consumes too much of a critical resource — such as CPU — application performance suffers. In order to consistently meet your SLAs and keep the business happy, you must be able to proactively identify potential issues at both the application level as well as the operating system level — before service delivery is impacted.

Comprehensive Monitoring for Solaris Environments

Zenoss has provided unified monitoring and event management for many years in some of today’s largest, most complex enterprise datacenters. The IT Operations teams in these organizations use Zenoss to unify and automate the monitoring of everything in their heterogeneous physical, virtual, and cloud IT infrastructure — including the monitoring of their Solaris resources that run their business-critical applications.

Over the last several months, the Zenoss engineering team has been hard at work further enhancing our Solaris monitoring capabilities to take advantage of new features in Solaris 11, including the addition of new cluster support. The team has also added several other additional service impact and analytics capabilities that have been requested by our customers.

Our recently updated Solaris ZenPack now provides the following support for Solaris environments:

  • Discovery and Modeling: Automatically discovers processors, file systems, interfaces, network routes, processor pools, IP services, and hard disks. Discovers and models Solaris zones, including zone file systems, network adapters, and dedicated CPUs. Discovers and models LDOMs, including LDOM virtual CPUs and virtual disk services. Also discovers and models Solaris clusters, including cluster nodes, device groups, resources and resource groups, switches and switch ports, and transport paths.
  • Performance Monitoring: Tracks device load, CPU, memory, and file system utilization. Also tracks inbound and outbound throughput and packets, processor pools, hard disk read/write rates, and more. For zones, collects metrics for zone CPU, memory, and swap utilization, zone file system size, and, for zone network adapters, incoming and outgoing packet errors and incoming and outgoing packets. For LDOMs, collects metrics for CPU utilization and volumes available for LDOM virtual disk services. For clusters, collects metrics for offline nodes, online nodes, total nodes, as well as for quorum votes needed, votes possible, and votes present.
  • Service Impact and Analytics: When combined with Service Impact and Analytics, automatically adds service impact relationships. For more information about these service impact relationships, see the “Solaris Service Impact Relationships” section below. Also includes three new Solaris data domains which support trend analysis for regular Solaris operations as well as Solaris cluster- and zone- specific trend analysis.

Solaris Service Impact Relationships

When you use the latest Solaris ZenPack with Service Impact and Analytics, you can use the automatically added service impact relationships to see how:

  • A zone or LDOM failure affects related devices
  • A zone file system, network adapter, or dedicated CPU failure affects a related zone
  • An LDOM virtual CPU or LDOM virtual disk service failure affects related LDOMs

For clusters, you can use the automatically added service impact relationships to see how:

  • A node, NAS device, device group, or switch failure affects related devices
  • A resource group or device ID (DID) failure affects related nodes
  • A resource failure affects a related resource group
  • A switch port failure affects related switches
  • A node is affected by an associated device failure

For example, if you have a two node cluster, when one node goes down, Service Impact generates a service event and immediately updates the impact model so you can quickly see that cluster performance is degraded and the cluster is now at risk due to a node failure.

The following image shows an example of the types of impact relationships Zenoss automatically creates.
Solaris Service Impact Model example

Next Steps

Continuous monitoring of your key applications and IT services running on not only your Solaris systems, but also on all of the other systems in your complex, hybrid IT infrastructure is imperative. Being able to see everything in your environment and know about crucial issues — before they impact application and service delivery — is key.

If you have Solaris in your datacenter, check out the new capabilities our latest version of the Solaris ZenPack provides. In particular, see how the service impact relationships now available can help you identify and address problems before users begin calling to complain.

New to Zenoss?

If you’re new to Zenoss, check out the following links, which provide more information about Zenoss and the Zenoss Service Dynamics platform: