Saturday, November 13, 2010

RHQ: Deleting Agent Plugins

Introduction
RHQ is an extensible management platform; however, the platform itself does not provide the management capabilities. For example, there is nothing built into the platform for managing a JBoss AS cluster. The platform is actually agnostic of the actual resource and types of resources it manages, like the JBoss AS cluster. The management capabilities for resources like JBoss AS are provided through plugins. RHQ's plugin architecture allows the platform to be extended in ways such that it can manage virtually any type of resource.

Plugin JAR files can be deployed and installed on an RHQ server (or cluster of servers), they can be upgraded, and they can even be disabled. They cannot however be deleted. In this post, we spend a little bit of time exploring plugin management, from the perspectives of installing and upgrading to disabling them. Then we consider my recent work for deleting plugins.

Installing Plugins
Plugins can be installed in one of two ways. The first involves copying the plugin JAR file to /jbossas/server/default/deploy/rhq.ear/rhq-downloads/rhq-plugins. And starting with RHQ 3.0.0, you can alternatively copy the plugin JAR file to /plugins which is arguably easier the much shorter path. The RHQ server will periodically scan these directories for new plugin files. When a new or updated plugin is detected, the server will deploy the plugin. This approach is particularly convenient during development when the RHQ server is running on the same machine on which I am developing. In fact, RHQ's Maven build is set up to copy plugins to a development server as part of the build process.

The second approach to installing a plugin involves uploading the plugin file through the web UI. The screenshot below shows the UI for plugin file upload.

 Deploying plugins through the web UI is particularly useful when the plugin is on a different file system that the one on which the RHQ server is running. It is worth noting that there currently is no API exposed for installing plugins through the CLI.

Upgrading Plugins
The platform not only supports deploying new plugins that previously have not been installed in the system, but it also supports upgrading existing plugins. From a user's perspective there really is no difference in upgrading a plugin versus installing one for the first time. The steps are the same. And the RHQ server, for the most part, treats both scenarios the same as well.

Installing a new or upgraded plugin does not affect any agents that are currently running. Agents have to be explicitly updated in one of a number of ways including,
  • Restarting the agent
  • Restarting the plugin container
  • Issuing the plugins update command from the agent prompt
  • Issuing a resource operation for one of the above. This can be done from the UI or from the CLI
  • Issuing a resource operation for one of the above from a server script.
Disabling Plugins
Installed plugins can be disabled. Disabling a plugin results in agents ignoring that plugin once the agent is restarted (or more precisely, when the plugin container running inside the agent is restarted). The plugin container will not load that plugin, which means resource components, discovery components, and plugin classloaders are not loaded. This results in a reduced memory footprint of the agent. It also reduces overall CPU utilization since the agent's plugin container is performing fewer discovery and availability scans.

Plugins can be disabled on a per-agent basis allowing for a more heterogeneous deployment of agents. For instance, I might have a web server that is only running Apache and the agent that is monitoring it, while on another machine I have a JBoss AS instance running.  I could disable the JBoss-related plugins on the Apache box freeing up memory and CPU cycles. Likewise, I can disable the Apache plugins on the box running JBoss AS.

When a plugin is disabled, nothing is removed from the database. Any resources already in inventory from the disabled plugin remain in inventory. Type definitions from the disabled plugin also remain in the system.

Deleting Plugins
Recently I have been working on adding support for deleting plugins. Deleting a plugin not only deletes the actual plugin from the system, but also everything associated with it including all type definitions and all resources of the types defined in the plugin. When disabling a plugin, the plugin container has to be explicitly restarted in order for it to pick up the changes. This is not the case though with deleting plugins. Agents periodically send inventory reports to the server. If the report contains a resource of a type that has been deleted, the server rejects the report and tells the agent that it contains stale resource types. The agent in turn recycles its plugin container, purging its local inventory of any stale types and updating its plugins to match what is on the server. No type definitions, discovery components, or resource components from the plugin will be loaded

Use Cases for Plugin Deletion
There are a number of motivating use cases for supporting plugin deletion. The most import of these might be the added ability to downgrade a plugin. But we will also see the benefits plugin deletion brings to the plugin developer.

Downgrading Plugins
We have already mentioned that RHQ supports upgrading plugins. It does not however support downgrading a plugin. Deleting a plugin effectively provides a way to rollback to a previous version of a plugin. There may be times in a production deployment for example when a plugin does not behave as expected or as desired. Users currently do not have the capability to downgrade to a previous version of that plugin. Plugin deletion now makes this possible.

Working with Experimental Plugins
Working with an experimental plugin or one that might not be ready for production use carries with it certain risks. Some of those risks can be mitigated with the ability to disable a plugin; however, the plugin still exists in the system. Resources remain in inventory. Granted those resources can be deleted easily enough, but there is still some margin for error in so far as failing to delete all of the resources from the plugin or accidentally deleting the wrong resources. And there exists no way to remove type definitions such as metric definitions and operation definitions without direct database access. Having the ability to delete a plugin along with all of its type definitions and all instances of those type definitions completely eliminates these risks.

Simplifying Plugin Development
A typical work flow during during plugin development includes incremental deployments to an RHQ server as changes are introduced to the plugin. Many if not all plugin developers have run into situations in which they have to blow away their database due to changes made in the plugin (This normally involves changes to type definitions in the plugin descriptor). This slows down development, sometimes considerably. Deleting a plugin should prove much less disruptive to a developer's work flow than having to start with a fresh database installation, particularly when a substantial amount of test data has been built up in the database. To that extent, I can really see the utility in a Maven plugin for RHQ plugin development that deploys the RHQ plugin to a development server. The Maven plugin could provide the option to delete the RHQ plugin if it already exists in the system before deploying the new version.

Conclusion
Development for the plugin deletion functionality is still ongoing, but I am confident that it will make it into the next major RHQ release. If you are interested in tracking the progress or experimenting with this new functionality, take a look at the delete-agent-plugin branch in the RHQ Git repo. This is where all of the work is currently being done. You can also check out this design document which provides a high level overview of the work involved.

No comments:

Post a Comment