What is Plumbr overhead?
As any other monitoring tool Plumbr needs elbow room for its internal data structures as well as some extra CPU cycles to analyze performance data. In addition, Plumbr Agent uses local disk space and the communication with Plumbr Server will require some network bandwidth. In the following sections, these overheads are explained, but as you will see, many of them are environment-specific.
Due to this, we recommend to go through a testing phase before rolling Plumbr to production. In the test two runs of the application should be monitored under the same load for resource consumption, throughput and latency. One of the runs should be carried out with Plumbr Agent attached and another without the Agent. Analyzing the results will give you exact results from your environment.
In 95% of the cases Plumbr heap memory overhead is below 11MB and metaspace memory overhead below 6MB.
Plumbr Agent also consumes extra CPU cycles. The exact numbers are more application-specific, but for 95% of the applications the CPU overhead is below 8%. When measuring Plumbr Agent overhead for your environment, you should pay attention to peak CPU usages the most – when the CPU is not yet utilized to 100%, the application throughput and latency will not suffer and end users feel no impact.
In addition to the installation bundle size of 10MB, Plumbr Agent uses local disk space for two purposes:
- To store Agent log files in the logs/ folder next to the Agent’s plumbr.jar location in the filesystem. The disk space required for the logs depends upon the application being monitored, but for 95% of the applications the logs are generated at the pace under 10MB/day. By default the logs are rolled daily and stored for 30 days, after this the logs will be deleted to save disk space.
- To store temporary data which is yet to be analyzed in the Agent or which has not yet been sent to the Server. During the normal course of operations, this does not happen. The only situations requiring the local disk space usage in this case are:
- When an OutOfMemoryError is detected or Plumbr is making a heap snapshot, Plumbr Agent captures statistical information from the heap and stores this to disk to be compressed and sent to Server. When the data compression and upload is not completed, this temporary data may require at most the same amount of disk space as the JVM heap size is, although usually several times less.
- When the Agent loses connection to the Server, the monitored data is buffered on the Agent side until the connection to the Server is restored and the data can be sent to Server. The amount of data buffered by the Agent is capped at 250MB.
The bandwidth consumed by the Agent communicating with the Server is in correlation with the number of transactions monitored & the number of root causes detected. The more incoming transactions monitored on the JVM and the more root causes impacting such transactions, the more data is sent to the Plumbr Server. On average, each 1 million transactions monitored by Plumbr will require 100MB to be sent over the wire, so for example when your JVM is generating a million transactions per hour on a steady pace, the network consumption would be 28 KB/sec.