Manual

Introduction

What is Plumbr?

Plumbr is a Java performance monitoring solution that tracks end user experience and links poor user experience with an actual root cause in source code. Thus, the main benefit of Plumbr is measured in the reduced time it takes to find and fix performance issues.

The lengthy cycle of “detect – evaluate impact – try to reproduce – gather evidence – link evidence to source code” is eliminated completely thanks to Plumbr exposing end-to-end transparency to application performance.

Conceptually, Plumbr works by tracking incoming transactions and monitoring such transactions for performance. When performance is not sufficient, Plumbr exposes the root cause in the source code or configuration that causes the performance degradation.

How does Plumbr work?

Plumbr is packaged as a Java Agent you need to attach to the JVM. The Agent gets access to the Java bytecode to detect potential performance issues during application runtime. Such issues are sent to the Plumbr Server, which analyzes the issues and links the detected problems with the actual end user operation suffering from the particular problem.

Transactions

What are Transactions?

Transaction is an inbound operation captured at the JVM boundary. Transaction attributes include the service invoked, identity of the user invoking the transaction, start/end timestamps and outcome of the operation. Operations supported by Plumbr include HTTP requests to servlet API or Play 2 framework, EJB 3.0 remote calls and Swing/AWT event listeners.

Monitoring transactions builds the cornerstone for end user monitoring in Plumbr. Transactions resulting in a poor user experience are used to measure the impact of any performance issues detected. For Plumbr, it means that all the root causes detected are ranked by the number of transactions a particular root cause was the reason why end users were unhappy.

An example of such a transaction consuming an HTTP endpoint published from a typical web application, from which Plumbr is able to capture a transaction such as the following

11 Aug 2015 (18:54:14 – 18:54:22) HTTP POST; http://www.yourcompany.com/invoice/pay/1008112 -> HTTP 200 OK

From such an incoming HTTP request, Plumbr will capture:

  • The service consumed: invoice payment (invoice/pay)
  • Parameter(s): id (1008112)
  • Response: invoice successfully paid (HTTP 200 OK)
  • End time: (11 Aug 2015 18:54:22)
  • Duration (8,800ms)

Plumbr also flags each transaction based on its outcome:

  • Successful – the transaction was completed and completed fast.
  • Slow – the transaction was slower than expected.
  • Failed – the transaction was not completed successfully.

Slow Transactions

Upon transaction completion, the duration of the transaction is calculated at the JVM boundaries from the moment when the Servlet/Filter received the incoming request until the response was flushed to the stream.

If the duration of the transaction is under a service-specific threshold, the transaction is considered to be completed successfully. On the other hand, if this threshold was exceeded, the transaction is flagged as slow. This threshold can be configured to match the specific performance requirements.

Failed Transactions

On transaction completion certain criteria are used to decide whether the transaction was completed successfully or failed to complete as expected. A failure is defined differently for different monitored technologies.

  • For web applications a failure is defined via HTTP status codes. Receiving a 500-series status code with the response indicates that the transaction has failed.
  • For EJB modules, failure is defined via Exception detection. All remote method invocations which result in an Exception being thrown are flagged as failed transactions.
  • For Swing/AWT applications, failure is based upon Exception detection. All ActionListener events which result in an Exception being thrown are flagged as failed transactions

Excluding Transactions

To reduce noise in monitoring, certain transactions are ignored by Plumbr. By default, the transactions accessing commonly used static content are excluded from monitoring. The default exclusions include transactions consuming URLs ending with “png”, “jpeg”, “bmp”, “gif”, “js”, “css”, “ico”, “swf”, “woff”, “jpg”, “tmp”, “ttf”, “min” and “eot”.

Services

What are Services?

Transactions are grouped together by the endpoint that the transactions are consuming. These endpoints are called services. Detecting a service is built by using two different solutions; both are built on top of transactions detected from incoming HTTP requests:

  • Framework-based controllers. For widely adopted Java MVC frameworks such as Spring MVC and Struts, the identity of the service is represented as the class/method name of the controller invoked by the transaction.
  • URL pattern matching. For the transactions not processed by a supported MVC framework, the service is detected using the information encoded in the URL.

Services use the concept of Health to express how well a particular service is performing. Health is used by Plumbr to rank individual Services according to the ratio of successful transactions to total transactions using the following formula:

monitoring service health

Detecting a Service from MVC

When the JVM monitored by Plumbr is exposing the services via a MVC framework supported by Plumbr, the service name is extracted from the controller processing the transaction. For example, when an HTTP request such as

http://www.example.com/payments?actionId=payInvoice&invoiceId=411121

is mapped and processed by a Struts controller, the service is extracted from the controller. An example of such a controller would then be visible in the Plumbr interface similar to:

com.example.payments.PaymentAction.execute()

The currently supported MVC frameworks include:

  • Spring MVC
  • Struts 1 & 2
  • GWT 2.x
  • JSF 1.1+
  • Vaadin 6+
  • Wicket

Adding support for more MVC controllers is a work in progress.

Detecting a Service from URL

An alternative to the controller-based approach is determining the service directly from the information encoded in the URL. URL-based service detection can be based on different URL tokens.

Detecting the service from URL is best explained via the an example. Lets have a following incoming transaction from which Plumbr is about to extract the service this transaction is consuming:

http://www.example.com/shop/cart/add/iPhone6

From the transactions consuming this particular URL the service happens to be defined by the first three tokens in the context path: /shop/cart/add/. The last token identifying the product added to the shopping cart (iphone6) is actually a parameter of the service and needs to be excluded by the service detection rules. Using this approach makes it possible to group transactions accessing the same /shop/cart/add service together.

Considering the fact that URL patterns vary from application to application, you might need to configure the rules to achieve the desired grouping. Whether the service needs to be detected from URLs based on URL segments, such as /shop/cart/add/ in the example above or from a specific request parameter, your service detection rules need to take this into account.

By default, Plumbr uses the following rules to group together transactions consuming the same service with different parameters encoded into URL tokens (but not in parameters):

  • The first token in the URL is always preserved.
  • Other URL tokens are matched to detect and replace the following patterns with placeholders:
    • Numbers
    • Dates
    • Email addresses
    • Alphanumericals (strings containing at least one number and any combination of letters and symbols , :, =, _, %, ., &, –.

Applying the default rules makes it possible to group together transactions such as the following two

  • http://www.example.com/pay/invoice/2015-9822
  • http://www.example.com/pay/invoice/2013-1839

together under the same /pay/invoice service.

Configuring Service Detection Rules

The option of customizing the service detection rules is exposed to end users in November 2015. Until then, when the default detection rules do not match your application requirements, please contact us to apply the desired rules.

Configuring Service Threshold

If the duration of the transaction exceeds a certain threshold, the transaction will be flagged as slow. The decision is made by comparing the threshold set for a particular service to the actual time of completing a transaction. If the transaction completion took longer than the predetermined threshold, the transaction is flagged as slow. By default, Plumbr sets a 5,000ms threshold for all the services detected. You can change the default threshold via Settings – Thresholds menu.

Having a single threshold for all services can trigger false alarms or missing information as services exposed by a JVM can have different requirements in regards of performance. For example, an application might expose services for which the latency requirements are strict and require transactions to complete under 1,000ms. On the other hand, there can be services for which the latency is more relaxed and response times of several minutes can be considered tolerable.

Service-based thresholds allow setting the threshold per individual service. Specifying the threshold for individual services will override the generic threshold. Setting the specific threshold for a service is accessible after opening a particular service in question. In the Impact view, you get exposure to the current behavior of the service in terms of latency distribution. In the very same view you can change the threshold for this particular service.

 

Applications

What are applications?

Application is an attribute of a transaction. This attribute is similar to the service and is used to group together transactions consuming the same application. By default, Plumbr detects the application from the domain, protocol and port combination used to access the functionality.

Following examples will explain to understand the concept

When you are not satisfied how Plumbr application discovery works, you can rename, merge or split the existing applications based on the instructions in the following chapters.

Renaming applications

You might want to rename an existing application to something your team is used to. For example Plumbr might have discovered your application via http://testbilling.example.com:8080 but your team is used to calling to this application “Billing Test”.

Renaming an application is possible via Plumbr user interface accessible at https://app.plumbr.io. In the application detail view you can click on the settings icon next to the application name and change the existing name to the one you prefer.

With this change, the historical interactions referring to the previous application are also updated. So for example when you had 10 interactions with http://testbilling.example.com:8080 application you renamed to “Billing Test” then all the new interactions and existing 10 interactions will be referring to “Billing Test” app.

Merging applications

You might want to merge different applications detected by Plumbr into single application. There are two different ways of doing it

You can merge applications via user interface by renaming one or more applications in a way where after the rename the application names would be equal. For example you might have applications http://billing.com and https://billing.com detected by Plumbr. You might wish to rename these applications to “Billing Live”. You can achieve this by renaming both of the applications in UI to “Billing Live”. After this change, the previous interactions with these applications are also now referring to “Billing Live”.

In situations where you have many applications to merge, the previous approach might not be suitable due to sheer amount of rename operations involved. This can happen in situations where you are exposing your application via different virtual host for different clients, so that instead getting access to http://billing.com, each client has its unique virtual host (http://company1.billing.com, http://company2.billing.com, etc).

In such situations you can merge applications via configuration in plumbr.properties file located next to Plumbr Java Agent. In such file you can modify (or add, if not present) a parameter appId referring to the identity of the application you wish to use, similar to the following example

appId=Billing Live

Notice that this change will not be applied to existing interactions – all the interactions formerly bound to different applications will still be referring to previous applications. Only the new interactions will be linked with the new application you specified.

Splitting applications

You might find yourself in a situation where the you wish to split the application detected by Plumbr into two different applications. This can for example happen in situations where Plumbr Agents capture data being accessed via http://localhost:8080 interface from different webapps being deployed in different developer machines.

You can split applications only configuration in plumbr.properties file located next to Plumbr Java Agent. In such file you can modify (or add, if not present) a parameter appId referring to the identity of the application you wish to use, similar to the following example

appId=Billing Test

After this change all new interactions will be linked to the ID you chose. Note that the existing interactions are not affected by this change.

Users

What are Users?

Users are what the name implies – real users using the application. Each transaction monitored by Plumbr links the user’s identity with the transaction. The presence of the concept allows to monitor the experience of the individual user.

Users take advantage of the concept called Happiness to express how a particular user has experienced the application. Happiness is used by Plumbr to rank Users according to the ratio of successful transactions to total transactions using the following formula:

User happiness Plumbr

Currently the user identity is linked only with the transactions arriving over HTTP protocol. Swing or EJB – based transactions do not currently possess the capability to acquire user identity.

Distinguishing Users

Plumbr is capable of understanding who exactly are using the application being monitored. First part of this challenge is making sure Plumbr will be able to distinguish one user from another. Plumbr is capable of doing so by either being aware of a particular HTTP header being used for authentication purpose or via a specific cookie named plumbr_user_tracker.

In case the HTTP headers are used, distinguishing and identifying users is based purely on the value of a particular HTTP header value. No new cookies are created nor sent to the users of such applications.

In case the application monitored is not using HTTP headers for authentication/identification purposes, Plumbr falls back to cookie-based approach. For requests not containing particular HTTP authentication headers, Plumbr generates a cookie and injects it to the browser via HTTP response. The presence of the cookie guarantees that all the subsequent HTTP requests arriving from the same device/browser will be linked to the same cookie ID.

Cookies are thus used to distinguish between different users. Identifying the user is based on linking the cookie with the user identity stored in the server-side. If the application stores the identity in a HTTP Session then whenever a user is authenticating him/herself, Plumbr can capture the identity and link it with the cookie.

Pay attention that Plumbr can discover users only from HTTP traffic. So if you are for example using EJB calls or Swing events, Plumbr is not capable of using aforementioned approaches.

Identifying Users

Plumbr can identify the users of your application from incoming HTTP traffic, using either specific HTTP headers or linking a dedicated cookie with the value extracted from a specific HTTPSession attribute in the server-side.

In case the HTTP headers are used, distinguishing and identifying users is based purely on the value of a particular HTTP header value. All Plumbr needs to be aware in such cases is which HTTP header will contain the required information.

In case HTTP headers are not used, Plumbr falls back to using cookies. In such case, the identity of a user is extracted from a specific HTTP session attribute. The captured identity is linked to the cookies Plumbr planted on the browser (see the Distinguishing Users section for more on cookies). In such a way all requests using a particular cookie are linked to the identity captured.

By default, Plumbr is supporting the following frameworks for identity capturing:

  • JWT Bearer tokens. If your application passes the identity of the user in the HTTP request headers using JWT Bearer tokens, Plumbr will use the value of the subject extracted from the token as the identity of the user.
  • Spring Security. If the application monitored by Plumbr is using the authentication built into the Spring Security library, Plumbr will extract the user’s identity from springframework.security.core.userdetails.UserDetails#getUsername()
  • Java Authentication and Authorization Service (JAAS). If the application Plumbr is monitoring is storing the Principal instances in the HTTP Session then Plumbr will extract the identity from java.security.Principal#getName().

Supporting these frameworks means that for ~45% of the applications Plumbr is able to identify the users without any additional configuration. In case Plumbr has not been able to detect the identity, you can help Plumbr to locate the identity yourself via Identity Detection Rule menu. How to achieve this is explained in the following chapter.

Identity Detection Rules

In case Plumbr has not been able to identify the users, you would need to help us a bit and hint where the identity of the users is stored. You can do this by configuring the location the user identity via creating a new Identity Detection Rule.

Identity Detection Rules can look for the identity of users from two different locations

In case your application is passing along the identity of the user via HTTP request headers, you need to configure a HTTP Header Rule. In such case, all you need to specify is the name of the HTTP header from where Plumbr will be extracting the tracking information. The value for the header would then be similar to the following:

X-User-Authentication

In case your application is not using HTTP Headers to pass along the identity, you need to configure a Session Attribute Rule instead. In such case you need to configure two parameters: Attribute name and Extraction path.

  • Attribute name is the name of the attribute in the session context storing the user’s identity. For example Spring Security adds it’s security context under the SPRING_SECURITY_CONTEXT attribute. When the attribute specified is not found in a particular HTTP Session either because a wrong attribute name was provided or the application user is not yet authenticated, Plumbr Agent will not proceed to detect the User’s identity from Extraction Path.
  • In the Extraction path field you should define the exact path where the user identity is stored. For example in case of the Spring Security being used, the extraction path used will be: getAuthentication().getPrincipal()#getUsername().

Combining the two parameters allows Plumbr now to look for the identity. Again using the Spring Security as an example and combining the two examples above will result in Plumbr looking for:

session.getAttribute(“SPRING_SECURITY_CONTEXT”).getAuthentication().getPrincipal().getUsername();

Configuration: Example

To explain how you can configure the location of the User Identity let us check the following example. In this example application the successful authentication operation results in storing the User’s identity in the HTTP Session as:

request.getSession(true).setAttribute(“USER_CONTEXT”, new UserContext(ipAddress, username));

where request is instance of javax.servlet.http.HttpServletRequest.

Let’s also assume that the UserContext class would be designed as:

public class UserContext {
    public String getIpAddress() {
        return ipAddress;
    }

    public User getUser() {
        return user;
    }

    private final String ipAddress;
    private final User user;

    public UserContext(String ipAddress, String username) {
        this.ipAddress = ipAddress;
        this.user = new User(username);
    }

    private final class User {
        private final String username;

        public String getUsername() {
            return username;
        }

        public User(String username) {
            this.username = username;
        }
    }
}

So we are adding an instance of the UserContext into session attributes under the key USER_CONTEXT. By default Plumbr Agent does not look the identity from this location. To teach Plumbr Agent how to extract user identity in this case, we would specify the configuration as following:

  • Attribute name: USER_CONTEXT
  • Extraction Path: getUser().getUsername()

Equipped with this knowledge, Plumbr Agent will now be monitoring for setAttribute() events in all HTTPSession instances. Whenever such an event arrives and the attribute set is “USER_CONTEXT”, Plumbr starts capturing the identity.

The identity itself is extracted by invoking getUser().getUsername() on the UserContext object stored under “USER_CONTEXT” key.

If Plumbr agent fails to extract identity with defined Extraction Path then in your application logs you will see the following banner:

 *********************************************************************
* Failed to extract user identity with path: getUser().username             *
* Please check your configuration here:                                                          *
* https://portal.plumbr.eu/settings/identity-detection                            *
**********************************************************************

Using the same example above, the error message becomes clear. The username attribute in the User class is declared private and cannot be thus accessed. To fix, just change the Extraction Path to be equal to getUser().getUsername().

Root Causes

What are Root Causes?

When a transaction is flagged as slow or failed, Plumbr looks for the root cause of the underlying issue. Plumbr is able to explicitly track the problematic transactions to the actual root cause in the source code or configuration.

To be able to detect the root causes for slow transactions, Plumbr monitors the JVM for various potential performance issues and if the problems significantly contribute toward exceeding the threshold limit, links such issues to the transaction.

An example of a root cause being linked to a transaction is an expensive JDBC operation, contributing 9,000ms to the total of 11,000ms of the transaction time. By default a significant contribution is detected when a single operation (such as a lock contention event, expensive JDBC operation or GC pause) contributes more than 25% of time to the total duration of the request.

To detect root causes for failed transactions, Plumbr is capturing all the exceptions thrown while the transaction was being executed. When such an exception is been detected in context of a failed transaction it will be linked to the transaction as the root cause. An example of such a root cause being linked to a transaction is a NullPointerException being triggered during a particular transaction aborting the normal flow of operations and returning a 500-series error to the end user instead.

Slow JDBC Calls

The Plumbr Agent monitors every JDBC Type 3 and Type 4 driver detected in the application. This means that Plumbr supports almost every database vendor exposing the data storage via JDBC, including but not limited to the most widely used MySQL, Oracle, Postgres and IBM DB2 databases.

The Agent instruments all the JDBC calls which connect to databases via StatementPrepared Statement and Callable Statement APIs. When a call via such an API starts affecting the end user experience, the offending query is listed as a root cause exposing the JDBC operation executed along with the call stack from the thread executing the query. In such a way, you get access to the root cause of expensive JDBC operations down to a single line in the source code responsible for executing such queries.

In order to reduce noise and get a prioritized list of expensive database operations, Plumbr groups expensive operations triggered by the same root cause together, allowing you to rank the expensive operations based on the number of times they are detected.

 

Locked Threads

The Plumbr Agent monitors all JVM threads for lock contention events. Plumbr monitors both synchronized block/method access and java.util.concurrent locks.

For synchronized blocks/methods, Plumbr tracks the situations where a thread in the JVM executes code in a synchronized block or method and another thread tries to enter the same synchronized block/method.

For java.util.concurrent locks Plumbr will detect the situations where threads are forced to wait for events originating from the use of various java.util.concurrent classes, ranging from ReentrantLock to ArrayBlockingQueue.

When the wait times in either of the case exceeds a predetermined threshold, the root cause will be exposed, containing the following:

  • How long the thread was forced to wait before getting access to the synchronized block/method.
  • The monitor used to lock the method/code block (for synchronized usage only).
  • The name and call stack from the thread trying to enter the synchronized block/method.
  • The name and a snapshot of the call stack of the thread whose code was running in the synchronized block. The snapshot of the call stack is taken when the waiting time for the blocked thread is about to exceed the configured threshold.

Having such information allows you to zoom in to the underlying root cause with the precision of a single line in the source code, skipping the tedious and complex process of troubleshooting concurrency issues. Notice that Plumbr also binds together similar lock contention events, allowing you to rank the severity of the performance issues based on the frequency of the underlying root cause.

GC Pauses

The Plumbr Agent monitors all stop-the-world Garbage Collection pauses that take place in the JVM. If the duration of such a pause exceeds a configured threshold, an incident is created. In addition to the time and duration of the pause, a Plumbr incident contains insights that would help reduce either the duration or frequency of the long GC pauses, for example:

  • Plumbr captures a memory snapshots, exposing the most memory-hungry data structures in memory. This allows you to proceed with trimming the most resource-hungry data structures.
  • Allocation and promotion rates exposed by Plumbr, along with the memory consumption in different memory pools will give you clues about the poorly allocated heap structures..

OutOfMemoryErrors

When memory leak detection does not spot any abnormal data structure growth that looks like a memory leak, the second line of defense is set to capture OutOfMemoryError events and analyze the contents of memory when such an event occurs. When the event is captured, the native code of the Plumbr Agent captures the snapshot of statistics from JVM memory and sends it to the Plumbr Server for analysis. When the analysis completes, an incident is created containing the following information:

  • What are the “fattest” data structures currently in memory (measured in MB).
  • What is currently referencing such data structures, blocking them from being GC-d.
  • Of what do these “fat” data structures consist.
  • Where were these data structures created.

Having this information allows you to quickly understand the most likely reason why the OutOfMemoryError was triggered. In the vast majority of cases, the culprit is staring right at you in one of the top three memory consumers.

The information might look somewhat similar to the dominator tree one could acquire via heap dumps, but at a closer look you will see that Plumbr exposes a lot more information than one could capture via heap dumps (such as the allocation points and the full reference chain). In addition, the relevant information is presented in a lot more user-friendly way, saving you from days spent trying to figure out why some byte arrays seem to occupy most of the heap inside your heap dumps.

Memory Leaks

Memory leak detection monitors all object creation and collection events in order to detect patterns indicating a certain data structure growth being triggered by a memory leak. When such a data structure is detected, Plumbr exposes the root cause equipping it with the following information:

  • The size of the leak (in MB) and the speed at which the leak is growing (in MB/h).
  • The objects that are leaking.
  • What is currently referencing the leaked objects; blocking them from being GC-d.
  • The line in source code where the leaking objects were created.

 

Exceptions

The Plumbr Agent monitors the creation of Exceptions during a transaction. Whenever a transaction is flagged as failed, the chronologically last Exception is linked to the transaction as a root cause. The Exception contains the full stack trace, allowing you to zoom in to the source code and quickly fix the underlying error.

Exceptions that do not affect any transactions or exceptions used to steer control flow are not exposed. Exceptions are grouped together into root causes by Exception class, so for example all ArrayIndexOutOfBoundExceptions would be grouped together as instances of a single root cause. Different call stacks are visible from the root cause details to verify whether or not the source code would need patches in multiple locations.

 

Slow HTTP Calls

The Plumbr Agent monitors different HTTP client libraries used for connecting to remote systems over HTTP. When the HTTP calls to such remote endpoints start affecting the end user experience, the offending HTTP query is linked to user transactions as a root cause, exposing the outgoing HTTP request along with the call stack from the thread executing the query.

Slow HTTP Calls tend to perform poorly due to the remote system not responding to the call from JVM quickly enough. To solve the problem, the system being accessed via HTTP needs to be tuned for latency. If this is not an option, caching the results can also used to reduce the number of such operations.

The supported list of HTTP client libraries includes:

Slow MongoDB Operations

To detect expensive calls to a MongoDB instance, Plumbr monitors DBCollection and MongoCollection interface methods such as find() and findAndModify(). When a call via such an API starts affecting the end user experience, the operation is listed as a root cause exposing the MongoDB operation called along with the call stack from the thread executing the operation.

In order to reduce noise and get a prioritized list of expensive MongoDB operations, Plumbr groups expensive operations triggered by the same root cause together, allowing you to rank the expensive operations based on the number of times they are detected.

Plumbr supports and monitors both the 2.x and 3.x versions of MongoDB drivers.

Slow JDBC Connection Acqusition

Plumbr detects slow JDBC Connection Acquisition when JDBC connection retrieval via DataSource.getConnection() or DriverManager.getConnection() is affecting end user experience. In such case Plumbr notices this and exposes the number of transactions affected along with the wait time the transactions were forced to wait behind the connection retrieval.

Slow connection retrieval can be caused either by

  • Missing connection pool. Creating JDBC connections is expensive, so in this case please consider using pooling the connections. 
  • Uninitialized connection pool. In cases where pooled connections are initialized lazily, the first requests to the empty pool are slow. Consider initializing the pool during application startup.
  • Under-provisioned connection pool. When the number of available connections in the pool is smaller than the demand, there will be wait time in queue for the connections. Consider increasing the pool size to match the number of concurrent requests to the data source.
  • Leaking connection pool. If connections are not closed the connection pool does not know that the connection is no longer being used by the borrower thread. To fix this, add pool-specific options to pool configuration to spot leakages in pool.
  • Testing connections. To avoid unused connections in pool for becoming stale, the pool implementations often test out the connection before handing it off to the executor thread. When the test query is expensive, this can result in poor performance. Consider simplifying or dropping the tests if possible.

Slow ResultSet Processing

Slow ResultSet processing is detected when the result set fetched from database over JDBC is processed in a way it is affecting end user experience. In such case Plumbr notices this and exposes the number of transactions affected along with the wait time the transactions were forced to wait behind the JDBC result set processing.

To monitor the time it takes to process the resultset, Plumbr Agent monitors the cumulative duration of each java.sql.ResultSet.next() iteration. When the cumulative time of the iterations starts impacting end user experience, the Slow ResultSet Processing root cause is created. This root cause will expose the query whose results were processed along with the call stack through which the results were processed.  

Slow ResultSet processing is usually detected when fetching large result sets received from database. To improve the situation, consider either switching to more fine-grained queries or use database-backed paging to limit the size of result sets.

Transaction Snapshots

Plumbr is capable of monitoring for a large number of specific root causes explicitly. Unfortunately, the different technologies used in real world means that the number of ways a particular code can perform poorly is effectively unlimited. Thus a fallback was needed to cover the cases where the specific root cause can not be determined. In such situations, Plumbr Agent captures a snapshot from the suspicious transaction.

This snapshot is captured at the moment when 50% of the transaction threshold is exceeded. The snapshot will be linked to the transaction if the duration of the transaction will eventually be flagged as slow. When the transaction does not exceed the slow threshold, the captured snapshot is discarded.

To expose this information in a useful way, Plumbr aggregates those call stacks into a tree-like structure. Call stacks occurring most frequently are ranked higher in a tree. To reduce noise, non-repetitive occurrences are hidden, enabling you to focus on the most frequently captured snapshots first.

Slow Lucene Operations

Plumbr monitors Lucene indexes being used via instrumenting and monitoring all implementations of org.apache.lucene.search.IndexSearcher and org.apache.lucene.index.IndexWriter interfaces. Doing so allows Plumbr to track all the operations modifying the index or reading from the index. This support is implemented and tested on Lucene 4 and 5 releases.

By monitoring the behavior of said interfaces, Plumbr is capable of exposing:

  • The impact poorly performing Lucene indexes have on your end users
  • Actual root cause, down to a single line in source code accessing the index
  • Information about the index accessed, including the index size, accessed fields, accessor methods and more.

JDBC Multi-Queries

The Agent instruments all the JDBC calls which connect to databases via Statement, Prepared Statement and Callable Statement APIs. When a single JDBC statement executed through such APIs will impact user experience, a Slow JDBC Call is detected as the root cause. In situations where many database calls take place during a single transaction and the accumulated duration of such calls is the reason why the transaction is flagged as slow, the multi-query root cause is exposed instead.

In the details of this root cause you will find the offending queries along with the call stacks from the threads executing such queries. To minimize overhead, smart sampling is applied when exposing this data.

The Plumbr Agent monitors every JDBC Type 3 and Type 4 driver detected in the application. This means that Plumbr is able to monitor communication with almost every database vendor exposing the data storage via JDBC, including but not limited to the most widely used MySQL, Oracle, Postgres and IBM DB2 databases.

Excessive Number of …

“Excessive number of …” root causes are exposed in situations where many similar operations take place during a single transaction and the accumulated duration of such operations is the reason why the transaction ends up being slow.

For example, when just a single HTTP call is impacting user experience, it will be exposed as a Slow HTTP Call. In situations where many HTTP calls take place during a single transaction and the accumulated duration of such calls is the reason why the transaction ends up being slow, the Excessive Number of HTTP Calls root cause is exposed instead.

In most cases, the solution for such problems requires a change in application code. The performance gains can often be achieved by applying either of the following guidelines:

  • Reducing the amount of operations invoked via changing the amount of data requested
  • Batching the operations together instead of launching them via a single call.

File Stream Operations

The Plumbr Agent monitors file reading and writing operations performed by using FileInputStream and FileOutputStream classes. When the wait time for the read and write operations starts impacting end user experience, Plumbr recognizes this and links a File Stream Operation root cause with the slow transaction. The root cause exposed will contain the following information:

  • File(s) being read/written, along with their path in file system, size and other relevant attributes.
  • Call stack from the thread executing the operation, zooming you right to the line in source code accessing the file system.

There are several common problems that happen when reading from or writing to a file stream and leading to slow transactions:

  • Lack of buffering: each read or write operation incurs overhead, depending on the operating system, file system and hardware. Instead of reading or writing one byte at a time, a much more performant approach would be to do it in bulk. A simple approach would be to make use of a BufferedInputStream or BufferedOutputStream
  • System issues: like we said above, the performance of file operation depends on the operating system, the file system and the hardware. It is sometimes the case that one of these becomes the bottleneck, and even a single file stream operation could take tens of seconds.

File Attribute Operations

The Plumbr Agent monitors file attribute querying performed by using methods such as File.exists(), File.isDirectory(), File.canWrite() and so on. While individual operations like that are usually handled very quickly, typically under a few microseconds, having a large number of them may result in a slow transaction. One of the most common cases is recursively walking a large directory that contains millions of files.

When the wait time for the attribute checking starts impacting end user experience, Plumbr recognizes this and links a File Attribute Operations root cause with the slow transaction. The root cause exposed will contain the following information:

  • File(s) being accessed, along with the operation performed (exists(), isDirectory(), etc)
  • Call stack from the thread executing the operation, zooming you right to the line in source code accessing the file system.

Health

What is Health?

Health is a single number representing the performance of a service or JVM. Health is expressed in percentage, ranging from 0 to 100%. 100% healthy service or JVM indicates a situation where all transactions in this JVM/service are completing successfully and completing fast. 0% on the other hand would indicate a JVM where none of the transactions in this JVM/service managed to complete successfully. In addition to communicating the way the JVM/service is performing, health is also used to power the alert system in Plumbr.

Health is being calculated according to the ratio of slow and failed transactions using the following formula:

Health = (# of Successful Transactions + # of Slow Transactions / 2) / # of Total Transactions

For example: if Plumbr has

  • monitored 10,000 transactions in a JVM during the selected time period
  • detected that 500 of those transactions have been slow
  • detected that 500 of those transactions have failed

then the health of this JVM is 92.5%, calculated using the formula above as:

92,5% = (9,000 + 500 / 2) / 10,000

The health for services/JVMs is always calculated for a specific time window (12th – 14th July 2016, last 2 hours, etc), so it is a measure changing over time.

Health Thresholds

Low health for a service or JVM is a symptom requiring attention. Plumbr allows you to define the threshold below which the attention is required. If the health of a JVM or a service falls through such thresholds, alerts are generated to notify you via external alert channels. The same concept is used to communicate health violations in the Plumbr UI via color-coding services and JVMs.

Health thresholds are built using the following simple expression:

A service or a JVM is unhealthy if its health falls below [certain percentage].

Such health thresholds can be configured either at JVM or service level. By default, Plumbr ships with the following health thresholds:

  • A JVM is unhealthy when the health of the JVM drops below 95%
  • A service is unhealthy when the health of the service drops below 95%.

These policies are used to color-code JVMs and services. Any JVM or service can be color-coded using two different colors:

  • Green – the JVM/service is not violating any health policies in the time range selected.
  • Red – the JVM/service is violating a health policy in the time range selected.

When the JVM is not exposing any transactions for Plumbr to monitor or when a service has been consumed by under 5 transactions during the selected period, health cannot be calculated. In such cases, health is expressed as “N/A” and color-coded with grey.

Uptime

What is Uptime?

In addition to slow and fast transactions to measure health, Plumbr also monitors JVM uptime to detect situations where the JVM itself has crashed and is unavailable to service any incoming transactions.

Uptime monitoring is built upon active checks from the Plumbr Server to the Agent that occur once every five seconds. Such checks are called heartbeats. When eight or more sequential heartbeats go unanswered by a JVM, a downtime event is registered.

Such downtime can be either planned downtime (caused by a graceful JVM shutdown) or unplanned downtime (e.g. caused by an OutOfMemoryError or a segmentation fault). For downtime events where Plumbr is able to capture the reason why downtime was triggered, Plumbr also exposes it as the root cause.

Alerts

What are Alerts?

Alerts are used to communicate health and uptime degradations. Alerts are designed to be actionable, triggering only when the performance impact is high enough to require action.

Each alert contains the following information:

  • The alert rule that triggered the alert.
  • The time when the alert was triggered.
  • The JVM or service that triggered the violation.
  • The severity/impact of the alert.

Alerts are triggered when the conditions in alert rules are matched. Alerts are sent to alert channels, which you can configure separately.

Alert Rules

Alert rules are used to create actionable alerts and send an alert to a correct channel when the health or downtime of a particular JVM or service has been spanning for a certain period. Some examples of such rules are:

  • Send a message to [Hipchat room MyCompany Sysadmins] when [any service] has been unhealthy for more than [1 hour].
  • Send a message to [Hipchat room MyCompany Sysadmins] when [any JVM] has been down for more than [5 minutes].
  • Send [an email to admins@mycompany.com] when the [e-shop@example.com JVM] has been unhealthy for more than [5 minutes].
  • Send [an email to bigboss@mycompany.com] when the [e-shop@example.com JVM] has been unhealthy for more than [30 minutes].

Each such alert rule will trigger the creation of an alert. Alerts are sent out to alert channels, which you can configure separately.

To avoid alert fatigue/flapping, Plumbr sends out alerts only once during the 24 hours when the alert is triggered from the same JVM or service.

By default, Plumbr alert rules are set to trigger an alert when:

  • Any JVM has been unhealthy for more than 10 minutes.
  • Any service has been unhealthy for more than 15 minutes.
  • Any JVM has been down for more than 10 minutes.

You can either modify existing rules or add new rules based on your specific needs.

Alert Channels

Alert channels are used to communicate alert rule violations to external channels. Examples of such channels include email, SMS, issue trackers, chatrooms and other monitoring solutions. Adding, deleting and changing channels is possible via Settings – Alert Channels menu. Plumbr currently supports the following channels:

  • Email
  • HipChat
  • JIRA
  • PagerDuty
  • Slack

When you have configured a new alert channel, the newly created channel can be used to create new alert rules sending alerts to the channel.

Email

Email channel sends alerts to the email whenever any of the alert rules using the channel is triggered. When choosing email channel, all you need to do when configuring is to provide the email address where the emails are sent. After this, all the alerts triggered by Plumbr will be sent to the email address provided according to the alert rules present in the settings.

Email is also the alert channel configured by default during the sign-up. The e-mail address provided during the sign-up process is used as the endpoint to send alerts to.

HipChat

HipChat channel sends alerts to the Hipchat room whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide an existing HipChat room name and notification token. To acquire the token, follow the steps below:

  1. Go to https://hipchat.com/ and sign in
  2. Navigate to “Rooms” section
  3. Select the room where you would like to receive Alerts from Plumbr
  4. Navigate to “Tokens” menu
  5. Fill the form under “Create new token” section
  6. Choose type “Send Notification” from the drop down menu
  7. Enter “Plumbr” to the label input
  8. Click “Create”
  9. Copy the created token to the “Notification token” input field

When the channel is created all the alerts triggered by Plumbr will be sent to the HipChat room specified according to the alert rules present in the settings.

JIRA

JIRA channels creates an issue in the JIRA whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide the following information about the issues created:

  • JIRA Base URL. The URL your JIRA is located. You can acquire this URL from https://your-company-name.atlassian.net -> Settings -> System
  • Username. User present in your JIRA who will create the ticket. We recommend
  • Password. Password for the user specified.
  • Project Key. The key of the project the issue will be created in.
  • Issue Type. The type of the issue created in JIRA (Bug, Task, …)

When the channel is created all the alerts triggered by Plumbr will trigger creation of new issues in the JIRA instance specified according to the alert rules present in the settings.

PagerDuty

PagerDuty channels send alerts to PagerDuty whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide the PagerDuty integration key via following steps:

  1. Go to https://your-company-name.pagerduty.com/services/new
  2. Fill “General Settings” form
  3. In “Integration Settings” section choose “Use our API directly”
  4. Choose desired “Incident Settings”
  5. Click on “Add Service” button
  6. Copy generated “Integration Key” to corresponding input field

When the channel is created all the alerts triggered by Plumbr will be sent to the PagerDuty channel specified according to the alert rules present in the settings.

Slack

Slack channel sends alerts to the Slack channel specified via a webhook. To set up the channel, you would need to provide the name of an existing Slack channel and a webhook. TO acquire the webhook URL, follow the steps below:

  1. Go to integration settings in your Slack account
  2. Choose a desired channel from “Post to Channel” dropdown
  3. Click on “Add Incoming WebHooks Integration” button
  4. Copy generated URL to the “Webhook URL” input field

When the channel is created all the alerts triggered by Plumbr will be sent to the Slack channel specified according to the alert rules present in the settings.

Configuration

Agent Configuration

Agent configuration is stored in the file plumbr.properties located next to the Plumbr Agent .jar file. This configuration is used to monitor a single JVM in one machine. When monitoring multiple JVMs in the same machine make sure that every JVM uses a different Plumbr installation to avoid clashing in the configuration.

If you cannot use the configuration via property files, an alternative is to configure the Agent to specify parameters in the JVM startup script using -D parameters, prefixing each parameter with the “plumbr.” namespace. So, for example, you could specify the accountId, jvmId and serverUrl parameters for your JVM also via:

java -Dplumbr.accountId=a8nd2bar -Dplumbr.jvmId=PtJ -Dplumbr.serverUrl=https://app.plumbr.io

Please note that if you do not use a property file, you will need to pass the following properties via parameters: accountId, jvmId, serverUrl, excludePackages, logFile, logLevel. Copy their values from the property file that you have.

Basic configuration

Configuration parameters in this section are required for the Plumbr Agent to connect to the Plumbr Server, link the JVM to your account, and identify the JVM so that you could distinguish between the different JVMs monitored by Plumbr.

  • accountId – your account identifier, which binds this Agent to your account in the Plumbr Server. This identity is generated and embedded into the downloaded Agent configuration for you. During normal use, you should not change the value of the parameter.
  • jvmId – JVM identifier, binding data from this particular JVM to a correct JVM in the Plumbr Server. When this identity is not provided, the connected JVM gets assigned a temporary identifier, which will not survive over JVM restarts. In order to have the data connected to the same account, provide the identifier either as a value of this property or via the server-side UI.
  • serverUrl– the server to which the Agent connects. If you are using On Demand Plumbr, make sure the value refers to https://app.plumbr.io If the Agent is connecting to a Plumbr Server installed in your premises, make sure you have specified the correct server URL.

Proxy configuration

When your network configuration requires outgoing communication to pass a proxy server you can set up the communication between the Plumbr Agent and the Server via a proxy. Specifying the values for these parameters redirects the traffic from the Agent to the Server via the proxy server specified in proxyUrl.

  • proxyUrl– the proxy URL that you can use to connect to the Plumbr Server if a direct connection from Agent is not possible. If proxy is used, this setting is mandatory; other proxy settings are optional. An example of the parameter: proxyUrl=http://squid.mycompany.com:3128.
  • proxyAuthUser– the username for proxy authentication. Note that the Plumbr Agent only supports Basic authentication.
  • proxyAuthPassword– the password for proxy authentication. Note that the Plumbr Agent only supports Basic authentication.

Logging configuration

Parameters in logging configuration are used to tune the logging of the Plumbr Agent.

  • logConf – the location of the Logback configuration file Plumbr Agent uses for logging purposes. Detailed logging configuration is embedded in the referred XML file which you can tune to suit your needs.
  • tmpDir– the location of the temporary files generated by Plumbr during runtime. Such temporary data includes the buffered data Agent has not yet sent to the Server and temporary data structures used during Agent-side analysis. The location is relative to the location of the Plumbr Agent in the file system.
  • doCleanup– whether or not Plumbr deletes the temporary files after they are no longer needed. Switch this to false only when told so by Plumbr support.

Excluding certain data structures from being monitored

On rare occasions Plumbr can detect false positives – the behavior of your application looks like a leak or a lock, but in reality these are harmless pieces of code. The best example is a well-configured cache, which continuously grows over time although you know that it is configured properly and will not blow up your application. If you want to exclude some root cause from further detection, please contact us at support@plumbr.eu and we will guide you through the steps required.

Network Configuration

When your network configuration is blocking Plumbr Agents from connecting to the Plumbr Server, you will see a message similar to the following in your JVM standard output logs:

****************************************************************
* Plumbr Server not responding -                               *
* cannot connect to https://portal.plumbr.eu.                  *
* Retrying in 60 seconds.                                      *
****************************************************************

You should also notice that although the Server cannot be reached, the JVM will still start, it will just not be monitored by Plumbr.

To verify the problem is related to network configuration only, try connecting to portal.plumbr.eu port 443 from the machine you are installing Plumbr, to see whether the connection is allowed. This can for example be achieved via telnet, similar to following:

$ telnet portal.plumbr.eu 443
Trying 54.171.1.110...
Connected to portal.plumbr.eu.
Escape character is '^]'.

When the connection is successful, you should see a message Connected to portal.plumbr.eu, similar to the example above. When the connection fails, the network configuration is blocking connections to portal.plumbr.eu port 443.

To overcome the situation, connections via proxy servers or relaxing the firewall configuration are the first two options recommended. If you can not change the network configuration, you should turn to our On Premise offering where you can install the Server component in your network.

Proxy configuration

When your network configuration requires outgoing communication to pass a proxy server, you can set up the communication between the Plumbr Agent and Server via proxy. Specifying the values for the following parameters in the plumbr.properties file located next to the Plumbr Agent redirects traffic from the Agent to the Server via the proxy server specified in the proxyUrl.

  • proxyUrl– the proxy URL that you can use to connect to the Plumbr server if direct connection is not possible. If proxy is used, this setting is mandatory; other proxy settings are optional. An example of the parameter: proxyUrl=http://squid.mycompany.com:3128.
  • proxyAuthUser– the username for proxy authentication. Note that the Plumbr agent only supports Basic authentication.
  • proxyAuthPassword– the password for proxy authentication. Note that the Plumbr agent only supports Basic authentication.

Firewall configuration

Another option for bypassing connectivity issues is to check your firewall configuration. If the outgoing connections from the Plumbr Agent are blocked by a firewall make sure the connection to portal.plumbr.eu:443 is allowed in your firewall configuration.

The exact configuration steps for this are firewall-specific. See your vendor manuals for further information.

Upgrading Plumbr Agent

To upgrade Plumbr Agent you need to go through the following steps:

  1. Download the new Plumbr Agent from here.
  2. Backup the current Agent installation.
  3. Unzip the newly downloaded agent .zip file to the folder you wish to install the Plumbr Agent to.
  4. Copy plumbr.properties from the backup to new Agent installation.
  5. Update startup parameters of the JVM you are monitoring to point to new agent JAR file:
    -javaagent:path-to-new-agent/plumbr.jar
  6. Restart the JVM you want to monitor.

Upgrading Plumbr Server

Plumbr Server update is based on building new Docker images and mounting the data to the newly built images. No data stored in the docker images is thus preserved, so you cannot expect any manual configuration changes made to existing docker machines to be preserved.

To upgrade Plumbr Server you need to go through the following steps.

  1. Download a new version of Plumbr Server distribution from Download Center.
  2. Extract downloaded archive on top of the existing plumbr-server folder replacing all existing files.
  3. Restart Docker Compose project by running “docker-compose up -d” from plumbr-server This will download all updated Docker images and then recreate all affected containers.
  4. After process completes, new version of Plumbr Server is now available at same URL as previously.

As a next step, the Plumbr Agents connecting to the Server need to be upgraded. You can do this independently of the Server update, but for consistency you need to eventually also upgrade the Agents.

On-Premise OOM analysis

Running analysis when an OutOfMemoryError occurs in an application is computationally intensive and requires an amount of RAM proportional to the number of objects in the JVM. Therefore, when running your own Plumbr Server on-premise, additional steps must be taken to find root causes for an OutOfMemoryError.

Semi-automatic analysis

The OutOfMemoryError meta-information snapshot will always be automatically sent by Plumbr Agent to the Plumbr Server after such error occurs. A corresponding Root Cause screen will appear in Plumbr Server, prompting you to do the following:

  1. Select or create a machine with an amount of RAM as specified on the page
  2. Download a .jar file that would perform the analysis to that machine
  3. Run it, supplying the required amount of heap to the JVM by specifying the -Xmx argument

The program will automatically download the meta-information from Plumbr Server, run the analysis and upload a complete report back to Plumbr Server. This assumes two things:

  • You have specified the plumbr.server.url property in the server properties or set it in the web interface
  • The machine that runs the analysis has access to the machine where Plumbr Server is running

In case condition (1) is not met, you can still run the analysis by supplying a property to the jar file: -Dportal.url=https://address-of-your-portal-installation.

Running analysis from behind a firewall

In case access to the Plumbr Server is restricted by a firewall, some additional manual actions are required:

  1. Click on “Detailed information” on the Root Cause page to follow the instructions
  2. Copy the meta-information files named oom_dump_v4.tbz2 and oom_dump_info.txt to the target machine
  3. Supply the path to the copied .tbz2 file to the jar running command
  4. Supply the path ot where the report should be saved, e.g. report.bin
  5. Run the analysis, e.g. java -Xmx1g -jar analyze-oom ~/oom-analysis/1/oom_dump.tar.bz2 report.bin
  6. Upload the report.bin file to the corresponding form on the Root Cause page

Data retention
By default, meta-information files will be immediately deleted upon successful analysis. Data files that date back more than 30 days will also be deleted, even if no analysis was performed on them. At any time you may manually delete the dumps from the ${plumbr.server.home}/data/dumps folder.

Identifying the JVM

When you configure your application to run with Plumbr, you have an option to identify this JVM with a name suitable for your deployment. “Payment Live” or “Reporting QA” as examples can give you an idea what this ID can look like. Assigning the ID can be done in three ways:

  • By setting “jvmId” property in the plumbr.properties file. This file resides in the same directory with “plumbr.jar” file. Just find in that file a line starting with “jvmId=” and append the selected name, similar to the example: “jvmId=Payment
  • By providing your JVM with “plumbr.jvmId” system property. Just add “-Dplumbr.jvmId=Payment” to your application command line. E.g. “java -Dplumbr.jvmId=Payment -javaagent:/path/to/plumbr/plumbr.jar …”. This option is useful in dynamic environments, where JVMs are created and destroyed dynamically via scripts.
  • When your application is already running and is connected to the Plumbr Server, then by going to this JVM detail view and by clicking on JVM name, providing a new name and clicking “Save”.

If you don’t manually identify your JVM as described above, it gets assigned an auto-generated ID. The generated ID will be ephemeral in the sense that it will not persist between JVM restarts. When you restart your application and have not provided the ID yourself, then a new JVM with a new identifier will be created in Plumbr Server.

Agent startup checks

While starting the Agent, Plumbr goes through the following checks to verify the integrity of the installation:

  1. Verifying file system permissions: whether or not the folder Plumbr Agent resides and its subdirectories are readable and writable by the user launching the JVM Plumbr is attached to.
  2. Verifying the configuration: whether or not all the required configuration parameters are present and valid.
  3. Verifying the support for the environment: whether the OS, JVM and application server used are supported by Plumbr
  4. Verifying the connectivity: whether the Plumbr Agent can connect to Server.
  5. Verifying the Agent version: whether the Agent is still supported by the Server the Agent is connecting to
  6. Verifying the subscription: whether the account has an active subscription or has the subscription period expired
  7. Other miscellaneous checks, for example including but not limited to:
    1. If a proxy server is used to connect to Plumbr Server, then whether the proxy can be connected with the credentials provided
    2. Whether or not the jvmId parameter used to identify the JVM connecting to Server is unique
    3. Whether or not multiple JVMs are using the same Plumbr installation.
    4. Check for the jvmId does belong to the account it tries to connect to.

Some of the steps can fail due to various reasons. In such case the Agent will not be attached and the JVM starts without the Plumbr Agent monitoring the end user experience. The reason for the failure will be exposed in the server’s standard output.

So in order to find whether or not the Agent failed to initialize, search the log files for “Plumbr” phrase. In case you discover one of the error messages listed below, follow the instructions specified in the particular error message.

Verifying file system permissions

********************************************************************
* Plumbr encountered a filesystem permissions error.               *
* The user that runs your Java process has no write access to the  *
* /users/me/plumbr                                                 *
* Plumbr needs write access to the whole directory.                *
*                                                                  *
* Please ensure that the user that runs your Java process has      *
* read and write permissions for that directory,                   *
* its sub-directories and files inside it.                         *
*                                                                  *
* Check out https://plumbr.eu/support/agent-configuration          *
* for more information or contact support@plumbr.eu                *
********************************************************************

When you encounter this error message in log files, it indicates that the user launching the JVM Plumbr Agent is attached to does not have enough permissions to access the Plumbr Agent installation directory.

Plumbr Agent needs to be able to read and write the folder the Agent is installed (and its subdirectories). In order to proceed, you need to ensure the user running the Java process the Agent is attached to has read and write permissions for both the Agent’s installation folder and its subdirectories.

Verifying the configuration

******************************************************************************
* Plumbr is missing the following required properties: serverUrl.            *
* Either make sure the plumbr.properties file is present next to plumbr.jar  *
* or specify individual properties via -D parameters in your startup script. *
*                                                                            *
* Check out https://plumbr.eu/support/agent-configuration                    *
* for more information or contact support@plumbr.eu                          *
******************************************************************************

When you face such a banner in your JVM startup scripts then either the entire configuration stored in the file plumbr.properties file or some of the mandarory properties are missing. The plumbr.properties file is located next to the Plumbr Agent .jar file.

First step to overcome the problem is to make sure the Plumbr installation directory is intact and you have not extracted only the Agent’s plumbr.jar file. In case the file is present, check the content of the error message indicating the missing mandatory parameter(s) and add such parameters to the file. When in doubt, check the Agent Configuration page in our support materials.

An alternative to have the configuration present in the plumbr.properties file in the filesystem is to configure the Agent to specify parameters in the JVM startup script using -D parameters, prefixing each parameter with the “plumbr.” namespace. So, for example, you could specify the accountId, jvmId and serverUrl parameters for your JVM also via:

java -Dplumbr.accountId=a8nd2bar -Dplumbr.jvmId=BillingProduction -Dplumbr.serverUrl=https://app.plumbr.io

Verifying the support for the environment

**********************************************************************************************
* Environment you are trying to run Plumbr in is not supported.                              *
* Windows XP operation system is unsupported. Minimum supported Windows version is Windows 7.*
*                                                                                            *
* Check out the support page https://plumbr.eu/support/is-my-environment-supported-by-plumbr *
* for the list of supported environments.                                                    *
**********************************************************************************************

When facing a banner similar to the one above in your log files, the environment the Plumbr Agent is running in is not supported by the Agent. The exact message will be different, depending on which unsupported operating system, JVM vendor/version or application server was detected in the environment.

To overcome the issue, consult the list of supported environments in our support documentation to find out whether or not you have a possibility to use Plumbr in an environment officially supported by Plumbr.

Verifying the connectivity

***********************************************
* Plumbr Server not responding -              *
* cannot connect to https://app.plumbr.io. *
* Retrying in 60 seconds.                     *
***********************************************
***************************************************************
* Plumbr Server not responding -                              *
* cannot connect to https://app.plumbr.io.                 *
* Retrying in 60 seconds.                                     *
*                                                             *
* In case your network configuration is blocking connections  *
* to Plumbr servers, see how to configure proxy server and/or *
* firewall https://plumbr.eu/support/network-configuration.   *
*                                                             *
* In case your company policy does not allow using externally *
* hosted services, try out Hosted Plumbr which does not       *
* require external network connections                        *
* https://plumbr.eu/support/hosted-plumbr                     *
***************************************************************

When you face a banner similar to the either of the above in your JVM log files, then it indicates the Agent deployed cannot connect to Plumbr Server over the network. Plumbr Agents will start without the presence of the Server, but as there is no endpoint to send the harvested data, then the Server cannot analyze the gathered data and thus you receive no value from Plumbr.

Pay attention that when the Server endpoint is just temporarily unavailable, the Agent will buffer the data locally. The Agent also periodically retries to connect to Server and when the Server (re)appears, the buffered data will be sent to Server.

To verify that the problem is related to network configuration only, try connecting to portal.plumbr.eu port 443 from the machine you are installing Plumbr, to see whether the connection is allowed. This can for example be achieved via telnet, similar to following:

$ telnet app.plumbr.io 443
Trying 54.171.1.110...
Connected to app.plumbr.io
Escape character is '^]'.

When the connection is successful, you should see a message Connected to portal.plumbr.eu, similar to the example above. When the connection fails, the network configuration is blocking connections to portal.plumbr.eu port 443.

To overcome the situation, connections via proxy servers or relaxing the firewall configuration are the first two options recommended. If you can not change the network configuration, you should turn to our On Premise offering where you can install the Server component in your network.

Verifying Agent version

During the startup, the Agent version will be compared to the Server version to verify whether or not the Agent is still supported by the Server the Agent is connecting to. In general we use the following policy for Agent version support

  • Servers accept connections from Agents up to one year older than Servers. Agents older than one year will be rejected by Server.
  • Servers will not be compatible with Agents released later than the Server.

Recommending to upgrade

************************************************************************************************
* You are using version 16.08.02 of Plumbr. We recommend upgrading to the latest version 12356.   *
* Download the latest version of the agent here: https://app.plumbr.io/download/agent/16.09.20 *
************************************************************************************************

When facing a banner like the one above, your currently used Agent is between 1 to 3 months behind the latest and greatest Agent available. As we add new features almost every month, you should consider upgrading, but you still have nothing to worry about.

Strongly recommending to upgrade

********************************************************************************************************
* You are using version 16.08.02 of Plumbr, which will be supported only until 2017-01-01                 *
* Download the latest version (16.12.12) of the agent here: https://app.plumbr.io/download/agent/16.12.12 *
********************************************************************************************************

When facing the banner above, your current Agent is between 3 and 6 months older than the Server the Agent connects to. The Server will still support the connecting Agent, but you should start planning for the Agent version upgrade t

Deprecated Agent version

********************************************************************************************************
* You are using deprecated version 16.08.02 of the Plumbr agent.                                         *
* Download the latest version (17.04.02) of the agent here: https://app.plumbr.io/download/agent/17.04.02 *
********************************************************************************************************

When facing the message above, then the Agent connecting to the Server is from 6 to 12 months older than the Server the Agent connects to. The version is already deprecated and will be unsupported when the 12 months limit will be hit. You should plan for Agent version upgrade as soon as possible.

Unsupported Agent version

**************************************************************************************************************
* You are using unsupported version 16.08.02 of the Plumbr Agent, which can no longer connect to Plumbr Server. *
* Please upgrade to the latest version of the agent from: https://app.plumbr.io/settings/download-center *
**************************************************************************************************************

When seeing an error banner above, the Agent can no longer connect to the Server as the Agent version is older than the oldest Agent version supported by this particular Server. You need to upgrade to a new Agent version in order to proceed benefitting from Plumbr.

Agent version newer than Server version

*************************************************************************************
* You are using unsupported version 16.08.02 of the Plumbr Agent                       *
* Plumbr Server accepts only agents released before the server                     *
* Please refer to Download Center https://app.plumbr.io/settings/download-center *
* to get supported version of Plumbr Agent                                         *
*************************************************************************************

When facing an error message above in your JVM logs, the Agent connecting to the Server is newer than the Server. As Server accepts connections only from Agents it knew existed when the Server was released, this “agent from the future” is not allowed to connect.

When facing this message you are using our On Premise offering where you have installed the Server yourself. This means you have two options:

  • Preferably you should upgrade the Server, so that new Agents with new and shiny features can apply all their new features in your deployment.
  • If this is not possible, you need to use older Agent version, so that the Agent is not released later than the Server it connects.

Verifying the subscription

Plumbr is a subscription-based software with 14-day free trial subscription available. When the subscription has expired, the data on your account is kept for 10 more days, but you can no longer monitor the applications with Plumbr. After the 10 days have passed from your subscription expiring, the data on your account will be permanently deleted.

Expiration warning

***************************************************************************************
* Your subscription will expire on 2016-01-01.                                        *
* From 2016-01-01 Plumbr will not monitor your JVM(s) any more.                       *
* To renew your subscription get in contact with Plumbr Sales at sales@plumbr.eu.     *
***************************************************************************************

When encountering such a warning in your log files, your subscription will expire soon. If you wish to benefit from Plumbr after the subscription period, you should start planning for the subscription extension.

Paid account expired

**********************************************************************************************
* Your subscription expired on 2016-01-01 and Plumbr is not monitoring your JVM(s) any more. *
* Your data will be available for 10 days, after which your account will be deleted.         *
* To renew your subscription go to https://app.plumbr.io/payment                          *
**********************************************************************************************

If you encounter the message above, it means your subscription has expired. The data is still present on your account, but you can no longer monitor any JVMs. Whenever the 10 days have passed from the subscription expiration, the data will be permanently deleted.

To keep benefitting from Plumbr, extend your subscription.

Paid account deleted

******************************************************************************
* As you subscription was not renewed your Plumbr account has been deleted   *
* and you cannot monitor your JVM(s) with Plumbr any longer.                 *
* To start monitoring your JVM(s), sign up and purchase Plumbr               *
* one year subscription: https://app.plumbr.io/payment                    *
******************************************************************************

Encountering the message above indicates your subscription has expired and more than 10 days have passed from the expiration date. The data on your account has been deleted.

The way to start reusing Plumbr is to purchase a new subscription.

Trial account expired

***************************************************************************************
* Your free trial is now expired and Plumbr is not monitoring your JVM(s) any more.   *
* Your data will be available for 10 days, after which your account will be deleted.  *
* To activate your account go to https://app.plumbr.io/payment                     *

The message above indicates that the free trial you used has expired. You can no longer monitor the JVMs with Plumbr. The data gathered during the trial is still available for you until 10 days have passed from the trial expiration. After this, the data on your account will be permanently deleted.

Trial account deleted

*************************************************************************************
* As your free trial expired your Plumbr account has been deleted and you cannot    *
* monitor your JVM(s) with Plumbr any longer. To start monitoring your JVM(s),      *
* sign up and purchase Plumbr subscription: https://app.plumbr.io/payment        *
*************************************************************************************

The banner indicates that your free trial has been expired and the data on your account has been deleted.

If the trial demonstrated the value of Plumbr to you, then the way to keep using Plumbr is to switch to a paid subscription.

Miscellaneous checks.

Besides the categories above, Plumbr Agent performs a number of other checks, which can also result in error/warnings being printed into the JVM standard output.

Connecting to wrong Server

***************************************************************************************************
* The account does not exist at htts://my-plumbr-sever-installation:8080/.                                         *
* Check plumbr.properties to make sure you are connecting to the correct Plumbr Server instance.  *
* If indeed so, contact support@plumbr.eu                                                         *
***************************************************************************************************

When facing the error above, the Agent is connecting to a Server using the accountId the Server is not aware of. This usually means you are connecting to an incorrect Plumbr Server. If this is the case, just make sure the serverUrl in plumbr.properties is pinpointing towards the correct Server.

When it is not the case, contact our support@plumbr.eu and let us figure out the source for the problem.

Lambda support in early Java 8 releases

*******************************************************************************************************
* There is a known issue with Java versions 1.8.0 - 1.8.0_31 where using Java agents                  *
* together with code that uses dynamic invocation (such as lambdas or dynamic languages)              *
* may cause segmentation faults. If these are not used in your application, your JVM may be safe,     *
* but for production sites we do not recommend using Plumbr with Java 8 versions older than 1.8.0_40. *
* To make sure this problem will not occur, either:                                                   *
*   a) Upgrade your Java version to 1.8.0_40 or newer                                                 *
*   b) If upgrading Java version is not possible, turn off JIT compilation for java.lang.invoke       *
* package by specifying -XX:CompileCommand=exclude,java/lang/invoke/ in your JVM startup script.      *
*******************************************************************************************************

When facing the warning above, you are running on an early Java 8 build which are known to contain bugs which will affect your JVM when you are making use of lambdas or dynamic languages along with any Java Agents attached to the JVM.

The application might work fine, but in order to make sure you will not run into any issues, please consider either

  • upgrading the Java version to 1.8.0_40 or newer
  • Turning off JIT compilation, as specified in the error message.

Native agent loading failure

*****************************************************************
* Native agent could not be loaded from                         *
* /users/me/plumbr                                              *
* This may be caused by missing read or execute permissions for *
* plumbr home directory or one of its subdirectories.           *
*                                                               *
* Check out https://plumbr.eu/support/agent-configuration       *
* for more information or contact support@plumbr.eu             *
*****************************************************************

When facing the error above, the filesystem permissions for native agents located in lib/ folder next to the Agent’s plumbr.jar file are not readable or executable by the user launching the JVM Plumbr is attached to.

To solve the problem you would need to make sure the user launching the JVM Plumbr Agent is attached has read and execute permissions for the lib/ folder and its subdirectories.

In all honesty, this is one of the cases we do not fully understand can be created. So if you are facing this situation, we would really appreciate if you could contact support@plumbr.eu so we could understand how on earth this permission issue can even happen.

Proxy credentials missing

**********************************************************************************
* The Proxy server at your-proxy-server.ip:3039 is requesting a username and a password.           *
* Please add them to plumbr.properties file. You can find the instructions here: *
* https://plumbr.eu/support/manual#network-configuration                         *
**********************************************************************************

When seeing the banner above in your JVM standard output, then you are trying to connect from the Agent to Server using a proxy server. The proxy server requires authentication but the configuration you provided in plumbr.properties does not contain username and password.

To solve the issue, provide the correct username and password in the Plumbr configuration to access the proxy.

Multiple JVMs using the same jvmId

*************************************************************************************************************************
* The JVM ID “myjvmid" is already in use. This happens when multiple JVMs are connecting                                *
* to Plumbr Server using the same jvmId configuration parameter specified in plumbr.properties file.                    *
* In order to solve the problem, download new Plumbr agent from here: https://app.plumbr.io/settings/download-center *
* and make sure the agent location in the new JVM refers to a different Plumbr installation in file system.             *
*************************************************************************************************************************

When facing the error message above, the Plumbr Agent was not started. It was so due to a JVM already being connected to the Server using the same jvmId as specified for the rejected JVM.

This can happen when you have copied the Plumbr installation used by one JVM and are using it for the second JVM. The jvmId specified in the plumbr.properties file must be unique, so to solve the issue you would need to make sure all the JVMs you want to monitor with Plumbr have a unique jvmId specified in the configuration (either in plumbr.properties or passed as -D parameter).

Multiple JVMs accessing the same Plumbr installation

**************************************************************************
* Working directory is locked. This might happen when you launch several *
* applications from the same plumbrHome at the same time.                *
* In order to solve the problem, download new Plumbr agent from here:    *
* https://app.plumbr.io/settings/download-center                      *
* and make sure the agent location in the new JVM refers to a different  *
* Plumbr installation in file system                                     *
**************************************************************************

When you encounter the error above, your are trying to launch two JVMs both loading the Plumbr Agent from the same location in the filesystem. Pay attention that each JVM monitored by Plumbr must use an unique Plumbr installation.

To overcome the issue, create a separate Plumbr Agent installation for each JVM monitored and load the javaagent from different locations for each JVM monitored.

Data backup & restoration: On Premises

Plumbr Server stores all persistent data in PLUMBR_SERVER_HOME/data folder on the host server running docker containers. We have provided a sample backup script called “backup.sh” that can be used to preserve the most important data. Just run it periodically (e.g. via cron job) as follows:

cd $PLUMBR_SERVER_HOME

./backup.sh /my/backup/destination

Please note, that only final aggregated data which is presented in the Plumbr Server UI is preserved. Raw probe data sent by Plumbr Agents as well as all intermediate partially processed data is not backed up.

In order to restore Plumbr Server installation after some disaster or relocation to another server do the following:

  • Install Plumbr Server on new server as described in Plumbr Server Installation Manual
  • Run Plumbr Server and wait for 5 minutes until it creates the required structures, both internally and on the file system in data subfolder
  • Run the provided script “restore.sh”

Agent API

Introduction

Plumbr Agent API enables programmatic control over:

  1. Service naming (see description of a service for more details)
  2. Transaction boundary definition (see definition of a transaction for more details)

The following guide describes how to install the api dependency and how to use it.

You can look at the fully functional demo at https://bitbucket.org/plumbr/agent-api-demo.

Installation

To start using the Plumbr Agent API, agent-api.jar must be added as a dependency to your project. When running the application without the Plumbr Agent attached, all calls to the library will be silently ignored without any performance impact. When the Agent API library sees the attached agent, it will perform the requested integration calls.

The Agent API is published on Bintray () and Maven Central.

To add the dependency, copy and paste the suitable snippet for your build system from the respective Bintray or maven central page.

To use the library in the code, the following import must be added to your source file:

import eu.plumbr.api.Plumbr;

Service Naming

Plumbr agent supports a number of web frameworks for service name identification.
However, there are cases where the default service name extraction functionality is not enough and
Plumbr detects only one or two services in your application – typically the default
root application URLs. In such case, the Agent API can be used to better distinguish
between services.

In case of a regular servlet-based web-application, transaction boundaries are
correctly identified by Plumbr and only the service name should be set in a correct
place in the code (usually a central routing class that extracts service
name from request parameters). To set the service name for the current transaction
one must call:

Plumbr.setServiceName("service name");

After the service name is set, it cannot be redefined.

Transaction boundary definition

If you monitor an application where Plumbr does not detect transactions at all, or transactions start and end not at the default integration points (ie. the start and end of http request processing), then it is possible to manually define transaction boundaries. This is done as follows:

try {
  Plumbr.startTransaction("Service name");
  ...
  Plumbr.endTransaction();
} catch (Exception e) {
  Plumbr.failTransaction();
}

NB! Pay attention to possible exceptional cases, so that the transaction would be
definitely ended. Forgetting to end the transaction will introduce a memory leak.