Skip to Main Content

Database Software

How to Visualize Oracle Performance Data with the Elastic Stack

unknown-1040115Jan 19 2017 — edited Feb 21 2017

Visualizing Oracle Performance Data with the Elastic (ELK) Stack

     By Robin Moffatt, Head of R&D, Rittman Mead. @"rmoff"

One of the great things about the Oracle database is level of diagnostics and performance data that is available from it. Used in conjunction with good instrumentation from the applications generating the workload on the database, it’s a brilliant way - arguably, the only way - to accurately and efficiently get to the bottom of any performance problems that arise. By analysing what the database is doing at a point in time we can understand more about the load that it is under, and by looking at how a given session executes we can identify optimisation opportunities for it.

The data is available natively through some of the V$ system views, and exposed through Enterprise Manager, and tools such as SQL Developer. What I’m going to show in this article is how you can make use of the open-source Elastic stack (also called ELK)  - comprising Elasticsearch, Logstash, and Kibana - to extract this data and visualize it as an alternative to the aforementioned options.

Why Elastic Stack? (ELK)

Because Kibana is a great data visualization and exploration tool. Because Elasticsearch is a great data store that dead easy to work with, and because Logstash makes getting data from a ridiculously large and varied number of sources an absolute doddle. It’s a stack I frequently use for monitoring and diagnosing applications including OBIEE, and including Oracle performance data in that analysis was an obvious thing to do.


Elasticsearch, Logstash, and Kibana are all open-source tools that are free to use. You can pay for a subscription that gives you direct access to product support, as well as additional tools including for security and alerting.

The Oracle Diagnostics Pack is required if you’re using functionality in Oracle including (but not limited to) Active Session History (ASH) – if in doubt, speak to your Oracle Account Manager.


This article was written using Elasticsearch 2.2, Logstash 2.2, Kibana 4.4, pulling data from Oracle 12c ( I used the excellent Swingbench to generate load against the database.


The Elastic stack gives you great flexibility to analyse exactly the data that you need, in exactly the way that you want to. In this article I’m going to show how to get Active Session History (ASH) data streamed into Elasticsearch. You might want to then enrich it further with log data from your applications, or OS metrics from the servers – or anything else that might be useful for a full-stack view of your application’s performance.

Using Logstash’s JDBC input, we pull data from the Oracle V$ACTIVE_SESSION_HISTORY table, loading the data directly into Elasticsearch. From there we can analyse it with Kibana. It really is as simple as that.


1. Choose your monitoring server. Elasticsearch and Logstash both run under the Java VM so are eminently portable across OSs.
    Kibana ships with binaries for Windows, Mac, and Linux.
    In this example, I’m running them on a linux server.

2. Download the binaries:

     You’ll also need the Oracle JDBC driver which you should be able to get from here (or via a download link at the parent page here).

     Don’t forget a current JDK too, if you don’t already have one.

3.  Unpack the binaries and JDBC driver somewhere appropriate. In this example I use /opt.

4.  Optionally, install the (free) kopf monitoring plugin for Elasticsearch by running the following:

/opt/elasticsearch-2.2.0/bin/plugin install lmenezes/elasticsearch-kopf

5. Start Elasticsearch:


You should see some output showing that Elasticsearch has started up successfully:

ELK Image1.jpg

If you Ctrl-C or close the terminal, you’ll kill Elasticsearch. For this - and a multitude of other purposes - I’d always use screen in order to run multiple sessions in a single window/SSH connection.

If you installed Kopf in step 4 above you can now go to http://localhost:9200/_plugin/kopf/#!/cluster to see the status of Elasticsearch:

ELK Image2.jpg

6. Start Kibana:


After a moment go to http://localhost:5601/ and you should see a config screen like this:

ELK Image3.jpg

Getting the Data into Elasticsearch

So far, so easy. Download and run a handful of binaries. But, we’ve not got any data yet. Enter Logstash. Logstash enables you to pull in data from one or many types of source, filter and enrich it, and then output it to Elasticsearch and many other destinations.

Here we’re using the Logstash JDBC input plugin. Note that this is relatively new (in the open-source sense) and previously was often done as a “River” - something now depreciated in Elasticsearch but I mention it here as you may still find articles through Google that reference it.

To use Logstash we need to build a configuration file. We’ll build this up step by step to make sure it’s all working along the way.

First Steps with Logstash JDBC Input

All Logstash configuration files have the same simple structure : input, filter, output. And of those, filter is optional. Here’s our starter for ten, that’ll act as a smoke-test for the Logstash-JDBC-Oracle connectivity:

ELK Image4.jpg

The parameters in the input JDBC stanza are all pretty obvious - connection string, credentials, path to JDBC driver that you’ve installed, and the SQL to run.

For the moment we’re setting the output as simply back to the console, using some formatting provided by a codec (you don’t need to worry about this detail here - it just makes things easier to read).

Save this local to the server where you’re running ELK (here I’ve saved it to ~/logstash-ora-01.conf), and then execute it :

/opt/logstash-2.2.0/bin/logstash -f ~/logstash-ora-01.conf

After a few seconds of Logstash firing up, you should see the output including the SYSDATE that it’s pulled through from Oracle

ELK Image5.jpg

Note that you’re also got two internal Logstash fields in there – @version and @timestamp.

Polling Oracle

As you saw above, when we ran Logstash it ran the query once and then finished. We’re going to want to be polling the data out of Oracle on a continual basis in order to stream it into Elasticsearch. Let’s see how we can do that here.

As the documentation for the Logstash JDBC input plugin shows, a schedule can be specified in the parameters. The syntax matches that of cron. Our logstash configuration file now looks like this, and will poll Oracle every two minutes:

ELK Image6.jpg

When we run it (and wait patiently!) we’ll see that every two minutes it’s pulling new data from Oracle:

ELK Image7.jpg

Now that Logstash is running on a stream of input data we need to explicitly kill it to get it to stop – Ctrl-C will do the job here.

 ^CSIGINT received. Shutting down the pipeline. {:level=>:warn}
Logstash shutdown completed

Oracle to Logstash to Elasticsearch

We’re getting closer now to the final configuration that we need. Let’s see how Logstash can be configured to send the data we’ve been pulling from Oracle into Elasticsearch, instead of just the console on stdout. It’s really difficult, I’m warning you now. (for my friends across the pond - that was sarcasm. Sending data to Elasticsearch is ridiculously easy!)

We’ll use the same input stanza, but with one little extra line in the output:

ELK Image8.jpg

Blink and you’d miss it, right? That one little elasticsearch {} is all that it takes to send data to Elasticsearch, as well as still sending it to stdout. The Elasticsearch output plugin supports lots of parameters, but if you’re running a single Logstash instance locally to a single Elasticsearch node, that’s really all you need.

If you now look in Kopf you’ll see that an “index” called logstash- followed by the current date has appeared. The number of “documents” should match however many entries you’ve got on the console – each “document” is one record sent from logstash to Elasticsearch. In RDBMS terms a document is roughly a row, and an index roughly a table.

You’ll also see in Kopf that the top bar has turned from Green to Yellow, indicating the health of Elasticsearch. It’s gone Yellow as a warning, because you’re now storing data (which you weren’t before, when it was green) and you’ve only got a single node so no redundant copy of it for resilience. Elasticsearch, as the Elastic part of the name implies, scales horizontally by design and doing so is a doddle. Provisioning extra nodes in the cluster is pretty much a case of just installing the binaries and updating the configuration to use the same cluster name - as soon as you fire them up Elasticsearch will automagically rebalance the data so that there are redundant copies – at which point the cluster health goes back to green.

Did Someone Say Oracle Performance Data?

So far we’ve just been proving out the connectivity. Now let’s actually, finally, start pulling in some of the good stuff. Oracle’s V$ACTIVE_SESSION_HISTORY table holds a snapshot every second of what has been running on the database at that point in time. ASH, as it’s known, is one of the best ways to analyse the queries that have been running and what the database was doing whilst running them.

We’ll use the SAMPLE_TIME column on V$ACTIVE_SESSION_HISTORY as our predicate, otherwise we’ll pull back the whole contents of the table each time. The JDBC plugin can store on disk when it last ran, and it makes this available as a run-time variable to pass into the SQL. The SQL statement looks like this:


Note that between Logstash 2.1 and 2.2 the name of this variable changed (along with some of the associated plugin parameters), from sql_last_start to sql_last_value.

Now we’re going to introduce a new stanza to the configuration – a filter. Logstash filters are basically pipeline processing functions - they might enrich the data, conditionally drop it, manipulate it, and so on. Here we’re just doing a straightforward field assignment, setting the date of the event (the @timestamp field that you can see in the sample output above) to that of the SAMPLE_TIME in the ASH data, rather than the current time of processing. By default logstash will record each event that it processes (so in our case, a row of ASH data) as having occurred at the current time. If we’re streaming data from a log file, that’s probably true, but when we’re polling data (or back-populating historical data) it is important to assign the actual time to it. You’ll know if you’ve not done it when all your historical data shows up in Elasticsearch and Kibana with today’s date…

Note that whilst SAMPLE_TIME is a timestamp field, it can’t be directly assigned - we have to cast it to string first and then pattern match it:

mutate { convert => [ "sample_time" , "string" ]} date { match => ["sample_time", "ISO8601"]} 

If you don’t cast it first, you’ll get the error cannot convert instance of class org.jruby.RubyObject to class java.lang.String. Thanks to javadifateme on github for this workaround. Watch out also that field names are case sensitive.

The completed configuration looks like this - note that the stdout is commented out, because there’s going to be a lot of data flowing and dumping it to the console slows things down somewhat. For debugging though, it’s useful to leave in.

ELK Image9.jpg

The first time that you run this, it’s going to take a while, because the “last_run” value doesn’t yet exist (since we’ve never run it) and defaults to 1/1/1970. If you don’t want a full dump of ASH loaded, you could probably write a crafty CASE clause into the SQL to cater for this, or pre-create the lastrun metadata file on disk - it looks like this:

--- 2016-02-02 23:02:06.470000000 Z 

Data Cleansing

When you run logstash you may well get a screenful of errors like this:

Failed action.  {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2016.02.02", :_type=>"logs", :_routing=>nil}, #<LogStash::Event:0x566d7547 @metadata_accessors=#<LogStash::Util::Accessors:0x95620e @store={}, @lut={}>, @cancelled=false, @data={"sample_id"=>#<BigDecimal:3c4cfaaa,'0.17901126E8',8(12)>, "sample_time"=>"2016-02-02T18:37:00.680Z", [...]

and on Elasticsearch’s console too:

[2016-02-02 23:38:15,716][DEBUG][action.bulk              ] [Ariel] [logstash-2016.01.30][3] failed to execute bulk item (index) index
MapperParsingException[failed to parse [force_matching_signature]]; nested: JsonParseException[Numeric value (12852529615260526791) out of range of long (-9223372036854775808 - 9223372036854775807)
   at [Source:; line: 1, column: 334]];        
          at org.elasticsearch.index.mapper.FieldMapper.parse(
          at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(

What’s happening here is that Logstash is sending data over to Elasticsearch that it’s not happy with in the data type that Logstash is claiming, hence Numeric value out of range. Looking closely at the Elasticsearch error we can see the field name that it’s not happy with :

failed to parse [force_matching_signature] [...] 

and since we’re not interested in storing it, we’ll add another filter clause to just chuck it away:

# Drop "force_matching_signature" because its value sometimes blows the limits of the long data type 
mutate { remove_field => [ "force_matching_signature" ] }

I guess if it was a field that we wanted to retain it could be cast to a string for storage (or on the fetch from Oracle). I found that p3 also trips up on the same, so dropped it too. This does give me opportunity to mention a great feature of Elasticsearch - schemas (“mappings”) can evolve as needed, and if we want to start sending greater or fewer columns from Logstash to Elasticsearch we do just that, and Elasticsearch won’t bat an eyelid. In general, the SELECT * [...] is a bit of a lazy way to pull data out; much neater would be to build up a properly specified list of columns that we want and only the ones that we want.

So What Have We Got Now?

Having set logstash running loading ASH data into Elasticsearch, you should see the volume of data in increasing as shown by kopf :

ELK Image10.jpg

Each row from ASH is one “document” in Elasticsearch, and Logstash will by default partition data in Elasticsearch by date. In the above screenshot there are two days’ worth of data, and so two indicies (“logstash–2016.01.30” and “logstash–2016.02.02”). The date partitioning is a another reason the timestamp field that we set in the filter clause above is important. Even though as we’ll see shortly Kibana can work with any timestamp field, we want to make sure that Elasticsearch is using the appropriate partitioning key - event timestamp.

Analytics on ASH - Enter Kibana

So far we’ve collected the ASH data from Oracle with Logstash, and stored it in Elasticsearch. Let’s see now how we can do some cool visualisations and analytics on it, using the third part of the ELK stack – Kibana.

Setting up Kibana

The first time you use Kibana (or add a new set of data to it in different indicies) you need to tell it what the index, or indicies pattern, is. Here we benefit from the tight integration within the ELK stack because Logstash by default writes to a partitioned index called logstash-[...], and Kibana by default reads data from an index called the same.

Launch Kibana in your favourite web browser by going to http://localhost:5601 (change localhost for the server on which you’re running Kibana), and you should see the Configure an index pattern screen as seen above. Leave the Index name as logstash-, and in the Time-field name list, pick @timestamp (remember that in the Logstash configuration we’ve already mapped this directly to sample_time):

ELK Image11.jpg

Click Create, and Kibana will then list out all of the fields that it can find in the Elasticsearch index:

ELK Image12.png

From here we’re going to explore the functionality that Kibana offers. It’s split up into three main areas – Discover, Visualize, and Dashboard. Dashboards are a compendium of one or more visualisations and searches (built with discover), so really it’s only two that we need to learn!


ELK Image13.jpg

This gives you access to the raw ‘documents’, viewing all the field values, and searching and filtering. The bar chart at the top of the page shows a count by time period of documents. The longer the time window shown, the greater the time period per bar.

ELK Image13a.png

By default just the timestamp of an event and the whole document (_source) are shown, which is of limited use if we’ve got lots of fields of useful data. By hovering over a field on the left you can opt to add or remove it from the columns in the table shown:

ELK Image14.jpg

You can do neat things like see a summary of the proportion of occurrences of each value within a field (based on a sample):

ELK Image15.jpg

and from there opt to show only those values, or to exclude them:

ELK Image16.jpg

If you click on the toggle arrow next to a particular document the full contents is displayed, which can be a useful way to examine given record. From here you might identify fields that may be useful for data discovery and you therefore want to display as a column in the table, as well as data values that you want to specifically show-only, or exclude, from the table by adding a filter.

ELK Image17.jpg

To save the view of the records (in terms of fields/columns shown, and filters applied), click the Save Search icon in the top right of the screen

ELK Image18.jpg


Let’s head on over to Visualize now, where we can build useful aggregations on top of the data to help with our analysis of it.

ELK Image19.jpg

We’ll start with an Area Chart:

ELK Image20.jpg

ELK Image21.jpg

Under buckets click X-Axis and set the Aggregation to Date Histogram. You can specify a time interval if you want, or leave it as Auto. Click the Apply button (the big green triangle)

ELK Image22.jpg

You should see something similar to the bar chart in the Discover page that we saw previously. It’s a count of the number of documents, per time period.

ELK Image23.png

So here we’re seeing the number of active sessions on the database, summarised up per time period.

Let’s add a dimension to the data, and look at the SESSION_STATE within these sessions. To do this click on Add sub-buckets, Split Area, and set the Sub-Aggregation to Terms. Click the Field dropdown and set it to session_state.raw (note the raw suffix) and hit the apply button again.

ELK Image24.png

OK, this is starting to look interesting. How about a third dimension - MODULE. This is one of those ones that any good application developer should be setting in order to improve the instrumentation and therefore diagnostics for their application. Read more about why this is so important in Cary Millsap’s slidedeck here. Anyway, here it is thrown into our Kibana visualization of ASH data:

ELK Image25.png

To build this, I added a further sub-bucket, using the Split Chart option, and a Term aggregation on the module.raw field.

Now save the visualization (top right of the screen), because we’re going to use it shortly in a Dashboard.

Wondering why I’m using this .raw fields? These are to do with the powerful full text search that Elasticsearch (it’s even in the name, right) is so good at. By default Elasticsearch will “analyse” string fields and when you opt to aggregate against them do so against the terms it finds within them. So “Update Customer” might become two separate aggregations “Update” and “Customer”. This is useful in lots of cases, but not here. On its own “Update” and “Customer” are complete nonsense. Hence, use the .raw version of the fields, which under the covers are stored as “Not analysed”. This is all done automagically for you in the default Logstash/Elasticsearch configuration, but you’ll want to look into index templates if you go start to go off-road with this.

Once you’ve got the hang of the Kibana interface you’ll find it a doddle to start properly slicing & dicing data to suit what you’re looking for. Here’s another example of a visualization I put together, and it’s so easy to do it only took a couple of minutes:

ELK Image26.png

I’ve always scorned pie charts, but I think this one of ASH data is genuinely an appropriate use of the visualization. It shows what the user’s doing (module/action), and what the database is doing (session_state/wait_class/event). Perhaps as a standalone analytical visualization it still falls foul of the problem of pie charts (the human eye has difficulty perceiving the relative proportions of an area in a circle), but as we’ll see in a moment, visualizations can be interactive filters, and combined with the tooltips, I think this pie gets to stay…


ELK Image27.jpg

Standalone ‘Discover’ and ‘Visualize’ explorations can sometimes be enough in themselves to answer the questions you have for the data, but typically you’ll want to pull several of them together. You might want a “dashboard” in the monitoring sense, or maybe just a way to show the detail rows of data for a given visualisation.

From the Dashboard tab click the + icon in the top right of the page. From here you’ve got two tabs enabling you to select which visualisations and searches (from the Discover page) you want to include:

ELK Image28.jpg

When you add them in they’ll first show up as small boxes which may well be too small to show the intended contents. Simply hover over the bottom-right corner of any of the boxes to resize (notice how the rest of the boxes rearrange themselves to make space - neat!). In no time at all, you can put together something like this:

ELK Image29.png

And now the really neat bit - as I mentioned briefly before, the dashboard is interactive. You can hover over elements to see details about that data:

ELK Image30.png

What’s more, you can click on any element such as a bar chart or pie chart segment to apply a filter for that segment of data. Here we can show all ASH records related to the identified module, action, and wait state:

ELK Image31.png

Note the filters at the top of the screen - you can toggle these to change the view on the data. For example, to show all ASH records for the given module and action but regardless of state etc:

ELK Image32.png

All the visualizations and data tables on the page show data as specified by the filters. This is a very intuitive, powerful, and flexible way for analysing your data!


The ELK stack is a fantastic way to quickly ingest data, store it, and analyze it. It’s so simple to install and get started with, and all of this can be done from a simple install on a laptop, or scaled up onto proper servers for production use with clustering for both resilience and performance.


Examples of ELK in action for both systems monitoring and analytics, including against data in Hadooop.

Slides: “Data Discovery and Systems Diagnostics with the ELK stack” here,

Video:  Yorkshire Database (YoDB) user group in 2015.

About the Author:

Robin Moffatt is Head of R&D at Rittman Mead, and an Oracle ACE. He specialises in OBIEE and Linux as well as ODI, and more recently delving into the worlds of Kafka, Hadoop, and Elasticsearch. His particular interests are data and analytics, systems architecture, performance testing and optimisation. He blogs at and and can be found tweeting grumpy geek thoughts as @rmoff.

Post Details
Added on Jan 19 2017