The onshore operating room and computers that learn

The onshore operating room and computers that learn

This page is a description, demonstrator or executable versions of tools and methods.

No votes yet

users have rated this content. We would love to have your vote as well. Log in and rate!

This presentation is geared towards those seeking to implement new data analysis methods in an onshore operating room. Analysis of real-time drilling data has traditionally been carried out by visual inspection, comparison with calculated predictions and trend lines. In recent years, the industry has sought to complement this with novel analysis methods. Referred to variously as data mining, intelligent agents, machine learning, soft computing, learning algorithms or artificial intelligence (AI), it seeks to extract more value out of the real-time data stream, "learn" the well's behaviour by itself or offload the human operator of some of the more mundane monitoring tasks. While of potentially high value, these methods may be perceived as unproven or with unpredictable performance. We will here present some simple implementation templates which improve the robustness of such methods in a real-world drilling operation and ways of combining these with existing methods of analysis.

The data analysis workflow

We'll assume that you're monitoring the drilling operation to get an early warning about upcoming problems. A full monitoring system will include several steps of data analysis. Granted, it is sometimes possible to cut to the chase and detect drilling problems directly from the raw data stream, but noise and the challenge of telling different effects apart, usually mandate several steps of pre-processing. These include: Filtering, to separate out the important aspects of the data. Feature extraction, for aggregation of data. Anomaly detection, typically performed by detecting a deviation between measured and expected models and fault identification which identifies the cause of the anomaly, often involving pattern recognition methods. 


Every operating room will perform some anomaly detection, if only visually checking that the curve on a plot do not deviate outside some given threshold value. Commonly, that curve will be expected to match some prediction, like how the bottom hole pressure is expected to rise as we drill further down. An "anomaly" is then not that the pressure changes, but that it starts to deviate from a curve calculated in the well's planning phase, or by a real-time drilling model. The difference between the measured and the predicted pressure is referred to as the "residual". As the real pressure, flow rate and temperature depend on many factors, it can be challenging to produce a good prediction to compare the measurements against. Some therefore turn to the novel data analysis methods, which can "learn" to produce better real-time predictions. This is the use-case we'll be covering here. In control engineering parlance, we are treating the intelligent agent or AI as just another "state observer". 


Simplifying the problem using a greybox 

Physical models built from first principles are often referred to as "white boxes" in contrast to the perceived "black box" of machine learning algorithms such as neural networks. The term "greybox" is meant to signify a combination of these two approaches. 

Imagine for instance that you are monitoring the pit volume. The simplest approach is to assume that the pit volume will stay constant unless we lose mud to the formation or is taking a kick. But this will frequently produce a large residual, as the net flow of mud is affected by elongation of the hole, displacement of mud by the drill string, mud compressibility and even thermal expansion when the mud is heated. Calculating these effects require calculation of heat transport and fluid dynamics so you may find yourself comparing the measurements against the output from an advanced real-time drilling simulator. This will reduce the residual, making any real kick or loss stand out more clearly, but there will still be effects that are unaccounted for. At this point, perhaps you consider the novel approaches mentioned earlier. Could these systems simply learn to predict the effects by being trained on old time series? In practice, the amount of training data needed to accomplish that scale steeply with the number of effects considered. It is possible to exhaust your database of past wells while training a neural network and still not perform better than traditional models. If the effects are particular to the well or the machinery you are using right now, data from past wells may not even be relevant. 

A simple trick to overcome this problem is to train your learning algorithm of choice not on the raw data, but on the residual. As the traditional models have here already removed a lot of effects, the residual is a lot "simpler", requiring less training data to learn. That is:

  1. Compute prediction from traditional model

  2. Define Residual1 = Measurement - Prediction from traditional model 

  3. Train learning algorithm to predict residual

  4. Define Combined prediction = Prediction from traditional model + prediction of residual from learning algorithm

  5. Define Residual2 = Measurement - Combined prediction

  6. If Residual2 < Residual 1, the combined prediction is more accurate than the original. 

In short, the traditional model predicts what it can and the learning model is left to focus on only the effects that remain unexplained. Many variations on this greybox workflow exists, the common denominator is splitting the raw data into an explained and an unexplained part. 


In a test we studied the residual in mud flow rate predictions. We used a real-time flow model as the traditional model and an "echo state network" as the learning algorithm. Using the greybox setup presented here, the algorithm was able to correct for both a delay in flow as the pipes on the rig drained into the mud pit tank and even a bug in the flow model, which affected the calculated flow rate. With a pitgain alarm system triggered by a 3bbl deviation, the greybox approach was able to reduce the number of false alarms by 25%. (SPE 112212). 

This shows an aspect of the greybox approach which is very appealing from an operational perspective. You may have an existing complex model in use, but without interfering in that model, you can put a learning algorithm on top and get an immediate return on the time you've invested in learning algorithms.


Ensemble prediction, two heads are better than one

Learning algorithms have a reputation for behaving in an unpredictable fashion. Their performance depends on the data they are trained with and the examples in this data set also define the area of applicability of the trained algorithm. And like their counterparts in first-principle models, they too are sensitive to noise and errors in the input. A way to handle this is to set up several different algorithms and drawing a consensus from these, like a committee of experts. We'll here outline the success criteria for forming a good committee of learning algorithms. 


If we want to measure something, we know that averaging several independent measurements will give us a more precise estimate than one measurement alone. But there may be some "systematic bias" to our measurements, so we may use a set of different measurement devices or even have several different people do the measurements, to get an estimate that's even closer to the true value. The same thinking can be applied to computer models. Both meteorology and reservoir modeling employ "ensemble" methods, where several parallel models each represent different but plausible representations of reality. 


"Ensemble learning" extends this thinking to learning algorithms. Assume that we want a good estimate of bottom hole pressure. We can initialize several models or learning algorithms and take the average of their output as the "ensemble estimate". Ten identical copies of a model would provide ten identical estimates, which would obviously not be an improvement. To make ensembles of learning algorithms work, we must induce "diversity" into the ensemble set. This can be achieved in the following ways: 

  • Training set diversity: Feed different subsets of the available training data to each ensemble member. Also known as "bootstrap aggregating" or "bagging". Like a group of experts with a diverse set of backgrounds and experiences, the ensemble here represents a multitude of viewpoints. It may seem counterintuitive that we get an improvement by feeding less data to each ensemble member, but another way of looking at it is that one ensemble member alone cannot internally represent all the information in the training set. By dividing it up between the ensemble members, possibly with some overlap, the ensemble as a whole contains a fuller representation of the training set. Training set diversity is the most common way to induce diversity.


  • Algorithm diversity: Configure the learning algorithms in different ways, like varying the number of neurons in the hidden layer of a neural network, or use several completely different algorithms entirely. 

  • Input diversity: Predicting the bottom hole pressure will involve several variables but the full set is not needed. You can for instance make a good guess of the BHP from the standpipe pressure alone, but given a favorable set of other variables, you can also dispense with it altogether. Input diversity, also known as the "random subspace method" simply means to give each learning algorithm different subsets of the multivariate data stream. 


A combination of training set and input set diversity is a good choice for ensembles used in drilling. Data quality is a big issue in drilling and we may not trust neither real-time nor recorded data. With training set diversity you ensure that a section of bad historical data does not "taint" every ensemble member, while input diversity guard against malfunction when real-time data breaks down. If for instance the stand pipe pressure signal drops out or becomes noisy, there will be some ensemble members that remain unaffected because they do not take SPP as input. Control engineers may recognize a similar setup from alarm systems. State observers can be set up to take input from only a subset of the available sensors. By looking at which state observers succeed or fail to model the system, sensor faults can be singled out. 


At this point, it is worthwhile to clarify a distinction between the "ensembles" of traditional models and learning algorithms. An ensemble of models, such as in an ensemble Kalman filter or a Monte Carlo simulation, represents a sample of a probability distribution, representing our uncertainty about the system state. It is also an exploration of the parameter space, ensuring that we're not stuck in a local optimum when tuning our model. With the ensemble Kalman filter, the models are updated and replaced as new real-time information becomes available. In contrast, the learning algorithm ensemble we have described above have a fixed set of members which are not tuned after the initial training. They are not dynamically converging on a value like in the ensemble Kalman filter, but instead serve as checks and balances against each other's weak points. 


With this difference in mind, one can see that different instances of the same first-principle physical model would not make a good ensemble. If e.g. mud density and formation temperature is uncertain, one may intuitively initialize several models with different likely temperatures and formations, centered on a best guess. The ensemble now samples a probability distribution and the ensemble average turns out to be what you'd get from a monte carlo simulation and probably close to what you'd get by applying your best guess. You get none of the robustness or diversity and none of the real-time tuning of Kalman filters. (Except for ensemble boosting methods like Adaboost.R which we are not covering)


There is a loophole here however. If you include your physical model together with several learning algorithms, the physical model adds to the algorithm diversity of the ensemble set. This creates a nice synergy as the physical model is probably the most accurate of them all, while ensemble set as a whole provides more robustness. This setup was explored by the IO center in SPE paper 150201. Using prediction of BHP during an MPD operation as a case study, we combined two physical models with widely different underlying algorithms together with two learning algorithms. As the ensemble output we chose the median value of the four predictions. The figure below shows the results. 


Top: Four independent models (red curves) try to predict the true pressure (blue) in a well during drilling.
Bottom: Taking a consensus among the four models results in a new prediction that is better than any of the original ones.

In the figure we see how each single model (red) fail some of the time, but as they are distributed around the true value (blue), spikes and outliers are easily discounted, leading to a more accurate and less noisy prediction. Using the median value, spikes and large deviations, typical in this case, are removed. In other cases the average value may be the preferred choice. 



Learning algorithms offer an opportunity to improve real-time monitoring in the support centre, but many will feel reluctant to implement algorithms that are not part of the tried and tested methodologies in drilling or it appears that the new methods are no better than the most sophisticated existing models. We have outlined how you can set up your analysis in a way that make the new methods improve on existing systems rather than competing against them. We have also shown how a diversity of new and old approaches can be combined, to produce a system that is more robust to bugs and bad data quality.


Further reading

Ensemble learning - Scholarpedia

Improving management and control of drilling operations with artificial intelligence
Conference paper, SPE150201, March 2012
Authors: Giulio Gola, Institute for Energy Technology, Roar Nybø, Dan Sui SINTEF Petroleum Research AS, Davide Roverso, FirstSensing AS

A grey-box approach for predicting the well bottom-hole pressure in pressure-controlled drilling operation
International Conference Paper COMADEM 2011.05.30
Authors: G. Gola, R. Nybø, D. Sui

Ensemble Kalman Filter Used During Drilling
SINTEF Report 11.10.2010
Authors: Dan Sui

Ensemble methods for process monitoring in oil and gas industry operations
Journal paper, Journal of Natural Gas Science and Engineering, Sept. 2010
Authors: D. Sui, R. Nybø, G. Gola, D. Roverso, M. Hoffmann

Anomaly detection and sensemaking in time series interpretation
Report, 01.12.2012
Author: Roar Nybø

Other key information

289 results
Below, you will find related content (content tagged with same topic(s) as this tools and methods)
Content type: Report

The Integrated Operations Maintenance and Modification Planner (IO-MAP) The first usability evaluation – study and first findings

The Integrated Operations Maintenance and Modification Planner (IO-MAP) is a software test bed, which is being designed in an iterative process.

Content type: Ongoing activity

A benchmark for integration of reservoir engineering and geophysics

This activity facilitates the collaboration between reservoir engineers and geophysicists

Content type: Publication

A capability approach to integrated operations

This white paper tries to summarise the first initial thinking on capability development and capability platforms.

Content type: Publication

A Derivative-Free Approach for the Estimation of Porosity and Permeability Using Time-Lapse Seismic and Production Data

This is a scientific publication written in collaboration with the IO Center.

Content type: Publication

A Lagrangian-Barrier Function for Adjoint State Constraints Optimization of Oil Reservoirs Water Flooding

This is a scientific publication written in collaboration with the IO Center.

Content type: Report

A new telemedicine practice on-the-go

Development and demonstrations of a telemedicine prototype that enables the key features of the “Virtual Examination Room - On the go”.

Content type: Publication

A Robust Scheme for Spatio-Temporal Inverse Modeling of Oil Reservoirs

This is a scientific publication written in collaboration with the IO Center.

Content type: Publication

A Sparse Basis POD for Model Reduction of Multiphase Compressible Flow

This is a scientific publication written in collaboration with the IO Center

Content type: Publication

A structured approach to improved condition monitoring

A paper describing a systematic approach to select and implement appropriate condition monitoring for systems, structures and components.

Content type: Presentation

Acceptance criteria for dynamic risk analysis

Presentations at IO Center Work Shop Sept 2013