Probabilistic approach to benchmarking three numerical simulators Using the Egg Model

Probabilistic approach to benchmarking three numerical simulators Using the Egg Model

A large number of the decisions taking by oil & gas companies rely on the evaluation of simulators to project resources and hence infrastructure, production, wells and operations. These simulators integrate rock physics and fluid dynamics to give a more realistic representation of the reservoir and its response during the field development. The inputs used for the simulators vary but mostly rely on data from geology, seismic as well as exploration and production in order for the models to be as accurate as possible.

Today more than ever, we can tap into better and cheaper computers which satisfy the computational requirements needed to run the simulations. We also find ourselves with not only commercial simulators such as Eclipse, CMG, RFD but also open source simulators such as Open Porous Media (OPM) and Matlab Reservoir Simulation Toolbox (MRST).

This article describes the process of evaluating/benchmarking of these open source simulators against the industry standard ECLIPSE simulator from Schlumberger, as well as the approach that was taken for my MSc Petroleum Engineering Project with the results so far.


The reservoir model used in this project is called the Egg model which is a synthetic reservoir model consisting of nearly 100 fairly small 3D realizations of a channelized oil reservoir which is producing oil under water flooding conditions. 

Figure 1 Egg Model with 8 Injectors and 4 Producers (Ransen et al, 2013)

It consists of eight water injectors and four oil producers (Figure 1). Water flooding is a major development method for oil reservoirs whereby water is injected from some wells to maintain reservoir pressure as oil is produced from others. The aim is to position injection and production wells such that we 'sweep' the oil towards the producing wells.

With water flooding, recovery factors can be as high as 60% (Brown, 2015) but will depend on sweep efficiency both, areal and 'locally' with respect to rock/fluid properties. As the reservoir model has no aquifer or gas cap, water flooding as a secondary recovery method can supply additional reservoir energy for producing substantial quantities of oil trapped by the limited displacement drive and poor sweep efficiency.

Figure 2 Egg Model showing Oil Saturation (Brown, 2013)


·    - The objective of this study was to benchmark the three numerical simulators and compare the estimated ultimate recovery (Cumulative oil production) between the simulators.
·        - Understand the application of the proxy models in the reservoir simulations.

The three numerical simulators which were used are Schlumberger’s ECLIPES Black Oil, Matlab’s (MRST) and the Open Porous Media from Sintef.


ECLIPES has been tested and proven to be robust and reliable since its launch in 1982. It’s able to transfer models from simulators such as BlackOil, Compositional and FrontSim.

 Matlab MRST

The MATLAB Reservoir Simulation Toolbox (MRST) is an open-source oil & gas reservoir simulator. Matlab supports a wide range of solvers, workflow tools which can be combined to perform various tasks.

 Open Porous Media (OPM)

The Open Porous Media Simulator Flow is a fully implicit black-oil simulator which is capable of running industry-standard simulation models. This simulator is implemented using automatic differentiation which enables rapid development of new fluid models.

Why the need to benchmark?

Benchmarking the simulators will allow us to compare and contrast the output of the mentioned simulators. This is necessary in order to examine the multiple conditions in the simulators in one go. The multiple conditions are the changing values of porosity, permeability, bottom-hole pressure, injection rate etc. By benchmarking the simulators, it facilitated in giving us a deeper understanding in order to determine whether the responses from the simulators were accurate. It also allowed us to identify the most influential variables on the projects key performance indicators (KPI). Benchmarking the Egg model was carried out in the past however only on a deterministic level.

Input Parameters
So far, the input variables were a mix of static and dynamic variables obtained from geological, PVT and production data each of which an uncertainty was applied. The variables and their respective ranges are outlined below:

Table 1 Variables and their respective ranges


A probabilistic approach was used to conduct this project’s workflow. This approach enables the quantification of variation and uncertainty by means of using distributions instead of fixed values. The distribution will describe the range of possible values as well as the most possible value for that variable. For example, we roll a dice until ‘5’ comes up. We know that in each roll a ‘5’ will come up with the probability of 1/6. However we don’t know when exactly but can predict it very well. The probabilistic approach is necessary in order to use the full range of values that could possibly occur for each of the unknown parameters whereby it will generate a full range of possible outcomes.

There are many sampling techniques which can be used in order to randomly draw values from an input probability distribution. Types of sampling techniques are: Simple random, convenience sampling, systematic, cluster and stratified sampling. There are drawback with these methods and they don’t guarantee to actually capture the uncertain domain efficiently. As well as this, the number of simulations required for the methods are extremely high such as the Monte Carlo sampling technique. The solution proposed was to use Design of Experiment or DOE, due to its ability to efficiently cover the uncertain domain as well as having a good efficiency between the statistical interpretation and the physical behaviour of the reservoir model.

A particular design called the  Plackett- Burman design, which is a 2-level type of experimental design that allows us to screen a large number of variables in relatively fewer runs was utilized. The design which was created in R (a statistical analysis software) had 12 runs and 10 variables with their corresponding ranges as illustrated in Table 1. 

Figure: 3 Plackett Burman design generated in R

Once the experiment was designed, it was exported as an excel file. The experimental designs were then used in the creation of data files as inputs for the three simulators and output values received from the three simulators were subsequently imported into the excel file. Each case of the simulators shows the cumulative oil production over a period of ten years. The Matlab, OPM and Eclipse responses in terms of oil cumulative production (Qo) are depicted in Table 2, 3 and 4 below.

MRST responses (Qo):

Table 2: MRST oil production response for all 12 cases

OPM responses (Qo):

 Table 3: OPM oil production response for all 12 cases

Eclipse responses (Qo):

 Table 4: Eclipse oil production response for all 12 cases

Sensitivity analysis is a technique used to determine how different values of an independent variable will impact a particular dependent variable under a given set of assumptions. Once the responses were conducted as shown above, we could deduce which variables are the least and most influential on our independent variables. Below is the sensitivity analysis for all the three responses from the three different simulators.

Matlab- MRST

Figure 4: Linear regression results on the top and Main effects results on the bottom for MRST responses

Based on the linear regression results for the MRST simulator responses it was found that based on the p-value, the most influential variable is porosity which is below the p-value threshold of 0.01. The p-value is a statistical parameter used in evaluating if a variable will have a significant effect on the independent variable or not.  The main effect plots show that porosity is the most influential variable on the response variable indicating a positive linear relationship.


Figure 5: Linear regression results on the top and Main effects results on the bottom for OPM responses


Figure 6: Linear regression results on the top and Main effects results on the bottom for Eclipse responses

The regression for each response showed that the Multiple R-squared is at 0.9993 which is really good as it shows that 99.93% of the variation in the response Qo is explained by the model.

Proxy Models

Once the most influential variables were identified the simulator can be replaced by the linear regression model which was generated. Utilizing the regression model for each response from the three simulators a Monte Carlo simulation was then run.

Monte Carlo Simulation

Monte Carlo is a method which randomly samples the stochastic input values in order to provide an output of the distribution values and probabilities. Each of the variables were defined by a probability distribution with the number of samples taken being 1000.

Figure 7: Defining variable probability distributions

Figure 8: Histogram illustrating the probability distributions of the variables

Using these independent variables a Monte Carlo was run using the defined linear regression model from each of the three simulators response. This will generate a cumulative distribution function of the dependent variable (cumulative oil production) or Qo, thus defining the P10, P50 and P90 across the three simulators. As such a probabilistic answer, rather than a deterministic answer was achieved which is able to grasp the whole uncertainty of the model.

Below shows the cumulative oil production generated from the responses across the three different simulators.

Figure 9: Cumulative distribution function of the response illustrating the P10, P50 & P90 for Eclipse

Figure 10: Cumulative distribution function of the response illustrating the P10, P50 & P90 for MRST

 Figure 11: Cumulative distribution function of the response illustrating the P10, P50 & P90 for OPM

From the three CDF’s which were produced, we can see that all the three simulators are giving very similar outputs in terms of the P10,P50 and P90 values. Eclipse gives a P10 of 4.22MMSTB, OPM a P10 of 4.16MMSTB and MRST a P10 of 4.19MMSTB.

The next step within this project will be to add more variables and evaluate their effect on the response variable. I will also be looking to benchmark the water production and hopefully the pressure response.

Author : Mohamed Abdinasir 
MSc Petroleum Engineer
Primera Reservoir


Brown, Steve, and Steve Brown. "Steam Flooding & Recovery Factors". The Steam Oil Production Company ltd. N.p., 2017. Web. 21 Mar. 2017.

"The Egg Model". TUDelft. N.p., 2017. Web. 21 Mar. 2017.