2.3 Variograms, at the heart of geostatistics
In the next two sections, we’ll go through the modeling of a sand/shale facies distribution, first using a dense dataset (Figure 1A) and then a limited dataset extracted from the dense dataset (Figure 1B). In this section, we are focusing on the variograms that will be used with this dataset.
Figure 1 shows the different variograms that will be used in the next sections. Variograms are represented on a map either as circles (black circles, Figure 1) or as ellipses (red and green ellipses, Figure 1). By extension, 3D variograms are represented as spheres or ellipsoids. A circular variogram means that there is no preferred orientation in the data. On the contrary, the more anisotropic the ellipse is, the more elongated and narrow the facies will be distributed in that direction. When the property is evaluated at a given empty location, the variogram being used (circle or ellipse) is centered on this location. The data points found inside the circle will have an influence on the value that will be computed at the new location. The data points outside of the variogram won’t have any impact.
Kriging algorithms use only the input data points, which is why kriging is a deterministic technique. Simulation algorithms, on the contrary, use both the input data points and the values that were computed before moving to this location. Simulations are probabilistic in nature because the empty nodes are not populated in the same order from one realization to the next. As a result, when the time comes to populate a given location, the surrounding available data will be different from realization to realization. For more details on how the surrounding data are used, please refer to (Pyrcz and Deutsch, 2014) for example.
2D isotropic variograms are defined by their range and their sill. The range represents the radius of the circle/ellipse. The sill will be defined in the next paragraph. 2D anisotropic variograms are defined by their maximum and minimum ranges, represented respectively by the ellipse semi-major and semi-minor axes. They are also defined by a sill, as for isotropic variogram, and by the azimuth of the semi-major axis (referenced to the North; 150 degrees on Figure 1 for example). A 3D variogram is usually defined as the combination of a 2D horizontal isotropic or anisotropic variogram and a vertical range. The vertical range is much smaller than the horizontal ranges. It reflects the fact that geological properties are continuous over a large area, while they rapidly change in the direction of deposition (here referred to as vertical). A true 3D variogram implies that the ellipsoid can have a dip and a plunge. True 3D variograms are used when we assume that the plane of deposition is not horizontal but inclined. True 3D variograms are not commonly used, but they are gaining some traction for example in oil sands project to model dipping IHS.
Variograms are defined using variogram analyzers (Figure 2 and Figure 3). The correlation found in different orientations (azimuths) is analyzed to identify the directions of the maximum and minimum horizontal ranges (Azimuths 150 and 60 degrees respectively in our dataset). For a given azimuth, the analyzer superimposes two objects: the experimental variogram and the variogram model. The experimental variogram is a succession of points computed from the input data. The variogram model is a mathematical equation that we have to adjust to the points of the experimental variogram. The circle/ellipses (Figure 1) are the spatial representations of the corresponding variogram models.
The modeling expert feeds two main datasets to the variogram analyzer: a set of azimuths and a set of distances between data points. The horizontal axis of the variogram analyzer represents these distances. For our dataset, we decided to compare each data point with the nearby point, if any, 400 meters away. We then do the same for a distance of 800 meters and so on until distances of 8000 meters. As a result, our experimental variograms have one point every 400 m. We did so in 10 different azimuths, of which we show only azimuths 60 and 150, the azimuths of the axes of the variogram model. For a given azimuth and a given distance, the goal is to check how two data points (= a pair) are similar. If the values of the two points making every pair are the same, the correlation is perfect and the corresponding point of the experimental variogram will be at Y=0 on the variogram analyzer. This only happens at the origin of the graph, where the distance is zero and each node is compared to itself. The bigger the distance, the lower the correlation will get, until a distance (the range) is reached beyond which there is no more correlation. At this stage, the points of the experimental variogram plateau. This plateau is the sill. For stationary and ergodic properties, the sill is the variance of the data.
A good variogram model will be one that starts at the origin, climbs progressively until reaching a plateau equal to the sill of the experimental variogram. It is essential to properly fit the experimental variogram between the origin and the range, as this is the part of the variogram model which will have the higher influence on the results of kriging and simulation. It is also essential to capture the anisotropy of the experimental variogram: keeping an isotropic (circular) variogram while the data show anisotropy will lead to missing some important information about the property we want to predict. Figure 2 A, B and C were used for kriging and the results are respectively shown on Figure 4, Figure 5 and Figure 6. As can be seen with this dataset, different variogram shapes do indeed give some drastically different models.
Adjusting a variogram model is often challenging. It is rare to have a dataset as dense as the one used here (Figure 1A). As a result, it is rare to have horizontal experimental variograms as clean as in Figure 2. Often, the data is limited and the experimental variogram difficult to interpret (Figure 3). In this example, the experimental variogram in azimuth 60 degrees even looks as if it’s a perfect sill: there are no points dipping down progressively to the origin. If this were true, it would mean that even for very short distances, there is no correlation between the values. While true for some ore deposits, this is rarely – if ever – the case in sedimentary rocks. The issue is not the geology but the dataset: the facies distribution is under-sampled and as such the first few points are not representative. In petroleum studies, variogram models should always start at the origin unless it can be backed up otherwise with solid geological evidences. In technical terms, we should never have any nugget effect (= variogram model not starting at zero).