The methodology will take a bit to explain, so bear with me (or just scroll down to the bottom for the 'meat'. Also, I'm just going to post data, so be sure you have your spreadsheet program open if you want little pictures to go with it.
Have you ever plotted a log of production history? I think you should. What does it look like? Interestingly, a rather simple yet striking behaviour is immediately apparent. This forecast proceeds by interpreting the Ln(Production) chart as a series of straight lines seperated by short periods of discontinuity. I fitted straight lines to each subset of the data that appeared to be approximately a straight line. The data, and straightline fits, follow.
- Code: Select all
1901 -1.79 -1.76
1902 -1.70 -1.70
1903 -1.64 -1.63
1904 -1.52 -1.57
1905 -1.54 -1.50
1906 -1.55 -1.43
1907 -1.33 -1.37
1908 -1.25 -1.30
1909 -1.21 -1.23
1910 -1.12 -1.17
1911 -1.07 -1.10
1912 -1.04 -1.03
1913 -0.95 -0.97
1914 -0.90 -0.90
1915 -0.84 -0.84
1916 -0.78 -0.77
1917 -0.69 -0.70
1918 -0.69 -0.64
1919 -0.59 -0.57
1920 -0.37 ---- -0.32
1921 -0.27 ---- -0.24
1922 -0.15 ---- -0.17
1923 0.02 ---- -0.09
1924 0.01 ---- -0.01
1925 0.07 ---- 0.07
1926 0.09 ---- 0.15
1927 0.23 ---- 0.23
1928 0.28 ---- 0.31
1929 0.40 ---- 0.38
1930 0.34 ---- ----
1931 0.32 ---- ----
1932 0.27 ---- ---- 0.27
1933 0.37 ---- ---- 0.35
1934 0.42 ---- ---- 0.43
1935 0.50 ---- ---- 0.52
1936 0.58 ---- ---- 0.60
1937 0.71 ---- ---- 0.69
1938 0.69 ---- ---- ----
1939 0.74 ---- ---- ----
1940 0.77 ---- ---- ----
1941 0.80 ---- ---- ----
1942 0.74 ---- ---- ---- 0.74
1943 0.81 ---- ---- ---- 0.82
1944 0.95 ---- ---- ---- 0.89
1945 0.95 ---- ---- ---- 0.96
1946 1.01 ---- ---- ---- 1.04
1947 1.11 ---- ---- ---- 1.11
1948 1.23 ---- ---- ---- 1.19
1949 1.22 ---- ---- ---- 1.26
1950 1.34 ---- ---- ---- 1.34
1951 1.45 ---- ---- ---- 1.41
1952 1.51 ---- ---- ---- 1.48
1953 1.57 ---- ---- ---- 1.56
1954 1.61 ---- ---- ---- 1.63
1955 1.73 ---- ---- ---- 1.71
1956 1.81 ---- ---- ---- 1.78
1957 1.86 ---- ---- ---- 1.86
1958 1.89 ---- ---- ---- 1.93
1959 1.96 ---- ---- ---- 2.00
1960 2.04 ---- ---- ---- 2.08
1961 2.10 ---- ---- ---- 2.15
1962 2.18 ---- ---- ---- 2.23
1963 2.26 ---- ---- ---- 2.30
1964 2.33 ---- ---- ---- 2.37
1965 2.45 ---- ---- ---- 2.45
1966 2.54 ---- ---- ---- 2.52
1967 2.61 ---- ---- ---- 2.60
1968 2.69 ---- ---- ---- 2.67
1969 2.77 ---- ---- ---- 2.75
1970 2.86 ---- ---- ---- 2.82
1971 2.92 ---- ---- ---- 2.89
1972 2.97 ---- ---- ---- 2.97
1973 3.06 ---- ---- ---- 3.04
1974 3.06 ---- ---- ---- ----
1975 3.01 ---- ---- ---- ---- 3.04
1976 3.09 ---- ---- ---- ---- 3.07
1977 3.13 ---- ---- ---- ---- 3.11
1978 3.14 ---- ---- ---- ---- 3.15
1979 3.18 ---- ---- ---- ---- 3.19
1980 3.13 ---- ---- ---- ---- ----
1981 3.08 ---- ---- ---- ---- ----
1982 3.04 ---- ---- ---- ---- ----
1983 3.03 ---- ---- ---- ---- ---- 3.04
1984 3.05 ---- ---- ---- ---- ---- 3.06
1985 3.04 ---- ---- ---- ---- ---- 3.07
1986 3.09 ---- ---- ---- ---- ---- 3.09
1987 3.10 ---- ---- ---- ---- ---- 3.10
1988 3.14 ---- ---- ---- ---- ---- 3.12
1989 3.15 ---- ---- ---- ---- ---- 3.13
1990 3.17 ---- ---- ---- ---- ---- 3.15
1991 3.17 ---- ---- ---- ---- ---- 3.16
1992 3.18 ---- ---- ---- ---- ---- 3.18
1993 3.18 ---- ---- ---- ---- ---- 3.19
1994 3.20 ---- ---- ---- ---- ---- 3.21
1995 3.21 ---- ---- ---- ---- ---- 3.22
1996 3.24 ---- ---- ---- ---- ---- 3.24
1997 3.27 ---- ---- ---- ---- ---- 3.25
1998 3.29 ---- ---- ---- ---- ---- 3.27
1999 3.27 ---- ---- ---- ---- ---- 3.29
2000 3.31 ---- ---- ---- ---- ---- 3.30
2001 3.31 ---- ---- ---- ---- ---- 3.32
2002 3.30 ---- ---- ---- ---- ---- 3.33
2003 3.34 ---- ---- ---- ---- ---- 3.35
2004 3.38 ---- ---- ---- ---- ---- 3.36
Ok, this is probably quite obvious to some, but lets take a brief moment to consider what this means. A straight line on a Log plot is exponential growth (or decay). Thus, interpreting the historical production data this way is equivalent to saying oil production has tended to grow according to a stable exponential trend until something disrupts that trend, at which time, after a short period of adjustment, production resumes growth according to some other exponential trend. The first thing that may jump to mind is that short term forecasts by IEA, EIA etc. always basically follow this principle - Exponential growth in consumption and fix production to match. Well, this methodology suggests that they will be correct, or correct until, at least, a significant discontinuity arrives. For those of us who have spent inordinate amounts of time trying to find stable methodology for finding verhulst fits, this suggests the reason we have had so much trouble at this. If the local behaviour of the curve is exponential, while the local behaviour of the fitting verhulst is relatively flat (concave), it is unsurprising that large variation in parameters yield such small variation in goodness of fit. Anyway, let's continue.
We now have 6 straight lines, and 5 periods of discontinuity. Each straight line can be defined by m and b such that y = mx + b. The lines are:
- Code: Select all
m 0.066 0.079 0.084 0.074 0.038 0.015
b -1.831 -1.894 -2.429 -2.376 0.159 1.774
If you want, you can plot the exponential of these lines, and see how it maps on to ordinary production.
The five periods of discontinuity are:
circa 1919-192? (beginning of "Roaring twenties")
circa 1929-1932 (Great Depression)
circa 1937-1943 (WWII related)
circa 1973-1975 (First "Oil Crisis")
circa 1979-1983 (Second "Oil Crisis")
I think these periods of discontinuity are pretty apparent, and find it remarkable that they each map well to a significant geopolitical event. The exception is perhaps the first, which also happens to be the only positive shock. In seperate analysies I took the first shock to be 1919-1920 or 1919-1923. For the forecast below I actually used 1919-1923. Either way, it seems strange to me that WWI produces no obvious negative shock, and that the aftermath should produce such a positive shock, but I don't know my history that well. Oil production certainly seems to behave differently between the two world wars. Maybe someone knows why.
Anyway, we now have a framework from we would like to generate a forecast into the future. We suspect that oil production will continue along an exponential trend, until at some point in the future, when there is some discontinuity (probably big enough to be called a geopolitical shock), after which we may forecast that production will continue along some new exponential trend (until some further discontinuity). Several questions arise. How long until the next discontinuity? How large will the discontinuity be? What new exponential trend is likely to follow the shock? If we can answer these questions, we have a forecast that proceeds indefinitely into the future.
The method I have used to resolve these questions is rather shaky - believe me, I know how many long bows I'm balancing atop one another. If anyone has better ideas for how to resolve these questions, I'd love to hear it. Continuing, the method I used to resolve these questions follows.
Whatever method is used to generate future exponential trends, I suggest that it intrinsically make use of the fact that oil is a non-replenishable resource. This is the core logic behind conventional use of the verhulst curve, and it seems we may be able to make use of that now. I generated a verhulst curve, which becomes a comparison function, whereby the difference between the comparison function and actual production is taken to be related to the probability of shock. Follow? Production grows exponentially until it travels too far from a baseline verhulst curve. As it gets further away, the likelihood of a shock increases, and the shock will bring current production more into line with the comparsion verhulst curve. Ok, continuing.
I fitted a verhulst to historic data. Actually, I prefer to fit the Log of a Verhulst to the Log of historic data, because the latter exhibits relative homoskedasticity. This just means that the variance is pretty similar as we travel along the curve. It's not perfect in this case but it's better than fitting to the raw curve. I use Least Square Error mostly. When I did the fit, the following parameters were resolved.
U=1583
(1/k)=14.68
n=0.96
T1/2=1995
Looking at this fit, I believe we can see easily that it is inappropriate. If we look at the difference between the data and the fit, we can see that it is a it's greatest in year 2004. Logically, we can see that this is an artifact of fitting locally expontential data with a locally concave curve. This is the same artifact discussed earler. Thus, I rejected this fit as heavily biased.
Unsure of how to proceed, I chose to take U as an exogenous variable. Following Campbell and others, I used the ballpark 2000 Gb as U. The choice of U will significantly alter the outcome of the model. I subsequently fitted the remaining variables to log production - log verhulst, as before. The parameters were:
U=2000
(1/k)=14.53
n=1.84
T1/2=2003
Before continuing, note that U will no longer actually equal Ultimate production as in the standard verhulst model. Rather, it is merely a parameter of the comparison curve. The actual Ultimate will almost always be a fair amount greater than U, because the exponential growth trend will mostly trigger negative shocks by being greater than the comparison function (at least in the model that follows). Actual ultimate seems to be about 5-10% greater than U. I usually take the liberty of assuming that Campbell et al are 10-20% too pessimistic, so a U of 2000 is fine here with me (implies actual U = 2100-2200). I can also run this model with any other numbers, so make suggestions. What would be better is a non-biased method for estimation of U. Any ideas? ("I know what we need, a magic bullet!") Continuing.
Ok. Now let's take a look a the Z score for the difference between the log of production and the log of the fitted verhulst. Data are:
- Code: Select all
1901 -0.16
1902 -0.06
1903 -0.05
1904 0.28
1905 -0.33
1906 -0.91
1907 0.18
1908 0.25
1909 0.07
1910 0.26
1911 0.11
1912 -0.22
1913 -0.07
1914 -0.16
1915 -0.23
1916 -0.31
1917 -0.12
1918 -0.61
1919 -0.38
1920 0.71
1921 1.00
1922 1.34
1923 2.09
1924 1.58
1925 1.45
1926 1.15
1927 1.70
1928 1.56
1929 1.91
1930 1.02
1931 0.32
1932 -0.53
1933 -0.31
1934 -0.41
1935 -0.28
1936 -0.18
1937 0.29
1938 -0.39
1939 -0.52
1940 -0.78
1941 -1.03
1942 -1.96
1943 -1.88
1944 -1.32
1945 -1.80
1946 -1.85
1947 -1.61
1948 -1.13
1949 -1.66
1950 -1.30
1951 -0.87
1952 -0.93
1953 -0.94
1954 -1.06
1955 -0.65
1956 -0.46
1957 -0.53
1958 -0.77
1959 -0.62
1960 -0.51
1961 -0.43
1962 -0.24
1963 -0.11
1964 0.05
1965 0.56
1966 0.80
1967 0.95
1968 1.22
1969 1.43
1970 1.80
1971 1.88
1972 1.95
1973 2.26
1974 1.97
1975 1.30
1976 1.60
1977 1.59
1978 1.40
1979 1.45
1980 0.85
1981 0.20
1982 -0.31
1983 -0.61
1984 -0.66
1985 -0.87
1986 -0.66
1987 -0.78
1988 -0.63
1989 -0.66
1990 -0.61
1991 -0.73
1992 -0.76
1993 -0.81
1994 -0.75
1995 -0.68
1996 -0.52
1997 -0.31
1998 -0.17
1999 -0.29
2000 -0.01
2001 0.01
2002 0.01
2003 0.32
2004 0.68
I thought this was worth taking a special look. If you plot this up, you can see very clearly the discontinuities. The five shocks at the times mentioned earlier stand out quite clearly. We can take the Z value at the start of shock, subtract the Z value at the end of shock, and take the absolute value of this, to get a shock magnitude. Following this procedure, we get shock magnitudes of delta Z =:
2.51
2.45
2.26
0.99
2.10
Do you remember that one of the things we were going to need to proceed with the forecast was the size of the shock, when it arrived? Well, I used these numbers to approximate the size of future shocks. I know this is a very dodgy procedure with only five data elements, but hey, it's the best I could think of. Better suggestions welcome. In practice, what I actually did was a maximum likelihood estimation procedure to generate parameters for a 2 parameter Wiebull distribution. I choose Wiebull because I needed a one directional distribution, and Wiebull hit highest likelihood amoungst the bunch I tried. With five data elements, it's going to be impossible to distinguish between distributions anyway. Parameters were resolved as:
alpha 5.21
beta 2.26
So later on, I'm going to use that distribution to generate magnitudes of shocks. Check that one off.
Ok. Let's work out when shocks should occur. For those of you who are on the ball, you probably suspected from the fact that I generated a distribution to resolve shock magnitudes that I was going to use a Monte Carlo procedure later on. Well, you are right. Consistent with this, I want to a generate a probabilistic method of determining whether or not a shock occurs. I wanted to use historic Z values as the baseline assumption for what Z values trigger shocks, but have been unable to make this work yet. I am still working on variations to the model which may make this possible in the future. I any case, I simply did what MC modellers sometimes have to do, and rigged up a loose approximation to what kind of Z value will trigger a shock. I compared the current Z value with a normally distributed random variable with mean 2 and SD 1/2. A shock was triggered when the current Z value surpassed the random variable.
Let's look at what we have so far. We have the exponential trend, which we expect to continue until a shock arrives. We have a measure of the likelihood that a shock will arrive in any given time. Also, we have a tentative measure of the magnitude of that shock, were it to arrive. After the shock, we expect oil production to resume an exponential trend. However, we currently have no method of deciding what that trend should be. If you have read this far, then you may suspect that I'll use some dodgy method of approximating this, and you'd be right.
Remember the m and b parameters from the set of straight lines mapped to historical log of production? Well, if you plot these m vs b parameters, you'll see they vaguely allude to a straight line. I obtained the regression equation:
b= -61.93 * m + 2.58
Ok then. Now we have timing of the shock, magnitude of the shock, which will yield the first data point after the shock (all shocks were assumed to be 1 year long for simplicity). Also, now we can find the new exponential trend because we know the relationship between m and b.
I'm sure i've forgotten something, but, it's gone too long already, so, now we'll just take 10000 sets of the above mentioned random numbers, and obtain the forecast below. From left to right, the columns refer to -
YEAR Historic P5 P25 P50 P75 P95 Median
- Code: Select all
1950 3.80
1951 4.28
1952 4.52
1953 4.80
1954 5.02
1955 5.63
1956 6.12
1957 6.44
1958 6.61
1959 7.13
1960 7.66
1961 8.19
1962 8.89
1963 9.54
1964 10.29
1965 11.61
1966 12.62
1967 13.55
1968 14.76
1969 15.93
1970 17.54
1971 18.56
1972 19.59
1973 21.34
1974 21.40
1975 20.38
1976 22.05
1977 22.89
1978 23.12
1979 24.11
1980 22.98
1981 21.73
1982 20.91
1983 20.66
1984 21.05
1985 20.98
1986 22.07
1987 22.19
1988 23.05
1989 23.38
1990 23.90
1991 23.83
1992 24.01
1993 24.11
1994 24.50
1995 24.86
1996 25.51
1997 26.34
1998 26.86
1999 26.40
2000 27.36
2001 27.31
2002 27.17
2003 28.12
2004 29.29
2005 ---- 30.06 29.58 29.24 28.91 28.42 29.28
2006 ---- 30.98 30.19 29.64 29.09 28.30 29.73
2007 ---- 32.11 30.85 29.97 29.09 27.82 30.18
2008 ---- 33.52 31.49 30.08 28.67 26.64 30.65
2009 ---- 34.89 31.85 29.73 27.61 24.57 31.12
2010 ---- 35.44 31.48 28.73 25.97 22.01 31.60
2011 ---- 34.17 29.93 26.98 24.03 19.79 25.00
2012 ---- 31.29 27.79 25.35 22.91 19.41 24.09
2013 ---- 28.39 26.09 24.49 22.89 20.59 24.05
2014 ---- 27.25 25.57 24.41 23.24 21.56 24.24
2015 ---- 27.35 25.70 24.55 23.40 21.75 24.45
2016 ---- 27.65 25.86 24.62 23.37 21.58 24.61
2017 ---- 27.96 25.91 24.50 23.08 21.03 24.66
2018 ---- 28.13 25.77 24.12 22.48 20.12 24.56
2019 ---- 28.01 25.32 23.46 21.60 18.91 24.21
2020 ---- 27.40 24.54 22.56 20.58 17.72 23.17
2021 ---- 26.32 23.53 21.60 19.66 16.87 21.01
2022 ---- 24.92 22.42 20.69 18.95 16.45 20.08
2023 ---- 23.68 21.53 20.04 18.55 16.41 19.69
2024 ---- 22.83 20.94 19.63 18.33 16.44 19.46
2025 ---- 22.28 20.52 19.29 18.07 16.31 19.26
2026 ---- 21.97 20.20 18.97 17.74 15.97 19.05
2027 ---- 21.73 19.86 18.56 17.27 15.40 18.76
2028 ---- 21.31 19.35 17.98 16.62 14.66 18.23
2029 ---- 20.73 18.74 17.35 15.96 13.97 17.52
2030 ---- 20.02 18.06 16.70 15.33 13.37 16.60
2031 ---- 19.18 17.33 16.05 14.76 12.91 15.84
2032 ---- 18.36 16.65 15.46 14.27 12.57 15.24
2033 ---- 17.64 16.08 14.99 13.90 12.34 14.85
2034 ---- 17.08 15.60 14.57 13.54 12.06 14.51
2035 ---- 16.59 15.16 14.17 13.17 11.74 14.19
2036 ---- 16.17 14.75 13.77 12.78 11.37 13.83
2037 ---- 15.72 14.31 13.32 12.34 10.92 13.39
2038 ---- 15.27 13.85 12.87 11.89 10.48 12.91
2039 ---- 14.74 13.35 12.39 11.43 10.04 12.38
2040 ---- 14.19 12.86 11.94 11.01 9.68 11.89
If you're still with me, thanks for journeying. I look forward to hearing all the easy ways I can eliminate the incessant and inherent dodginess from the model. If someone wants me to run other numbers, I can do that too. Alternatively, PM me for the spreadsheet (excel). Be warned, however, that the spreadsheet is what I consider to be one of the most superior examples of "Spagetti spreadsheeting" that one will ever witness. A dying art in modern times. I will not be held liable for psychiatric bills incurred in trying to decypher it.