Menu
 News
 Search
 Topics
 Stories Archive
 Submit News
 Discussions
 Code of Conduct
 Forums
 Forum Search
 Last 24 Hours
 PO 24hrs
 Peak Blog
 Ask Jane
 Resources
 About Us
 Downloads
 Web Links
 PeakWiki
 PeakPortal
 Focus Search
 Peak TV
 Peak Oil Boston
 Houston Peak Oil
 Follow on Twitter
 Members
 User Panel
 Members List
 PO Team
 JOIN!
 Private Messages
 
Support PeakOil.com
Visit Our Advertisers
 
Light Sweet Crude Oil
 




Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: Bootstrapping Technique Applied to the Hubbert Linearization
New postPosted: Tue Jan 10, 2006 2:09 pm 
Offline
Moderator
Moderator

Joined: Mon Sep 27, 2004 12:00 am
Posts: 934
Location: Canada
Motivation:

This thread is a continuation of How Reliable is the Hubbert Linearization Method?. The main motivation here is again to gain some insight in the robusteness, the limitations of the curve fitting aproach. One main issue is the lack of confidence intervals on crucial parameters (Ultimate Recoverable Ressource, growth rate and peak date).

What is bootstrapping? it's a way to repeat an experiment. With the Bootstrap, a new set of experiment is not needed, the original data is reused. Specifically, the original observations are randomly reassigned and the estimate is recomputed. These assignments and recomputations are done a large number of times and considered as repeated assignements.

Why is the bootstrap attractive? It can answer many questions with very little in the way of modelling, assumptions and can be applied easily.

In general, the bootstrap is a methodology for answering the following question: How accurate is a parameter estimator?. Bootstrapping is a fairly new method in statistics and has been proposed by Efron in 1979. The R software (open source :)) has a bootstrap library called 'boot' that implements almost all the standard techniques. I won't go into the details of the Bootstrap theory, a lot of details about the techniques and the R language implementation used in this post can be found in the following document:

John Fox. Bootstrapping Regression Models, Appendix to an R and S-plus Companion to Applied Regression

Application:

I restate the Hubbert linearization equation:
Code:
P(t)/Q(t)= K(1 - Q(t)/URR)

where Q(t) is the cumulative production at time t, K is the growth rate and URR=Q(t=+inf) (for a good discussion on this approach: Another Way of Looking at CERA).

The BP production data are used for the world production.

For each starting year Y we randomly generate 2,000 new samples (called bootstrap replicates) taken in the time range [Y:2004] on which a robust least-square fitting is perfomed (function rlm).

[align=center]Image
large version
(a) URR
Image
large version
(b) K
Histograms and normal quantile-comparison plots for the bootstrap replications of the URR (a) and K (b). The broken vertical line in each histogram shows the original value for the model fit to the original sample.[/align]

The confidence intervals are derived from the bootstrap replicates using the so-called bias-corrected accelerated method (BCa). The R script is given in the post below and has produced the following numerical results.

Code:
   Y    URR(50%)U URR(50%)L  K(50%)L  K(50%)U    URR(90%)L URR(90%)U  K(90%)L   K(90%)U   URR(95%)L URR(95%)U  K(95%)L     K(95%)U
1  1940 1589.852 1666.108 0.06553184 0.06803704 -1047.8955 1722.340 0.06229963 0.07005338 -2115.754 1738.640 0.05931496 0.07057659
2  1945 1534.121 1604.606 0.06716867 0.06970711  -909.1412 1655.086 0.06448430 0.07143187 -1694.621 1674.993 0.06099418 0.07215952
3  1950 1474.262 1535.714 0.06936133 0.07203309  1415.9148 1580.065 0.06720602 0.07404873  1371.762 1593.248 0.06624411 0.07463725
4  1955 1411.245 1479.824 0.07170182 0.07469338  1358.8073 1529.928 0.06909329 0.07690102  1332.695 1570.994 0.06755400 0.07777587
5  1960 1344.488 1429.857 0.07396238 0.07755941  1288.4431 1999.462 0.05258170 0.08004962  1269.352 2069.939 0.05197241 0.08069580
6  1965 1251.694 1362.724 0.07767415 0.08219249  1127.0189 2032.670 0.05220087 0.08584747  1048.126 2077.450 0.05171007 0.08775371
7  1970 1867.918 2100.396 0.05173704 0.05643942  1248.4862 2169.640 0.05066231 0.08498707  1150.549 2186.294 0.05038610 0.08823982
8  1975 2016.554 2132.814 0.05118828 0.05302772  1827.8085 2199.954 0.05023261 0.05699570  1443.484 2219.633 0.04985585 0.07501213
9  1980 2048.701 2138.568 0.05113436 0.05246769  1920.1593 2201.945 0.05022459 0.05480756  1837.808 2223.453 0.04995022 0.05584261
10 1985 2124.600 2202.525 0.05005941 0.05111410  2015.7552 2273.314 0.04894792 0.05221503  1840.982 2353.670 0.04681022 0.05449696

from which we derive the following figures:
[align=center]Image
Image
[/align]
We can also look at the correlation between the bootstrap replicates of the K and URR coeffcients:
[align=center]Image
Scatterplot of the bootstrap replications of the URR and K coefficients for the BP data. The concentrations ellipse are drawn at 50, 90, and 99% level using a robust estimate of the covariance matrix of the coefficients[/align]
Discussion:

    1- The 95% confidence intervals for the URR and K using data from 1980 to 2004 is [1838.0 2223.0] Gb and [5.0 5.6]% respectively.
    2- Not surprisingly, the confidence intervals are strongly affected by the 70's hump in production. The 1965-1970 period constitutes a transition or shock period between two production regimes.
    3- Estimates from data starting around 1970 are fairly stable
    4- the parameter K is negatively correlated with the URR (higher K values will give lower URR)
    5- Many variations can be made, for instance in the way the robust least-square is applied

_________________
______________________________________
http://GraphOilogy.blogspot.com


Last edited by khebab on Tue Jan 10, 2006 7:07 pm, edited 3 times in total.

Top
 Profile  
 
 Post subject: Re: Bootstrapping Technique Applied to the Hubbert Lineariza
New postPosted: Tue Jan 10, 2006 2:10 pm 
Offline
Moderator
Moderator

Joined: Mon Sep 27, 2004 12:00 am
Posts: 934
Location: Canada
Enjoy! :)
Code:
####################################################################
##
##  Script that illustrate bootstrapping techniques applied
##  to the Hubbert linearization of the total world production.
##
##  see the thread http://www.peakoil.com/fortopic16349.html for more details
## 
##  Author: Khebab (January, 2006)
##  version 1.0
##
####################################################################

rm(list=ls(all=TRUE))   # clean up
##
##  Load matlab emulator package and gremisc (to install)
##  in the menu bar: packages/install pacakage(s)
##
library(matlab)
library(gregmisc)
library(boot)
library(MASS)
library(car)

####################################################################
##
##              Main
##
####################################################################

#
#   World oil production since 1901 up to 2005 (from BP) in Gb (all liquids)
#
temp <- scan()
2.0000000e-001  1.6740000e-001  1.8180000e-001  1.9490000e-001  2.1800000e-001  2.1510000e-001 
2.1330000e-001  2.6420000e-001  2.8560000e-001  2.9870000e-001  3.2780000e-001  3.4440000e-001 
3.5240000e-001  3.8530000e-001  4.0750000e-001  4.3200000e-001  4.5750000e-001  5.0290000e-001 
5.0350000e-001  5.5590000e-001  6.8890000e-001  7.6600000e-001  8.5890000e-001  1.0157000e+000 
1.0143000e+000  1.0679000e+000  1.0968000e+000  1.2630000e+000  1.3250000e+000  1.4860000e+000 
1.4100000e+000  1.3730000e+000  1.3100000e+000  1.4420000e+000  1.5220000e+000  1.6540000e+000 
1.7920000e+000  2.0390000e+000  1.9880000e+000  2.0860000e+000  2.1500000e+000  2.2210000e+000 
2.0930000e+000  2.2570000e+000  2.5930000e+000  2.5950000e+000  2.7450000e+000  3.0220000e+000 
3.4330000e+000  3.4040000e+000  3.8030000e+000  4.2830000e+000  4.5190000e+000  4.7980000e+000 
5.0180000e+000  5.6260000e+000  6.1250000e+000  6.4390000e+000  6.6080000e+000  7.1340000e+000 
7.6613500e+000  8.1942500e+000  8.8877500e+000  9.5374500e+000  1.0285700e+001  1.1070450e+001 
1.2620000e+001  1.3550000e+001  1.4760000e+001  1.5930000e+001  1.7540000e+001  1.8560000e+001 
1.9590000e+001  2.1340000e+001  2.1400000e+001  2.0380000e+001  2.2050000e+001  2.2890000e+001 
2.3120000e+001  2.4110000e+001  2.2980000e+001  2.1730000e+001  2.0910000e+001  2.0660000e+001 
2.1050000e+001  2.0980000e+001  2.2070000e+001  2.2190000e+001  2.3050000e+001  2.3380000e+001 
2.3900000e+001  2.3830000e+001  2.4010000e+001  2.4110000e+001  2.4500000e+001  2.4860000e+001 
2.5510000e+001  2.6340000e+001  2.6860000e+001  2.6400000e+001  2.7360000e+001  2.7310000e+001 
2.7170000e+001  2.8120000e+001 

vWorldProduction <- matrix(temp, ncol=1 , byrow= F)


mProductionData <- cbind(cbind(c(1901:2004), vWorldProduction), cumsum(vWorldProduction));
mProductionData <- cbind(mProductionData, mProductionData[,2]/mProductionData[,3])

colnames(mProductionData , do.NULL = FALSE)
colnames(mProductionData )<- c("Year","Prod","CumProd","PQ")
frProductionData <- data.frame(year= mProductionData[, 1],  prod = mProductionData[,2], cumprod= mProductionData[,3], pq= mProductionData[,4])

head(frProductionData)
#
#   Function that apply the Hubbert linearization technique
#
boot.RLS.twoparam <- function(data, indices, maxit=20) {
   data <- data[indices,]
   mod <- rlm(pq ~ cumprod, data=data, maxit=maxit, method= "MM")
   v <- coefficients(mod)
   c(-v[1]/v[2], v[1])
}

matplot(x= frProductionData$cumprod, y= frProductionData$pq, pch = 1:4, type = "l", add= F, ylab= "P/q", xlab= "Q")
years <- c(1940,1945,1950,1955,1960,1965,1970,1975,1980,1985)
#years <- c(1940:1985)
Results.confidenceInterval <- data.frame(year= NA,
             URRMin1= NA, URRMax1= NA, KMin1= NA, KMax1= NA,
      URRMin2= NA, URRMax2= NA, KMin2= NA, KMax2= NA,
      URRMin3= NA, URRMax3= NA, KMin3= NA, KMax3= NA)
m <- 1
#
#   The bootstrapping technique is applied for different year intervals
#   from startingYear to 2004.
#
for(startingYear in years ) {
   
   idx <- find(frProductionData$year == startingYear)
   idx2 <- c(idx[1]:104)
   frSubProductionData <- data.frame(year= mProductionData[idx2 , 1],  prod = mProductionData[idx2 ,2],
             cumprod= mProductionData[idx2 ,3], pq= mProductionData[idx2 ,4]);
   # bootstrapping application
   Results.boot <- boot(frSubProductionData , boot.RLS.twoparam , 2000, maxit= 200)
   # confidence intervals calculation
   tempURR <- boot.ci(Results.boot, conf= 0.5, index= 1, type= c("perc","bca"))
   tempK <- boot.ci(Results.boot, conf= 0.5, index= 2, type= c("perc","bca"))
   tempURR2 <- boot.ci(Results.boot, conf= 0.9, index= 1, type= c("perc","bca"))
   tempK2 <- boot.ci(Results.boot, conf= 0.9, index= 2, type= c("perc","bca"))
   tempURR3 <- boot.ci(Results.boot, conf= 0.95, index= 1, type= c("perc","bca"))
   tempK3 <- boot.ci(Results.boot, conf= 0.95, index= 2, type= c("perc","bca"))
   Results.confidenceInterval[m,]= c(startingYear, tempURR$bca[4:5], tempK$bca[4:5]
      , tempURR2$bca[4:5], tempK2$bca[4:5], tempURR3$bca[4:5], tempK3$bca[4:5])

   plot(Results.boot, index=1)
   #jack.after.boot(Results.boot, index=1, main='URR')
   
   m <- m + 1
}
Results.confidenceInterval
write(t(Results.confidenceInterval), file="ConfidenceIntervals.txt")

_________________
______________________________________
http://GraphOilogy.blogspot.com


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Atom News Feed   Forums RSS Feed