Hello guys,
Comments on a couple of points:
At the time, we just attributed it to user bias etc. but your work above confirms it may be due to the modelling method, e.g. verhulst function. However, I figure any of these modelling functions may do the same thing, if left to their just devices.
Actually it is even more complicated than that. The 2 imminent theoreticians on Nonlinear Regression were Bates and Watts who published extensively in the later 70s - 80s, before the era of Monte Carlo based techniques. BW derived elegant formulas by
approximating the Hessian of the chi-square function (I do not want to sound pedantic, but the underlying theory is not well known) and gave measures to assess the practionioner in understanding the results (i.e. the curvature measures of Nonlinearity). There are two such measures i.e. the parameters effect and the intrinsic nonlinearity effect). The first measure (roughly) how much of the non-linear behaviour is attributed to the values of the parameter and the latter the intrinsic non-linearity of the model (i.e. how well the Hessian is approximated by the linear/quadratic approximation of the algorithm). With different data (i.e. try and re run your simulations by eliminating one or two observations) the prediction will be drastically different. Even letting the algorithm work an extra or a couple iterations throw us of the mark. Unfortunately this is a
feature of the time series we are working with and much less a feature of the Verhulst model per se. Without smoothing .... the results in terms of reproducibility etc are even worse.
To complicate matters, estimation using Verhulst (at least in its current parametric form) will probably never fly (i.e. the estimated parameters are multicollinear). To illustrate, generate an artificial dataset based on the Verhulst, add random noise and pass it through your solver. You will find that even in the case of known parameters ... you end up with estimates of the various parameters that are off the mark (usually lower for the URR, and lower half lifes etc). In the artificial dataset cases, the curvature measures of nonlinearity are better behaved than the real world dataset, but there is room for improvement.
BTW DO NOT USE EXCEL - it has been proven again and again that it's statistical routines are unreliable (there was a big fuss a few years ago on the Statistical Literature about the pitfalls of the various statistical packages). Mathematica is immune but the inferences are more unstable if one :
- uses machine precision arithmetic
- uses high precision and accuracy in the data (i.e. in the artificial dataset case, using a higher accuracy in the artificial dataset resulted in catastrophic performance of the estimation procedure). What i do is fix the accuracy in the input data in the 2 or 3 digits and then use a much higher tolerance in the numerical calculations AND arbitrrary precision arithmetic AND symbolic derivatives
- let the system ran forever ... one needs to restrain the fitting process to ran for only a few iterations ... otherwise it tends to locate curves with unrealistic parameters (i.e. the best curve in terms of least squares gives a URR close to fifty thousands GBa !!!)
So to summarize:
1) The dataset we are dealing with us, is crappy. It is not just the quality of data, but also the fact that the raw data is not of the form Signal+N(0,s). The error term each year is the sum of the errors of the previous years hence one of the important assumptions of NLS (uncorrelated errors) is violated.
2) The Verhulst does not lend itself to a straightforward NLS exercise. Even when you generate data on known versions of the process, the estimation procedure is characterised by extremely high CMNL (curvature measures of nonlinearity) values and one should try and rectify that.
3) Possible rectifications:
a) reparametrization of the Verhulst (I think this is were the money is). This is addressed latter
b) variance stabilization transformations of the data (albeit my smoothing did stabilize variance and in the artificial datasets I used, the variance was pretty much stable
c) doing away with the Verhulst altogether, even though :
- it has a sound theoretical basis though on ecosystem biology dealing with utilization of finite resources
- the URR is a parameter of the equation. In other modeling approaches, one would probably have to integrate over the realizations of the process to get an idea of the URR)
d) Kalman filtering approaches or Markov Chain approaches using the current form of the Verhulst equation
e) working with the Stochastic Differential Version of the Verhulst and using Brownian motion tools to estimate the parameters ). A related thread would be to use the equations in Roper's paper that deal with variable extraction rates and then apply the Expectation Maximization algorithm to calculate the integrals of the depletion rate at each point.
If you think these points are worth while pursuning PM me to see if we can work something out (for the next couple of months I am horribly busy with my certification exams in Internal Medicine). It is a very interesting applied statistics problem on top of its obvious importance for all of us
A good solution could be to perform a Bayesian fit, as you suggested, by using a prior probability distribution for the URR that will constrain the URR to remain within a valid range. Another solution could be also to make assumptions about the expected depletion rate.
Valid points both of them ... The theory of nonlinear regression and Bayesian assessment of the results has been worked out by Bates and Watts. Using locally non-informative priors they derived formulas for the Posterior Distributions of the various parameters. These formulas are listed in the publications I am referencing at this email. When I tried to visualize them mentally (4D space!) the results were extremely complicated. Given the intrinsic multicollinearity of Verhulst based parameter estimation I do not think that further progress can be made by using the current parametrization. One way out would be to do the following exercises:
1) Expand the VF using a finite power series in the vicinity of n=1,2,3 ... j (Using Mapple or Mathematica) to get n out of the exponents. A low order series approximation can give results that are within 10% of the theoretical values for n 3-4 times the size of the point of expansion. Playing with artificial datasets ... the inference on the parameters seems to be improved, but the garden variety solver would be inadequate. I suspect that a 2 step procedure combining gradient descent and symbolic expansion at the estimated value of n would be necessary.
2) Reparametrize the Verhulst so that instead of Qinf, n, τ, t1/2 we estimate the terms Qinf/(n*τ), Exp[τ/t1/2], 1/τ, n. Then using the asymptotic formulas for the HPDs, Bayesian Integration, and the PDF transformation theorem extract the the probability density values of the parameters we have to use in our response to PO (i.e. Qinf, n, τ, t1/2)
3) Do a full Bayesian analysis with proper (not necessarily conjugate priors) inside BUGS or your favourite Monte Carlo software.
I do not know which ones (1)-(3) are best ... but even with an approach as simplistic as the Verhulst equation it is not going to be easy. Not to mention the next step which would be a Lotka-Volterra simulation of the effect of PO on global population and infrastructure (to see how many people have to be executed in the next ASPO newsletter
![Wink :wink:](https://peakoil.com/forums/images/smilies/wink.gif)
)
My 2 c on the whole business of depletion modeling:
1) we as a human species control the dynamics of depletion from now on. What is not mentioned in any of the sites/papers/posts I have read (or is not understood) is that all the parameters are conditional to the concept of business as usual (i.e. continuation of extraction etc). If we stop pumping oil tomorrow ... then irrespective of geology depletion will not continue and peak (as well as oil becomes irrelevant). Even the 1-2% depletion rate is not particularly relevant, because the figure is calculated on the basis of a "business as usual" modus operandi; the more relevant figure is the input to the transformation of energy infrastructure ACROSS the globe (and that figure can be increasing every year in the face of decreasing global consumption). Leaving the oil industry to market forces would be folly at this point. I'm particularly worried that the big players will try to do something foolish i.e. embark in a resource war and then we are hosed. This is why the collapse of the Vikings in Greenland is a relevant historical precedent (i.e failure to change one's ways in the face of changing external constraints)
2) the whole Odulvai business. I like Duncan's work ... simple but not simplistic as a guiding principle. There is an underlying assumption though, that it is business as usual and that no-one can effect things. In mathematical terms Duncan's energy per capita ratio is basically a composite measure calculated in terms of a Lotka-Volterra evolution law(predator = human, prey = energy). The Lotka-Volterra approximation is a highly non-linear equation, with many interesting regimes of stability between predator and prey, and as with the PO/Verhulst approach it assumes that predating continues even in the face of imminent disaster (i.e. an energy extraction peak rate). However if the predating ratio is decreased and alternate energy forms are found then a die-off is not unavoidable. Even if one has to use the "prime-food" to "create a secondary food source" (i.e. oil to fuel the transition to renewables) disaster is not guaranteed (in fact I dont know whether this has been studied, it would be a highly non-linear coupled system of Differential Equations).There are outcomes that are not dismal but hard political and personal choices have to be made (ie. move the paramaters of the equations in favourable subspaces). However this will never occur if fatalism prevails; when you are depressed, there can be no way out. And the bad thing is that people who are aware tend to see whatever they want to see and disregard the theory if it does not agree with their gloomy view. I believe that this is the worse hypocrisy (or maybe a human response?) Decide that nothing can be done NOW and create a fantasy world with concetration camps, semi-automatic riffles and nuclear depopulation. This might be plausible scenarios if nothing is done ... basic infrastructure guaranteeing survival of 6 bi people while descending is not energetically very expensive.A limit of energetically expensive personal choices needs take place (i.e. carbon quotas etc) but outside the market system. This is a war based mentality, but it is war and extinction we are facing. IT is the fat that has to go, particularly here in NA and then in EU.
3) The other danger ... is the use of mathematics to refute PO (i.e. home on the difficult mathematics to deny the existence of the phenomenon, or push it in the future). The burden falls on the technically/mathematically literate to defuse this bomb. It will not be long before an economist uses a NLS to create an unrealistic (but still mathematically plausible) fantasy world of 50000GBa to avoid the unavoidable deconstruction of the market system. It is either the market system or us .unfortunately (and I was a big believer till I delved into the physically implausible nature of the Cobbs Douglas production functions)
Anyway the technical references (anyone can PM if they want them in a digital format i.e. PDF/DJVU)
- R Dennis Cook, Miriam L Goldberg Curvature for Parameter Subsets in Nonlinear Regression
Annals of Statistics 14(4):1399-1418, (1986)
- Douglas M Bates, Donald G .Watts Relative Curvature Measures of Nonlinearity,
Journal of the Royal Statistical Society B 42(1): 1-25, (1980)
- Douglas M Bates, Donald G .
Nonlinear Regression Analysis and its applications . John Wiley and Sons ISBN 0-471-81643-4 (1988)