Mccoy
Geotechnical
- Nov 9, 2000
- 907
BigH wrote in one of the latest threads:
As a statistics buff, I'll bite Howard's bait and say whatever comes to my mind about such point.
First and foremost, we must assume that the sample is homogeneous, which is to say that all methods are equal, i.e. they have no bias, they have shown to be reliable, they are suited to all conditions...
We know well this is not true; so we should analyze the model and pick 3-4 methods which have been shown to be reliable in the specific case under examination.
We further assume that the results are random, i.e. there is no bias due to the source of measurements (lab tests, SPT,CPT....), to stratigraphy, presence of water...
We end up with a sample of small dimension, in a non -parametric context (we don't know if a specific distribution function suits
the model.
I've no real-case data at hand (I usually average values of bearing capacity, not settlements), but the rule usually applies that arithmetic mean yields higher values than geometric mean, since the latter filters the extreme values out, the former rules them in. Of course, outliers are suspicious values per se, so sometimes they are excluded a priori.
At the end, I believe it's a matter of personal judgment (the subjectivity which is such an important part of the geotechnical engineer's job) to decide wether to rule out the outlier(s), to rule'em in because we want to include that possibility, to rule'em out because we want an average which is little sensitive to extremes...
If data are clustered, arithmetic or geometric mean makes very little difference.
If we want to deepen the subject, I'll say that, in this specific case with very small samples, Bayesian analysis might do a very reasonable job, where the average value is "corrected" by our prior knowledge of the model. One example might be as follows: let's figure out all the possible values of settlements by all methods in our knowledge; the values will make up the range of all possible settlements in our model (the Bayesian prior); let's then calculate the settlements by our one ore two chosen methods; this will be what we think is the "actual value", or most likely value, or Bayesian likelyhood. Combining the two by Bayes' theorem we end up with a posterior value (or distribution) which is our final value to be used in calcs.
If the above isn't clear, please don't bother, it's rather specialistic stuff, adapted by myself to the specific situation and might well be the subject of a technical paper (I don't know if something similar has been done) or conversely of sensitive criticism.
Please note that the issue of means would apply more practically and simply to lab or field measurements of specific parameters, but here we should include a few more aspects like spatial correlation.
By the way, I think in our group there is one of the leading world researchers in statistical study of random fields (Gordon Fenton, who has also published detailed papers in elastic settlements, available in the web).
If he'd like to chime in his opinions would be most valuable (and please, feel free to criticize whatever I said above, I'll take no offence, guaranteed!)
for sands, a paper years ago in Ground Engineering implied to pick three or four of the available analyses (he reviewed something like 15), then take the average of them - arithmetic average? Nominal average; geometric average? - I'll let that be as per another thread in the main forums!
As a statistics buff, I'll bite Howard's bait and say whatever comes to my mind about such point.
First and foremost, we must assume that the sample is homogeneous, which is to say that all methods are equal, i.e. they have no bias, they have shown to be reliable, they are suited to all conditions...
We know well this is not true; so we should analyze the model and pick 3-4 methods which have been shown to be reliable in the specific case under examination.
We further assume that the results are random, i.e. there is no bias due to the source of measurements (lab tests, SPT,CPT....), to stratigraphy, presence of water...
We end up with a sample of small dimension, in a non -parametric context (we don't know if a specific distribution function suits
the model.
I've no real-case data at hand (I usually average values of bearing capacity, not settlements), but the rule usually applies that arithmetic mean yields higher values than geometric mean, since the latter filters the extreme values out, the former rules them in. Of course, outliers are suspicious values per se, so sometimes they are excluded a priori.
At the end, I believe it's a matter of personal judgment (the subjectivity which is such an important part of the geotechnical engineer's job) to decide wether to rule out the outlier(s), to rule'em in because we want to include that possibility, to rule'em out because we want an average which is little sensitive to extremes...
If data are clustered, arithmetic or geometric mean makes very little difference.
If we want to deepen the subject, I'll say that, in this specific case with very small samples, Bayesian analysis might do a very reasonable job, where the average value is "corrected" by our prior knowledge of the model. One example might be as follows: let's figure out all the possible values of settlements by all methods in our knowledge; the values will make up the range of all possible settlements in our model (the Bayesian prior); let's then calculate the settlements by our one ore two chosen methods; this will be what we think is the "actual value", or most likely value, or Bayesian likelyhood. Combining the two by Bayes' theorem we end up with a posterior value (or distribution) which is our final value to be used in calcs.
If the above isn't clear, please don't bother, it's rather specialistic stuff, adapted by myself to the specific situation and might well be the subject of a technical paper (I don't know if something similar has been done) or conversely of sensitive criticism.
Please note that the issue of means would apply more practically and simply to lab or field measurements of specific parameters, but here we should include a few more aspects like spatial correlation.
By the way, I think in our group there is one of the leading world researchers in statistical study of random fields (Gordon Fenton, who has also published detailed papers in elastic settlements, available in the web).
If he'd like to chime in his opinions would be most valuable (and please, feel free to criticize whatever I said above, I'll take no offence, guaranteed!)