Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations KootK on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Recreating a histogram from statistical moments

Status
Not open for further replies.

rih5342

Marine/Ocean
May 8, 2007
40

There's alot of information available about generating moments from a histogram but I haven't found a clear example of
using the statistical moments to approximate the original histogram.

Is there a function that approximates a histogram from its statistical moments?



 
Replies continue below

Recommended for you

I don't think the transformation is reversible.

Mike Halloran
Pembroke Pines, FL, USA
 
There's certainly one way.

Monte carlo a distribution of say 100 random numbers (population size will depend on what bin size you want). Do a sensitivity of the moments to each member. kill the ones that are giving you a low score. replace them with new random numbers. rinse and repeat.

Having said that since you can do it analystically for just the first two moments (mean, sd) I can't see why it would be impossible for higher order moments (kurtosis et al)





Cheers

Greg Locock


New here? Try reading these, they might help FAQ731-376
 
Only to the extent that a standard deviation characterizes a normal distribution, i.e., a normal distribution will have specific moments, which means that you can deduce that the set of moments corresponds to the distribution. But, given a single standard deviation, there are lots of distributions that could result in that standard deviation.

But, histograms are not necessarily the same as the distributions from whence they came, so the histograms's moments will tend to be ambiguous w.r.t. to the distribution, since the histogram is not a perfect replica of its distribution.

TTFN
faq731-376
7ofakss

Need help writing a question or understanding a reply? forum1529
 
Going out on a limb here but there is such a thing as a moment generating function.


This function gives the central moments of a process from its derivatives.

Now you have the moments but not the function if I understand correctly.

In the wiki link there are several example of distributions with their moment generating functions.

If you differentiate them with respect to t and evaluate at t=0 they give the moments.

You might find some kind of match to the moments you have considering a scaling factor.

Sounds painful though.
 
Interesting question. It appears that the more general question (can you invert the function that calculates moments from a measure?) is called the "Moment Problem". There are all sorts of studies into whether this inversion exists, and if it does, whether it is unique. I think in your case, you can afford to simplify the problem with the goal of actually calculating it at the end of the day.

I'm going to make a number of simplifications that may or may not apply in your scenario, but should help you make your own simplifications:

1. The random variable is discrete. The fact that you're talking about a histogram might satisfy this one nicely.
2. The range is finite. In fact, let's assume there are just 5 samples from 0 to 4. You can always scale appropriately.
3. The moments have all been calculated according to: u_n = integral( (x-c)^n * f(x), dx) which in the discrete case is re-written as a summation: u_n = sum(from 0, to 4, (x-c)^n * f(x)), where u_n is the nth moment, c is some central value, f(x) is your original (unknown) function and x is the sample. Often different order moments have slightly different calculating functions to normalise them or otherwise make them more useful. It shouldn't be a problem to revert them all to the same generating function.
4. c in the previous function is always 2. If it's not, that's no problem, it just has to be substituted appropriately.

With these assumptions, the derivation follows quite simply as a set of simultaneous equations:

u_0 = f0 + f1 + f2 + f3 + f4 (where the notation fn is used instead of f(n) for brevity)
u_1 = -2f0 - f1 + f3 + 2f4
u_2 = 4f0 + f1 + f3 + 4f4
u_3 = -8f0 - f1 + f3 + 8f4

Which are all linear, so you get all the normal results: there is a unique solution is you have as many moments as samples; and there are infinite solutions if you have less moments than samples.

The latter case is probably the more likely, so maybe it's worth pointing out that you can constrain your solution space quite dramatically if you know anything about the underlying distribution. If it is normal for example, just two moments (the mean and the standard deviation) is enough to uniquely determine the distribution.

Finally, I'd concur with GregLocock's excellent suggestion - since you're just after an approximation, a Monte Carlo type approach is well worth considering, especially if you only have a few moments and several samples to recover. As well as just minimising the error in the moments, you could help steer the Monte Carlo process with other soft information you have about the distribution. For example, you could start the Monte Carlo from a few distributions which are "close" to what you expect the final result to be and see if you can get them to converge.
 
Everyone, thank you, good input.
Turns out this is called an Edgeworth series.
The problem is full of challenges: a function made from moments may not be unique and may not yield a positive probability.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor