×
INTELLIGENT WORK FORUMS
FOR ENGINEERING PROFESSIONALS

Log In

Come Join Us!

Are you an
Engineering professional?
Join Eng-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Eng-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

What is the minimum sample size to generate a Weibull distribution?

What is the minimum sample size to generate a Weibull distribution?

What is the minimum sample size to generate a Weibull distribution?

(OP)
One of the advantages of the Weibull is you can form a distribution with a much smaller sample size than say a histogram.
I experimented with using only the first 5 data points from a sample set of 26 to do a Weibull analysis. The first 5 points fit a 2 parameter Weibull and gave an R^2 values of 0.99. The second data set (remaining 21 points) changes to a 3 parameter Weibull with an R^2 value somewhere in the neighborhood of 0.976. So my questions are

1) What is the minimum sample size you need to accurately represent the population? For example from this sample size X the PDF is within some metric (std deviations, % etc) of the true PDF. Clearly R^2 value alone isn’t a good metric

2) I would like to plot the remaining 21 data points over the CDF calculated from my first 5 points. I have used 2 methods. Simply calculating the MR and plotting test cycle vs MR. The second using the Beta and Eta to calculate the CDF percent and plotting test cycle vs CDF percentile. Is there a good way to represent predicted CDF vs test data?

Here is my data set.
154173
171158
83431
201778
117578

192083
136262
149487
148009
98317
69798
94195
62548
103574
108364
132377
143047
85272
95760
214166
289237
161265
172490
99972
117440
89717

RE: What is the minimum sample size to generate a Weibull distribution?

1) 3 unknowns require a minimum of 3 data points, but if the data points are widely dispersed, then your mean squared error should at least be a guide to how many data points are required.

2) Typically, you plot the data over the predicted equation curve

TTFN
FAQ731-376: Eng-Tips.com Forum Policies

Need help writing a question or understanding a reply? forum1529: Translation Assistance for Engineers


Of course I can. I can do anything. I can do absolutely anything. I'm an expert!
There is a homework forum hosted by engineering.com: http://www.engineering.com/AskForum/aff/32.aspx

RE: What is the minimum sample size to generate a Weibull distribution?

There is no specific number of data points that I have ever heard of. The number of data points is determined by repeatability, spread and ultimately the design limits. Statistically speaking all of your data points could be the ideal value but the next 100 data points could be scattered. Or your first values may be the 2 sigma values. Do your analysis, calculate the error and review the data. Then decide on the sample size. I have been stuck with sample sizes from 1 (due to cost) to 1,000.

RE: What is the minimum sample size to generate a Weibull distribution?

"One of the advantages of the Weibull is you can form a distribution with a much smaller sample size than say a histogram."

This practice could be seen as putting the cart before the horse. Be careful not to force inadequate data into reinforcing unsubstantiated foregone conclusions that your model can be in fact described by a Wiebull distribution.


RE: What is the minimum sample size to generate a Weibull distribution?

(OP)
@IRstuff or anyone that would like to answer. I'm not following how/why you would use your MSE. Isn't that just the same as the r^2? The larger that r^2 gets the smaller your SSE and MSE will get just that I have a relative scale 0-1 for r^2.

Using MSE are you suggested I get 0.009 and I don't think the 5 data points accurately predicts the remaining 21.

How do you plot the remaining 21 points over the predictive CDF? What method do you use to calculate the Median ranking?

RE: What is the minimum sample size to generate a Weibull distribution?

(OP)
@BigInch How is one to prevent themselves from forcing "inadequate data into reinforcing unsubstantiated foregone conclusions"? I have a good fit with my Weibull but so if you find yourself with only 5 data points how can you gauge how many more samples you will need?

RE: What is the minimum sample size to generate a Weibull distribution?

Well, how else would you define a "fit?" If the mean squared error went down substantially, then the initial fit was poor, but that's a trap. The problem with a generalized "fit," particularly with 3 parameters is akin to polynomial fits, i.e., you keep upping the order, and the fit keeps getting better, but all it's doing is fitting the noise. The Weibull function is problematic in that there is no physical basis for the parameters, since they essentially exist simply for the sake of fitting data to a smooth curve.

Offhand, I would argue that 5 data points would be woefully inadequate for fitting 3 parameters; a minimum of 3 times the number of parameters would seem like a good place to start, but even then, their spacing and relative similarity need to be considered. Since "noise" is inherent in any sort of reliability data, more data points

If you look at your data more closely, you can see that there is one specific datapoint that is problematic, because it seems to be substantially different than the rest.

TTFN
FAQ731-376: Eng-Tips.com Forum Policies

Need help writing a question or understanding a reply? forum1529: Translation Assistance for Engineers


Of course I can. I can do anything. I can do absolutely anything. I'm an expert!
There is a homework forum hosted by engineering.com: http://www.engineering.com/AskForum/aff/32.aspx

RE: What is the minimum sample size to generate a Weibull distribution?

(OP)
@IRstuff Good point about the polynomial. Since MSE is SSE/n and SSE is used to calculate your correlation coefficient r^2, is r^2 not a suitable proxy for MSE? I can potentially see that value going up or down as n increases. MSE for the 21 points (0.03)is higher than the 5 points (0.09). What is a suitable MSE? How do you judge sample size with MSE? It seems that you can have a good fit and not be representative of the true population

The first 5 data points I used a 2 parameter and for the remaining 21 I used 3 parameter based on my regression values. A lot of these seems to be based on your median rankings.

RE: What is the minimum sample size to generate a Weibull distribution?

Again, given a certain level of variability in the points, "representative" is a very relative term. A better R does mean a better fit, but because the points are never going to perfectly on a line, that's as good as it gets. That's why looking at the mean squared error is somewhat more physically meaningful because you are minimizing the distance of each point to the predicted line. However, since the points are never perfect, getting additional points could still mean that there's a better line to be found. The best you can do is whatever you can do with your given data, but, obviously, if you must predict an equation with only 5 points, you must expect there to be a potentially higher error against the other points.

TTFN
FAQ731-376: Eng-Tips.com Forum Policies

Need help writing a question or understanding a reply? forum1529: Translation Assistance for Engineers


Of course I can. I can do anything. I can do absolutely anything. I'm an expert!
There is a homework forum hosted by engineering.com: http://www.engineering.com/AskForum/aff/32.aspx

RE: What is the minimum sample size to generate a Weibull distribution?

iF you know your Weibull curve is good? Then why worry about sample size? Simply pick a value and see if it falls on the curve.

If you know your data fits some Weibull curve, plot it. Then see what data values don't land on it.

If you don't know which Weibull curve is the correct one, changing sample size will just keep on giving you different Weibull curves, each with slightly different parameters than the other.

To arrive at a Weibull curve that best describes a set of n values, consider all n values and calculate Weibull parameters.

If introducing new values to the sample size doesn't introduce disproportionate noise to the point where the curve doesn't change much from the previous curve, ie. the curve's parameters do ot change significantly, then you've got enough samples to predict the curve described by all (and presumably future) samples.



RE: What is the minimum sample size to generate a Weibull distribution?

(OP)
@Biginch In my first post the fit from the first 5 points do not fit the remaining 21 points. Hence r^2 values albeit good for both may not predict representation to the true population. The 5 vs 21 is arbitrary. If we imagine you do a test and have X samples how do you know that that quantity is enough? That is the question.

RE: What is the minimum sample size to generate a Weibull distribution?

Exactly. You can also ask, HOW DO YOU KNOW IT'S NOT.

If you have a sample of data and you calculate the best fit curve, it will be the best fit curve for the sample of data you have. If you change the data set and do it again, it will be the best fit curve for that data set.

If you are convinced that the Weibull curve that fits the whole sample population is the "ONE" true curve, take 5 random values from your population, get the parameters for the best Weibull fit. Now take six different, random values and calculate the Weibull parameters for that set. Keep on doing that with ever more and more random values. When you see that Weibull's parameters aren't changing much (within your margin of error), you've discovered how many values you need.


RE: What is the minimum sample size to generate a Weibull distribution?

Bear in mind, again, there is inherent variability in the data. If you collect another 26 data points, the resultant Weibull curve may also be different. Perfect repeatability only exists in school

TTFN
FAQ731-376: Eng-Tips.com Forum Policies

Need help writing a question or understanding a reply? forum1529: Translation Assistance for Engineers


Of course I can. I can do anything. I can do absolutely anything. I'm an expert!
There is a homework forum hosted by engineering.com: http://www.engineering.com/AskForum/aff/32.aspx

RE: What is the minimum sample size to generate a Weibull distribution?

Exactly. The number of Weibull curves possible is limited only by the amount of data you can collect and the number of combinations you can make with it. If they all happen to exactly fit one curve, I'd be suspicious. Very, very suspicious.


Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Eng-Tips Forums free from inappropriate posts.
The Eng-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Eng-Tips forums is a member-only feature.

Click Here to join Eng-Tips and talk with other members!


Resources