Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations cowski on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Statistics: probability distribution 2

Status
Not open for further replies.

kuki

Mechanical
Feb 26, 2002
10
I have the information about how many hours my machines operate per day (during one year)... so 30 machines, 350 days. I want to know the probability distribution focusing Operating hours per day. I am not sure how to treate the data in order to have the right distribution (the one which will allow me to know what is the highest probable period of time the machines need to work).
Can someone help?

I would be very thankful.
 
Replies continue below

Recommended for you

Here is what I think you are asking:

You have 30 machines. You have 350 days of data for each machine. You want to know the most likely period of time each machine will work.

My initial thoughts:

The most likely period of time will simply be the mean number of hours of operation for each machine. This is basically a point estimate. While there may be dependent relationships between machines, for your purposes considering each independent will give reasonable answers. Also, you can assume that the distribution is Normal and calculate a limiting estimate of the hours. For example you can say with some probability that the machine will not operate for more than X hours. You would need to calculate both the mean and the standard deviation. Then, the probability is taken as the area under the normal distribution to the left of the value X. To determine the area calculate the number of standard deviations to the right of the mean [(X-mean)/sd], sometimes called Z, and then using standard normal distribution tables obtain the area. Since the tables usually only give areas for half of the distribution, i.e., max = 0.5, then you will have to add 0.5.

 
Thank you very much for your answer.
I am not sure if I understood it... since my statistics knowledge is quite low... unfortunately.
I want to have the most likely operating hours per day... but not for each machine (independent from each other)... I need the general trend. So if I need to buy a new machine I know how many operating hours per day it will (most probably) need to work.
Is the information you sent taking that in consideration?
 
Are all 30 machines the same and do they do the same thing?

If so, then average is

(total hours for all 30 machines for 350 days)/(30 x 350).

If all the machines don't do the same thing then you have to classify your data by purpose and treat each classification separately.
 
One statistical curve i would plot would be the number of hours on the vertical axis vs the 12 months of the years.But instead of plotting total number of hours on the vertical axis, I would plot an index representing the hours per month. That information would tell me when is the greatest useage during any year with these machines. If data is available for the last five to ten years the better to determine trend such as slow downs around holidays and summer, production loss during certain days of the week, breakdown analysis.
You can refine this statistical curve by weeks and days instead of months.
 
As a professional stat guy, I am confused about your post. Are you looking for:
1. The mean is the average, so the average of 20,20,20,24 is 21.
2. The mode is the most highest likelihood, thus 20,20,20,24 is 20.

Assuming that the mean is not as important as the distribution, I would group the data in ranges of hours. Then by looking at this you might more insight, much more than a simple mean that might have some extreme numbers. So create several ranges and count the number of days the machine hour total for the machines fall into that range.
 
That's exactly what I realised meanwhile, CubScout. Thanks a lot for the confirmation.
Creating these ranges, or classes, I built an histogram, which allows me to make a good analysis, isn't it?
 
Kuki, I am puzzled about your remark to CubScout becausethe way you presented your problem, of what value would you get from a average and mode. Your data would appear to be more useful in determining trends and in Q.C..
 
Thanks for the remark chicopee.
The thing is I need this anaylisis to decide about the data to consider in another studay. In this way I need for each characteristic a value to use... as the highest probability. The trend is a general intersting view but I need one value to further consideration. An histogram will give that information, I suppose!
Further comments are very welcomed.
Regards.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor