Correlation Coefficient
Correlation Coefficient
(OP)
Hello all,
I am working for a plant and we have executed a green initiative that starts with the monitoring of our total usage in water, electricity, fuel etc. We have a table to do this, but it requires way too much data entry and time to complete so we are trying to use excel to automate it as much as possible.
If I have three column sets of data that are ordered like this (rough example of the table that has more data)
P= Pressure vessel cooling water, S= Sanitary water, and T = total water
P + S =T
20 10 30
30 15 45
40 20 60
And they are tabulated over the span of a few years, is it possible to create a formula or use a constant number that I can multiply or divide the total water consumption by, that can give me the pressure vessel cooling water with a slight but negligible margin of error?
What I did was with two of the columns (pressure vessel cooling water and total water consumption) calculate the correlation coefficient. It was about .9 which is pretty good. with that I set the
[Pressure Vessel cooling water amount] = [Total Water Consumption]*x
Solved for x and got say 0.7, can I use this number within the last few years of data sets in order to predict the amount of cooling water? The goal is to be able to do it for the remaining months of this year with again as I mentioned, a reasonable margin of error but to run the experiment on the last few years in order to get a better coefficient.
I experimented with the value and I recieved variations between the calculated values vs the original values of +/- no more than 10%. I don't know if there is a more formal formula for doing this but any help would be much appreciated!
I am working for a plant and we have executed a green initiative that starts with the monitoring of our total usage in water, electricity, fuel etc. We have a table to do this, but it requires way too much data entry and time to complete so we are trying to use excel to automate it as much as possible.
If I have three column sets of data that are ordered like this (rough example of the table that has more data)
P= Pressure vessel cooling water, S= Sanitary water, and T = total water
P + S =T
20 10 30
30 15 45
40 20 60
And they are tabulated over the span of a few years, is it possible to create a formula or use a constant number that I can multiply or divide the total water consumption by, that can give me the pressure vessel cooling water with a slight but negligible margin of error?
What I did was with two of the columns (pressure vessel cooling water and total water consumption) calculate the correlation coefficient. It was about .9 which is pretty good. with that I set the
[Pressure Vessel cooling water amount] = [Total Water Consumption]*x
Solved for x and got say 0.7, can I use this number within the last few years of data sets in order to predict the amount of cooling water? The goal is to be able to do it for the remaining months of this year with again as I mentioned, a reasonable margin of error but to run the experiment on the last few years in order to get a better coefficient.
I experimented with the value and I recieved variations between the calculated values vs the original values of +/- no more than 10%. I don't know if there is a more formal formula for doing this but any help would be much appreciated!
RE: Correlation Coefficient
RE: Correlation Coefficient
RE: Correlation Coefficient
TTFN
FAQ731-376: Eng-Tips.com Forum Policies
RE: Correlation Coefficient
I understand what you mean and that is how I approached it from the beginning but the main question I would say is how to ensure the repeatability of the number I solve for. That is why I tried the correlation coefficient first between the two sets of data. This showed that they typically rely on one another. Then I solved for the constant variable that I would use in the future to get a good prediction on what either the amount of cooling water will be or sanitary water etc etc. Again these are just a few values from several columns that include pH adjusted water and so on.
In another sense, if I have a constant value that relates two of the variables, I can predict what one will be within a reasonable margin of error depending on their correlation.
I just want to be able to create a type of standardized formula. I know there will be error but I am sure we can neglect it if it is reasonable say 5%.
RE: Correlation Coefficient
RE: Correlation Coefficient
TTFN
FAQ731-376: Eng-Tips.com Forum Policies
RE: Correlation Coefficient
Warning about implementing this. If you think you may need to change something, the first step is to measure where you are so that the effect of changes may also be measured. Estimating S maybe time saving for now, but you can probably expect that once you have collected enough data to show the present state of the system, someone is going to want to make system changes to "improve" the water usage and that may well be measured by changes in the ration between P and S. If that is the case, you need actual data, not estimated data and you will need to measure each water usage separately and do the additional data entry.
RE: Correlation Coefficient