## Data selection and modelling.

## Data selection and modelling.

(OP)

Hi,

I have a large quantity of data from a test site where 500 parameters are measured every minute and I have acces to 2 years of this data. I want to model 5 of these parameters in function of the most important other parameters (The model has 5 Output parameters and I want to select the best parameters to model these 5 output parameters). The data set is noisy, clouded and some parameters are redundant. I am looking for a method to determine the parameters which are higly correlated with the 5 parameters I have to model. I know I can just calculate the correlation coeficcients. But I dont think this is suffucient.

I have examined a method called Principal Component Analysis but I dont think there is a way to Calculate Principal Components in function of the 5 parameters I wish to model. Here comes my first question, is there a method wich calculates Principal Components in function of other parameters?

I am currently examining feature selection but I don't find good references about this method. Can someone give advice on this method?

If you have any other method to calculate the best variables to model with in function of the output parameters.

Thanks in advance, Regards

I have a large quantity of data from a test site where 500 parameters are measured every minute and I have acces to 2 years of this data. I want to model 5 of these parameters in function of the most important other parameters (The model has 5 Output parameters and I want to select the best parameters to model these 5 output parameters). The data set is noisy, clouded and some parameters are redundant. I am looking for a method to determine the parameters which are higly correlated with the 5 parameters I have to model. I know I can just calculate the correlation coeficcients. But I dont think this is suffucient.

I have examined a method called Principal Component Analysis but I dont think there is a way to Calculate Principal Components in function of the 5 parameters I wish to model. Here comes my first question, is there a method wich calculates Principal Components in function of other parameters?

I am currently examining feature selection but I don't find good references about this method. Can someone give advice on this method?

If you have any other method to calculate the best variables to model with in function of the output parameters.

Thanks in advance, Regards

## RE: Data selection and modelling.

Regards,

## RE: Data selection and modelling.

But I am not allowed to perform experiments on the site. I have to do it with the data I can acces. I also don't think that the site allows such controlled experiments since it is constantly in industrial application.

kind regards,

## RE: Data selection and modelling.

The problem is trying to pare it down. The best way is to plot several variables and visually see which ones appear to matches first. Somethings can be deduced from principles, like reactions speed up with temperatur in a power function. Heat transfer is proportional to delta t.

Generally its called "design of experiments" under advanced Statistical Process Controls tools, google that to find articles.

## RE: Data selection and modelling.

Regards,

## RE: Data selection and modelling.

Thank you very much! Regards

## RE: Data selection and modelling.