01-14-2009 07:46 AM

Hi,

my problem like as

data new;

input x y z u v;

cards;

1 0 3 4 7

2 3 2 5 9

3 . 0 3 7

4 3 5 1 7

5 6 8 3 5

;

proc reg;

model x = y z u v / vif out=vifmatrix collin out=colinmatrix;

run;

I got two out put file as vifmatrix and colinmatrix, I have to check maximum vif of the corresponding variable (say u) then find out the maximum colliniarity variable (say v) with maximum vif variable(u), then delete that variable (v) from data set then again run proc reg with same data set but variable ‘v’ is not contain in the dataset.

I want this bcz I have thousand variables, I don’t want to repeat the same sas code.

Can any one give me any idea ( any macro, any loop kind of things)

Regards,

Pabitra

01-14-2009 08:47 AM

you should use ods tables (http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/reg_sect49.htm) to get the output to datasets. For example,

/*

data new;

input x y z u v;

cards;

1 0 3 4 7

2 3 2 5 9

3 . 0 3 7

4 3 5 1 7

5 6 8 3 5

6 3 6 9 10

7 2 5 10 20

;

proc reg;

ods output CollinDiag=colinmatrix ParameterEstimates=vifmatrix(keep= variable varianceinflation);

model x = y z u v / vif collin;

run;

*/

I added two more lines of data, because the model was an over-fit with fewer observations and no collinear diagnositics was produced.

01-14-2009 08:58 AM

This ok..but my problem is other,

i I got two out put file as vifmatrix and colinmatrix, I have to check maximum vif of the corresponding variable (say u) then find out the maximum colliniarity variable (say v) with maximum vif variable(u), then delete that variable (v) from data set then again run proc reg with same data set but variable ‘v’ is not contain in the dataset.

I want this bcz I have thousand variables, I don’t want to repeat the same sas code.

01-14-2009 10:13 AM

Now that you had your colinmatrix and vifmatrix, just merge them side by side and run two queries. For example,

/*

data together;

merge vifmatrix colinmatrix;

proc sql;

select variable into :u from together

having varianceinflation=max(varianceinflation);

select variable into :v from together

where variable ne "&u"

having &u = max(&u);

quit;

*/

After that, &v is the variable that has max "var prop" with &u that has the max variance inflation.

However, the "var prop"s in the colinmatrix means nothing to me.

01-14-2009 10:28 AM

thanks,

but in this example, max vif for 'v', and 2nd maximum correlation with u, i want to remove u from my dataset, then i want to run my sas code proc reg again with modified dataset, next time again i want to check same things, same modification in the data set again run proc reg, i need kind of loop or macro for this problem then i cant repeate the same code by manually,

plz suggest me

01-14-2009 11:27 AM

For example, if you want to select 2 variables at last,

/*

data new;

input x y z u v;

cards;

1 0 3 4 7

2 3 2 5 9

3 . 0 3 7

4 3 5 1 7

5 6 8 3 5

6 3 6 9 10

7 2 5 10 20

;

%macro var_select(dsn=new, depvar=x, varlist=y z u v, n=4);

ods _all_ close;

%do %until(&n = 2);

proc reg data=&dsn;

ods output CollinDiag=colinmatrix ParameterEstimates=vifmatrix(keep= variable varianceinflation);

model &depvar = &varlist / vif collin;

run;

data together;

merge vifmatrix colinmatrix;

proc sql noprint;

*find the variable name with max variance inflation and dump it into &u;

select variable into :u from together

having varianceinflation=max(varianceinflation);

*find the variable name with max 'var prop' associated with &u and dump it into &v (I'm not sure if it's the statistically right thing to do);

select variable into :v from together

where variable ne "&u"

having &u = max(&u);

*Then run a query to construct the list of independent variables excluding &v;

select count(*), variable into :n, :varlist separated by ' '

from vifmatrix where variable ne "&v" and lowcase(variable) ne 'intercept';

quit;

%end;

ods listing;

*dump the selected variables into a dataset;

proc sql;

create table var_selection as

select variable

from vifmatrix where variable ne "&v" and lowcase(variable) ne 'intercept';

quit;

%mend var_select;

%var_select;

*/

The variables selected will go to dataset*var_selection*, and you may change the stop condition "&n=2" to anything you like.

01-15-2009 04:45 AM

thanx,

now i want to stop this loop when VIF<2 (say) and index<20 (say)

also i want out put of proc regression

plz suggest me Message was edited by: pabitra

01-16-2009 12:35 PM

It's way beyond my capacity, unless you can offer me a job

01-19-2009 02:04 AM

i m student..i want to learn more.....how can i offer u job.....