BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
esita
Calcite | Level 5

Hi All,

I need to do a correlation analysis on 10,000 variables  named cerp, cerl, cfty...... and so on and 8 variables red, blue, green..... and so on. Instead of calling for all 10000 variables is there a way we can do the correlation on 10000 variables without listing. Thanks for the help in advance.

Esita

%macro corr_data (var_1, var_2);

proc corr data= cell;

var &var_1 &var_2;

run;

%mend corr_data;

%corr_data (cerp, red);

%corr_data (cerl, blue);

%corr_data (cfty, green);

%corr_data (cerp, red);

%corr_data (cerl, blue);

%corr_data (cfty, green);

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Paige is extremely skeptical that performing correlations on 10,000 variables is a good idea from a statistical point of view. In fact, you will likely get many "significant" correlations just by random chance, rather than because there is a real association between the variables. You will also be misled by the multi-collinearity between variables, and well I can't see how this will lead to any relevant conclusions here.

Nevertheless, if you want to do this in SAS, you can create macro variables containing the names of SAS variables of interest from the contents of your data set.

Something like (untested code)

proc contents data=your_sas_data_set noprint out=_cont_;

run;

proc sql noprint;

     select distinct name into :names separated by ' ' from _cont_ where name ^in ('red','green','blue');

     select distinct name into :names2 separated by ' ' from _cont_ where name in ('red','green','blue');

quit;

ods output outpearson=corrs;

proc corr data=your_sas_data_set noprint;

     var &names;

     with &names2;

run;

--
Paige Miller

View solution in original post

6 REPLIES 6
stat_sas
Ammonite | Level 13

If your variables starting with 'c' then  you can try something like this

proc corr data=cell;

var red green blue;

with c:;

run;

esita
Calcite | Level 5

No it starts with different letters

stat_sas
Ammonite | Level 13

Then you can use macro variables as recommended by .

Reeza
Super User

If its all numerical variables you can use the _numerical_ or _num_ shortcut. 

Though you'd better be saving that output to a dataset to automatically go through other wise you'll simply miss correlations. 

PaigeMiller
Diamond | Level 26

Paige is extremely skeptical that performing correlations on 10,000 variables is a good idea from a statistical point of view. In fact, you will likely get many "significant" correlations just by random chance, rather than because there is a real association between the variables. You will also be misled by the multi-collinearity between variables, and well I can't see how this will lead to any relevant conclusions here.

Nevertheless, if you want to do this in SAS, you can create macro variables containing the names of SAS variables of interest from the contents of your data set.

Something like (untested code)

proc contents data=your_sas_data_set noprint out=_cont_;

run;

proc sql noprint;

     select distinct name into :names separated by ' ' from _cont_ where name ^in ('red','green','blue');

     select distinct name into :names2 separated by ' ' from _cont_ where name in ('red','green','blue');

quit;

ods output outpearson=corrs;

proc corr data=your_sas_data_set noprint;

     var &names;

     with &names2;

run;

--
Paige Miller
ballardw
Super User

It sounds like you don't want printed output i.e. "without listing".

proc corr data=cell outp= pcorr noprint;

with red blue green  ; /* your list of 8 goes here*/

run;

will send Pearson correlations to the set pcorr and generate no listing out put. You can specify other output data sets for Spearman, Hoeffding and Kendal statistics. The output will have correlations for all of the numeric variables compared with the ones on the WITH statement.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4178 views
  • 0 likes
  • 5 in conversation