BookmarkSubscribeRSS Feed
carbs
Calcite | Level 5

Hi,

I have a data including 20 countries' observations for four variables, x1,x2,x3 and x4. I would like to get mean values for the correlations between countries for pairs x1&x2, x1&x3,1&4,2&3,2&4,3&4. I have all the data in one file, and I know proc corr can be used in calculating the correlations, but didn't figure out how to be able to modify the code so that I'd get it to calculate the mean correlations like suggested?

10 REPLIES 10
Doc_Duke
Rhodochrosite | Level 12

Actually, I am not sure that your question is well specified.  You have twenty data points with 4 values for each (20 rows and 4 columns).  If you do the initial correlations that you specify, then that is ACROSS  all countries, not BETWEEN any pair of them.

carbs
Calcite | Level 5

Sorry for the inaccuracy in my question.. Yes I meant across all countries..

carbs
Calcite | Level 5

Any ideas still?

Tom
Super User Tom
Super User

Post some actual example data so that someone can try to understand the question.

carbs
Calcite | Level 5

I attached a data sample.. There's four variables, and each country has their own values for those variables.. Now I'd like to calculate mean correlations for those variables.. By mean correlations I mean what is the average correlation between two of the variables across 20 countries.. Hope this helps

Reeza
Super User

Can you post what you'd expect the output to look like? I'm still a bit confused on the 'across 20 countries' part.

You'll definitely need to change your data structure as well, do you already have that done?

carbs
Calcite | Level 5

Attached a table that represents the results I want to have as well (the average part).. Haven't yet modified my data, so it's just in the form than in the excel file attached..

PGStats
Opal | Level 21

I didn't open your Excel file for security reasons and the usefulness of mean correlation is foreign to me but the following example might still show you how to calculate just that : the mean of all possible correlations in a set of variables :

data test;
array x{4};
do country = 1 to 20;
do i = 1 to 4;
X{i} = rannor(-1); /* Random numbers for the test */
end;
output;
end;
run;

proc corr data=test outp=testc;
var X:;
run;

proc transpose data=testc(where=(_TYPE_="CORR"))
name=VAR
out=testt(where=(VAR>_NAME_))
prefix=CORR;
var X:;
by _NAME_ notsorted;
run;

title "Mean correlation";
proc sql;
select mean(CORR1) as meanCorr from testt;
drop table testc, testt;
quit;

PG

PG
Ksharp
Super User

How about:

proc import datafile='c:\sasforums.xls' out=have dbms=excel replace ;getnames=yes ;run;
data _null_;
dsid=open("have","i");
num=attrn(dsid,"nvars");
do i=2 to num by 5;
 n+1;
 name=catx(',',varname(dsid,1),varname(dsid,i),varname(dsid,i+1),varname(dsid,i+2),varname(dsid,i+3),varname(dsid,i+4) );
 call symputx(cats('name',n),name);
end;
  call symputx('n',n);
rc=close(dsid);
run;
%macro stack;
proc sql;
create table want as 
%do i=1 %to &n;
 select &&name&i from have
 %if &i ne &n %then %do; union all  %end;
%end;
 ;
quit;
%mend stack;

 %stack
proc sort data=want;by country;run;
proc corr data=want outp=want1(where=(_TYPE_="CORR"))  noprint;
  by country;
  var _numeric_;
run;
data want2(keep=country _name_ name corr);
 set want1;
 array _a{*} _numeric_;
 do i=1 to dim(_a);
  if _a{i}=1 then leave;
  name=vname(_a{i});corr=_a{i};output;
 end;
run;
proc sql noprint;
 select distinct catt('want2(where=(country="',country,'") rename=(corr=corr_',scan(country,-1),'))') into : list separated by ' '
  from want2;
quit;

proc sort data=want2;by country _name_ name;run;
data want3(drop=country);
 merge &list ;
 by _name_ name;
 mean_corr=mean(of corr_:);
run;



Ksharp

carbs
Calcite | Level 5

PGstats and Ksharp, thanks!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2115 views
  • 6 likes
  • 6 in conversation