Hi,
I have a data including 20 countries' observations for four variables, x1,x2,x3 and x4. I would like to get mean values for the correlations between countries for pairs x1&x2, x1&x3,1&4,2&3,2&4,3&4. I have all the data in one file, and I know proc corr can be used in calculating the correlations, but didn't figure out how to be able to modify the code so that I'd get it to calculate the mean correlations like suggested?
Actually, I am not sure that your question is well specified. You have twenty data points with 4 values for each (20 rows and 4 columns). If you do the initial correlations that you specify, then that is ACROSS all countries, not BETWEEN any pair of them.
Sorry for the inaccuracy in my question.. Yes I meant across all countries..
Any ideas still?
Post some actual example data so that someone can try to understand the question.
I attached a data sample.. There's four variables, and each country has their own values for those variables.. Now I'd like to calculate mean correlations for those variables.. By mean correlations I mean what is the average correlation between two of the variables across 20 countries.. Hope this helps
Can you post what you'd expect the output to look like? I'm still a bit confused on the 'across 20 countries' part.
You'll definitely need to change your data structure as well, do you already have that done?
Attached a table that represents the results I want to have as well (the average part).. Haven't yet modified my data, so it's just in the form than in the excel file attached..
I didn't open your Excel file for security reasons and the usefulness of mean correlation is foreign to me but the following example might still show you how to calculate just that : the mean of all possible correlations in a set of variables :
data test;
array x{4};
do country = 1 to 20;
do i = 1 to 4;
X{i} = rannor(-1); /* Random numbers for the test */
end;
output;
end;
run;
proc corr data=test outp=testc;
var X:;
run;
proc transpose data=testc(where=(_TYPE_="CORR"))
name=VAR
out=testt(where=(VAR>_NAME_))
prefix=CORR;
var X:;
by _NAME_ notsorted;
run;
title "Mean correlation";
proc sql;
select mean(CORR1) as meanCorr from testt;
drop table testc, testt;
quit;
PG
How about:
proc import datafile='c:\sasforums.xls' out=have dbms=excel replace ;getnames=yes ;run; data _null_; dsid=open("have","i"); num=attrn(dsid,"nvars"); do i=2 to num by 5; n+1; name=catx(',',varname(dsid,1),varname(dsid,i),varname(dsid,i+1),varname(dsid,i+2),varname(dsid,i+3),varname(dsid,i+4) ); call symputx(cats('name',n),name); end; call symputx('n',n); rc=close(dsid); run; %macro stack; proc sql; create table want as %do i=1 %to &n; select &&name&i from have %if &i ne &n %then %do; union all %end; %end; ; quit; %mend stack; %stack proc sort data=want;by country;run; proc corr data=want outp=want1(where=(_TYPE_="CORR")) noprint; by country; var _numeric_; run; data want2(keep=country _name_ name corr); set want1; array _a{*} _numeric_; do i=1 to dim(_a); if _a{i}=1 then leave; name=vname(_a{i});corr=_a{i};output; end; run; proc sql noprint; select distinct catt('want2(where=(country="',country,'") rename=(corr=corr_',scan(country,-1),'))') into : list separated by ' ' from want2; quit; proc sort data=want2;by country _name_ name;run; data want3(drop=country); merge &list ; by _name_ name; mean_corr=mean(of corr_:); run;
Ksharp
PGstats and Ksharp, thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.