turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Calculating means based on percentiles

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-17-2012 09:31 AM

Dear SAS Users,

I have data as follows:

date corr1_2 corr1_3..... corr19_20

for each date, the variable corr1_2 shows the correlation between item1 and item2, ..., corr19_20 shows the correlation between item19 and item20. I have all combinations of pairwise correlations for different dates.

I have macro variables that holds 18,19 and 20 (end numbers). I calculated mean correlation for each day by using a line as follows:

meancorr=mean( %do i=1 %to &lastbmax2; %do j=&i+1 %to &max; corr&i._&j, %end; %end; corr&lastbmax1._&max);

where lastbmax2, lastbmax1 and max are 18, 19, and 20, respectively.

My question is that how can I calculate the mean of the bivariate correlations which are lower than the 5th and higher than the 95th percentiles for each day? I can identify the percentiles by using the pctl function. But how can I calculate the means as I just described?

Thanks for any help in advance.

J.

Accepted Solutions

Solution

10-18-2012
02:39 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JohnK

10-18-2012 02:39 AM

This solution uses the array suggested by ArtT, but loops through the array rather than sorting the array, which would be more elegant.

data have;

input date date9. corr1_2 corr1_3 corr2_3;

datalines;

16oct2012 .1 .9 .2

17oct2012 .5 .7 .3

run;

data want;*(keep=date meancorr umean lmean);

set have;

array allcorr {*} corr:;

meancorr = mean(of corr;

lower=pctl(5,of corr;

upper=pctl(95,of corr;

lmean=0; lcnt=0; umean=0; ucnt=0;

do i = 1 to dim(allcorr);

if allcorr{i} le lower then do;

lmean + allcorr{i};

lcnt + 1;

end;

if allcorr{i} ge upper then do;

umean + allcorr{i};

ucnt + 1;

end;

end;

lmean=lmean/lcnt;

umean=umean/ucnt;

output want;

run;

proc print data=want;

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JohnK

10-17-2012 09:58 AM

I think you are complicating the task by including all of the correlations in macro variables. If you simply loaded then into an array, in a datastep, you might be able to easily accomplish both tasks by using call sortn to sort the array. Then, you would only have to use the of operator to include all of the correlations in the overall mean calculation and, for the top and bottom percentiles (unless I misunderstand your data), simply include the x percent at the top and bottom of the sorted array.

Solution

10-18-2012
02:39 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JohnK

10-18-2012 02:39 AM

This solution uses the array suggested by ArtT, but loops through the array rather than sorting the array, which would be more elegant.

data have;

input date date9. corr1_2 corr1_3 corr2_3;

datalines;

16oct2012 .1 .9 .2

17oct2012 .5 .7 .3

run;

data want;*(keep=date meancorr umean lmean);

set have;

array allcorr {*} corr:;

meancorr = mean(of corr;

lower=pctl(5,of corr;

upper=pctl(95,of corr;

lmean=0; lcnt=0; umean=0; ucnt=0;

do i = 1 to dim(allcorr);

if allcorr{i} le lower then do;

lmean + allcorr{i};

lcnt + 1;

end;

if allcorr{i} ge upper then do;

umean + allcorr{i};

ucnt + 1;

end;

end;

lmean=lmean/lcnt;

umean=umean/ucnt;

output want;

run;

proc print data=want;

run;