BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PSP_1
Fluorite | Level 6

Hi,

I am new to SAS and I struggle to find a way for my problem.

 

I have a data set having variables:

va1 var2  age

 

I calculated the percentiles using the code below and this worked well:

 

proc univatiate data=mydataset;

        by age;

        weight var1;

        var var2;

output out = mynew_data

           PctlPre=PERC_

           PctPts=0 to 100 by 1;

run;

 

I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be

               p1               p2       etc...

age1       n obs         n obs  

age2

etc..

 

Your help would be much appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.

You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine. 

 

*add weights to fake data;

data stocks;
    call streaminit(20);
    set sashelp.stocks;
    myWeight=rand('integer', 1, 100);
run;

*get percentiles;

proc univariate data=stocks noprint;
    by stock;
    weight myWeight;
    var open;
    output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1;
run;

*merge into main data and calculate percentile using an array;
*you may want to check the <= its likely not right;
*and the index to see how you want to deal with records above/below;

data calcs;
    merge stocks myNewData;
    by stock;
    array pct(*) perc:;

    do i=2 to dim(pct);

        if pct(i-1) <=open <=pct(i) then
            index=i-1;
    end;
    drop perc:;
run;

*get frequencies;

proc freq data=calcs noprint;
    table stock*index / out=long missing;
run;

*transpose to desired format - if just printing you dont need this
step;

proc transpose data=long out=wide prefix=PRCT_;
    by stock;
    id index;
    var count;
run;

*print for display;

proc print data=wide;
run;

@PSP_1 wrote:

Hi,

I am new to SAS and I struggle to find a way for my problem.

 

I have a data set having variables:

va1 var2  age

 

I calculated the percentiles using the code below and this worked well:

 

proc univatiate data=mydataset;

        by age;

        weight var1;

        var var2;

output out = mynew_data

           PctlPre=PERC_

           PctPts=0 to 100 by 1;

run;

 

I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be

               p1               p2       etc...

age1       n obs         n obs  

age2

etc..

 

Your help would be much appreciated.


 

View solution in original post

6 REPLIES 6
ballardw
Super User

I think that you may want Proc Rank for this specific task;

proc rank  data= mydataset groups=100
    out=rankedset;
   by age;
   var var2;
   ranks varrank;
   ;
run;

Will add a variable VARRANK to the data which indicates which percentile the record belongs to.

 

Then use proc freq/report/tabulate to count age values by the varrank values.

Reeza
Super User
PROC RANK doesn't have a WEIGHT statement so the percentiles wouldn't be calculated correctly 😞
Reeza
Super User

Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.

You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine. 

 

*add weights to fake data;

data stocks;
    call streaminit(20);
    set sashelp.stocks;
    myWeight=rand('integer', 1, 100);
run;

*get percentiles;

proc univariate data=stocks noprint;
    by stock;
    weight myWeight;
    var open;
    output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1;
run;

*merge into main data and calculate percentile using an array;
*you may want to check the <= its likely not right;
*and the index to see how you want to deal with records above/below;

data calcs;
    merge stocks myNewData;
    by stock;
    array pct(*) perc:;

    do i=2 to dim(pct);

        if pct(i-1) <=open <=pct(i) then
            index=i-1;
    end;
    drop perc:;
run;

*get frequencies;

proc freq data=calcs noprint;
    table stock*index / out=long missing;
run;

*transpose to desired format - if just printing you dont need this
step;

proc transpose data=long out=wide prefix=PRCT_;
    by stock;
    id index;
    var count;
run;

*print for display;

proc print data=wide;
run;

@PSP_1 wrote:

Hi,

I am new to SAS and I struggle to find a way for my problem.

 

I have a data set having variables:

va1 var2  age

 

I calculated the percentiles using the code below and this worked well:

 

proc univatiate data=mydataset;

        by age;

        weight var1;

        var var2;

output out = mynew_data

           PctlPre=PERC_

           PctPts=0 to 100 by 1;

run;

 

I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be

               p1               p2       etc...

age1       n obs         n obs  

age2

etc..

 

Your help would be much appreciated.


 

PSP_1
Fluorite | Level 6
Thank you so, so much! The code worked perfectly well and it solved my problem!

Thanks again for your time!
Reeza
Super User
It may be helpful to post your solution as well, since you had to tweak it to get the right values.
PSP_1
Fluorite | Level 6
Yes, sure.

The only change I made to the code was in the below section:

data calcs;
merge stocks myNewData;
by stock;
array pct(*) perc:;

do i=2 to dim(pct);

if pct(i-1) <=open <=pct(i) then

* on the line below I changed from index = i-1 to index = i-2 otherwise the percentiles were shifted by 1;;

index=i-2;
end;
drop perc:;
run;

Apart of one line, everything else I used as is.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3755 views
  • 2 likes
  • 3 in conversation