SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
littlewho
Fluorite | Level 6

Hello,

 

I'm pretty new to SAS programming language and I would need advice about the best practice in the following problem. I have this table containing cut-off values of a percentile distribution:

proc univariate data=temp1 noprint;
   var ptb;
   output out=percentiles1 pctlpts=10 20 30 40 50 60 70 80 90 100 pctlpre=P;
run;
proc print data=percentiles1;
run;

Now, I would like to iterate over "temp1" table and decide in which bucket fits. I tried something like this:

data temp2 (drop = p10 p20 p30 p40 p50 p60 p70  p80 p90 p100);
merge temp1 percentiles1;
if ptb <= p10 then bucket = '1st percentile';
else if ptb <= p20 then bucket = '2nd percentile';
... and so on ...
run;

But this would work only for the first row. I was thinking to do a cross join in SQL, but I think it's not the best approach to copy the row of percentiles everywhere.

So, what would be the best approach?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Other than creating text values (likely not a best idea ) Proc Ranks will do this.

proc ranks data=test1 groups=10
   out=want;
   var ptb  ;
   ranks ptbtile;
run;

The variable ptbtile will have values of 0 to 9 (10 groups).

Assign a format to display the value of '1st decile' if needed (decile would be more descriptive then "1st percentile" as that would indicate the P1 not P10 cutoff.

View solution in original post

4 REPLIES 4
ballardw
Super User

Other than creating text values (likely not a best idea ) Proc Ranks will do this.

proc ranks data=test1 groups=10
   out=want;
   var ptb  ;
   ranks ptbtile;
run;

The variable ptbtile will have values of 0 to 9 (10 groups).

Assign a format to display the value of '1st decile' if needed (decile would be more descriptive then "1st percentile" as that would indicate the P1 not P10 cutoff.

littlewho
Fluorite | Level 6

EDIT: solved, I had to use "by" groups

 

Do you know if there is any way to use 'class' keyword with proc ranks?
I would like the ranking to be done separately for each different date.
I obtained the cut-off values using proc univariate with class and it looks good, but I would also need something similar for ranking.

Reeza
Super User
You can have multiple variables in your BY statement so just add the date to the BY statement.
Reeza
Super User

@littlewho wrote:

Hello,

 

I'm pretty new to SAS programming language and I would need advice about the best practice in the following problem. I have this table containing cut-off values of a percentile distribution:

proc univariate data=temp1 noprint;
   var ptb;
   output out=percentiles1 pctlpts=10 20 30 40 50 60 70 80 90 100 pctlpre=P;
run;
proc print data=percentiles1;
run;

Now, I would like to iterate over "temp1" table and decide in which bucket fits. I tried something like this:

data temp2 (drop = p10 p20 p30 p40 p50 p60 p70  p80 p90 p100);
merge temp1 percentiles1;
if ptb <= p10 then bucket = '1st percentile';
else if ptb <= p20 then bucket = '2nd percentile';
... and so on ...
run;

But this would work only for the first row. I was thinking to do a cross join in SQL, but I think it's not the best approach to copy the row of percentiles everywhere.

So, what would be the best approach?

 

Thanks!


Neither of these. Use PROC RANK instead. 0 will correspond to the first decile, 1 to the second decile and so forth.

https://documentation.sas.com/?docsetId=proc&docsetVersion=9.4&docsetTarget=p1xzpoijq32wbsn1gr6g5cx3...

proc rank data=temp1 out=temp2 groups=10;
var ptb
ranks ptb_rank;
run;

 

To fix your code, you need to merge it to all rows which is accomplished with a little code trick.

 

data temp2 (drop = p10 p20 p30 p40 p50 p60 p70  p80 p90 p100);
set temp1;
if _n_ =1 then set percentiles1;
if ptb <= p10 then bucket = '1st percentile';
else if ptb <= p20 then bucket = '2nd percentile';
... and so on ...
run;

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3475 views
  • 3 likes
  • 3 in conversation