Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.
You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine.
*add weights to fake data; data stocks; call streaminit(20); set sashelp.stocks; myWeight=rand('integer', 1, 100); run; *get percentiles; proc univariate data=stocks noprint; by stock; weight myWeight; var open; output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1; run; *merge into main data and calculate percentile using an array; *you may want to check the <= its likely not right; *and the index to see how you want to deal with records above/below; data calcs; merge stocks myNewData; by stock; array pct(*) perc:; do i=2 to dim(pct); if pct(i-1) <=open <=pct(i) then index=i-1; end; drop perc:; run; *get frequencies; proc freq data=calcs noprint; table stock*index / out=long missing; run; *transpose to desired format - if just printing you dont need this step; proc transpose data=long out=wide prefix=PRCT_; by stock; id index; var count; run; *print for display; proc print data=wide; run;
@PSP_1 wrote:
Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
I think that you may want Proc Rank for this specific task;
proc rank data= mydataset groups=100 out=rankedset; by age; var var2; ranks varrank; ; run;
Will add a variable VARRANK to the data which indicates which percentile the record belongs to.
Then use proc freq/report/tabulate to count age values by the varrank values.
Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.
You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine.
*add weights to fake data; data stocks; call streaminit(20); set sashelp.stocks; myWeight=rand('integer', 1, 100); run; *get percentiles; proc univariate data=stocks noprint; by stock; weight myWeight; var open; output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1; run; *merge into main data and calculate percentile using an array; *you may want to check the <= its likely not right; *and the index to see how you want to deal with records above/below; data calcs; merge stocks myNewData; by stock; array pct(*) perc:; do i=2 to dim(pct); if pct(i-1) <=open <=pct(i) then index=i-1; end; drop perc:; run; *get frequencies; proc freq data=calcs noprint; table stock*index / out=long missing; run; *transpose to desired format - if just printing you dont need this step; proc transpose data=long out=wide prefix=PRCT_; by stock; id index; var count; run; *print for display; proc print data=wide; run;
@PSP_1 wrote:
Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.