Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.
You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine.
*add weights to fake data;
data stocks;
call streaminit(20);
set sashelp.stocks;
myWeight=rand('integer', 1, 100);
run;
*get percentiles;
proc univariate data=stocks noprint;
by stock;
weight myWeight;
var open;
output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1;
run;
*merge into main data and calculate percentile using an array;
*you may want to check the <= its likely not right;
*and the index to see how you want to deal with records above/below;
data calcs;
merge stocks myNewData;
by stock;
array pct(*) perc:;
do i=2 to dim(pct);
if pct(i-1) <=open <=pct(i) then
index=i-1;
end;
drop perc:;
run;
*get frequencies;
proc freq data=calcs noprint;
table stock*index / out=long missing;
run;
*transpose to desired format - if just printing you dont need this
step;
proc transpose data=long out=wide prefix=PRCT_;
by stock;
id index;
var count;
run;
*print for display;
proc print data=wide;
run;
@PSP_1 wrote:
Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
I think that you may want Proc Rank for this specific task;
proc rank data= mydataset groups=100
out=rankedset;
by age;
var var2;
ranks varrank;
;
run;
Will add a variable VARRANK to the data which indicates which percentile the record belongs to.
Then use proc freq/report/tabulate to count age values by the varrank values.
Here's one way - please read the comments this will need some tweaking to ensure you get the correct values.
You didn't provide sample data so I worked with the Stocks data available, replace with your data set and variable names and you should be fine.
*add weights to fake data;
data stocks;
call streaminit(20);
set sashelp.stocks;
myWeight=rand('integer', 1, 100);
run;
*get percentiles;
proc univariate data=stocks noprint;
by stock;
weight myWeight;
var open;
output out=myNewData PctlPre=PERC_ PctlPts=0 to 100 by 1;
run;
*merge into main data and calculate percentile using an array;
*you may want to check the <= its likely not right;
*and the index to see how you want to deal with records above/below;
data calcs;
merge stocks myNewData;
by stock;
array pct(*) perc:;
do i=2 to dim(pct);
if pct(i-1) <=open <=pct(i) then
index=i-1;
end;
drop perc:;
run;
*get frequencies;
proc freq data=calcs noprint;
table stock*index / out=long missing;
run;
*transpose to desired format - if just printing you dont need this
step;
proc transpose data=long out=wide prefix=PRCT_;
by stock;
id index;
var count;
run;
*print for display;
proc print data=wide;
run;
@PSP_1 wrote:
Hi,
I am new to SAS and I struggle to find a way for my problem.
I have a data set having variables:
va1 var2 age
I calculated the percentiles using the code below and this worked well:
proc univatiate data=mydataset;
by age;
weight var1;
var var2;
output out = mynew_data
PctlPre=PERC_
PctPts=0 to 100 by 1;
run;
I now would like to determine how many observations there are within each percentile and I am really struggling. I need to have the output to be
p1 p2 etc...
age1 n obs n obs
age2
etc..
Your help would be much appreciated.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.