Hi,
How do you code so that SAS assigns an equal number of data into each of the 4 different categories? I am working with AGE variable (continuous ratio variable), and I just want to categorize them into 4 categories that would give me the same sample size in each group. Sometimes, I'd like to have an equal number of data in 5, 6, 7 categories as well. People have told me that I could possibly use the statement PROC RANK but I really don't know how to use it even after I looked it up. How do I need to code so that I can achieve this? I'd like to have an output like this:
Age group # of people in each group
---------------------------- ----------------------------------------
0 to ?? years old 2000 people
?? to ?? years old 2000 people
?? to ?? years old 2000 people
?? to 100 years old 2000 people
Thank you!
GF
GROUPS = 5 or 6 or 7
VAR AGE;
RANKS AGEGroup;
Does this code go between the data step? Do I add the run statement after RANKS AGEGroup? Does it work for 4 groups too? I just ran this
GROUPS = 4
VAR AGE;
RANKS AGEGroup;
run;
My log gave me the following error
1415 GROUPS = 4
------
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
1416
1417 VAR AGE;
1418 RANKS AGEGroup;
-----
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
1419 run;
I'd appreciate any additional comments regarding the error. Thank you.
GF
Yeah, I already checked it before posting my question but was of no help. I need to see how the code is being written.
You need to look in the example section of the documentation when looking to see how the code is written, though the link looks the same as Data _null_'s its not.
proc rank data=have out=want ties=low groups=4;
rank age;
ranks age_rank;
run;
I tried
proc rank data=xxx.xxxxx out=WANT ties=low groups=4;
rank AGE;
ranks AGE_rank;
Run;
I am getting the log errors for the following:
1. SAS doesn't like out=WANT. Instead of WANT, what do I need to insert here?
2. Also, it doesn't define AGE_rank because it was my first stating in SAS. How am I supposed to define AGE-rank?
3. The RANK in my second line is in red. I don't know why.
I have been trying to figure out how to do this since Saturday. Please help.....
Thanks
GF
You should post your log when you have an error.
1) out=want isn't the issue, 2) I don't know what you mean, 3) that means you did something wrong.
The following works for me.
proc rank data=sashelp.class out=want groups=3 ties=low;
var age;
ranks age_rank;
run;
proc print; run;
I ran the following:
proc rank data=xxx.xxx out=want groups=4 ties=low;
var AGE;
ranks AGE_rank;
run;
proc freq data=xxx.xxx;
table AGE_rank;
run;
Then, I see the following error on my log
527
528 proc freq data=bmi.new;
529 table AGE_rank;
ERROR: Variable AGE_RANK not found.
530 run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
So, SAS is asking me to define AGE_RANK. What do I need to code to resolve this error?
GF
That's because variable AGE_RANK is created by Proc Rank. So you need to use the output data set in Proc Freq (so "want" instead of "xxx.xxx").
If you've got Proc Surveyselect licensed then you could also use Proc Rank to create a strata variable which you then use as part of Proc Surveyselect.
There is also a data step approach under this link: http://support.sas.com/resources/papers/proceedings09/058-2009.pdf
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.