Home
- /
SAS Programming
- /
Base SAS Programming
- /
Categorizing so that SAS assigns an equal number o...

01-27-2013 10:28 AM

Hi,

How do you code so that SAS assigns an equal number of data into each of the 4 different categories? I am working with AGE variable (continuous ratio variable), and I just want to categorize them into 4 categories that would give me the same sample size in each group. Sometimes, I'd like to have an equal number of data in 5, 6, 7 categories as well. People have told me that I could possibly use the statement PROC RANK but I really don't know how to use it even after I looked it up. How do I need to code so that I can achieve this? I'd like to have an output like this:

Age group # of people in each group

---------------------------- ----------------------------------------

0 to ?? years old 2000 people

?? to ?? years old 2000 people

?? to ?? years old 2000 people

?? to 100 years old 2000 people

Thank you!

GF

01-27-2013 10:33 AM

GROUPS = 5 or 6 or 7

VAR AGE;

RANKS AGEGroup;

01-27-2013 10:46 AM

Does this code go between the data step? Do I add the run statement after RANKS AGEGroup? Does it work for 4 groups too? I just ran this

GROUPS = 4

VAR AGE;

RANKS AGEGroup;

run;

My log gave me the following error

1415 GROUPS = 4

------

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

1416

1417 VAR AGE;

1418 RANKS AGEGroup;

-----

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

1419 run;

I'd appreciate any additional comments regarding the error. Thank you.

GF

01-27-2013 11:12 AM

01-27-2013 11:23 AM

Yeah, I already checked it before posting my question but was of no help. I need to see how the code is being written.

01-27-2013 02:26 PM

You need to look in the example section of the documentation when looking to see how the code is written, though the link looks the same as Data _null_'s its not.

proc rank data=have out=want ties=low groups=4;

rank age;

ranks age_rank;

run;

01-29-2013 08:38 AM

I tried

proc rank data=xxx.xxxxx out=WANT ties=low groups=4;

rank AGE;

ranks AGE_rank;

Run;

I am getting the log errors for the following:

1. SAS doesn't like out=WANT. Instead of WANT, what do I need to insert here?

2. Also, it doesn't define AGE_rank because it was my first stating in SAS. How am I supposed to define AGE-rank?

3. The RANK in my second line is in red. I don't know why.

I have been trying to figure out how to do this since Saturday. Please help.....

Thanks

GF

01-29-2013 10:48 AM

You should post your log when you have an error.

1) out=want isn't the issue, 2) I don't know what you mean, 3) that means you did something wrong.

The following works for me.

proc rank data=sashelp.class out=want groups=3 ties=low;

var age;

ranks age_rank;

run;

proc print; run;

01-29-2013 09:03 PM

I ran the following:

proc rank data=xxx.xxx out=want groups=4 ties=low;

var AGE;

ranks AGE_rank;

run;

proc freq data=xxx.xxx;

table AGE_rank;

run;

Then, I see the following error on my log

527

528 proc freq data=bmi.new;

529 table AGE_rank;

ERROR: Variable AGE_RANK not found.

530 run;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

So, SAS is asking me to define AGE_RANK. What do I need to code to resolve this error?

GF

01-29-2013 11:35 PM

That's because variable AGE_RANK is created by Proc Rank. So you need to use the output data set in Proc Freq (so "want" instead of "xxx.xxx").

If you've got Proc Surveyselect licensed then you could also use Proc Rank to create a strata variable which you then use as part of Proc Surveyselect.

There is also a data step approach under this link: http://support.sas.com/resources/papers/proceedings09/058-2009.pdf