BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ak2011
Fluorite | Level 6

Hi,

I would appreciate if someone could help me with the SAS code to create tertiles (low income, medium income, high income) for dataset subjinc below and count the number of ca case pop cont and can cont. The SAS code and log are shown below: Table 1 output is attached.

 

My ultimate aim is to count the number of ca case, pop cont and can cont for low income (showing income range), medium income (showing income range) and  and high income (showing income range).

Any help, please? 

Thanks.

ak.

 


/* Suject income*/
data subjinc;
input id$ 1-5 job 7 income lung$ 15-22;
datalines;
OSa06 3 32356 Pop cont
OSa10 1 25689 Ca case
OSa11 1 18546 Ca case
OSa12 3 15756 Ca case
OSa13 3 14527 Ca case
OSa14 3 20458 Ca case
OSa29 2 56974 Ca cont
OSa30 4 34568 Ca cont
OSa31 4 69713 Ca cont
OSa32 1 12896 Ca case
OSa33 3 25684 Ca case
OSa34 3 78362 Ca case
OSa35 3 18998 Ca case
OSa54 3 26934 Pop cont
OSa73 3 65823 Pop cont
OSa86 3 23542 Ca cont
OSa93 3 14589 Pop cont
OSa94 3 14325 Pop cont
OSa95 4 45802 Pop cont
OSb02 1 27419 Ca cont
OSb16 1 15843 Ca case
OSb20 4 16560 Ca case
OSb38 3 21123 Ca cont
OSb39 3 19743 Ca cont
OSb58 1 36589 Pop cont
OSb70 1 32145 Ca cont
OSb71 2 19400 Ca cont
OSb72 3 21356 Ca cont
OSb73 4 70195 Ca cont
;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73 /* Suject income*/
74 data subjinc;
75 input id$ 1-5 job 7 income lung$ 15-22;
76 datalines;
 
NOTE: The data set WORK.SUBJINC has 29 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
 
 
106 ;
107
108 /* TASK 1: Creating cutpoints in entire dataset*/
109
110 data cutp; set subjinc;
111
 
NOTE: There were 29 observations read from the data set WORK.SUBJINC.
NOTE: The data set WORK.CUTP has 29 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
112 proc rank data=cutp out=cutp2 groups=3;
113 var income;
114 ranks rank2;
115 run;
 
NOTE: The data set WORK.CUTP2 has 29 observations and 5 variables.
NOTE: PROCEDURE RANK used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
 
 
116
117 proc print data= cutp2;
118 title ' Table 1: Subject income ranks';
119 run;
 
NOTE: There were 29 observations read from the data set WORK.CUTP2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.34 seconds
cpu time 0.34 seconds
 
 
120
121
122 data cutp3; set cutp2;
123 if rank2 = 0 then incomerank ='low';
124 if rank2 = 1 then incomerank ='medium';
125 if rank2 = 2 then incomerank ='high';
126 run;
 
NOTE: There were 29 observations read from the data set WORK.CUTP2.
NOTE: The data set WORK.CUTP3 has 29 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
 
 
127
128 /*TASK 2: COUNTING ca case, pop cont and can cont with dollar amount range:
129
130 low income range(.......), medium income range (.....), high income range (....)*/
131
132
133
134 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
146

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Something like this?

 

proc rank data=subjinc out=subjrnk groups=3;
var income;
ranks subjgrp;
run;

proc format;
value incgrp
0 = "low income"
1 = "medium income"
2 = "high income";
run;

proc sql;
alter table subjrnk
modify subjgrp format=incgrp.;
quit;

proc tabulate data=subjrnk;
var income;
class subjgrp;
table (min*format=dollar6.0 max*format=dollar6.0 n*format=4.0),subjgrp=""*income="";
run;
  	low income 	medium income 	high income
Min 	$12896 	$19400 	$32145
Max 	$18998 	$27419 	$78362
N 	9 	10 	10
PG

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

 

You don't seem to actually need cutpoints, since PROC RANK has put your data into 3 roughly equal-size groups.  Can't you just take the resulting ranked data set, and run a proc freq, cross-tabulating the variables of interest against rank2?

 

Now, if you actually want cutpoints, then PROC UNIVARIATE is the way to go, as in:

 

proc univariate data=subjinc noprint ;
  var income;
  output out=need pctlpts=33.3 66.7  pctlpre=pctl_ ;
run;
proc print;run;

which produces dataset NEED with one observations and two variables:  PCTL_33_3=19400   and PCTL_66_7=32145.

 

You could then

data want;
 set subjinc;
 if _n_=1 then set need;
 if .<income < pctl_33_3 then incgroup='Low'; else 
 if   income < pctl)66_7 then incgroup='Mid'; else
 incgroup='Hi';
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ak2011
Fluorite | Level 6
Hi, you helped me sometime ago with the Sas code for cut points and thank you for that. However, the "data want' portion of the SAS code to create percentiile ranking(33.3 and 66.7) does not work. I would appreciate if you could help me with corrections. I tried several things to correct the error but it still did not work. Thank you in advance.
PGStats
Opal | Level 21

Something like this?

 

proc rank data=subjinc out=subjrnk groups=3;
var income;
ranks subjgrp;
run;

proc format;
value incgrp
0 = "low income"
1 = "medium income"
2 = "high income";
run;

proc sql;
alter table subjrnk
modify subjgrp format=incgrp.;
quit;

proc tabulate data=subjrnk;
var income;
class subjgrp;
table (min*format=dollar6.0 max*format=dollar6.0 n*format=4.0),subjgrp=""*income="";
run;
  	low income 	medium income 	high income
Min 	$12896 	$19400 	$32145
Max 	$18998 	$27419 	$78362
N 	9 	10 	10
PG

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2115 views
  • 2 likes
  • 3 in conversation