Hi,
I am trying to band the variable 'GROUP' in my want dataset, so that it:
1. Creates a new variable 'BAND' in the 'test' dataset
2. Create a new BAND_1_to_5 variable, which will use the newly created 'BAND' variable just created.
BAND_1_to_5 will be equal to 'Yes' when the new 'BAND' variable created has:
BAND='GROUP = 1'
BAND='GROUP = 2'
BAND='GROUP= 3'
BAND='GROUP = 4'
BAND='GROUP = 5'
else BAND_1_to_5 will be equal to 'No' for any other result e.g. Band = 'GROUP = -99999', Band = 'GROUP = -99998' etc.
Please can the code be reviewed below to check if the new BAND_1_to_5 variable I want to create will produce 'Yes' and 'No' results required, and whether BAND and BAND_1_to_5 can both be created in one DATA step (see question 2 below), or whether BAND_1_to_5 needs to be created in a separate DATA step (see the two questions below)? Thanks!
Current code
data test;
set want;
length Band $25;
if GROUP = -99999 then Band = 'GROUP = -99999';
else if GROUP = -99998 then Band = 'GROUP = -99998';
else if -99997 <= GROUP <=-1 then Band = '-99997 <= GROUP <=-1';
else if GROUP = 0 then Band = 'GROUP = 0';
else if GROUP = 1 then Band ='GROUP = 1';
else if GROUP = 2 then Band = 'GROUP = 2';
else if GROUP = 3 then Band = 'GROUP = 3';
else if GROUP = 4 then Band = 'GROUP = 4';
else if GROUP = 5 then Band = 'GROUP = 5';
else if 6 <= GROUP <= 10 then Band = 'GROUP = 6-10';
else Band = 'GROUP = 11+';
run;
1. Would the code below be correct (using in ([all of the GROUP results from 1-5]) so that BAND_1_to_5 has a result of 'Yes' when GROUP=1, GROUP=2, GROUP=3, GROUP=4 and GROUP=5 only (see code below)?
data test2;
set test;
length Band_1_to_5 $3;
if BAND in ('GROUP = 1', 'GROUP = 2', 'GROUP= 3', 'GROUP = 4', 'GROUP = 5') then BAND_1_to_5 = 'Yes';
else BAND_1_to_5 = 'No';
run;
2. OR can I put the code
(if BAND in ('GROUP = 1', 'GROUP = 2', 'GROUP= 3', 'GROUP = 4', 'GROUP = 5') then BAND_1_to_5 = 'Yes';
else BAND_1_to_5 = 'No'; )
in the code written above, so that both that both new variables, BAND and BAND_1_to_5 are BOTH included in the just the one 'test' dataset (i.e. everything is created from just one DATA step).
For example:
data test;
set want;
length Band $25;
if GROUP = -99999 then Band = 'GROUP = -99999';
else if GROUP = -99998 then Band = 'GROUP = -99998';
else if -99997 <= GROUP <=-1 then Band = '-99997 <= GROUP <=-1';
else if GROUP = 0 then Band = 'GROUP = 0';
else if GROUP = 1 then Band ='GROUP = 1';
else if GROUP = 2 then Band = 'GROUP = 2';
else if GROUP = 3 then Band = 'GROUP = 3';
else if GROUP = 4 then Band = 'GROUP = 4';
else if GROUP = 5 then Band = 'GROUP = 5';
else if 6 <= GROUP <= 10 then Band = 'GROUP = 6-10';
else Band = 'GROUP = 11+';
if BAND in ('GROUP = 1', 'GROUP = 2', 'GROUP= 3', 'GROUP = 4', 'GROUP = 5') then BAND_1_to_5 = 'Yes'; /*Is it possible to create BAND_1_to_5 in the same DATA step, when GROUP is a new variable being created? */
else BAND_1_to_5 = 'No';
run;
Hello,
Why don't you just try it and see if it works ?
You can use the newly created variable BAND in order to create the new variable BAND_1_to_5 in the same
data step but it is simpler to use the variable GROUP as it is numeric so you don't have potential problems with case
or blanks.
data have;
input GROUP;
cards;
-99999
-99998
-99997
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
;
run;
data test;
set have;
length Band $25;
if GROUP in (-99999,-99998) or (0<=GROUP<=5) then Band = catx(' ','GROUP =', GROUP);
else if -99997 <= GROUP <=-1 then Band = '-99997 <= GROUP <=-1';
else if 6 <= GROUP <= 10 then Band = 'GROUP = 6-10';
else Band = 'GROUP = 11+';
if (1<=GROUP<=5) then BAND_1_to_5 = 'Yes';
else BAND_1_to_5 = 'No';
run;
1. Would the code below be correct (using in ([all of the GROUP results from 1-5]) so that BAND_1_to_5 has a result of 'Yes' when GROUP=1, GROUP=2, GROUP=3, GROUP=4 and GROUP=5 only (see code below)? data test2; set test; length Band_1_to_5 $3; if BAND in ('GROUP = 1', 'GROUP = 2', 'GROUP= 3', 'GROUP = 4', 'GROUP = 5') then BAND_1_to_5 = 'Yes'; else BAND_1_to_5 = 'No'; run; 2. OR can I put the code (if BAND in ('GROUP = 1', 'GROUP = 2', 'GROUP= 3', 'GROUP = 4', 'GROUP = 5') then BAND_1_to_5 = 'Yes'; else BAND_1_to_5 = 'No'; ) in the code written above, so that both that both new variables, BAND and BAND_1_to_5 are BOTH included in the just the one 'test' dataset (i.e. everything is created from just one DATA step).
[1] Do you want the combination in 1. to be all 5 to simultaneously true to take value 'YES'? Or if one of them (eg. 'GROUP = 4' ) is true then do you want 'YES'.
A better way will be to add few records in DATALINES and show an output you require out of it.
Hello,
Why don't you just try it and see if it works ?
You can use the newly created variable BAND in order to create the new variable BAND_1_to_5 in the same
data step but it is simpler to use the variable GROUP as it is numeric so you don't have potential problems with case
or blanks.
data have;
input GROUP;
cards;
-99999
-99998
-99997
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
;
run;
data test;
set have;
length Band $25;
if GROUP in (-99999,-99998) or (0<=GROUP<=5) then Band = catx(' ','GROUP =', GROUP);
else if -99997 <= GROUP <=-1 then Band = '-99997 <= GROUP <=-1';
else if 6 <= GROUP <= 10 then Band = 'GROUP = 6-10';
else Band = 'GROUP = 11+';
if (1<=GROUP<=5) then BAND_1_to_5 = 'Yes';
else BAND_1_to_5 = 'No';
run;
Creating custom formats for variables removes the need to keep adding variables.
Groups created by formats are honored by most analysis, reporting and graphic procedures. So you can use a different format with the same variable to create different groups for an analysis or report.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.