Contributor
Posts: 45

# if then statement to create new variables of ranges

[ Edited ]

Hi I tried to do this in Excel, but was unsuccessful. Now I need help trying to do this in SAS, please :

Data looks like this (in months---not important)....and Age1 is the beginning of an age range and Age2 is the end.

Age1              Age2

13                   60             <---(data collected like this, but basically means that age range is 13-60)

50                   70

168                780

0                     0

0                    240

all                   all

192                older

I want to create new variables that are interval groups to see where each age range falls:

Interval groups are:

<12  is group1

12-60 is group2

61-132 is group3

133 - 204 is group4

205 < is group5

.....and if "all" in both Age1 and Age2 then they go in all groups

.....and if "0" in both Age1 and Age2 then they go in group1 only

.....and if "older" in Age2 ("older won't be found in Age1), then it goes into the group of where Age1 would fall and all groups after that

.....see that depending on the range (between Age1 and Age2), counts can be in multiple groups (see below)

Data should look like:

Age1              Age2         group1              group2          group3           group4           group5

13                   60               0                         1                       0                   0                    0  <-(only falls in group2)

50                   70               0                         1                       1                   0                    0  <-(both group2&group3)

168                780              0                         0                       0                   1                   1

0                     0                 1                         0                       0                  0                    0

0                    240              1                         1                       1                  1                    1

all                   all                1                        1                        1                  1                    1  <-(all groups)

62                 older             0                        0                        1                  1                    1  <-(group3&each one after)

Any help would be appreciated! Thanks in advance!

Super User
Posts: 9,599

## Re: if then statement to create new variables of ranges

Well, as you haven't provided test data in the form of a datastep, this is just untested theory:

```data want;
set have;
array group{5} 8;
select;
when (age1 < 12) group1=1;
when (12 <= age1 and age2 <=60) group2=1;
...
otherwise;
end;
run;
```
Contributor
Posts: 45

## Re: if then statement to create new variables of ranges

I apologize (here is the test data)

data work;

input age1 age2;

datalines;

13     60

50     70

168   780

0       0

0      240

all     all

192   older

run;

the code you suggested is something I tried as a function in basic Excel before and did not work. The statement......

when (12 <= age1 and age2 <=60) group2=1;

would not work as the second row would not get a "1" in group2 AND group3 even though it should because 50-70 should fall into both groups....

Super User
Posts: 9,599

## Re: if then statement to create new variables of ranges

If you run that datastep you will see there are errors as you have number and character (in the input you are reading number as no \$).  I have removed the invalid data and present:

```data work;
input age1 age2;
datalines;
13     60
50     70
168   780
0       0
0      240
;
run;

data want;
set work;
array group{5} 8;
group1=ifn(12 <= age1 and age2 <= 60,1,0);
group2=ifn(61 <= age1 and age2 <= 132,1,0);
...
run;```
Super User
Posts: 8,090

## Re: if then statement to create new variables of ranges

[ Edited ]

So it looks like your source data has the age vaues as text instead of numbers. That will make it harder to compare. How about if we convert your 'all' and 'older' (and also 'younger') into extremely low and high values so that we can make a numeric variable out of the ages.  Then the logic for telling if the two ranges overlap is pretty simple.

``````data work;
length text_age1 text_age2 \$10 age1 age2 group1-group5 8;
input text_age1 text_age2;
if text_age1 in ('younger','all') then age1=-1;
else age1=input(text_age1,10.);
if text_age2 in ('older','all') then age2=1000;
else age2=input(text_age2,10.);
group1 = (              age1 <= 12) ;
group2 = (age2 >  12 and age1 <= 60);
group3 = (age2 >  60 and age1 <=132);
group4 = (age2 > 132 and age1 <=204);
group5 = (age2 > 204);
datalines;
13     60
50     70
168   780
0       0
0      240
all     all
192   older
;;;;

proc print; run;``````
Discussion stats
• 4 replies
• 1267 views
• 0 likes
• 3 in conversation