Contributor
Posts: 47

# Cutting and keeping characters from variables/ transforming data to months

Hi I am trying to separate the following variable:

agegroup

up to 59 days

2 months to 17 years

4 years to 10 years

up to 12 hours

7 years to 8 years

10 years and older

all ages

TO

agegroup1       agegroup2

up                     59 days

2 months          17 years

4 years             10 years

up                     12 hours

7 years              8 years

10 years            older

all                     ages

THEN, i would like all values to be converted into months (and if in the variables "up", then = 0, and if "older", then = 1200, and if "all", then=0, and if "ages", then = 1200):

agegroup1_a     agegroup2_a

0                         1.94

2                         204

48                       120

0                         0.02

84                       96

120                     1200

Posts: 1,837

## Re: Cutting and keeping characters from variables/ transforming data to months

Spliting agegroup:

``````data temp;
set have;
ix = index(agegroup,'to');
agegroup1 = substr(agegroup,1,ix);
agegroup2 = substr(agegroup,ix+3);     drop ix;
run;``````

Coversion:

``````data want;
set temp;
length from to 8;
keep from upto;

if  strip(agegroup1) in ('up' 'all)  then from=0; else
if index(agegroup1,'month') > 0 then from = input(compress(agegroup1,,'A'),best.); else
if index(agegroup1,'year') > 0 then from =  input(compress(agegroup1,,'A'), best.)/12;

if index(agegroup2,'days') > 0 then
upto = input(compress(agegroup2,,'A'), best.)/30;
else  if index(agegroup2,'year') > 0 then upto= input(compress(agegroup2,,'A'), best.)/12;
else  if index(agegroup2,'hour') > 0 then upto = input(compress(agegroup2,,'A'), best.)/(30*24);
else if strip(agegroup2) in ('older' 'ages') then upto=1200;
run;``````

I haven't tested it but it should work.

Contributor
Posts: 47

## Re: Cutting and keeping characters from variables/ transforming data to months

Hi,

Thank you for the response....I was finally able to try your suggeste code today...so for the splitting of the age group, the "t" from the word "to" still remains in agegroup1

Also, I get the following error in the conversion code:

NOTE 49-169: The meaning of an identifier after a quoted string might change in a future SAS
release. Inserting white space between a quoted string and the succeeding
identifier is recommended.

ERROR 22-322: Syntax error, expecting one of the following: a quoted string,
a numeric constant, a datetime constant, a missing value, iterator, (, ), ','.

ERROR 76-322: Syntax error, statement will be ignored.

at the end of these lines:

if index(agegroup1,'year') > 0 then from = input(compress(agegroup1,,'A'),best.)/12;

else if index(agegroup2,'year') > 0 then upto= input(compress(agegroup2,,'A'), best.)/12;
else if index(agegroup2,'hour') > 0 then upto = input(compress(agegroup2,,'A'), best.)/(30*24);

Contributor
Posts: 47

## Re: Cutting and keeping characters from variables/ transforming data to months

and the "years and older" shows up in both agegroup1 and agegroup2

agegroup1                          agegroup 2

3 years t                             12 years

13 years and older             Years and older

all ages                               l ages

Contributor
Posts: 47

## Re: Cutting and keeping characters from variables/ transforming data to months

so I figured out how to split the age groups correctly and found that the error statements i was getting was because i was missing a ' at the end of the first "all"

however,

my new dataset shows as:

from           upto

0                  .

0                  .

.                   .

0               1200

it did not create agegroup1 and agegroup2 in months.....

Posts: 1,837

## Re: Cutting and keeping characters from variables/ transforming data to months

[ Edited ]

Here is a fixed tested code.

``````data have;
infile datalines truncover;
input agegroup \$30.;
datalines;
up to 59 days
2 months to 17 years
4 years to 10 years
up to 12 hours
7 years to 8 years
10 years and older
all ages
; run;

data temp;
set have;
agegroup = compbl(agegroup);

ix = index(agegroup,'to');
if ix > 0 then do;
agegroup1 = substr(agegroup,1,ix-1);
agegroup2 = substr(agegroup,ix+3);
end; else do;
ix = index(agegroup,'and');
if ix > 0 then do;
agegroup1 = substr(agegroup,1,ix-1);
agegroup2 = substr(agegroup,ix+4);
end; else do;
agegroup1 = scan(agegroup,1);
agegroup2 = scan(agegroup,2);
end;
end;
*drop ix;
run;

data want;
set temp;
length from to 8;
keep from upto;
format from upto best.2;

if  strip(agegroup1) in ('up', 'all')  then from=0; else
if index(agegroup1,'month') > 0 then from = input(compress(agegroup1,,'A'),best.); else
if index(agegroup1,'year') > 0 then from =  input(compress(agegroup1,,'A'), best.)*12;

if index(agegroup2,'days') > 0 then
upto = input(compress(agegroup2,,'A'), best.)/30;
else if index(agegroup2,'year') > 0 then upto= input(compress(agegroup2,,'A'), best.)*12;
else if index(agegroup2,'hour') > 0 then upto = input(compress(agegroup2,,'A'), best.)/(30*24);
else if strip(agegroup2) in ('older', 'ages') then upto=1200;      from = round(from, 0.01);      upto = round(upto, 0.01);
run;

``````

Discussion stats
• 5 replies
• 276 views
• 1 like
• 2 in conversation