Hello Everyone,
I want to use an INFORMAT in order to convert an alphanumeric variable that holds the age of a person. However, this turns out to be quite cumbersome. Reason is, that all "ages" greater than 99 years are expressed as "A0", "A1", etc. So for example, the age of one-hundred-and-forty-four (144) would be expressed as "A44".
How can I built a valid informat which takes the above circumstances into account? I tried using the tricks outlined here:
https://support.sas.com/resources/papers/proceedings12/245-2012.pdf
proc fcmp outlib=work.functions.smd;
function charagetonum(charage) ;
return( 100 + input(scan(charage,1,"A"),5.));
endsub;
run;
options cmplib=(work.functions);
proc format;
invalue bla (default=5) other=[charagetonum()];
invalue isnum (default=5)
'/[0-9]/' (regexpe) = _same_
'/A+[0-9]/' (regexpe) = [bla. ]
other=_error_
;
run;
data _null_;
input alter:isnum. @@;
put alter= ;
datalines;
00 25 26 A6 89 A9 8163 23 83 34 A3 78 35 62
25 26 A6 A6 A79 103 99 A0 A1 A3 A2 A9 A10 " "
;
run;
So far, I have not been able to get it going!
Does anybody have any ideas?
As I mentioned above: I want to "solve" it using a informat.
Hi,
it was definitely a good idea to treat the range as a regex and you do not need a fcmp
Since it's for age, I suspect you would like to produce an error if it's >120 or >A20
Then this code is for you:
proc format;
invalue isage (default=5) '/^(\d\d?)|(11\d)|(120)$/' (regexp) = _same_ other=[A10x.]; *Match ages btw. 0-120;
invalue A10x(default=5) 's/^A(\d)$/10$1/' (regexpe) = _same_ other=[A1x.]; *Match A1-9 => ages btw. 101-109;
invalue A1x(default=5) 's/^(A([0-1]\d))$/1$2/' (regexpe) = _same_ other=[A2x.]; *Match A0[0-9],Match A1[0-9] => ages btw. 100-119;
invalue A2x(default=5) 's/^(A(20))$/1$2/' (regexpe) = _same_ other=_error_; *Match A20 => ages btw. 100-120;
run;
data test;
input x:best. y:isage.;
put x= y=;
datalines;
0 00
1 100
2 145
3 132
4 102
5 A5
6 A00
7 A19
8 A20
run;
y=121 would produce an error, same as y=A21
- Cheers -
Hello @FK1,
Here's a numeric informat which leaves numeric values unchanged (by reading them with the 32. informat) and which reads A0, ..., A99 as 100, ..., 199. If needed, A00, ..., A09 (as alternative notations for 100, ..., 109) could be added to the definition.
data ageinf;
retain fmtname 'ageinf' type 'I';
length start $3;
do _n_=0 to 99;
start=cat('A',_n_);
label=put(100+_n_,3.);
output;
end;
hlo='OF';
label='32.';
output;
run;
proc format cntlin=ageinf;
run;
Edit: If you want to see the START and END values sorted numerically in FMTLIB output, add
hlo 'S '
(note the trailing blank) to the RETAIN statement.
Edit 2: Another useful addition to the HLO variable might be 'J' so that Axx values with leading blanks would be interpreted as if they were left-justified.
hlo 'SJ'
I'm not near a SAS machine, so I can not provide a testet code answer. But, unless this is more of a Proc Format Regex training thing, I think the logic of testing whether an input value has 'A' in it is much easier to create directly in the Proc FCMP.
This would make your Proc Format Statement simpler and the code easier to test and debug 🙂
Hi,
it was definitely a good idea to treat the range as a regex and you do not need a fcmp
Since it's for age, I suspect you would like to produce an error if it's >120 or >A20
Then this code is for you:
proc format;
invalue isage (default=5) '/^(\d\d?)|(11\d)|(120)$/' (regexp) = _same_ other=[A10x.]; *Match ages btw. 0-120;
invalue A10x(default=5) 's/^A(\d)$/10$1/' (regexpe) = _same_ other=[A1x.]; *Match A1-9 => ages btw. 101-109;
invalue A1x(default=5) 's/^(A([0-1]\d))$/1$2/' (regexpe) = _same_ other=[A2x.]; *Match A0[0-9],Match A1[0-9] => ages btw. 100-119;
invalue A2x(default=5) 's/^(A(20))$/1$2/' (regexpe) = _same_ other=_error_; *Match A20 => ages btw. 100-120;
run;
data test;
input x:best. y:isage.;
put x= y=;
datalines;
0 00
1 100
2 145
3 132
4 102
5 A5
6 A00
7 A19
8 A20
run;
y=121 would produce an error, same as y=A21
- Cheers -
You seem to be working way too hard.
proc fcmp outlib=work.functions.smd;
function charagetonum(charage $) ;
if charage=:'A' then value=100+input(substr(charage,2),32.);
else value=input(charage,32.);
return(value);
endsub;
run;
options cmplib=(work.functions);
proc format;
invalue charagetonum(default=5) other=[charagetonum()];
run;
data test;
input age :charagetonum. @@;
datalines;
00 25 26 A6 89 A9 8163 23 83 34 A3 78 35 62
25 26 A6 A6 A79 103 99 A0 A1 A3 A2 A9 A10 .
;
proc print;
run;
Thank you @Tom @Oligolas @PeterClemmensen @FreelanceReinh for your valuable comments!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.