Hi.
One of my SAS users has asked me to explain why the code below does not do what they expect - ie create a numeric variable (newvar) which is equal to 1 when the character variable myvar has a value of D01a_tot and is missing otherwise.
Now I realise there are many workarounds to this, which I have already fedback, and also that using the informat to code just one value is probably not necessary.
But I'd still be keen to know why ALL records get assigned a value of 1 when the numeric variable is created. Clearly all the records start with "D01a_Tot" and I guess that is causing the issue.
Can anyone explain why its happening and whether it's actually not "unexpected".
Thanks!
Code below (running on SAS 9.4M3)
data temp;
length myvar $20;
myvar="D01a_Tot_D02"; output;
myvar="D01a_TotP09"; output;
myvar="D01a_TotP12"; output;
myvar="D01a_TotP15"; output;
myvar="D01a_TotP16"; output;
myvar="D01a_Tot"; output;
run;
proc format;
INVALUE my_order
"D01a_Tot" = 1;
run;
data temp2;
set temp;
newvar=input(myvar,my_order.);
run;
The default length of the invalue is the length of D01a_Tot - 8, which is the length of the comparison being made on your myvar variable; everything after that is ignored.
If you change the format to
proc format;
INVALUE my_order(default=20)
"D01a_Tot" = 1
other = .; /* Stops error messages */
run;
or the data step to:
data temp2;
set temp;
newvar=input(myvar, my_order20.);
run;
then you'll get the result you require. Note that the second suggestion doesn't stop error messages.
I would recommend the former, because you can reuse the format in all your other data steps without thinking about it.
The default length of the invalue is the length of D01a_Tot - 8, which is the length of the comparison being made on your myvar variable; everything after that is ignored.
If you change the format to
proc format;
INVALUE my_order(default=20)
"D01a_Tot" = 1
other = .; /* Stops error messages */
run;
or the data step to:
data temp2;
set temp;
newvar=input(myvar, my_order20.);
run;
then you'll get the result you require. Note that the second suggestion doesn't stop error messages.
I would recommend the former, because you can reuse the format in all your other data steps without thinking about it.
Thanks.
That all makes sense.
The INPUT function sometimes behaves a bit differently than an input statement. Please see this:
proc format library=work; INVALUE my_order "D01a_Tot" = 1 other = . ; run; data temp; informat x my_order.; input x; datalines; D01a_Tot_D02 D01a_TotP09 D01a_TotP12 D01a_TotP15 D01a_TotP16 D01a_Tot run;
Note that to assign other values such as missing it is a good idea to explicitly set that in the INVALUE so there isn't any doubt of intent.
It helps to post code into a code box opened using either the {i} or "running man" icon in this forum.
Thanks for the prompt reply - the time zone here is handy for overnight responses
Will use that for reading in data using informats.
And thanks for the tip re inserting the code: will note for future.
I'm only two hours (probably) ahead of you - Wellington!
Glad to have helped.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.