Dear SAS-users,
I am very new to SAS and apologise for rather a basic question.
I wonder what is the difference between applying a new format using FORMAT and creating a new variable with the new format using PUT?
Below is the example of custom-formatting ParcCode variable. I tried both - using the FORMAT and create new variable Region using PUT. In the output table the both ParkCode and Region columns look identical, but the format of the Region is blank.
Is there any crucial difference between the two approaches? I understand that I'll need PUT to transform numeric to character, but in this case ParkCode is already a character, so I apparently can just use FORMAT then?
data np_lookup;
retain FmtName "$RegLbl";
set pg2.np_codeLookup (rename=(ParkCode = Start));
if Region ne " " then Label = Region;
else Label = "Unknown";
keep start label fmtname;
run;
proc format cntlin=np_lookup;
run;
data np_endanger;
set pg2.np_species;
where Conservation_Status='Endangered';
Region = put(parkcode, $reglbl.);
format parkcode $reglbl. ;
run;
I appreciate some tips and links to resources where I can find an answer!
Marina
Hi @PaigeMiller and @Jagadishkatam,
Thank you for your answers! You explained perfectly and now I understand the critical difference between the two approached.
Again, many thanks!
Marina
If we apply the format with put function (1) then the region will have the values as per the format. However if you apply the format via format statement (2) then values will be assign superficially only for the display but internally it will still have the original values in region as like parkcode.
if you use put function with format then there is no need for format statement.
1) Region = put(parkcode, $reglbl.);
2) format parkcode $reglbl. ;
In addition to what was said by @Jagadishkatam
If the variable PARKCODE is NUMERIC, then the PUT statement changes it to character, and it contains the text as defined by the format. This could make a difference in some analyses.
Basically, using a FORMAT statement changes the appearance of the variable as us humans see it; it does not change the underlying value of the variable as SAS sees it. Using the PUT statement actually changes the values of a variable.
Hi @PaigeMiller and @Jagadishkatam,
Thank you for your answers! You explained perfectly and now I understand the critical difference between the two approached.
Again, many thanks!
Marina
Hello @marina_esp and welcome to the SAS Support Communities!
Formats are a powerful and very useful feature of SAS. As Jag and Paige have pointed out, a format doesn't change the internal value of a variable even if it is associated with the variable in a dataset. This has many advantages, e.g.:
/* Create a format */
proc format;
value $code
'N1'='United States'
'N2'='Canada'
'S1'='Brazil'
'S2'='Argentina'
/* ... */
;
run;
/* Create test data */
data have;
input id c :$2. x;
format c $code.;
cards;
1 N2 50
2 S1 20
3 N1 100
4 S2 70
5 N1 60
;
/* Select data based on the internal values */
data south;
set have;
if c=:'S';
run;
/* Select data based on the formatted values */
data cdn;
set have(where=(put(c,$code.)='Canada'));
run;
/* Summarize data based on formatted values */
proc means data=have sum;
class c;
var x;
run;
/* Summarize data based on formatted values
using a different format "on the fly" */
proc means data=have sum;
class c;
format c $1.;
var x;
run;
/* Frequency table sorted by internal values */
proc freq data=have;
tables c;
run;
/* Frequency table sorted alphabetically by formatted values */
proc freq data=have order=formatted;
tables c;
run;
Hello Reinhard,
Thank you for explanations and examples - I tried them all. I see how FORMAT can be more useful than PUT.
Can I ask you a question about one of the examples you gave?
data cdn;
set have(where=(put(c,$code.)='Canada'));
run;
Here, I am not sure why do you have to use PUT when we already specified the FORMAT of column "c" when we were creating dataset "have"? Is that because FORMAT did not change the native format of values in "c" , but only their apprearace in the reports, right?
Thank you for answering!
M
@marina_esp wrote:
data cdn; set have(where=(put(c,$code.)='Canada')); run;
Here, I am not sure why do you have to use PUT when we already specified the FORMAT of column "c" when we were creating dataset "have"? Is that because FORMAT did not change the native format of values in "c" , but only their apprearace in the reports, right?
You raise a very good point. It is indeed somewhat counterintuitive that the format name must be specified again in this WHERE condition although it is stored in dataset HAVE and SAS should be able to retrieve it from there. Of course, we need to specify somehow that we want to refer to the formatted value of variable C because, as you've noticed correctly, the format did not change the "internal" value of the variable, so that the expression C='Canada' would be false. (Note, however, how tempting it is to use this type of criterion if you saw the formatted values of C, e.g., in PROC PRINT output, without being aware that format $code. is associated with that variable!)
There are ways to refer to the formatted value without specifying the format: by using SAS functions of category "Variable Information," e.g., the VVALUE function:
data cdn;
set have;
if vvalue(c)='Canada';
run;
This is possibly a better example than the previous one. Unfortunately, these "V..." functions cannot be used in WHERE conditions.* That's why I switched to IF.
*The error message in this case is really funny:
76 data cdn; 77 set have; 78 where vvalue(c)='Canada'; ERROR: Function VVALUE is supported by the DATA STEP only. 79 run; NOTE: The SAS System stopped processing this step because of errors.
As if this weren't a DATA STEP! 🙂
Hey @FreelanceReinh,
Ok, I see. I have even tried to use c ="Canada " and got an empty data frame as an output.
Thank you for mentioning the VVALUE function, I've never heard about that, but will check it out.
Today is the first time I posted a question on the SAS community, and I learned so much.
Thanks to you and the other guys who replied me earlier today.
SAS online communities are great!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.