Hello team,
I inherited a code that I mimicked to this one:
data mydata; length HealhPlanID $14; format HealthPlanID $char14.; informat healthPlanID $char14.; input healthplanID; datalines; 1234 456 ; run;
What is the role of informat here? Why is it placed here? I know format instructs how the values to be displayed and informat instructs SAS how to read the data.
Couldn't length and format be included in input statement?
blueblue
There is no obvious reason for the $CHAR14. informat here. It reads character variables with spaces, which you don't have. Yes, informats can be placed in the INPUT statement.
Nor is there any obvious reason for using the $CHAR14. format here.
Now it could be that with different data, these are necessary.
LENGTH cannot go in the INPUT statement if it is different than the width of the format (which it isn't), but that isn't needed with this data either.
The LENGTH statement defines how much storage is required for the variable in the dataset. SAS has two types of variable fixed length character strings and 64 bit floating point numbers. For character variables the length is the number of bytes to represent the characters. In single byte encodings that matches the number of characters that it can hold. In encodings like UTF-8 than use more then one byte to represent a single character the number of characters that will fit depends on the characters being saved and so might be less than the length.
The INFORMAT defines default way to convert text into values to store them into the variable (for example for reading from the input.) The width of an informat specfication defines the default number of bytes to read.
The FORMAT defines the default way to convert the values stored in the variable into text (for example for printing on the output). The width on the format specification defines the default number of bytes to write.
If you do not explicitly set the length for your variables with a LENGTH statement (or LENGTH= option on ATTRIB statement) then SAS will GUESS how to define them based on where you first use them. So if the first place is in an INFORMAT statement then SAS will guess to define the length based on the width used on the informat. Or if the first place you reference the variable is in an INPUT statement it will use any informat specification used in the INPUT statement to guess how to define the variable. Similarly for a FORMAT or PUT statement.
Otherwise it will try to guess from the context of how you use the variable. If the first place is on the left of an assignment statement it will match the definition to the expression on the right side.
When it has no other information about length when it has to decide how to define it then it will default the length to 8 bytes.
Do you really need to preserve leading spaces in the values stored? And show the leading spaces when the values are printed? If not then just let SAS used its default behavior for read/writing character variables.
data mydata;
length HealhPlanID $14;
input healthplanID;
datalines;
1234
456
;
Hello,
Thanks for the response. The data is not what I have brought it here.
Still one thing is not clear for me.
What is the point of using informat in the code that I posted? I always use the above code without informat statement.
You explained that:
The INFORMAT defines default way to convert text into values to store them into the variable (for example for reading from the input.
What does informat do if our variables are numeric? Doesn't informat read both characters and numbers?
Thanks for the response.
Blue
In the example you posted the INFORMAT statement is redundant as the INPUT statement will read your character data anyway. However if you are inputting non-standard data like dates or character data longer than 8 characters then the INFORMAT statement can be used to define how you want to read it instead of adding INFORMATs to the INPUT statement.
INFORMATs convert text to values. FORMATs convert values to text. A character informat creates character values. A numeric informat creates numeric values. A character format converts character values. A numeric format converts numeric values. Character informats and formats start with a $.
In addition to their effect on the current step using an INFORMAT or FORMAT statement in data step will causes those informats/formats to be attached to the variable when the dataset is created. (NOTE: To remove an existing informat or format from a variable list them in the statement without any specification after words.)
* Remove format attached to age and sex;
format age sex;
Why people use them vary. Some like to have them stored in the metadata of the dataset to help make the dataset more self documenting. Using the INFORMAT statement in a data step does allow you to use an informat that reads non-standard text (such DATE or TIME or DATETIME informats) without having to list the informat specification in the INPUT statement. This can make your INPUT statement much easier to write. For example if you have defined the variables in the same order in the dataset as their appear in the lines of text being read you can just use a positional variable list in the INPUT statement and not have to re-type all of the variable names.
input name -- weight ;
Some people might do it because they are confused and think the word "format" has something to do with defining how the variable is stored.
Some people might just like the way it makes the code look.
Note there is not really much need to have an informat attached to a variabel in a dataset now. Perhaps back in the early days when SAS/FSP was new and Excel and other GUI tools for entering data did not exist yet it made some sense.
Hi Tom,
Thanks for your response.
I am still not clear on one thing:
A character format converts character values. A numeric format converts numeric values. Character informats and formats start with a $.
Isn't it better to say a character format, format characters and numeric format formats numeric values? Because concept of convert is different than format. Format doesn't change anything in the stored value, but convert does. That might be wrong though. Please correct me if I am wrong.
Regards,
BlueBlue
If you are looking for other verbs try READ and WRITE.
Character informats read text as character strings. Numeric informats read text as numbers.
Character formats write character strings as text. Numeric formats write numbers as text.
Hello team,
Regarding this post, my boss told me when you import a file into SAS, these codes are created like (recorded macros in Excel) and then the author of the code has used that code in her codde.
Regards,
blueblue
It sounds like your boss is describing PROC IMPORT which can read delimited data files into SAS. Behind the scenes PROC IMPORT creates a SAS DATA step which you can see in the SAS log and you can copy that code back into your SAS editor and then use that code instead of PROC IMPORT.
Your example program does not read a delimited however so PROC IMPORT is of no use in this case.
PROC EXPORT is the only other one I can think of right now.
All tasks in SAS frontends like Enterprise Guide or DI Studio create and send code, which you can see in Code and Log tabs.
Hello,
I don't know how to create tasks at the frontend except that I can drag and drop fields.
Regards,
blueblue
You don't create those tasks, you just use them.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.