hi,
I get a new CSV file from client to upload into SAS. The file contains numeric and character data. With numeric data calculating the format specifier is easy . But with character data what options exist (general rules followed by practitioners) to specify the length of the character field in the format statement.
The CSV file contains product name, description and remarks. How do I specify the length for the two columns description and remarks ? what is the generally followed steps and options given by sas to calculate the suitable lengths for these fields ?
In sql, there is a text data type that can hold very large text. But in sas we don't have something like that. I am looking for an option where sas reads all alpha numeric data from one delimiter to the next and stores it into a field.
The alternate ways are specifying the max size 32767 or providing a lrecl= value or open the data in excel and find the longest character data and use it in the format statement.
I am using sas university edition.
Is there an option in sas to read character data without specifying a length ?
Some details please:
Are these one-time files or repeated files that should use the same layout?
If repeated files using the same layout was there any document or data transfer agreement that described the files?
If you do not specify an length SAS will default to 8 characters for character values.
If you are reading repeated files then the proper approach would be to write a data step based on the data transfer agreement and specify the lengths that should have been set out in that document.
If there is no such document you should work towards getting one. That is the proper way, you specify the lengths and variable types in a data step based on knowledge of the data to be transferred.
DO NOT OPEN CSV FILES IN EXCEL AS YOU DESCRIBE. People have been know to accidently save the files while in Excel and the can change the actual contents of some fields.
Proc Import can read a one time file, or one of a repeated file to build the main part of the data step code you want. Be sure to specify the GUESSINGROWS=MAX option so that the procedure will examine the whole file before guessing the variable type and length.
The log will contain code generated by Proc Import to read the file. Copy that from the Log into a program file.
Set the lengths, modifying the INFORMAT statements generate will do, to the lengths in that file content document. Make sure that the INFORMAT and FORMATS for dates, times, datetimes and currency fields are reading the data correctly.
You would then only have to change the data step program to point to a new file to read on the infile statement and possibly change the name of the output data set.
I simply take the lengths from the documentation I receive along with the file. If you did not get such documentation, do not request it. Require it.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.