BookmarkSubscribeRSS Feed
ariyurjana
Obsidian | Level 7

hi, 

I get a new CSV file from client to upload into SAS. The file contains numeric and character data. With numeric data calculating the format specifier is easy . But with character data what options exist (general rules followed by practitioners) to specify the length of the character field in the format statement.

The CSV file contains product name, description and remarks. How do I specify the length for the two columns description and remarks ? what is the generally followed steps and options given by sas to calculate the suitable lengths for these fields ?

 

In sql, there is a text data type that can hold very large text. But in sas we don't have something like that. I am looking for an option where sas reads all alpha numeric data  from one delimiter to the next and stores it into a field. 

The alternate ways are specifying the max size  32767 or providing a lrecl= value or open the data in excel and find the longest character data and use it in the format statement.

 

I am using sas university edition.

 

Is there an option in sas to read character data without specifying a length ?

 

3 REPLIES 3
ballardw
Super User

Some details please:

Are these one-time files or repeated files that should use the same layout?

If repeated files using the same layout was there any document or data transfer agreement that described the files?

 

If you do not specify an length SAS will default to 8 characters for character values.

 

If you are reading repeated files then the proper approach would be to write a data step based on the data transfer agreement and specify the lengths that should have been set out in that document.

If there is no such document you should work towards getting one. That is the proper way, you specify the lengths and variable types in a data step based on knowledge of the data to be transferred.

 

DO NOT OPEN CSV FILES IN EXCEL AS YOU DESCRIBE. People have been know to accidently save the files while in Excel and the can change the actual contents of some fields.

 

Proc Import can read a one time file, or one of a repeated file to build the main part of the data step code you want. Be sure to specify the GUESSINGROWS=MAX option so that the procedure will examine the whole file before guessing the variable type and length.

The log will contain code generated by Proc Import to read the file. Copy that from the Log into a program file.

Set the lengths, modifying the INFORMAT statements generate will do, to the lengths in that file content document. Make sure that the INFORMAT and FORMATS for dates, times, datetimes and currency fields are reading the data correctly.

You would then only have to change the data step program to point to a new file to read on the infile statement and possibly change the name of the output data set.

Peter_C
Rhodochrosite | Level 12
When investigating data with unknown lengths, specify all as ,say, $30 and check how well filled they are. (Get frequency report on the column lengths and widen any above 90% filled and truncate any than 50% filled.) You should be able to recognise dates. If you go to extremes with very wide columns, add the dataset option Compress=char

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 825 views
  • 0 likes
  • 4 in conversation