truncated data values

Reply
Frequent Contributor
Posts: 122

truncated data values

I imported some text files into SAS. I observed that some data were automatically truncated. How can I avoid this? I want to keep full text information. Also, when I tried to merge two data using

data a; set b c; run;

there is warning that a may have truncated values. I wonder why b and c do not have truncated values, while new dataset a has?

Thanks.

Super User
Super User
Posts: 6,502

Re: truncated data values

Imported from what?  If you used PROC IMPORT to read from an Excel file or CSV file then SAS had to guess how to define your variables.

You will have much better success if you save your data as a text (CSV files are good) and write your own data step to read it so that you can control the type and length of the variables you create.

The message about truncation is because when you concatenate datasets using that SET B C syntax the length of the variables will be defined by the first occurrence.  So if variable X is defined as character with length 10 in dataset B, but it is defined as character with length 20 in dataset C the values from C could be truncated.  That is any value that is longer than 10 characters will be truncated.  Again if you define the dataset yourself instead of letting PROC IMPORT guess for you then you will not have this problem.

Frequent Contributor
Posts: 122

Re: truncated data values

Thanks a lot! I can write my own data step as I get the rough code for data step when I execute proc import. Since I have a lot of variables (about twenty) it's just a pain. I don't know exactly the length of the longest value for each variable. For example, for the address, it might be forty characters for some item. Can I just use a random long number to define it, such as 200 in the data step?

Respected Advisor
Posts: 3,899

Re: truncated data values

Yes, you could use "random" lengths but I would recommend to use a length where you're sure it's sufficient but not longer than necessary. The reason is I/O as performance will decrease the longer the variable is plus depending on how you write your reports the variable sometimes simply take up the space "on paper" based on the variable length.

Ask a Question
Discussion stats
  • 3 replies
  • 1093 views
  • 8 likes
  • 3 in conversation