BookmarkSubscribeRSS Feed
CathyVI
Pyrite | Level 9

 So I have a data with over a million records say;

 

name  salary    wages 

Dog      100         20

Cat       120        40

Cattle    391         35

.            .             .

 

How do I check if  "Cattle " is not truncated in my data and is it also possible to check  the variable name for any truncated record/observation ?

 

Thanks.

 

3 REPLIES 3
Reeza
Super User
Eyeball test. Run a PROC FREQ and check if they're truncated. There's no way the computer can know if you meant CATTLE or CATT for the text value, and this usually means the data was not read in properly for some reason.
ballardw
Super User

@CathyVI wrote:

 So I have a data with over a million records say;

 

name  salary    wages 

Dog      100         20

Cat       120        40

Cattle    391         35

.            .             .

 

How do I check if  "Cattle " is not truncated in my data and is it also possible to check  the variable name for any truncated record/observation ?

 

Thanks.

 


I would start with Proc Contents. If the length of any of my defined lengths looks off the there is a problem.

 

One question I would also ask is why do you suspect you may have truncated data? The main reason I can think of that possibility is using Proc Import to "read" data. Since Proc Import makes guesses each time a file is read then similar files can end up with different variable properties such as length or variable type. If the data source is a text file of some sort one likely source of truncation can be removed by using a GUESSINGROWS=MAX; statement.

If your data is coming from some other file type, spreadsheet perhaps, then GUESSINGROWS is not available and only a few rows are examined to set the variable types and lengths. Make a text file such as CSV. Import that.

 

But you still have potential issues with different files. If you were given any documentation as to the expected lengths of the variables then you should use a data step to read the data and use the documentation to set variable types and lengths.

Note that Proc Import will create data step code when reading a text file. Copy from the editor, set the lengths to the documented or expected lengths. Reread.

ScottBass
Rhodochrosite | Level 12

@CathyVI wrote:

 So I have a data with over a million records say;

 

name  salary    wages 

Dog      100         20

Cat       120        40

Cattle    391         35

.            .             .

 

How do I check if  "Cattle " is not truncated in my data and is it also possible to check  the variable name for any truncated record/observation ?

 

Thanks.

 


Truncation implies the transformation of data from source to target.  So, by definition, you have untruncated data in source, and potentially truncated data in target.

 

So, get the max length of your data from source, then compare to the allocated length of your target.

 

I'm not saying this is the prettiest code but perhaps you can run with it:

 

data source;
   set sashelp.class;
run;

proc sql noprint;
   create table lengths as
   select "NAME" as name, max(length(name)) as src_length
   from sashelp.class;
quit;

data target;
   length name $3;  * causes truncation ;
   set source;
run;

proc contents data=target out=contents noprint;
run;

proc sql;
   create table baddata as
   select
      *
   from
      contents a
   inner join
      lengths b
   on
      lowcase(a.name)=lowcase(b.name)
   having
      a.length < b.src_length
   ;
quit;

   

 

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2440 views
  • 1 like
  • 4 in conversation