How to remove inconsistent data rows?

How to remove inconsistent data rows?

Hi SAS users,


I am trying to combine dozens of csv files usng a wildcard code (INFILE, INFORMAT, INPUT). Although I don't get any error for my code, I get a warning that makes me concerend. Looking at the log and my csv files, it seems that (at least part of the issue) is due to inconsistant data type in some rows (see the screenshot below).


Now, my question is how to ask SAS to delete/remove those rows (having inconsistant data type) while running the program?


Thank you for help,




Re: How to remove inconsistent data rows?

I am sure there is a cleaner way to do this but I would import all of the fileds as characters, then remove the ones you don't want then convert the variables to be the correct formats.


it might take a few steps but this way you have more control over what you are getting.

Re: How to remove inconsistent data rows?

Is this an adhoc operation or a regular/production like operation?
If the later, establish a file specification with your data supplier that states data types, lengths, names, valid values etc. When data fail to follow this spec, report back and demand a file that is corrected.
Re: How to remove inconsistent data rows?

If this is truly affectin many variables the easiest might be using the CMISS function. Cmiss will tell you how many of the variables have missing values. So if the number is "large" you could delete them from a file.


data want;

   set have;

   if cmiss(var1, var2, var3 , ...,varn) < (some critical number);



If you have lots of rows like this it may be worth going back to your raw csv files using a TEXT program to see if you have rows at the ends of files that are all commas. This is a common occurence when converting spreadsheets to CSV. Some forms of use will "touch" rows or columns so that they get included in CSV files when there is no data.


Something else to look for are inserted "page breaks" such a row between blocks of data.

Re: How to remove inconsistent data rows?

use operator ?? to set them all missing .


input a : ?? ddmmyy10. b : ?? best32. c : ?? $20. ............

