DATA Step, Macro, Functions and more

Catching wrong type of data while reading .csv file

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

Catching wrong type of data while reading .csv file

Hello,

 

The following code 

 

	%let csvName = &path.\&site.&dateStr.appxReviewTimes.csv ;

	data &site&dateStr.AppxReviewTimes;
    	infile "&csvName" obs=max dsd; 
		length 
		DeviceID $8
		dateStr $10
		startTimeStr endTimeStr $8
		NWS_code $3;
    retain site;
    if _n_ = 1 then site = "&site";
	input dateStr $ NWS_code $ startTimestr $ endTimeStr $ DeviceID $;

works well to read correct lines such as the first two shown below. Unfortunately, I sometimes get lines such as the third one thrown in. I'd like to ignore the third one. Also, regarding the fourth line, when I see the DeviceID = 'PWD53_02', I'd like to ignore that, too. Any suggestions for catching this error? Thanks very much, Bruce

12/01/2015, R-,23:26:20,23:27:51,PWD53_01
12/01/2015,  C,23:28:05,23:59:50,PWD53_01
WMO FOR IP- AT TIME 2015-12-01 23:25:51 IS 71
12/01/2015, C,00:00:05,00:17:50,PWD53_02

Accepted Solutions
Solution
‎01-06-2016 01:56 PM
Respected Advisor
Posts: 4,654

Re: Catching wrong type of data while reading .csv file

I would suggest analysing the input line before it is parsed as a CSV string:

 

data &site&dateStr.AppxReviewTimes;
infile "&csvName" dsd; 
length 
	DeviceID $8
	dateStr $10
	startTimeStr endTimeStr $8
	NWS_code $3;
retain site;
if _n_ = 1 then site = "&site";
input @;
if notdigit(substr(_infile_, 1, 2)) ne 0 then delete;
input dateStr NWS_code startTimestr endTimeStr DeviceID;
if deviceID="PWD53_02" then delete;
PG

View solution in original post


All Replies
Super User
Posts: 10,521

Re: Catching wrong type of data while reading .csv file

If possible I recommend going to the source and seeing if the layout can be corrected. Knowing that isn't possible then you end up with any of a number of approaches.

One depends on what you want. If you do not want the error lines in the output data set something like:

 

if input(datestr,mmddyy10.) = . then delete; would remove the lines where the first variable isn't valid date.

For any specific values you don't want in your data set then similar:

 

if device= "PWD53_02" then delete;

If there are multiple values for a variable you don't want use IN operator to use a list:

If device in ("PWD53_02" "PWD53_04" "PWD54_02") then delete;

Contributor
Posts: 37

Re: Catching wrong type of data while reading .csv file

Thank you! I'll give this a try and "accept as solution" if/when it works. Bruce
Solution
‎01-06-2016 01:56 PM
Respected Advisor
Posts: 4,654

Re: Catching wrong type of data while reading .csv file

I would suggest analysing the input line before it is parsed as a CSV string:

 

data &site&dateStr.AppxReviewTimes;
infile "&csvName" dsd; 
length 
	DeviceID $8
	dateStr $10
	startTimeStr endTimeStr $8
	NWS_code $3;
retain site;
if _n_ = 1 then site = "&site";
input @;
if notdigit(substr(_infile_, 1, 2)) ne 0 then delete;
input dateStr NWS_code startTimestr endTimeStr DeviceID;
if deviceID="PWD53_02" then delete;
PG
Contributor
Posts: 37

Re: Catching wrong type of data while reading .csv file

Thanks, PG. I will try this. Could you please help me out with a couple questions? In 

input @;
if notdigit(substr(_infile_, 1, 2)) ne 0 then delete;
input dateStr NWS_code startTimestr endTimeStr DeviceID;
  1. does the first statement read the entire line of the .csv file (I assume it does)
  2. _infile_ must be what was just read by the previous statement, correct? Where is this documented? I have not seen this before.

thanks very much, Bruce

Respected Advisor
Posts: 4,654

Re: Catching wrong type of data while reading .csv file

1) yes.

2) Google site:support.sas.com "_INFILE_"

PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 260 views
  • 1 like
  • 3 in conversation