That will not work with values that have used single quotes to quote commas.
Here is another trick to count how many words are on the line that use the LENGTH= and COLUMN= options on the INFILE statement.
data test;
infile tmpfile1 dsd length=len column=col;
length dummy $1 ;
do n1=1 by 1 until(col>len);
input dummy @;
end;
n2=countw(compress(_infile_,"'"),',','mq');
put _infile_ / (n:) (=) / ;
run;
So your test will see the comma in the company name as representing another field.
NOTE: The infile TMPFILE1 is:
Filename=.....
1,HANNAH,QU,HANNAH'S BAKERY INC.,46,F
n1=6 n2=6
2,HANNAH,QU,'Acme, Inc.',46,F
n1=6 n2=7
NOTE: 2 records were read from the infile TMPFILE1.
If you add that loop into the program then just remember to add an @1 into move the cursor back to the beginning before reading the lines. So perhaps you want something like this to store the records with proper number of fields into one dataset and those with too few or too many into another. You could also add the total counts into anther table.
* Use macro variable to change the expected number of fields ;
%let nfields=7 ;
data good (drop=ngood nbad nfields line)
bad (keep=nrec nfields line)
counts (keep=nrec ngood nbad)
;
if eof then output counts;
infile 'myfile.csv' dsd length=len column=col end=eof;
length ngood nbad nfields 8 line $200 ;
do nfields=1 to &nfields+1 until(col>len);
input line @;
end;
if nfields ne &nfields then do;
nbad+1;
line=_infile_;
output bad;
end;
else do ;
ngood+1;
length nrec nmiss nfields REC_NUM 8 FIRST_NM $20 LAST_NM $20 COMPANY_NM $20 AGE 8 GENDER $1 ;
input @1 REC_NUM -- GENDER ;
nrec+1;
output good;
end;
run;
... View more