Hi Team,
I am looking for a SAS code for validating a csv file.
I have variables like
FORMAT STARTED_TS MDYAMPM.;
format yearmonth 8.;
format APP_ID $255.;
format SN_NUMBER 22.;
format SN_AMOUNT 20.2;
format P_ID 19.;
Want to keep all exceptional records in a separate dataset.
Thanks in advance
Kajal
I tried code but didn't work
data work.cleandata(keep= APP_ID SN_NUMBER SN_AMOUNT P_ID)
work.errors(keep=Err_Message Faulty_Record)
;
length
APP_ID $255 SN_NUMBER 8 SN_AMOUNT 8 P_ID 8 Err_Message $ 100 Faulty_Record $ 250 ;
infile "/home/sas/score.csv " filename = _filename
delimiter = ","
dsd
truncover
missover dsd
lrecl=32767
firstobs=2;
input;
APP_ID = scan(_filename,2, ',');
SN_NUMBER = input(scan(_filename,3, ','), ?? best.);
SN_AMOUNT = input(scan(_filename,5, ','), ?? best.);
P_ID = input(scan(_filename,6, ','), ?? best.);
if missing(APP_ID) then do;
Faulty_Record = _filename;
Err_Message = 'APP_ID variable cannot be missing';
output work.errors;
end;
if missing(SN_NUMBER) then do;
Faulty_Record = _filename;
Err_Message = 'SN_NUMBER variable cannot be missing';
output work.errors;
end;
if missing(P_ID) then do;
Faulty_Record = _filename;
Err_Message = 'P_ID variable cannot be missing';
output work.errors;
end;
if missing(Faulty_Record) then do;
output work.cleandata;
end;
run;
None of those SCAN() function calls are going to return anything. Your filename does not have any commas in it.
Did you want to look at the content of the line being read instead of the name of the file being read? If so then use the automatic _INFILE_ variable instead of the variable you created with the FILENAME= option of the INFILE statement.
Your code is working WAAAAY too hard. Let the INPUT statement parse the line. You seem to want to ignore the first and fourth values on the line. So just read them into a dummy character variable.
data
cleandata(keep= APP_ID SN_NUMBER SN_AMOUNT P_ID)
errors(keep=Err_Message Faulty_Record)
;
length APP_ID $255 SN_NUMBER SN_AMOUNT P_ID 8 Err_Message $100 Faulty_Record $250 ;
infile "/home/sas/score.csv"
filename = _filename
dsd
truncover
lrecl=32767
firstobs=2
;
length dummy $1 ;
input dummy APP_ID SN_NUMBER ??
dummy SN_AMOUNT ?? P_ID ??
;
Faulty_Record = _INFILE_ ;
if missing(APP_ID) then do;
Err_Message = 'APP_ID variable cannot be missing';
output work.errors;
end;
if missing(SN_NUMBER) then do;
Err_Message = 'SN_NUMBER variable cannot be missing';
output work.errors;
end;
if missing(P_ID) then do;
Err_Message = 'P_ID variable cannot be missing';
output work.errors;
end;
if missing(Err_Message) then do;
output work.cleandata;
end;
run;
Replace _filename in
APP_ID = scan(_filename,2, ',');
SN_NUMBER = input(scan(_filename,3, ','), ?? best.);
SN_AMOUNT = input(scan(_filename,5, ','), ?? best.);
P_ID = input(scan(_filename,6, ','), ?? best.);
by _infile_.
Never (as in NEVER) just say "it does not work" without giving further information.
Post the complete (only remove multiple repeats of the same message) log and (if the step ran at all) show where the result does not meet your expectation.
Does your file really have that extra blank in the name?
None of those SCAN() function calls are going to return anything. Your filename does not have any commas in it.
Did you want to look at the content of the line being read instead of the name of the file being read? If so then use the automatic _INFILE_ variable instead of the variable you created with the FILENAME= option of the INFILE statement.
Your code is working WAAAAY too hard. Let the INPUT statement parse the line. You seem to want to ignore the first and fourth values on the line. So just read them into a dummy character variable.
data
cleandata(keep= APP_ID SN_NUMBER SN_AMOUNT P_ID)
errors(keep=Err_Message Faulty_Record)
;
length APP_ID $255 SN_NUMBER SN_AMOUNT P_ID 8 Err_Message $100 Faulty_Record $250 ;
infile "/home/sas/score.csv"
filename = _filename
dsd
truncover
lrecl=32767
firstobs=2
;
length dummy $1 ;
input dummy APP_ID SN_NUMBER ??
dummy SN_AMOUNT ?? P_ID ??
;
Faulty_Record = _INFILE_ ;
if missing(APP_ID) then do;
Err_Message = 'APP_ID variable cannot be missing';
output work.errors;
end;
if missing(SN_NUMBER) then do;
Err_Message = 'SN_NUMBER variable cannot be missing';
output work.errors;
end;
if missing(P_ID) then do;
Err_Message = 'P_ID variable cannot be missing';
output work.errors;
end;
if missing(Err_Message) then do;
output work.cleandata;
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.