Hi , I'm new to SAS DI.
I've historical CSV files which I need to load into dataset. I currently load 1 by 1 which is painful and not pratical.
I would like to use loop to load one time.
sample file name is optionlog_20180101.csv. I've to load until Jun2018. When come to Feb , march, april etc I need to append data feb to optionlog_201802, march data to optionlog_201803. How I can create the table automatically base on the file csv. Really appreciate if someone can help me on this.
Below is my script.
libname IVRSTG '/u11/SAS/prod/Modelling/14_IVR';
data _null_;
/*date_t = today();*/
date_t = '07Jan2018'd;
daydt = intnx('day',date_t,-1);
mth = intnx('month',date_t,-1);
yyyymm=put(intnx("month",date_t,0,"end"),YYMMN6.);
call symput ("yyyymm",yyyymm);
call symput('filedt', put(daydt, yymmddn8.));
call symput('nrptdt', put(daydt, date9.));
call symput('tbldt', put(mth, yymmn6.));
call symput('dddt', daydt);
run;
%put &yyyymm;
%put &filedt;
%put &dddt;
%put &tbldt;
%put &nrptdt;
%macro ivr;
%if %sysfunc(fileexist("/u05/SAS/prod/bigdata/avaya/optionlog_&filedt..csv"))=0 %then %do;
data _null_;
file myfile;
put "The optionlog_&filedt files was not received file from IVR.";
put " ";
put "Thanks!";
put " ";
run;
data work.optionlog_&filedt;
LENGTH
callid $50
optiondate $50
optionid $50
optioncode $50
accno $50
acctype $50
toaccno $50
toacctype $50
amount $50
chequeno $50
dbquery $50
hosttraceid $50
hostreqstatus $50
hoststatuscode $50
hoststatusdesc $250
/* indicator $50;*/
FORMAT
callid $50.
optiondate $50.
optionid $50.
optioncode $50.
accno $50.
acctype $50.
toaccno $50.
toacctype $50.
amount $50.
chequeno $50.
dbquery $50.
hosttraceid $50.
hostreqstatus $50.
hoststatuscode $50.
hoststatusdesc $250.
/* indicator $50. ;*/
format RPT_DATE date9.;
RPT_DATE = "&nrptdt"d;
stop;run;
%end;
%else %if %sysfunc(fileexist("/u05/SAS/prod/bigdata/avaya/optionlog_&filedt..csv")) %then %do;
data work.optionlog_&filedt;
infile "/u05/SAS/prod/bigdata/avaya/optionlog_&filedt..csv"
lrecl = 256
delimiter = '|'
dsd
missover
firstobs = 3;
;
attrib callid length = $50;
attrib optiondate length = $50;
attrib optionid length = $50;
attrib optioncode length = $50;
attrib accno length = $50;
attrib acctype length = $50;
attrib toaccno length = $50;
attrib toacctype length = $50;
attrib amount length = $50;
attrib chequeno length = $50;
attrib dbquery length = $50;
attrib hosttraceid length = $50;
attrib hostreqstatus length = $50;
attrib hoststatuscode length = $50;
attrib hoststatusdesc length = $250;
/* attrib indicator length = $50;*/
format RPT_DATE date9.;
input callid optiondate optionid optioncode accno acctype toaccno toacctype amount chequeno dbquery hosttraceid hostreqstatus hoststatuscode hoststatusdesc ;
RPT_DATE = "&nrptdt"d;
run;
%end;
%mend ;
%ivr;
data IVRSTG.optionlog_&yyyymm;
SET IVRSTG.optionlog_&yyyymm optionlog_&filedt;
RUN;
A few comments on your existing code.
1) Don't comment out the semi-colons at the end of your multiple line statements. This is one of the many reasons why I always place the semi-colon in a multi-line statement on a new line instead on at the end of the last line. Like it was an END statement in a DO/END block.
length
a $50
b $50
;
2) There is no need to attach $xx formats to character variables. SAS already knows how to display character variables without having any special formatting instructions attached to the variable definition.
To your problem.
1) You can use a wildcard in the INFILE statement.
infile "/u05/SAS/prod/bigdata/avaya/optionlog_201801*.csv"
2) You can use the FILENAME option on the infile statement to see what file you are currently reading. Make sure to set its length BEFORE the INFILE statement. You can then parse out the RPT_DATE value from the filename.
length filename $200;
infile .... filename=filename ... ;
input .... ;
rpt_date=input(substr(filename,length(filename)-12,8),yymmdd8.);
How to import multiple files has been discussed so many times, i stopped counting weeks ago.
You can use wildcards in the infile-statement:
infile "/u05/SAS/prod/bigdata/avaya/optionlog_*.csv"
would read all files in the given directory, starting with "optionlog_" and ending with ".csv".
Only thing to be sure is, that all files have the same structure.
When using wildcards one output-dataset is created, if you don't use multiple output-statements. Why do you need a dataset for each month? It seems to be easier to work with one big dataset and use a where-statement to get the data of a month.
A few comments on your existing code.
1) Don't comment out the semi-colons at the end of your multiple line statements. This is one of the many reasons why I always place the semi-colon in a multi-line statement on a new line instead on at the end of the last line. Like it was an END statement in a DO/END block.
length
a $50
b $50
;
2) There is no need to attach $xx formats to character variables. SAS already knows how to display character variables without having any special formatting instructions attached to the variable definition.
To your problem.
1) You can use a wildcard in the INFILE statement.
infile "/u05/SAS/prod/bigdata/avaya/optionlog_201801*.csv"
2) You can use the FILENAME option on the infile statement to see what file you are currently reading. Make sure to set its length BEFORE the INFILE statement. You can then parse out the RPT_DATE value from the filename.
length filename $200;
infile .... filename=filename ... ;
input .... ;
rpt_date=input(substr(filename,length(filename)-12,8),yymmdd8.);
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.