My requirement is to read the file names from dataset and create a master dataset by appending all the files. I 've tried this via follow code, but could not succeed.
I would request someone to guide me to complete this task.
Reading the filenames as below.
data dirlist;
infile cards truncover;
input file_names $300.;
cards;
'/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-30_tmptcmlva2_19142.log'
'/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-29_tmptcmlva2_19142.log'
'/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-28_tmptcmlva2_19142.log'
;
run;
Now I need to read and append all the files to create one master dataset. So I wrote a code as below. However, I could not succeeed.
It seems there was some error in my either infile statement or do loop which I'm unable to figure out.
DATA output_data_set;
set dirlist;
/* read the file references in variable called file_names */
INFILE IN dsd truncover FILEVAR = file_names END = end_of_file LRECL=32000;
DO WHILE (end_of_file = 0);
input var : $ 3000.;
/*filename=file_names;*/
var1 = _infile_;
if var1 = :'201';
Date_TimeStamp= scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
OUTPUT;
END;
RUN;
There are records in my input files,but my code could not read it for some reasons. Please guide me.
Log:
4 data dirlist;
5 infile cards truncover;
6 input file_names $300.;
7 cards;
NOTE: The data set WORK.DIRLIST has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
7 !
11 ;
12 run;
2 The SAS System 08:36 Friday, September 11, 2015
13
14 DATA output_data_set;
15 set dirlist;
16 /* read the file references in variable called file_names */
17 INFILE IN dsd truncover FILEVAR = file_names END = end_of_file LRECL=32000;
18 DO WHILE (end_of_file = 0);
19 input var : $ 3000.;
20 /*filename=file_names;*/
21 var1 = _infile_;
22 if var1 = :'201';
23 Date_TimeStamp= scan(var1,1," ");
24 Status = scan(var1,2," ");
25 Processid = scan(var1,3," ");
26 userid = scan(var1,4," ");
27 Details = scan(var1,-1,'-');
28 drop var var1;
29 OUTPUT;
30 END;
31 RUN;
NOTE: The variable file_names exists on an input data set, but was also specified in an I/O statement option. The variable will
not be included on any output data set.
NOTE: The infile IN is:
Filename=/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-30_tmptcmlva2_19142.log,
File List=('/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-30_tmptcmlva2_19142.log'
'/apps/tir/test/loganalysis/
'),
Owner Name=sassrv,Group Name=sas,
Access Permission=rwxrwxr-x,
Last Modified=Mon Aug 31 00:01:56 2015,
File Size (bytes)=720559
NOTE: The infile IN is:
Filename=/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-29_tmptcmlva2_19142.log,
File List=('/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-29_tmptcmlva2_19142.log'
'/apps/tir/test/loganalysis/
'),
Owner Name=sassrv,Group Name=sas,
Access Permission=rwxrwxr-x,
Last Modified=Sun Aug 30 00:01:46 2015,
File Size (bytes)=720559
NOTE: The infile IN is:
Filename=/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-28_tmptcmlva2_19142.log,
File List=('/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-28_tmptcmlva2_19142.log'
'/apps/tir/test/loganalysis/
'),
Owner Name=sassrv,Group Name=sas,
Access Permission=rwxrwxr-x,
Last Modified=Sat Aug 29 00:01:36 2015,
File Size (bytes)=2815408
NOTE: 1 record was read from the infile IN.
The minimum record length was 572.
The maximum record length was 572.
NOTE: 1 record was read from the infile IN.
The minimum record length was 572.
The maximum record length was 572.
3 The SAS System 08:36 Friday, September 11, 2015
NOTE: 1 record was read from the infile IN.
The minimum record length was 572.
The maximum record length was 572.
NOTE: There were 3 observations read from the data set WORK.DIRLIST.
NOTE: The data set WORK.OUTPUT_DATA_SET has 0 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.00 seconds
Could you clarify your request. Do you want to create a dataset with the contents of the three files, if so why not use a wildcard approach:
data want;
infile "/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/*.log";
...
run;
If you have to be specific then maybe:
data _null_;
set dirlist;
call execute('data inter; infile "'||strip(file_names)||'"; ....otherbits of code...; run;');
if _n_=1 then call execute('data base; set inter; run;');
else call execute('proc datasets library=work nolist; base=your_base_file data=inter force; run;');
run;
Hi,
Having some difficulties using this new forum. For your question, is there a reason why you do not use the wildcard:
data xyz;
infile "/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/*.log";
...
run;
Or maybe if you need to be specific:
data _null_;
set dir_list;
call execute('data inter; infile "'||strip(file_names)||'"; ...other code...; run;');
if _n_=1 then call execute('data base; set inter; run;');
else call execute('proc datasets library=work nolist; base=base data=inter force; run;');
run;
Does your code work for one file? What is your program for a single file that works?
Change
DO WHILE (end_of_file = 0);
to
DO WHILE(NOT end_of_file);
This might be the culprit causing no output:
if var1 = :'201';
Yes that is it. A subsetting IF is breaking out of the DO WHILE(NOT EOF) loop and sending the program back to read another file and on and on.
After applying your advise I slightly tweaked my code with macro, but again I end up with other error.
In this case, I've the file names in output_file.txt and I reading it to create a dataset. Please be informed I need to create a dataset only if file name (part of file name) matches. My management want me to develop with this code via macro and hence I used macro in my code.
output_file.txt has
fname /*header*/
SASApp_STPServer_2015-09-15_tmptcmsaslva2_18208.log
SASApp_STPServer_2015-09-15_tmptcmsaslva2_19142.log
libname log '/usr/sas/tir/test/loganalysis/';
proc import datafile='/usr/sas/tir/test/loganalysis/output_file.txt'
out=log.output_file
dbms=dlm
replace;
delimiter='09'x;
run;
Options symbolgen mlogic mprint mfile;
%macro log_analysis;
%let filedate=%sysfunc(putn("&sysdate9"d-2,yymmdd10.));
%put &filedate;
data log.log_analysis;
length fname filename $200;
infile "/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_&filedate._tmptcmsaslva2_*.log" dsd truncover FILEVAR = file_names END = end_of_file LRECL=32000;
DO WHILE (not end_of_file);
input var : $ 3000.;
/*filename=fname;*/
var1 = _infile_;
if var1 = :'201';
Date_TimeStamp= scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
output;
end;
run;
mend log_analysis;
data _null_;
set log.output_file;
if fname =: 'SASApp_STPServer' then call execute ('%log_analysis;');
else put 'no log files';
run;
Piece of log
1 + data log.log_analysis; length fname filename $200; infile
"/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-09-15_tmptcmsaslva2_*.log" dsd truncover FILEVAR =
file_names END = end_of_file LRECL=32000; DO WHILE (not
ERROR: A Physical file reference (i.e. "PHYSICAL FILE REFERENCE" ) or an aggregate file storage reference (i.e. AGGREGATE(MEMBER) )
reference cannot be used with the FILEVAR= option.
I request someone to help me to come past this error.
In your data step within the macro, you specify
filevar = file_names
This means that file_names must contain the physical name of the infile to be read. Since you do not assign a value to file_names anywhere in the data step, this fails.
I replaced my infile statement as follows.
infile dummy dsd truncover FILEVAR = file_names END = end_of_file LRECL=32000;
Got error as follows.
NOTE: Variable fname is uninitialized.
NOTE: Variable filename is uninitialized.
ERROR: Invalid physical name.
My question, how SAS will recongnize that the file reference 'dummy' is the file names to be read from the dataset log.output_file. Please be informed that there is only the filename in the dataset log.output_file. There is no path in the dataset or anywhere in the program.
Kindly guide me where should I mention the path in the program which makes life easier. Just I need to know whether the file output_file.txt should have the path along with file names or I should mention in infile statement which is under the macro.
I would request you to see my previous post to understand the files and other stuffs.
Thanks for the inputs so far.
When you use the filevar=any_variable_name construct in a INFILE statement, you MUST provide a valid physical filename in the variable any_variable_name before you can read from the infile.
If you want to read data from a certain set of similarly structured external files in one data step, you best create a SAS dataset with the filenames and do something like that
data have;
* code that reads filenames into the variable file_names;
run;
data want;
infile dummy filevar=file_names end=end_of_file /* other options */;
set have;
do while (not end_of_file);
input /* list of variables */;
* further processing;
output;
end;
run;
Still I had trouble reading the files. I did tried as you said, but I got error like 'ERROR: Invalid physical name'
Output_file.txt has records like, (fname is a header)
fname
/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-09-15_tmptcmsaslva2_18208.log
/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-09-15_tmptcmsaslva2_19142.log
'
My code is,
libname log '/usr/sas/tir/test/loganalysis/';
proc import datafile='/usr/sas/tir/test/loganalysis/output_file.txt'
out=log.output_file
dbms=dlm
replace;
delimiter='09'x;
run;
Options symbolgen mlogic mprint mfile;
%macro log_analysis;
%let filedate=%sysfunc(putn("&sysdate9"d-3,yymmdd10.));
%put &filedate;
data log.log_analysis;
length fname $200;
infile dummy dsd truncover FILEVAR = fname END = end_of_file LRECL=32000;
DO WHILE (not end_of_file);
input var : $ 3000.;
/*filename=fname;*/
var1 = _infile_;
if var1 = :'201';
Date_TimeStamp= scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
output;
end;
run;
%mend log_analysis;
data _null_;
set log.output_file;
if fname =: '/usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer' then call execute ('%log_analysis;');
else put 'no log files';
run;
Babloo, take a good look at this part of you code:
data log.log_analysis;
length fname $200;
infile dummy dsd truncover FILEVAR = fname END = end_of_file LRECL=32000;
DO WHILE (not end_of_file);
input var : $ 3000.;
/*filename=fname;*/
var1 = _infile_;
if var1 = :'201';
Date_TimeStamp= scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
output;
end;
run;
and then tell me: WHERE DO YOU SET fname???
Answer: nowhere. That's why it does not work.
The fact that fname is present in the data step that creates code through call execute does in no way imply that the variable somehow magically appears in the generated code (which is executed AFTER the data step doing the call execute has ended). You have to provide for some mechanism to get the filenames into your main step.
And I feel you are thinking way too complicated here.
My immediate solution would look like this:
- create a data step to read one single log file into an intermediate file
- wrap that in a macro definition that accepts the input filename as a parameter, and has a second step appending the data to a base file
- call that macro in a data step that reads the input filenames; every iteration will add another log output to your base file-
You are making it much too complicated. As someone said above the main error is that you cannot use a subsetting IF statement because that will break out of the DO WHILE loop and terminate the processing of the current file. You need to use IF/THEN/DO/END instead to conditionally execute the OUTPUT statement.
You probably should add some LENGTH or ATTRIB statements to define the variables that you are scanning from the lines in the log file. As the code is currently written they will all be character length 3000 since they are derived from string operations on VAR1 which is defaulting to the same because of the format used on the INPUT statement.
%let path=/usr/sas/tir/test/loganalysis ;
libname log "&path/";
data log.output_file ;
infile "&path/output_file.txt" firstobs=2 truncover ;
input fname $200. ;
run;
data log.log_analysis;
set log.output_file ;
filename=fname;
infile dummy dsd truncover filevar = fname end = end_of_file lrecl=32000;
do while (not end_of_file);
input var1 $3000.;
if var1 = :'201' then do;
Date_TimeStamp= scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
output;
end;
end;
drop var1;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.