I've a .log file (for example) as below.
Host: 'tmptcmsaslva2', OS: 'LIN X64', Release: '2.6.32-431.3.1.el6.x86_64', SAS Version: '9.03.01M2P08152012', Command: '/usr/sas/sas9.3/SASFoundation/9.3/sasexe/sas -noterminal -netencryptalgorithm SASProprietary -metaserver tmptcmsaslva2.timeinc.com -metaport 8561 -metarepository Foundation -objectserver -objectserverparms "protocol=bridge spawned spp=45106 cid=0 classfactory=15931E31-667F-11D5-8804-00C04F35AC8C server=OMSOBJ:SERVERCOMPONENT/A52GREI3.AV000006 cel=credentials dnsMatch=tmptcmsaslva2.timeinc.com multiuser port=8611 lb saslangrunas=client applevel=3"'
Log continued from /usr/sas/sas_config/Lev1/SASApp/StoredProcessServer/Logs/SASApp_STPServer_2015-08-19_tmptcmsaslva2_19142.log
2015-08-20T00:00:05,082 INFO [04086681] :sassr - New out call client connection (74065) for user
2015-08-20T00:00:05,092 INFO [04086681] :datapo@saspw - New client connection (74064) accepted from server port 8611 for SAS token user Encryption level is Credentials using encryption algorithm SASPROPRIETARY. Peer IP address and port are [10.176.232.41]:39494.
2015-08-20T00:00:05,094 INFO [04086689] 74064:datapo@saspw - STP: 36518: Creating New Context
In this log file I need to read the records from line number 6 and I need to produce the output as below. Please note that the field 'userid' is not a fixed width delimited. The value should be read before the hyphen -
Date_With_TimeStamp | Status | Processid | userid | Details |
2015-08-20T00:00:05,082 | INFO | [04086681] | :sassrv | New out call client connection (74065) for user |
2015-08-20T00:00:05,092 | INFO | [04086681] | :datapo@saspw | New client connection (74064) accepted from server port 8611 for SAS token user Encryption level is Credentials using encryption algorithm SASPROPRIETARY. Peer IP address and port are [10.176.232.41]:39494 |
2015-08-20T00:00:05,094 | INFO | [04086689] | 74064:datapo@saspw | STP: 36518: Creating New Context |
I tried the code like below (to read one file) to accomplish this task. However I got struck when I tried to read multiple similar log files (from UNIX) to produce a dataset (consolidation of all files). The files are looks like SASApp_STP_2015-08-19_tmp1_19142.log, SASApp_STP_2015-08-19_tmp2_19142.log, SASApp_STP_2015-08-19_tmp3_19142.log
data log_analysis;
infile '/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_tmp1_19142.log' truncover;
input var : $ 3000.;
var1 = _infile_;
if var1 = :'2015';
Date_With_TimeStamp = scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
run;
I request someone to guide me to tweak this code in such a way to read any such log files with similar format.
Sorry, what is the question?
I've a code below to read one file. How to make this code to read multiple files with similar layout to produce one dataset?
data log_analysis;
infile '/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_tmp1_19142.log' truncover;
input var : $ 3000.;
var1 = _infile_;
if var1 = :'2015';
Date_With_TimeStamp = scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
run;
Thanks for the code. Will your filename statement in the second set of code will read the files from UNIX?
What does this statement (input / / / @)does?
Thanks again.
Babloo wrote:
Thanks for the code. Will your filename statement in the second set of code will read the files from UNIX? My OS is UNIX but you will need to supply the proper path and depending on the actual names of the files you may need something a bit different for the filename part. My example is "self-contained" you could copy and paste it to your SAS and it should run. I assume you have /home ~ directory.
What does this statement (input / / / @)does? Skips the records at the beginning of the file that you don't want to read.
Thanks again.
You may need to spend a few minutes reading the manual. You can't expect me to write your program if you are not going to do any work to learn how it words.
Thanks for the code. However, please guide me to handle if any of the fields status, userid and processid were missing.
I think you may want to read the reply from @Jaap Karman
Babloo wrote:
Thanks for the code. However, please guide me to handle if any of the fields status, userid and processid were missing.
You can use FORMATTED INPUT for the first three fields.
You mentioned that USERID could be missing but did not show example data. I will assume that - (dash) still follows USERID field even if missing.
If we add DLM='-' to the INFILE statement we can read the first three fields with formatted input, read USERID using LIST/delimited input then switch back to formatted for DETAILS.
Well, there is a few ways.
Easiest is to change the infile:
infile "c:\temp\*.log" truncover;
Woulr read in all .log from that directory.
The other way I tend to do it is to pipe in a directory listing of where the logs are stored (I use windows, so you will need to change this to your Operating System):
filename pipe dirlist 'dir "c:\temp\*.log" /b';
data list;
infile dirlist;
run;
Gives a list of all files in that directory with .log extension. We can then generate the necessary code from there:
data _null_:
set list;
call execute('data log_analysis;
infile "c:\temp\'||strip(dirlist)||'.log" truncover;
input var : $ 3000.;
var1 = _infile_;
if var1 = :'2015';
Date_With_TimeStamp = scan(var1,1," ");
Status = scan(var1,2," ");
Processid = scan(var1,3," ");
userid = scan(var1,4," ");
Details = scan(var1,-1,'-');
drop var var1;
run;');
run;
My files are in UNIX.How to write the filename statement for files available in UNIX?
As I mention I work on Windows. Something like:
filename pipe dirlist 'ls ".../temp/*.log"';
If you can use a simple wildcard pattern to find the files then just modify the name used in the INFILE statement to include the wildcard pattern. You might also want to add the FILENAME option so that SAS will return the actual name of the file that the current line comes from. You will need two variables since the one referenced on the INFILE statement will be dropped.
data log_analysis;
length fname filename $200;
infile '/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_*.log' truncover filename=fname;
input var : $ 3000.;
filename=fname;
var1 = _infile_;
...
If you just have a list of individual files whose names you already know you can just use a fileref that points to multiple files.
filename in
('/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_tmp1_19142.log'
,'/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_tmp2_19142.log'
,'/usr/sas/sas_config/Lev1/SASApp/Logs/SASApp_STP_2015-08-19_tmp3_19142.log'
);
data log_analysis;
length fname filename $200;
infile IN truncover filename=fname;
input var : $ 3000.;
filename=fname;
var1 = _infile_;
...
If you have a dataset with the list of filename then use that dataset and then use a DO loop to read the individual files.
data log_analysis;
set filelist ;
fname=filenamee ;
infile logfile filevar=fname truncover end=eof;
do while (not eof);
input var : $ 3000.;
var1 = _infile_;
...
end;
run;
You didn't look a my example, it uses wildcard to READ. I use FILEVAR to create test files.
I updated my answer to include an example using a list of filenames from a dataset.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.