- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I need to convert the attached log file into dataset. The log file contains multiple delimiters such as "|" , tab, and a special character (please refer to the attached log file).
Thanks for your help in advance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your infile actually has three delimiters:
- the pipe character
- hex 08 to separate items in the long string
- hex 09 to separate events and values within an item
So you need a do loop for the items and two different splits along these separators:
data jei;
infile '$HOME/sascommunity/jei.txt' dlmstr=CRLF dlm='|' truncover;
input
status :$5.
TIMESTAMP :$20.
variablestr :$500.
n :$1.
;
length
event $50
val 8
;
do i = 1 to countw(variablestr,'08'x);
event = scan(scan(variablestr,i,'08'x),1,'09'x);
val = input(scan(scan(variablestr,i,'08'x),2,'09'x),best.);
output;
end;
drop i variablestr;
run;
proc print data=jei noobs;
run;
Partial result:
status TIMESTAMP n event val SV 20170629002459 N NumberOfUnitsBonded 6729 SV 20170629002459 N Previous Number of Unit Bonded 6732 SV 20170629002459 N Number of Unit Bonded 6733 SV 20170629002459 N UPS 3433 SV 20170629002459 N UOL 6733 SV 20170629002501 N NumberOfUnitsBonded 6730 SV 20170629002501 N Previous Number of Unit Bonded 6733 SV 20170629002501 N Number of Unit Bonded 6734 SV 20170629002501 N UPS 3434 SV 20170629002501 N UOL 6734 STATE 20170629002501 N Alarm count per shift 0 STATE 20170629002501 N Runtime per shift 8668 STATE 20170629002501 N MTBA per shift 8668 SV 20170629002502 N NumberOfUnitsBonded 6731 SV 20170629002503 N Previous Number of Unit Bonded 6734 SV 20170629002503 N Number of Unit Bonded 6735 SV 20170629002503 N UPS 3435 SV 20170629002503 N UOL 6735 SV 20170629002504 N NumberOfUnitsBonded 6732 SV 20170629002504 N Previous Number of Unit Bonded 6735 SV 20170629002504 N Number of Unit Bonded 6736 SV 20170629002504 N UPS 3436 SV 20170629002504 N UOL 6736 STATE 20170629002505 N Alarm count per shift 0 STATE 20170629002505 N Runtime per shift 8672 STATE 20170629002505 N MTBA per shift 8672 SV 20170629002506 N NumberOfUnitsBonded 6733 SV 20170629002506 N Previous Number of Unit Bonded 6736 SV 20170629002506 N Number of Unit Bonded 6737 SV 20170629002506 N UPS 3437 SV 20170629002506 N UOL 6737 SV 20170629002508 N NumberOfUnitsBonded 6734 SV 20170629002508 N Previous Number of Unit Bonded 6737 SV 20170629002508 N Number of Unit Bonded 6738 SV 20170629002508 N UPS 3438 SV 20170629002508 N UOL 6738
To make the data more analyzable for SAS, consider to convert the timestamp value to a SAS datetime value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Would you be able to outline your target position for the dataset, as that would help someone being able to give you some assistance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I wish to have the same format of the output on the attached file.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Log files are text. Text manipulation is done with proper tools; SAS and Excel are not proper tools for text.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So what did you try?
Looking at the file you posted in the viewer on this site it looks like text file that delimited with a |.
It is trivial to read a file that is delimited with | in SAS.
data want ;
infile 'myfile.txt' dsd dlm='|' truncover ;
length var1-var5 $200 ;
input var1-var5;
run;
It does appear that one of column appears to contain a list of values delimited by something that looks like a square in the view, but even that should not be hard to deal with once you have the data in a dataset instead of a text file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Tom
I tried the code below
DATA TEST_LOG;
LENGTH
STATUS $ 13
TIMESTAMP 8
EVENT $ 2577
VAL 8
;
FORMAT
STATUS $CHAR13.
TIMESTAMP BEST14.
EVENT $CHAR2577.
;
INFORMAT
STATUS $CHAR13.
TIMESTAMP BEST14.
EVENT $CHAR2577.
;
INFILE 'E:\TEST.log'
LRECL=32767
FIRSTOBS=2
DLMSTR='7C'x
TRUNCOVER
;
INPUT
STATUS : $CHAR13.
TIMESTAMP : ?? BEST14.
EVENT : $CHAR2577.
;
ARRAY AR[1] EVENT;
DO i=1 ;
AR(i)=scan(EVENT, i, '09'X);
END;
RUN;
and gives the output below.
If you can notice, the value whose delimiter is a tab was gone.
I tried your given code and it gives me the output below.
I want to have another column for the numeric values in var3 column - which are in the most right. In my code, these values did not appear.
So basically, I'm wishing to have like the output below.
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Just use DLM instead of DLMSTR and add all THREE delimiters as value (a pipe, backspace, tab).
DATA TEST_LOG;
LENGTH
STATUS $ 13
TIMESTAMP 8
EVENT $ 100
VAL 8
;
FORMAT
STATUS $CHAR13.
TIMESTAMP BEST14.
EVENT $100.
;
INFORMAT
STATUS $CHAR13.
TIMESTAMP BEST14.
EVENT $100.
;
INFILE 'c:\temp\TEST.txt'
LRECL=32767
FIRSTOBS=1
DLM='7c0809'x
dsd
TRUNCOVER
;
INPUT
STATUS : $CHAR13.
TIMESTAMP : ?? BEST14.
EVENT : $100.
val : best15.
;
RUN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your infile actually has three delimiters:
- the pipe character
- hex 08 to separate items in the long string
- hex 09 to separate events and values within an item
So you need a do loop for the items and two different splits along these separators:
data jei;
infile '$HOME/sascommunity/jei.txt' dlmstr=CRLF dlm='|' truncover;
input
status :$5.
TIMESTAMP :$20.
variablestr :$500.
n :$1.
;
length
event $50
val 8
;
do i = 1 to countw(variablestr,'08'x);
event = scan(scan(variablestr,i,'08'x),1,'09'x);
val = input(scan(scan(variablestr,i,'08'x),2,'09'x),best.);
output;
end;
drop i variablestr;
run;
proc print data=jei noobs;
run;
Partial result:
status TIMESTAMP n event val SV 20170629002459 N NumberOfUnitsBonded 6729 SV 20170629002459 N Previous Number of Unit Bonded 6732 SV 20170629002459 N Number of Unit Bonded 6733 SV 20170629002459 N UPS 3433 SV 20170629002459 N UOL 6733 SV 20170629002501 N NumberOfUnitsBonded 6730 SV 20170629002501 N Previous Number of Unit Bonded 6733 SV 20170629002501 N Number of Unit Bonded 6734 SV 20170629002501 N UPS 3434 SV 20170629002501 N UOL 6734 STATE 20170629002501 N Alarm count per shift 0 STATE 20170629002501 N Runtime per shift 8668 STATE 20170629002501 N MTBA per shift 8668 SV 20170629002502 N NumberOfUnitsBonded 6731 SV 20170629002503 N Previous Number of Unit Bonded 6734 SV 20170629002503 N Number of Unit Bonded 6735 SV 20170629002503 N UPS 3435 SV 20170629002503 N UOL 6735 SV 20170629002504 N NumberOfUnitsBonded 6732 SV 20170629002504 N Previous Number of Unit Bonded 6735 SV 20170629002504 N Number of Unit Bonded 6736 SV 20170629002504 N UPS 3436 SV 20170629002504 N UOL 6736 STATE 20170629002505 N Alarm count per shift 0 STATE 20170629002505 N Runtime per shift 8672 STATE 20170629002505 N MTBA per shift 8672 SV 20170629002506 N NumberOfUnitsBonded 6733 SV 20170629002506 N Previous Number of Unit Bonded 6736 SV 20170629002506 N Number of Unit Bonded 6737 SV 20170629002506 N UPS 3437 SV 20170629002506 N UOL 6737 SV 20170629002508 N NumberOfUnitsBonded 6734 SV 20170629002508 N Previous Number of Unit Bonded 6737 SV 20170629002508 N Number of Unit Bonded 6738 SV 20170629002508 N UPS 3438 SV 20170629002508 N UOL 6738
To make the data more analyzable for SAS, consider to convert the timestamp value to a SAS datetime value.