Hi All,
I have these many of these text files with this type of data :
---------------------------
|Data statistics|Number of|
|-------------------------|
|Records passed | 1.206|
---------------------------
21.12.2011 Document Extract for Auditors - all journals 2
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Document Extract for Auditors - all journals
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|CoCd|Year|Period|Doc.no. |Doc. Type|Pstg date |Entry dte |Time |Document header text |Reference |User name |Itm|D/C|G/L acct|FA | Amount|Curr.|Loc.curr.amount|Curr.| LC2 amount|LCur2|Text |S|Parked by|Tcod|
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|6200|2012| 02|900000019 |AB |13.10.2011|13.10.2011|20:02:02|Autom.clearing SAPF124 | |S00000002 |001|H |102419 | | 0,00 |EUR | 0,00 |EUR | 21,15 |USD | | | |FB1S|
I need to delete everything in bold and these occur at regual intevals in the file...
I need to delete these and append moref files like these.
Please help.
Thanks.
Max,
Unfortunately the OP hasn't yet let us know if this is even on the right track. Regardless, I expanded the code to include a pipe that identifies all of the files and imports them all in one datastep.
%let path=c:\art\;
filename filnames pipe "dir &path.extract*.txt /b";
data want;
/* declare formats and informats*/
informat file2read $100.;
informat junk $1.;
informat CoCd $4.;
informat Year 4.;
informat Period $4.;
informat DocNo $10.;
informat DocType $10.;
informat PstgDate anydtdte10.;
format PstgDate date9.;
informat EntryDate anydtdte10.;
format EntryDate date9.;
informat Time time8.;
informat Document_header_text $25.;
informat Reference $15.;
informat User_Name $10.;
informat Itm $3.;
informat D_C $1.;
informat G_L_acct $6.;
informat FA $3.;
informat Amount comma10.2;
informat Curr $5.;
informat Loc_Curr_Amount comma10.2;
informat Loc_Curr $5.;
informat Loc_Curr2_Amount comma10.2;
informat Loc_Curr2 $5.;
informat Text $50.;
informat S $1.;
informat ParkedBy $9.;
informat Tcod $4.;
/*read each filename*/
infile filnames truncover;
input file2read &;
file2read=catt("&path.",file2read);
/* for each file, read all data and write to file 'want'*/
do until (eof);
infile dummy filevar=file2read dlm="|" dsd truncover lrecl=300 end=eof;
input @;
if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;
input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text
Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount
Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;
output;
end;
else input;
end;
run;
Not sure I correctly understand what you want to extract but, from the look of your example, something like the following might suffice:
data want;
infile "c:\art\extract1.txt" truncover;
input string $256.;
if count(string,'|') gt 5;
run;
Hello,
Does it hapen only in the head of the file or many times? If yes, i think this code can help:
data test;
infile datalines dlm="|" firstobs=21;
input
CoCd $
Year $
Period $
Doc_no_ $
Doc_Type $
Pstgdate $
Entrydte $
Time $
Documentheadertext $
Reference $
Username $
Itm $
D_C $
G_Lacct $
FA $
Amount $
Curr_ $
Loc_curr_amount $
Curr_ $
LC2amount $
LCur2 $
Text $
S $
Parkedby $
Tcod $;
datalines;
|Data statistics|Number of|
|-------------------------|
|Records passed | 1.206|
---------------------------
21.12.2011 Document Extract for Auditors - all journals 2
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Document Extract for Auditors - all journals
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|CoCd|Year|Period|Doc.no. |Doc. Type|Pstg date |Entry dte |Time |Document header text |Reference |User name |Itm|D/C|G/L acct|FA | Amount|Curr.|Loc.curr.amount|Curr.| LC2 amount|LCur2|Text |S|Parked by|Tcod|
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|6200|2012| 02|900000019 |AB |13.10.2011|13.10.2011|20:02:02|Autom.clearing SAPF124 | |S00000002 |001|H |102419 | | 0,00 |EUR | 0,00 |EUR | 21,15 |USD | | | |FB1S|
;
run;
When you say "these occur at regual intevals in the file". How many times does it hapen in the file?
and, to fully extract the data elements you could use:
data want;
infile "c:\art\extract1.txt" dlm="|" dsd truncover lrecl=300;
informat junk $1.;
informat CoCd $4.;
informat Year 4.;
informat Period $4.;
informat DocNo $10.;
informat DocType $10.;
informat PstgDate anydtdte10.;
format PstgDate date9.;
informat EntryDate anydtdte10.;
format EntryDate date9.;
informat Time time8.;
informat Document_header_text $25.;
informat Reference $15.;
informat User_Name $10.;
informat Itm $3.;
informat D_C $1.;
informat G_L_acct $6.;
informat FA $3.;
informat Amount comma10.2;
informat Curr $5.;
informat Loc_Curr_Amount comma10.2;
informat Loc_Curr $5.;
informat Loc_Curr2_Amount comma10.2;
informat Loc_Curr2 $5.;
informat Text $50.;
informat S $1.;
informat ParkedBy $9.;
informat Tcod $4.;
input @;
if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;
input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text
Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount
Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;
output;
end;
else input;
run;
That's Perfect Arthur!
Max,
Unfortunately the OP hasn't yet let us know if this is even on the right track. Regardless, I expanded the code to include a pipe that identifies all of the files and imports them all in one datastep.
%let path=c:\art\;
filename filnames pipe "dir &path.extract*.txt /b";
data want;
/* declare formats and informats*/
informat file2read $100.;
informat junk $1.;
informat CoCd $4.;
informat Year 4.;
informat Period $4.;
informat DocNo $10.;
informat DocType $10.;
informat PstgDate anydtdte10.;
format PstgDate date9.;
informat EntryDate anydtdte10.;
format EntryDate date9.;
informat Time time8.;
informat Document_header_text $25.;
informat Reference $15.;
informat User_Name $10.;
informat Itm $3.;
informat D_C $1.;
informat G_L_acct $6.;
informat FA $3.;
informat Amount comma10.2;
informat Curr $5.;
informat Loc_Curr_Amount comma10.2;
informat Loc_Curr $5.;
informat Loc_Curr2_Amount comma10.2;
informat Loc_Curr2 $5.;
informat Text $50.;
informat S $1.;
informat ParkedBy $9.;
informat Tcod $4.;
/*read each filename*/
infile filnames truncover;
input file2read &;
file2read=catt("&path.",file2read);
/* for each file, read all data and write to file 'want'*/
do until (eof);
infile dummy filevar=file2read dlm="|" dsd truncover lrecl=300 end=eof;
input @;
if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;
input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text
Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount
Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;
output;
end;
else input;
end;
run;
Thanks a lot Arthur. I could process the files for analysis with your help. Really appreciate it.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.