Solved: Re: Need help to remove the non data rows and append these text files ...

kabir_roy8 · Posted 11-22-2012 11:01 AM

Hi All,

I have these many of these text files with this type of data :

---------------------------

|Data statistics|Number of|

|-------------------------|

|Records passed | 1.206|

---------------------------

21.12.2011 Document Extract for Auditors - all journals 2

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Document Extract for Auditors - all journals

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

|6200|2012| 02|900000019 |AB |13.10.2011|13.10.2011|20:02:02|Autom.clearing SAPF124 | |S00000002 |001|H |102419 | | 0,00 |EUR | 0,00 |EUR | 21,15 |USD | | | |FB1S|

I need to delete everything in bold and these occur at regual intevals in the file...

I need to delete these and append moref files like these.

Please help.

Thanks.

art297 · Posted 11-22-2012 05:50 PM

Max,

Unfortunately the OP hasn't yet let us know if this is even on the right track. Regardless, I expanded the code to include a pipe that identifies all of the files and imports them all in one datastep.

%let path=c:\art\;

filename filnames pipe "dir &path.extract*.txt /b";

data want;

/* declare formats and informats*/

informat file2read $100.;

informat junk $1.;

informat CoCd $4.;

informat Year 4.;

informat Period $4.;

informat DocNo $10.;

informat DocType $10.;

informat PstgDate anydtdte10.;

format PstgDate date9.;

informat EntryDate anydtdte10.;

format EntryDate date9.;

informat Time time8.;

informat Document_header_text $25.;

informat Reference $15.;

informat User_Name $10.;

informat Itm $3.;

informat D_C $1.;

informat G_L_acct $6.;

informat FA $3.;

informat Amount comma10.2;

informat Curr $5.;

informat Loc_Curr_Amount comma10.2;

informat Loc_Curr $5.;

informat Loc_Curr2_Amount comma10.2;

informat Loc_Curr2 $5.;

informat Text $50.;

informat S $1.;

informat ParkedBy $9.;

informat Tcod $4.;

/*read each filename*/

infile filnames truncover;

input file2read &;

file2read=catt("&path.",file2read);

/* for each file, read all data and write to file 'want'*/

do until (eof);

infile dummy filevar=file2read dlm="|" dsd truncover lrecl=300 end=eof;

input @;

if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;

input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text

Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount

Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;

output;

end;

else input;

end;

run;

View solution in original post

art297 · Posted 11-22-2012 11:51 AM

Not sure I correctly understand what you want to extract but, from the look of your example, something like the following might suffice:

data want;

infile "c:\art\extract1.txt" truncover;

input string $256.;

if count(string,'|') gt 5;

run;

MaxReis86 · Posted 11-22-2012 12:17 PM

Hello,

Does it hapen only in the head of the file or many times? If yes, i think this code can help:

data test;

infile datalines dlm="|" firstobs=21;

input

CoCd $

Year $

Period $

Doc_no_ $

Doc_Type $

Pstgdate $

Entrydte $

Time $

Documentheadertext $

Reference $

Username $

Itm $

D_C $

G_Lacct $

FA $

Amount $

Curr_ $

Loc_curr_amount $

Curr_ $

LC2amount $

LCur2 $

Text $

S $

Parkedby $

Tcod $;

datalines;

|Data statistics|Number of|

|-------------------------|

|Records passed | 1.206|

---------------------------

21.12.2011 Document Extract for Auditors - all journals 2

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Document Extract for Auditors - all journals

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

|6200|2012| 02|900000019 |AB |13.10.2011|13.10.2011|20:02:02|Autom.clearing SAPF124 | |S00000002 |001|H |102419 | | 0,00 |EUR | 0,00 |EUR | 21,15 |USD | | | |FB1S|

;

run;

When you say "these occur at regual intevals in the file". How many times does it hapen in the file?

art297 · Posted 11-22-2012 01:19 PM

and, to fully extract the data elements you could use:

data want;

infile "c:\art\extract1.txt" dlm="|" dsd truncover lrecl=300;

informat junk $1.;

informat CoCd $4.;

informat Year 4.;

informat Period $4.;

informat DocNo $10.;

informat DocType $10.;

informat PstgDate anydtdte10.;

format PstgDate date9.;

informat EntryDate anydtdte10.;

format EntryDate date9.;

informat Time time8.;

informat Document_header_text $25.;

informat Reference $15.;

informat User_Name $10.;

informat Itm $3.;

informat D_C $1.;

informat G_L_acct $6.;

informat FA $3.;

informat Amount comma10.2;

informat Curr $5.;

informat Loc_Curr_Amount comma10.2;

informat Loc_Curr $5.;

informat Loc_Curr2_Amount comma10.2;

informat Loc_Curr2 $5.;

informat Text $50.;

informat S $1.;

informat ParkedBy $9.;

informat Tcod $4.;

input @;

if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;

input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text

Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount

Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;

output;

end;

else input;

run;

MaxReis86 · Posted 11-22-2012 01:45 PM

That's Perfect Arthur!

art297 · Posted 11-22-2012 05:50 PM

Max,

Unfortunately the OP hasn't yet let us know if this is even on the right track. Regardless, I expanded the code to include a pipe that identifies all of the files and imports them all in one datastep.

%let path=c:\art\;

filename filnames pipe "dir &path.extract*.txt /b";

data want;

/* declare formats and informats*/

informat file2read $100.;

informat junk $1.;

informat CoCd $4.;

informat Year 4.;

informat Period $4.;

informat DocNo $10.;

informat DocType $10.;

informat PstgDate anydtdte10.;

format PstgDate date9.;

informat EntryDate anydtdte10.;

format EntryDate date9.;

informat Time time8.;

informat Document_header_text $25.;

informat Reference $15.;

informat User_Name $10.;

informat Itm $3.;

informat D_C $1.;

informat G_L_acct $6.;

informat FA $3.;

informat Amount comma10.2;

informat Curr $5.;

informat Loc_Curr_Amount comma10.2;

informat Loc_Curr $5.;

informat Loc_Curr2_Amount comma10.2;

informat Loc_Curr2 $5.;

informat Text $50.;

informat S $1.;

informat ParkedBy $9.;

informat Tcod $4.;

/*read each filename*/

infile filnames truncover;

input file2read &;

file2read=catt("&path.",file2read);

/* for each file, read all data and write to file 'want'*/

do until (eof);

infile dummy filevar=file2read dlm="|" dsd truncover lrecl=300 end=eof;

input @;

if count(_infile_,'|') gt 5 and substr(_infile_,1,6) ne "|CoCd|" then do;

input junk CoCd Year Period DocNo DocType PstgDate EntryDate Time Document_header_text

Reference User_Name Itm D_C G_L_acct FA Amount Curr Loc_Curr_Amount

Loc_Curr Loc_Curr2_Amount Loc_Curr2 Text S ParkedBy Tcod;

output;

end;

else input;

end;

run;

kabir_roy8 · Posted 12-07-2012 02:32 AM

Thanks a lot Arthur. I could process the files for analysis with your help. Really appreciate it.

Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Re: Need help to remove the non data rows and append these text files in SAS

Register Today!

SAS Training: Just a Click Away