Adding modification date to data set from imported text files

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

Adding modification date to data set from imported text files

I have hundreds of text files that I am importing and the date in these text files is wrong sometimes. I want to take the modification date (or creation date) of the file and place that in the date variable when appropriate. I have started to figure out how to get the modifcation/creation date from the text files but my problem lies in processing the raw text files while getting the modification/creation date information. Essentially I have created one datastep that gets the modification date and one that processes the external text files. I just cannot create a datastep to do these two things simulatenously.


Here is what I have to get the modification date:

% let file = X:\LI\*.txt;
filename ft pipe "dir &file /t:c /a:-d";
data test;

infile ft truncover;
input createdate ?? :mmddyy8.;

put createdate = worddate.;
format createdate mmddyy8.;

run;

And here is the datastep that process the text files:
filename t 'X:\LI\*.txt';
informat fa_id $10 st_id $10 pt_name $10 date mmddyy10. time hhmmss10.;
infile t delimiter = ',' firstobs = 1 missover dsd pad;
input fa_id st_id pt_name date time goal min ht bv sat er event;
format date mmddyy10. time time.;
run;
* this data step creates multiple rows for each minute, so one text file creates approximately 200 rows of data (which only differs by the time, ht bv and sat variables, everything else remains constant)

How would I combine these two seperate datasteps into one that actually works? Everytime I try, I only get information for the create date and all the other variables are missing. I know this is probably a really simple question but I am new to these types of problems and just really am not sure how to get this to work! I am not sure if I should provide other info - I am definitely willing to if necessary. Thanks for any help anyone can provide


Accepted Solutions
Solution
‎03-05-2013 02:54 PM
Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

Look like normal. I have added length statement to make more robust, try this:

%let file = X:\LI\;

filename ft pipe "dir &file.\*.* /t:c /a:-d";

data want;

informat fa_id $10 st_id $10 pt_name $10 date mmddyy10. time hhmmss10.;

format format createdate date mmddyy10. time time.;

length file2read $100.;

infile ft truncover;

input createdate :mmddyy10. @40 fname :$40.;

if not missing(createdate) then do;

fil2read=cats("&file",fname);

infile dummy filevar=fil2read end=done delimiter = ',' missover dsd pad;;

do while(not done);

input fa_id st_id pt_name date time goal min ht bv sat er event;

output;

end;

end;

run;

Haikuo

View solution in original post


All Replies
Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

This obviously was not tested, so it may need some tweak or correction on typos:

%let file = X:\LI\;

filename ft pipe "dir &file.\*.txt /t:c /a:-d";

data want;

informat fa_id $10 st_id $10 pt_name $10 date mmddyy10. time hhmmss10.;

format format createdate date mmddyy10. time time.;

infile ft truncover;

input createdate :mmddyy10. @40 fname :$20.;

if not missing(createdate) then do;

fil2read=cats("&file",fname);

infile dummy filevar=fil2read end=done delimiter = ',' missover dsd pad;;

do while(not done);

input fa_id st_id pt_name date time goal min ht bv sat er event;

output;

end;

end;

run;

Haikuo

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

When I run the above code I get the flowing message in the LOG:

NOTE: The infile FT is:

Unnamed Pipe Access Device,

PROCESS=dir X:\LI\*.txt /t:c /a:-d,

RECFM=V,LRECL=256

NOTE: Invalid data for createdate in line 1 2-7.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--

1 Volume in drive X is Users

fa_id= st_id= pt_name= date=. time=. createdate=. fname= fil2read= done=0 goal=. min=. ht=.

bv=. sat=. er=. event=. _ERROR_=1 _N_=1

NOTE: Invalid data for createdate in line 2 2-7.

2 Volume Serial Number is

fa_id= st_id= pt_name= date=. time=. createdate=. fname= fil2read= done=0 goal=. min=. ht=.

bv=. sat=. er=. event=. _ERROR_=1 _N_=2

NOTE: Invalid data for createdate in line 4 2-10.

4 Directory of X:\LI 24

fa_id= st_id= pt_name= date=. time=. createdate=. fname= fil2read= done=0 goal=. min=. ht=.

bv=. sat=. er=. event=. _ERROR_=1 _N_=4

ERROR: Physical file does not exist, X:\LI\260_H_206_88.

09/03/2012 08:22 PM 13,490 260_H_206_88.txt

fa_id= st_id= pt_name= date=. time=. createdate=09/01/2012 fname=260_H_206_88

fil2read=X:\LI\260_H_206_88 done=0 goal=. min=. ht=. bv=. sat=. er=. event=.

_ERROR_=1 _N_=6

NOTE: 6 records were read from the infile FT.

The minimum record length was 0.

The maximum record length was 66.

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0

observations and 14 variables.

WARNING: Data set WORK.WANT was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

real time 2:00.33

cpu time 0.15 second

I am going to keep looking at it and hopefully be able to determine what is causing the error. I just wanted to share the error in case people were interested in seeing it.

Thank you for your help!

Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

I have made some test files using sashelp.class, and the following code works fine for me:

132 %let file=h:\temp\test\;

133 filename ft pipe "dir h:\temp\test\*.txt /t:c /a:-d";

134

135 data want;

136 format createdate mmddyy10. ;

137 infile ft truncover;

138 input createdate :mmddyy10. @40 fname :$20.;

139 if not missing(createdate) then do;

140 fil2read=cats("&file",fname);

141 infile dummy filevar=fil2read end=done delimiter = ',' missover dsd pad;;

142 do while(not done );

143 input name $20.;

144 output;

145 end;

146 end;

147 run;

NOTE: The infile FT is:

  Unnamed Pipe Access Device,

  PROCESS=dir h:\temp\test\*.txt /t:c /a:-d,

  RECFM=V,LRECL=256

NOTE: Invalid data for createdate in line 1 2-7.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--

1 Volume in drive H is 190 25

createdate=. fname= fil2read= done=0 name= _ERROR_=1 _N_=1

NOTE: Invalid data for createdate in line 2 2-7.

2 Volume Serial Number is 0000-001F 34

createdate=. fname= fil2read= done=0 name= _ERROR_=1 _N_=2

NOTE: Invalid data for createdate in line 4 2-10.

4 Directory of h:\temp\test 26

createdate=. fname= fil2read= done=0 name= _ERROR_=1 _N_=4

NOTE: The infile DUMMY is:

  Filename=h:\temp\test\CLASS.TXT,

  RECFM=V,LRECL=256,File Size (bytes)=141,

  Last Modified=05Mar2013:09:53:50,

  Create Time=05Mar2013:09:53:08

NOTE: The infile DUMMY is:

  Filename=h:\temp\test\tt.TXT,

  RECFM=V,LRECL=256,File Size (bytes)=141,

  Last Modified=05Mar2013:09:54:48,

  Create Time=05Mar2013:09:54:48

NOTE: The infile DUMMY is:

  Filename=h:\temp\test\ws.TXT,

  RECFM=V,LRECL=256,File Size (bytes)=141,

  Last Modified=05Mar2013:09:54:54,

  Create Time=05Mar2013:09:54:54

NOTE: Invalid data for createdate in line 9 16-16.

9 3 File(s) 423 bytes 45

createdate=. fname=bytes fil2read=h:\temp\test\ws.TXT done=1 name= _ERROR_=1 _N_=9

NOTE: Invalid data for createdate in line 10 16-16.

10 0 Dir(s) 2,460,905,734,144 bytes free 53

createdate=. fname=144 fil2read=h:\temp\test\ws.TXT done=1 name= _ERROR_=1 _N_=10

NOTE: 10 records were read from the infile FT.

  The minimum record length was 0.

  The maximum record length was 53.

NOTE: 19 records were read from the infile DUMMY.

  The minimum record length was 4.

  The maximum record length was 7.

NOTE: 19 records were read from the infile DUMMY.

  The minimum record length was 4.

  The maximum record length was 7.

NOTE: 19 records were read from the infile DUMMY.

  The minimum record length was 4.

  The maximum record length was 7.

NOTE: The data set WORK.WANT has 57 observations and 3 variables.

NOTE: DATA statement used (Total process time):

  real time 0.26 seconds

  cpu time 0.06 seconds

Yes, you will see some "NOTE", that is the expected results  when reading summary part of pipe buffer, it can only be avoided if you know how many files you are reading in. Please share the actual code you run if you don't mind.

Haikuo

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

This is the code I used:

%let file = X:\LI\;

filename ft pipe "dir &file.\*.txt/t:c /a:-d";

data test;

informat fa_id $10. st_id $10. pt_name $10. date mmddyy10. time hhmmss10.;

format createdate date mmddyy10. time time.;

infile ft truncover;

input createdate :mmddyy10. @40 fname :$20.;

if not missing(createdate) then do;

fil2read=cats("&file",fname);

infile dummy filevar=fil2read end=done delimiter = ',' missover dsd pad;;

do while(not done);

input fa_id st_id pt_name date time goal min ht bv sat er event;

output;

end;

end;

run;


I think the problem has something to with how it is naming the files. I get an error that reads:

'ERROR: Physical file does not exist, X:\LI\09\06\2012'

None of the txt files in this folder are named like this ('X:\LI\09\06\2012). It seems somehow the datastep is searching for files with the date they were created as the file name. But the file names are just arbitrary letters/numbers like: xyz_19802137_38974.txt so it would make sense why it can't find a file with the name mentioned above.

Thanks for your help! I am going to keep looking into why this is happening

Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

It sounds like something to do with the outcome of "dir" command, which may vary depending on different OS. Would you please post first couple rows of dir command results?

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

NOTE: The infile FT is:
Unnamed Pipe Access Device,
Process = dir X\LI\\*.txt/t:c/a:-d,
REFCM = V,LRECL=256

NOTE: invalid data for create in line 1 2-7

I noticed the double \\ before the * and I have changed the code to where that does not happen but I still get the same error. Is there something else I should use to get the creation/modifcation date? I looked around and saw this: http://support.sas.com/kb/40/934.html . But when I tried that it seemed to go VERY slowly and I had to leave the datastep.

Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

Ok, please run this from within SAS: (if you have write access to X:\)

x "dir X:\LI\*.txt /t:c /a:-d >X:\test.txt";

Then post first couple of rows of "x:\test.txt", from there we will work out a fix.

Haikuo

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

When I paste this into SAS and run it the command prompt (cmd.exe) pops up and it says "The sysem cannot find the file specified.  C:\users\z>. Also the "Sas System X command Window is Active" pops up and says "The X command is active. Enter Exit at the prompt in the X command window to reactive this SAS Session."

Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

Then my guess is that you don't have *.txt files in the specified folder, like your code addressed in your first post. Try this:

x "dir X:\LI\*.* /t:c /a:-d >X:\test.txt";

Then post the content of X:\test.txt.

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

The command prompt opens again but it says nothing, just the window. The 'SAS system X command window is Active' also pops up again. There is no file X:\test.txt

Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

Try this and post your log:

%let file = X:\LI\;

filename ft pipe "dir &file.\*.* /t:c /a:-d";

data _null_;

  infile ft truncover;

  input;

  put _infile_;

run;

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

ok I got it to work. the file looks like this:

Volume in drive X is Users$
Volume Serial Number is 042D-39CD

Directory of X:\LI

09/06/2012  08:58 AM            13,490 H14_20120906.txt
09/07/2012  09:36 AM               156   H09_20120906.txt
09/10/2012  09:46 AM            13,806 H22_20120906.txt
09/12/2012  10:02 AM            19,800 H10_20120906.txt
09/12/2012  10:03 AM            17,111 H00_20120906.txt
09/14/2012  12:58 PM            13,260 H44_20120906.txt
09/15/2012  01:20 PM            19,032 H01_20120906.txt
09/15/2012  01:41 PM            14,697 H09_20120906.txt
09/16/2012  01:56 PM            13,064 H11_20120906.txt
09/17/2012  02:41 PM            20,590 H26_20120906.txt

Solution
‎03-05-2013 02:54 PM
Respected Advisor
Posts: 3,124

Re: Adding modification date to data set from imported text files

Look like normal. I have added length statement to make more robust, try this:

%let file = X:\LI\;

filename ft pipe "dir &file.\*.* /t:c /a:-d";

data want;

informat fa_id $10 st_id $10 pt_name $10 date mmddyy10. time hhmmss10.;

format format createdate date mmddyy10. time time.;

length file2read $100.;

infile ft truncover;

input createdate :mmddyy10. @40 fname :$40.;

if not missing(createdate) then do;

fil2read=cats("&file",fname);

infile dummy filevar=fil2read end=done delimiter = ',' missover dsd pad;;

do while(not done);

input fa_id st_id pt_name date time goal min ht bv sat er event;

output;

end;

end;

run;

Haikuo

Occasional Contributor
Posts: 16

Re: Adding modification date to data set from imported text files

That seems to be working until it finds a text file that has a space in the name (instead of an underscore), then it just quits and nothing is output to the data set. I am not sure how to fix this without maunually fixing the file names, but I hope this is on the way to a solution! Thank you so much for your help! I'll update again (to say is this was the solution or not) once the datastep is done running. It could be awhile since there are so many files and I am unsure how many errors in the file names may exist.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 466 views
  • 6 likes
  • 2 in conversation