Hi Shell,
You have added the concatenation of "TXT" to the code unnecessarily.
My Code:
DATA _NULL_;
INFILE FILES TRUNCOVER END=LAST;
INPUT FNAME $5.;
I+1;
CALL SYMPUT("FNAME"||TRIM(LEFT(PUT(I,8.))),TRIM(FNAME));
IF LAST THEN CALL SYMPUT("TOTAL",TRIM(LEFT(PUT(I,8.))));
RUN;
Your Code:
data _null_;
infile files truncover end=last;
input fname $5.;
i+1;
call symput("fname"||trim(left(put(i,8.))),trim(fname)||".txt");
if last then call symput("total",trim(left(put(i,8.))));
call symput("fnamedate"||trim(left(put(i,8.))),substr(trim(fname),15,8));
run;
If you want to understand what the values look like that you are reading in using the INFILE STATMENT then output the results to a dataset.
data testing;
infile files truncover end=last;
input fname $5.;
i+1;
call symput("fname"||trim(left(put(i,8.))),trim(fname)||".txt");
if last then call symput("total",trim(left(put(i,8.))));
call symput("fnamedate"||trim(left(put(i,8.))),substr(trim(fname),15,8));
run;
If you have problem with the code moving forward, can you please state what the problem is, not just the fact that it won't work. Error messages and the like will help identify what the problem is and how to resolve it. It will also help us teach you to better debug your code.
Regards,
Scott
Hi Scott Thanks so much for replying. The reason I added the extension was because the error message was indicating "ERROR: Physical file does not exist, F:\TEST\Data\Corrections\Corr2." Adding the .txt allowed the code to run. What I mean by it not working, is that the results are not as expected. There are 4 correction files and the first 3 are all editing the same abstract but different fields. The 4th is a new abstract altogether. However, the results of nxg1 is only the last abstract and I would have expected to 2 abstracts - the second being the final version of the other abstract with all 3 field changes. What I would like to have happen is that the master dataset is updated by each correction file in sequence because if the same field is edited twice from the original, it's the second correction that is the value I want. Not updating in sequence might mean the 1st correction is applied and that's not what I want. Does that make sense? Thanks.
I'll echo my suggestion from earlier.
Read in all the files into one dataset appended and then do a mass update. You can order the files in the single appended dataset based on file name or other relevant logic before the update to control the process. This way no macro is needed.
data try01;
length filename txt_file_name $256;
retain txt_file_name;
infile "Path\*.txt" eov=eov filename=filename truncover;
input@;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -2, ".\");
eov=0;
end;
input
*Place input code here;
;
run;
Then do the update here.
data have;
update try01;
run;
If he was merely appending datasets together this approach would work, however I don't believe that you can update a dataset once the data is contained within it, hence the macro.
You also require 2 datasets in order to do an update.
I'm fairly sure 'he' is a 'she'.
My understanding is there was an original submission file, and these are the corrections. Read all the corrections into a single file and then update the original submission file would be what I meant.
Why do you assume that he is a she?
Very cool method though, I didn't know you could do it all in one mass update. It is great to learn something new.
Given that is the case, I am a fan of the FILEVAR method.
%LET IMPORT_FILE_SUB = 'E:\TEST\DATA\SUBMISSION\*.TXT';
%LET IMPORT_FILE_CORR = 'E:\TEST\DATA\CORRECTIONS\*.TXT';
DATA IMPORTA; INFILE &IMPORT_FILE_SUB. TRUNCOVER LRECL = 5000 FIRSTOBS=1;
INPUT @1 CHART $6. @7 ACCT $6. @13 SEX $1. @14 AGE $2. @16 DATAA $2. @18 DATAB $2. @20 DATAC $2.;
RUN;
FILENAME FILES PIPE "DIR E:\TEST\DATA\CORRECTIONS\*.TXT /B /O:N";
DATA HAVE;
INFILE FILES TRUNCOVER END=LAST;
INPUT FNAME $200.;
FNAME = "E:\TEST\DATA\CORRECTIONS\"||TRIM(LEFT(FNAME));
RUN;
DATA CORRECTIONS;
SET HAVE;
INFILE IN FILEVAR = FNAME END = EOF
LRECL=21
TRUNCOVER
;
DO UNTIL (EOF);
INPUT @1 CHART $6. @7 ACCT $6. @13 SEX $1. @14 AGE $2. @16 DATAA $2. @18 DATAB $2. @20 DATAC $2.;
OUTPUT;
END;
RUN;
PROC SORT DATA = IMPORTA;
BY CHART ACCT;
RUN;
PROC SORT DATA = CORRECTIONS;
BY CHART ACCT;
RUN;
DATA WANT;
UPDATE IMPORTA CORRECTIONS;
BY CHART ACCT;
RUN;
To answer the problem you were having with your macro code , you were updating the importa file on every iteration and overwriting the test dataset, when you should have been updating the test dataset.
proc sort data=importa;
by chart acct; run;
proc sort data=nxg1;
by chart acct; run;
data test;
update importa nxg1;
by chart acct; run;
Just saw that SHELLP55 resolves to Shelley in the forum, so I apologise for calling you a 'he' shellp.
I wonder if the second proc sort for the corrections also needs a different variable to order by. If a record has multiple corrections the order applied matters. I *think* I'd also sort by the file name.
I prefer the filevar option with the wildcard on the read statement to create a single file, then you don't have to worry about pipes and different issues on different OS.
I agree. I did a trial earlier and it didn't seem to matter on my test dataset, but it's better to be safe than sorry.
Hi Wow, a lot happened while I was on holidays, thanks to all contributors! Firstly, yes I am a "she" but no apologies necessary Scott because you can call me whatever as long as you're willing to help! Reeza, I understand your concept but I think that it may be two updates i.e. each correction file has to be imported in sequence and any already existing be overwritten with the most current correction. Then the correction file is applied to the original to update originals. I'm going to try Reeza's code from Jun 27 but I'm not understanding totally what's it doing. Is the var1 to whatever the filename? What if you don't know how many files there are i.e. this volume increases monthly? Thanks again to everyone!
It doesn't matter how many files there are, this will read all files. There's probably a sort order based on the file names though, or some other method. The key is reading all the updates into one file and then updating once. The code from Scott is better I believe...mines, just skeleton code, his is set up for your data explicitly.
Hi Thanks very much, Reeza...okay I'll use Scott's and see how I do! Thanks again to you and everyone!
EDIT: Just letting everyone know that Scott's method worked exactly as I wanted it to. Thanks to all of you so much for sticking with me on this!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.