Hello,
I am writing an import code to turn some files into SAS data sets. The data set will be updated weekly and the new files are appended to the existing data set.
I am using a wildcard on the FILENAME statement but one of my fields needs to be the original file where the observation came from.
I shortened the code for the sake of brevity but I can provide more details if necessary. It looks something like this:
Files to be loaded: file1, file2, file3, file4 etc
FILENAME files "/my/path/*file*.csv";
DATA append;
format file_name $100. update_date $8.f;
INFILE files lrecl=512 pad;
(import code for each field goes here...)
file_name = ???
This is where I need to create a field specifying the file name that each obs came from
for example if the row came from file1 I would need this field to say field_name = 'file1'
Is there any way for sas to do this?
Also, is there a way to list everything that is in 'files' from FILENAME files "/my/path/*file*.csv"; ?
thanks for your help 🙂
I tried your suggestion and it did populate my file_name field, but it filled them all with the first part of the path and not the file name it came from
Your initial code has this:
format file_name $100.
Not sure what you had in the code you actually ran, but if you had that, you would only get the first 100 characters.
Here is the exact code I am running:
FILENAME RAPS "/sasdata/mra/users/gmu/projects/IMPORT_RRR_AUTO/IMPORT_FILES/*RAPS_RETURN_FLAT_SH9698*.DAT";
DATA append;
format file_name $100. update_date $8.f;
INFILE RAPS lrecl=512 pad filename=source/*here*/;
INPUT
@1 RecordID $char3. @;
IF RecordID = "AAA" THEN DO; /*File Header*/
RETAIN SubmitterID FileID TransactionDate ProductionTestIndicator ICDver;
(A lot of irrelevant import/formatting code here similar to above INPUT part...)
file_name = source;
RUN;
field_name = '/sasdata' for every row. It seems to be truncating at 8 characters but I was explicit with 100.
Any ideas why its not running correctly?
Use the FILENAME= option of the INFILE statement.
Please note the documentation says: " Like automatic variables, the FILENAME= variable is not written to the data set. " So you have to do something silly to save this information for later use, like:
infile files lrecl=512 pad filename=tempvar;
savedfilename = tempvar;
I haven't done this in a while, so I can't remember, but you may need a LENGTH statement between INFILE and SAVEDFILENAME=
length savedfilename $ 256;
Try the LENGTH statement. If that doesn't work, show us a screen capture.
Here is the exact code I am running:
FILENAME RAPS "/sasdata/mra/users/gmu/projects/IMPORT_RRR_AUTO/IMPORT_FILES/*RAPS_RETURN_FLAT_SH9698*.DAT";
DATA append;
format file_name $100. update_date $8.f;
INFILE RAPS lrecl=512 pad filename=source;
INPUT
@1 RecordID $char3. @;
IF RecordID = "AAA" THEN DO; /*File Header*/
RETAIN SubmitterID FileID TransactionDate ProductionTestIndicator ICDver;
(A lot of irrelevant import/formatting code here similar to above INPUT part...)
file_name = source;
RUN;
field_name = '/sasdata' for every row. It seems to be truncating at 8 characters but I was explicit with 100.
Any ideas why its not running correctly?
Instead of format, use a length statement:
length file_name source $100;
Keep in mind that SAS uses a default length of 8 for all variables unless no other is specified.
Thank you it is working now, I didnt realize I needed to set the length for source as well. Im still learning sas and programming in general, really appreciate all the help in this thread
@gmuersch wrote:
Thank you it is working now, I didnt realize I needed to set the length for source as well. Im still learning sas and programming in general, really appreciate all the help in this thread
It was actually the SOURCE variable that you needed to make sure was long enough since the simple assignment statement like:
filename = source ;
is the first place the FILENAME is referenced will cause SAS to create FILENAME as the same length as SOURCE.
Note that a FORMAT statement is for attaching formats to your variables to control how they are displayed as text. Most variables do not need formats attached as SAS knows how to display character strings and numbers. Trying to use it to DEFINE the variable only works as a side effect when the FORMAT statement is the first place where the variable is seen. When that happens SAS will guess that the variables length should match the width of the format that is being attached to it.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.