So the dataset you got from the directory should look like
data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;
You can read everything into one dataset with this:
data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
input
acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
output;
end;
run;
The product will be automatically set from the dataset FNAMES.
After this, you can perform all calculations from this comprehensive dataset.
No. The dataset has no information how it was created.
Unless someone did the extra work to store this (manually) in the dataset label.
I am not sure i understand what you mean. If you have a dataset and want know which file was read to create the dataset, then @Kurt_Bremser already gave the answer. If have list of files in a dataset and want to read them: yes that is possible, have a look at the documentation of the infile-statement. especially the filevar-option.
@andreas_lds has a point here; this would look like
data filenames;
input fname $50.;
datalines;
filename1
filename2
filename3
;
data all;
set filenames;
infile dummy filevar=fname end=done;
do until (done);
input var1 var2 var3;
output;
end;
run;
see the INFILE Statement
Hi, thank you for your input!
Here is what I am trying to do (sorry did not provide more details in my original post last night):
- In a folder (Windows), I have a list a of csv files; their filenames contain specific keywords (representing what product they are), the month end date (for example: 20200630) and date/time stamp when they were created (for example: 20200710.180656).
The structure of these filenames within this folder is as follow:
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.174612
PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.180610
PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.180615
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024
PROD.CNTR_PRODUCT2.PARAMETER.20200630.csv.20200717.184315
PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT2.20200630.csv.20200717.184320
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.184942
PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.190211
PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.190216
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521
PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.193221
PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.193226
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743
PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.194201
PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.194206
Task 1:
Out of hundreds of these files in this folder, I have to select very few (about 19); these files should be:
- only DATA files,
- only for specific products as per specified list (Product1, 2...)
- only the latest created files.
I am able to do this and I have created a dataset with two column: one containing the full filename and the other containing only the respective product name (if needed I can include other information like month-end-date, datestamp, timestamp...):
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521
PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024
PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743
PRODUCT3
Task 2: I have to:
a - import them as sas datasets (they are csv files)
b - perform some calculations with some of the data contained in these datasets.
The generic code for Task 2/b is as follow:
%LET STOUT=\\NAS_SHARED_LOCATION\Folder1\Folder2;
%macro calc(lbl=);
/*Reading csv files into dataset*/
data work.incsv;
infile "&STOUT\&flname" dlm='|,' dsd firstobs=2;
input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;
/*Creating the results*/
proc sql;
create table outreslt as
select "&lbl" as Product length=12 label='Product',
count(*) as Total_Records format=comma10. label='Total Count',
sum(bal) as Total_BAL format=dollar16. label='Total Amount'
from work.incsv;
quit;
/*Updating Results*/
data Output_results;
set Output_results outreslt;
run;
%mend calc;
%calc(lbl="Product1");
%calc(lbl="Product2");
%calc(lbl="Product3");
...
%calc(lbl="Product19");
/*Printing Results*/
proc print data=Output_results;
run;
My question is about Task 2/a: is it possible to pass the values from the dataset created in Task 1 to the infile statement used in Task 2/b, specicilally to &flname variable?
Thank you!
So the dataset you got from the directory should look like
data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;
You can read everything into one dataset with this:
data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
input
acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
output;
end;
run;
The product will be automatically set from the dataset FNAMES.
After this, you can perform all calculations from this comprehensive dataset.
Thank you very much @Kurt_Bremser!
I was also able to modify my code, by entering a proc sql on top of the macro calc (before reading the datasets) and introducing the variable (which I used in INFILE statement) inside this proc sql:
%macro calc(lbl=);
proc sql noprint;
select file_name
into :fname
from latest_filenames
where segment = &lbl.;
quit;
/*Reading csv files into dataset*/
data work.incsv;
infile "&STOUT.\&fname." dlm='|,' dsd firstobs=2;
input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;
/*Perform calculations*/
/*Update results dataset*/
%mend calc;
%calc(lbl='PRODUCT1');
%calc(lbl='PRODUCT2');
%calc(lbl='PRODUCT3');
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.