BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

So the dataset you got from the directory should look like

data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;

You can read everything into one dataset with this:

data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
  input
    acct_id :$30.
    column2 :4.
    column3 :$6.
    bal :20.
  ;
  output;
end;
run;

The product will be automatically set from the dataset FNAMES.

After this, you can perform all calculations from this comprehensive dataset.

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

I am not sure i understand what you mean. If you have a dataset and want know which file was read to create the dataset, then @Kurt_Bremser already gave the answer. If have list of files in a dataset and want to read them: yes that is possible, have a look at the documentation of the infile-statement. especially the filevar-option.

Kurt_Bremser
Super User

@andreas_lds has a point here; this would look like

data filenames;
input fname $50.;
datalines;
filename1
filename2
filename3
;

data all;
set filenames;
infile dummy filevar=fname end=done;
do until (done);
  input var1 var2 var3;
  output;
end;
run;

see the INFILE Statement

DritanB
Fluorite | Level 6

Hi, thank you for your input!

 

Here is what I am trying to do (sorry did not provide more details in my original post last night):


- In a folder (Windows), I have a list a of csv files; their filenames contain specific keywords (representing what product they are), the month end date (for example: 20200630) and date/time stamp when they were created (for example: 20200710.180656).

 

The structure of these filenames within this folder is as follow:

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.174612

PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.180610

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.180615

 

PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024

PROD.CNTR_PRODUCT2.PARAMETER.20200630.csv.20200717.184315

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT2.20200630.csv.20200717.184320

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.184942

PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.190211

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.190216

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521

PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.193221

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.193226

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743

PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.194201

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.194206

 

Task 1:
Out of hundreds of these files in this folder, I have to select very few (about 19); these files should be:

- only DATA files,

- only for specific products as per specified list (Product1, 2...)

- only the latest created files.

 

I am able to do this and I have created a dataset with two column: one containing the full filename and the other containing only the respective product name (if needed I can include other information like month-end-date, datestamp, timestamp...):

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521

PRODUCT1

 

PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024

PRODUCT2

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743

PRODUCT3

 

Task 2: I have to:
a - import them as sas datasets (they are csv files)
b - perform some calculations with some of the data contained in these datasets.

 

The generic code for Task 2/b is as follow:

 

%LET STOUT=\\NAS_SHARED_LOCATION\Folder1\Folder2;

 

%macro calc(lbl=);

/*Reading csv files into dataset*/

data work.incsv;
infile "&STOUT\&flname" dlm='|,' dsd firstobs=2;
input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;

 

/*Creating the results*/
proc sql;
create table outreslt as
select "&lbl" as Product length=12 label='Product',
count(*) as Total_Records format=comma10. label='Total Count',
sum(bal) as Total_BAL format=dollar16. label='Total Amount'
from work.incsv;

quit;

 

/*Updating Results*/

data Output_results;
set Output_results outreslt;
run;


%mend calc;

%calc(lbl="Product1");
%calc(lbl="Product2");
%calc(lbl="Product3");
...

%calc(lbl="Product19");

 

/*Printing Results*/

proc print data=Output_results;
run;

 

My question is about Task 2/a: is it possible to pass the values from the dataset created in Task 1 to the infile statement used in Task 2/b, specicilally to &flname variable?

 

Thank you!

Kurt_Bremser
Super User

So the dataset you got from the directory should look like

data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;

You can read everything into one dataset with this:

data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
  input
    acct_id :$30.
    column2 :4.
    column3 :$6.
    bal :20.
  ;
  output;
end;
run;

The product will be automatically set from the dataset FNAMES.

After this, you can perform all calculations from this comprehensive dataset.

DritanB
Fluorite | Level 6

Thank you very much @Kurt_Bremser!

 

I was also able to modify my code, by entering a proc sql on top of the macro calc (before reading the datasets) and introducing the variable (which I used in INFILE statement) inside this proc sql:

 

 

%macro calc(lbl=);

proc sql noprint;
select file_name
into :fname
from latest_filenames
where segment = &lbl.;
quit;

 

/*Reading csv files into dataset*/

data work.incsv;
infile "&STOUT.\&fname." dlm='|,' dsd firstobs=2;

input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;

 

/*Perform calculations*/

 

/*Update results dataset*/

 

%mend calc;
%calc(lbl='PRODUCT1');
%calc(lbl='PRODUCT2');
%calc(lbl='PRODUCT3');

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1196 views
  • 1 like
  • 3 in conversation