BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

So the dataset you got from the directory should look like

data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;

You can read everything into one dataset with this:

data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
  input
    acct_id :$30.
    column2 :4.
    column3 :$6.
    bal :20.
  ;
  output;
end;
run;

The product will be automatically set from the dataset FNAMES.

After this, you can perform all calculations from this comprehensive dataset.

View solution in original post

6 REPLIES 6
andreas_lds
Jade | Level 19

I am not sure i understand what you mean. If you have a dataset and want know which file was read to create the dataset, then @Kurt_Bremser already gave the answer. If have list of files in a dataset and want to read them: yes that is possible, have a look at the documentation of the infile-statement. especially the filevar-option.

Kurt_Bremser
Super User

@andreas_lds has a point here; this would look like

data filenames;
input fname $50.;
datalines;
filename1
filename2
filename3
;

data all;
set filenames;
infile dummy filevar=fname end=done;
do until (done);
  input var1 var2 var3;
  output;
end;
run;

see the INFILE Statement

DritanB
Fluorite | Level 6

Hi, thank you for your input!

 

Here is what I am trying to do (sorry did not provide more details in my original post last night):


- In a folder (Windows), I have a list a of csv files; their filenames contain specific keywords (representing what product they are), the month end date (for example: 20200630) and date/time stamp when they were created (for example: 20200710.180656).

 

The structure of these filenames within this folder is as follow:

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.174612

PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.180610

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.180615

 

PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024

PROD.CNTR_PRODUCT2.PARAMETER.20200630.csv.20200717.184315

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT2.20200630.csv.20200717.184320

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.184942

PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.190211

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.190216

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521

PROD.CNTR_PRODUCT1.PARAMETER.20200630.csv.20200717.193221

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT1.20200630.csv.20200717.193226

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743

PROD.CNTR_PRODUCT3.PARAMETER.20200630.csv.20200717.194201

PROD.APPtriggerfile.PROD - Dept.CNTR_PRODUCT3.20200630.csv.20200717.194206

 

Task 1:
Out of hundreds of these files in this folder, I have to select very few (about 19); these files should be:

- only DATA files,

- only for specific products as per specified list (Product1, 2...)

- only the latest created files.

 

I am able to do this and I have created a dataset with two column: one containing the full filename and the other containing only the respective product name (if needed I can include other information like month-end-date, datestamp, timestamp...):

 

PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521

PRODUCT1

 

PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024

PRODUCT2

 

PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743

PRODUCT3

 

Task 2: I have to:
a - import them as sas datasets (they are csv files)
b - perform some calculations with some of the data contained in these datasets.

 

The generic code for Task 2/b is as follow:

 

%LET STOUT=\\NAS_SHARED_LOCATION\Folder1\Folder2;

 

%macro calc(lbl=);

/*Reading csv files into dataset*/

data work.incsv;
infile "&STOUT\&flname" dlm='|,' dsd firstobs=2;
input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;

 

/*Creating the results*/
proc sql;
create table outreslt as
select "&lbl" as Product length=12 label='Product',
count(*) as Total_Records format=comma10. label='Total Count',
sum(bal) as Total_BAL format=dollar16. label='Total Amount'
from work.incsv;

quit;

 

/*Updating Results*/

data Output_results;
set Output_results outreslt;
run;


%mend calc;

%calc(lbl="Product1");
%calc(lbl="Product2");
%calc(lbl="Product3");
...

%calc(lbl="Product19");

 

/*Printing Results*/

proc print data=Output_results;
run;

 

My question is about Task 2/a: is it possible to pass the values from the dataset created in Task 1 to the infile statement used in Task 2/b, specicilally to &flname variable?

 

Thank you!

Kurt_Bremser
Super User

So the dataset you got from the directory should look like

data fnames;
infile datalines dlm=',';
input fname :$200. product :$12.;
datalines;
PROD.CNTR_PRODUCT1.DATA.20200630.csv.20200717.191521,PRODUCT1
PROD.CNTR_PRODUCT2.DATA.20200630.csv.20200717.183024,PRODUCT2
PROD.CNTR_PRODUCT3.DATA.20200630.csv.20200717.193743,PRODUCT3
;

You can read everything into one dataset with this:

data all_products;
set fnames;
fname = "&stout.\" !! trim(fname);
infile dummy filevar=fname dlm="|" end=done; /* add additional options as necessary */
do until (done);
  input
    acct_id :$30.
    column2 :4.
    column3 :$6.
    bal :20.
  ;
  output;
end;
run;

The product will be automatically set from the dataset FNAMES.

After this, you can perform all calculations from this comprehensive dataset.

DritanB
Fluorite | Level 6

Thank you very much @Kurt_Bremser!

 

I was also able to modify my code, by entering a proc sql on top of the macro calc (before reading the datasets) and introducing the variable (which I used in INFILE statement) inside this proc sql:

 

 

%macro calc(lbl=);

proc sql noprint;
select file_name
into :fname
from latest_filenames
where segment = &lbl.;
quit;

 

/*Reading csv files into dataset*/

data work.incsv;
infile "&STOUT.\&fname." dlm='|,' dsd firstobs=2;

input acct_id :$30.
column2 :4.
column3 :$6.
bal :20.
;
run;

 

/*Perform calculations*/

 

/*Update results dataset*/

 

%mend calc;
%calc(lbl='PRODUCT1');
%calc(lbl='PRODUCT2');
%calc(lbl='PRODUCT3');

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 968 views
  • 1 like
  • 3 in conversation