SAS Juletip #9 this year is an introduction on ways to easily read and write multiple sources of data with filename wildcards, dynamic filenames, and keeping track of the source of the data.
Hopefully a few tips or recaps of forgotten features 😉
The examples are very simple, but hopefully, you get the general idea and can adjust them to your own needs.
You should be able to copy, paste, and run the code in your favorite SAS Editor and SAS Version ....
A1) generate two flat files, naming the files based on the data
SAS PGM |
SAS log (partly) |
* Create flatfile by groups - Method 1 *;
%macro makefile(gender);
data _null_;
* Only select some of the data *;
set sashelp.class(where=(sex="&gender"));
* create variable containing dynamic filename *;
filename=cats('C:\temp\gender_',sex,'.csv');
* use the custom filename when creating the files *;
file dynfile filevar=filename dlm=";";
put name age sex height weight;
run;
%mend;
%makefile(M);
%makefile(F);
|
NOTE: The file DYNFILE is: Filename=C:\temp\gender_M.csv,
NOTE: 10 records were written to the file DYNFILE.
NOTE: There were 10 observations read from the data set SASHELP.CLASS. WHERE sex='M';
|
A2) Creating 2 flat files named by content, with no macro
SAS PGM |
SAS Log (partly) |
* Method 2 *; * require that the data is sorted by groups *;
proc sql;
create view _temp_ as
select *
from sashelp.class
order by sex;
quit;
data _null_;
set work._temp_;
* variable containing dynamic filename based on content *;
filename=cats('C:\temp\gender_',sex,'.csv');
file dynfile filevar=filename dlm=";";
put name age sex height weight;
run;
|
NOTE: The file DYNFILE is: Filename=C:\temp\gender_F.csv,
NOTE: The file DYNFILE is: Filename=C:\temp\gender_M.csv,
NOTE: 9 records were written to the file DYNFILE.
NOTE: 10 records were written to the file DYNFILE.
|
B) Reading files with wildcard and tracking origin
SAS PGM |
SAS log (partly) |
data all_genders;
length _filename_ $40;
infile "c:\temp\gender_*.csv" dlm=';' filename=_filename_;
input name $ age sex $ height weight;
filename=scan(_filename_,-2,'\.');
run;
|
NOTE: The infile "c:\temp\gender_*.csv" is: Filename=c:\temp\gender_F.csv, File List=c:\temp\gender_*.csv
NOTE: The infile "c:\temp\gender_*.csv" is: Filename=c:\temp\gender_M.csv, File List=c:\temp\gender_*.csv
NOTE: 9 records were read from the infile "c:\temp\gender_*.csv".
NOTE: 10 records were read from the infile "c:\temp\gender_*.csv".
NOTE: The data set WORK.ALL_GENDERS has 19 observations and 6 variables.
|
C) Which SAS Dataset did the data originate from, reading SAS data with wildcard
SAS PGM |
SAS output (partly) |
* create simple data for example *;
data work.gender_male
work.gender_female;
set sashelp.class;
if sex='F' then output gender_female;
else if sex='M' then output gender_male;
run;
* Read data and mark the origin *;
data class;
* read all data sets beginning with "gender_"*;
set gender_: indsname=_source_;
source_data=_source_;
run;
|
Alice |
F |
13 |
56.5 |
84.0 |
WORK.GENDER_FEMALE |
Barbara |
F |
13 |
65.3 |
98.0 |
WORK.GENDER_FEMALE |
.. |
.... |
. |
.. |
... |
... |
...... |
Thomas |
M |
11 |
57.5 |
85.0 |
WORK.GENDER_MALE |
William |
M |
15 |
66.5 |
112.0 |
WORK.GENDER_MALE |
|
From the SAS Documentation
"Very well done if you got all of those at home." - Richard Osman / Pointless