Hi,
I have multiple sas transport files (.stc) that I am trying to import first into the sas data files (.sas7bdat). I am trying to use the macro to import all at once. However, my datafiles are named similar to sn55632-00001_blah-qwerty.stc, sn55632-00002_blah-yuiop.stc, sn55632-00003_blah-asdfg.stc, and so on. So, if you see the pattern here, the numbers after the first hypen are increasing by a unit. That's why I found somewhere that I could do 'mypath/sn55632-0000:.stc' (notice the colon there before .stc). However, it's not working for me for some reasons. Could anyone please guide me how I can go about it?
Thank you!
Use CALL EXECUTE with @Tom 's program:
%macro import (filename = );
proc cimport lib=w infile = "&filename";
run;
%mend import;
data files;
rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
did=dopen('dir');
do index=1 to dnum(did);
length filename $200;
filename=dread(did,index);
if 'stc'=scan(filename,-1,'.') then
call execute('%import(filename =' !! strip(filename) !! ');');
end;
rc=dclose(did);
run;
In SAS the colon is not a true wild card. It allows you to select variables that start with the text to the left of the colon. You can't use it in file names where you need to follow the OS file naming rules. There are other ways of doing what you want but you first need to post a working hard-coded version of your program.
I am working on this private data on a standalone device so I cannot really share exactly what I am doing. However, I can probably write the same code with different filenames to show here. What would be the relevant wildcard operator instead of colon in SAS as you mentioned?
option nofmterr;
libname w 'C:\Users\Public\ABCDE\alldata';
%macro import (n);
filename transfile 'C:\Users\Public\ABCDE\alldata\sn55632-0000&n.\:.stc';
proc cimport lib=w infile =tranfile;
run;
%mend import;
%import (1);
This is what I have right now.
The filenames are similar to:
sn55632-00001_blah-qwerty.stc
sn55632-00002_blah-yuiop.stc
sn55632-00003_blah-asdfg.stc
.
.
.
sn55632-00049_blah-asdfg.stc
Looks like you want to build the list of filename first.
If you can run operating system commands then use that. For example on unix you could use the ls command.
data files;
infile "ls /users/public/abdcde/alldata/*.stc" pipe truncover;
input filename $200.;
run;
Or for Window you might use the DIR command.
data files;
infile "c:; cd C:\Users\Public\ABCDE\alldata; dir /b *.stc" pipe truncover;
input filename $200.;
run;
Or you might need to use DOPEN() and DREAD() commands instead to get the list of files.
data files;
rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
did=dopen('dir');
do index=1 to dnum(did);
length filename $200;
filename=dread(did,index);
if 'stc'=scan(filename,-1,'.') then output;
end;
rc=dclose(did);
run;
Once you have the list of filenames use that to generate the code to process each STC file.
What type of files are these STC files? The STC extension has no fixed meaning. If you don't know that they are take a look at the first few records in the file.
data _null_;
infile 'C:\Users\Public\ABCDE\alldata\*.stc' lrecl=80 recfm=f obs=5;
input;
list;
run;
Awesome. Thank you!
The 3rd block of code you provided worked great. I might still need some help looping through those names to import them as .sas7bdat files.
Use CALL EXECUTE with @Tom 's program:
%macro import (filename = );
proc cimport lib=w infile = "&filename";
run;
%mend import;
data files;
rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
did=dopen('dir');
do index=1 to dnum(did);
length filename $200;
filename=dread(did,index);
if 'stc'=scan(filename,-1,'.') then
call execute('%import(filename =' !! strip(filename) !! ');');
end;
rc=dclose(did);
run;
Further Question: I am using a similar method to sort the data.
%macro sort (filename = );
proc sort data=w.&filename;
by ID;
run;
%mend sort;
data files;
rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
did=dopen('dir');
do index=1 to dnum(did);
length filename $200;
filename=dread(did,index);
if 'sas7bdat'=scan(filename,-1,'.') then
call execute('%sort(filename =' !! strip(filename) !! ');');
end;
rc=dclose(did);
run;
The problem with this is that the filename includes '.sas7bdat', and when it goes to proc sort, it'll throw an error because proc sort data=filename needs the filenames without extensions. How do I do it? I tried doing strip(filename-'.sas7bdat') to see if that's a thing but it didn't work either.
Thank you!
You can either use SCAN() to extract just the member name from the filename.
call execute(cats('%sort(filename=',scan(filename,1,'.'),')'));
Or change the macro to accept a full dataset name instead of just a member name in the W libref.
%macro sort (dataset);
proc sort data=&dataset ;
by ID;
run;
%mend sort;
And then you could pass in either W.MEMBER or just the quoted physical name.
call execute(cats('%sort(',quote(catx('\','C:\Users\Public\ABCDE\alldata',filename)),')'));
But if you already have a libref pointing to the directory there is no need to look for files. Just look for datasets. In fact you should only look for datasets that have the ID variable.
libname w 'C:\Users\Public\ABCDE\alldata';
proc sql;
create table members as
select libname,memname
from dictionary.columns
where libname='W' and upcase(name)='ID'
order by libname,memname
;
quit;
data _null_;
set members;
call execute(catx(' ','proc sort data=',catx('.',libname,memname),';by id;run;'));
run;
Is it okay if I ask more questions @Tom ? I feel like I have been learning a lot this way.
So, I have 134 datasets (now SAS datasets, thanks to you), all sorted within the same library. Now I need to merge them in two ways.
First, I need to merge, them into four phases:
1st Phase: s12345_0001_ads_qwerty
....
s12345_0049_ads_cvbnm
2nd Phase: s6789_0001_ads2_dfghk
......
s6789_0028_ads2_zxcvb
and so on.
If you see the pattern here, the first name before the underscore stays the same, while the number after the underscore increases by 1 throughout the phase.
Secondly, I need to merge all 134 of them into one dataset.
The problem with using similar macros, in this case, is I guess you have one filename passed to it at once, so SAS doesn't recognize what to merge. While with the SQL, I have hard time assigning dataname into the table, if that's the way.
Also, I have not used loops to run individual files through the merge statement yet.
Thank you!
So each dataset has ID and some other set of distinct variables that have different names than the variables in the other similarly named datasets?
data s12345;
merge w.s12345_: ;
by id;
run;
data s6789;
merge w:s6789_:;
by id;
run;
Are you sure don't want to SET them together instead of MERGEing them?
data s12345;
set w.s12345_: indsname=indsname;
by id;
source=indsname;
run;
If you have 4 years of data for 1,400 subjects then you SHOULD have 4*1,400 = 6,400 observations.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.