BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sudeep_Neupane
Obsidian | Level 7

Hi, 

I have multiple sas transport files (.stc) that I am trying to import first into the sas data files (.sas7bdat). I am trying to use the macro to import all at once. However, my datafiles are named similar to sn55632-00001_blah-qwerty.stc, sn55632-00002_blah-yuiop.stc, sn55632-00003_blah-asdfg.stc, and so on. So, if you see the pattern here, the numbers after the first hypen are increasing by a unit. That's why I found somewhere that I could do 'mypath/sn55632-0000:.stc' (notice the colon there before .stc). However, it's not working for me for some reasons. Could anyone please guide me how I can go about it? 

Thank you!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

Use CALL EXECUTE with @Tom 's program:

%macro import (filename = );

proc cimport lib=w infile = "&filename";
run;

%mend import;

data files;
  rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
  did=dopen('dir');
  do index=1 to dnum(did);
    length filename $200;
   filename=dread(did,index);
   if 'stc'=scan(filename,-1,'.') then 
   call execute('%import(filename =' !! strip(filename) !! ');');
  end;
  rc=dclose(did);
run;

View solution in original post

18 REPLIES 18
SASKiwi
PROC Star

In SAS the colon is not a true wild card. It allows you to select variables that start with the text to the left of the colon. You can't use it in file names where you need to follow the OS file naming rules. There are other ways of doing what you want but you first need to post a working hard-coded version of your program. 

Sudeep_Neupane
Obsidian | Level 7

 I am working on this private data on a standalone device so I cannot really share exactly what I am doing. However, I can probably write the same code with different filenames to show here. What would be the relevant wildcard operator instead of colon in SAS as you mentioned? 

Sudeep_Neupane
Obsidian | Level 7

option nofmterr;

libname w 'C:\Users\Public\ABCDE\alldata';

 

%macro import (n);

filename transfile 'C:\Users\Public\ABCDE\alldata\sn55632-0000&n.\:.stc';

proc cimport lib=w infile =tranfile;

run;

%mend import;

 

%import (1);

 

This is what I have right now. 

 

The filenames are similar to: 

sn55632-00001_blah-qwerty.stc

 sn55632-00002_blah-yuiop.stc

sn55632-00003_blah-asdfg.stc

.

.

.

sn55632-00049_blah-asdfg.stc

 

 

Tom
Super User Tom
Super User

Looks like you want to build the list of filename first.

If you can run operating system commands then use that.  For example on unix you could use the ls command.

data files;
  infile "ls /users/public/abdcde/alldata/*.stc" pipe truncover;
  input filename $200.;
run;

Or for Window you might use the DIR command.

data files;
  infile "c:; cd C:\Users\Public\ABCDE\alldata; dir /b *.stc" pipe truncover;
  input filename $200.;
run;

Or you might need to use DOPEN() and DREAD() commands instead to get the list of files.

data files;
  rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
  did=dopen('dir');
  do index=1 to dnum(did);
    length filename $200;
   filename=dread(did,index);
   if 'stc'=scan(filename,-1,'.') then output;
  end;
  rc=dclose(did);
run;

Once you have the list of filenames use that to generate the code to process each STC file.

 

What type of files are these STC files?  The STC extension has no fixed meaning.  If you don't know that they are take a look at the first few records in the file.

data _null_;
  infile 'C:\Users\Public\ABCDE\alldata\*.stc' lrecl=80 recfm=f obs=5;
  input;
  list;
run;

 

Sudeep_Neupane
Obsidian | Level 7

Awesome. Thank you!

The 3rd block of code you provided worked great. I might still need some help looping through those names to import them as .sas7bdat files.

 

SASKiwi
PROC Star

Use CALL EXECUTE with @Tom 's program:

%macro import (filename = );

proc cimport lib=w infile = "&filename";
run;

%mend import;

data files;
  rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
  did=dopen('dir');
  do index=1 to dnum(did);
    length filename $200;
   filename=dread(did,index);
   if 'stc'=scan(filename,-1,'.') then 
   call execute('%import(filename =' !! strip(filename) !! ');');
  end;
  rc=dclose(did);
run;
Sudeep_Neupane
Obsidian | Level 7

Thank you @Tom and @SASKiwi. No doubt you all are Super User. I had never utilized the forum before. Well, I guess I am under the shadow of the legends.

Muchas Gracias!

Sudeep_Neupane
Obsidian | Level 7

Further Question: I am using a similar method to sort the data.

 

%macro sort (filename = );

proc sort data=w.&filename;
by ID;
run;

%mend sort;

data files;
  rc=filename('dir', 'C:\Users\Public\ABCDE\alldata');
  did=dopen('dir');
  do index=1 to dnum(did);
    length filename $200;
   filename=dread(did,index);
   if 'sas7bdat'=scan(filename,-1,'.') then 
   call execute('%sort(filename =' !! strip(filename) !! ');');
  end;
  rc=dclose(did);
run;

 

The problem with this is that the filename includes '.sas7bdat', and when it goes to proc sort, it'll throw an error because proc sort data=filename needs the filenames without extensions. How do I do it? I tried doing strip(filename-'.sas7bdat') to see if that's a thing but it didn't work either.

 

Thank you!

 

Tom
Super User Tom
Super User

You can either use SCAN() to extract just the member name from the filename.

call execute(cats('%sort(filename=',scan(filename,1,'.'),')'));

Or change the macro to accept a full dataset name instead of just a member name in the W libref.

%macro sort (dataset);

proc sort data=&dataset ;
by ID;
run;

%mend sort;

And then you could pass in either W.MEMBER or just the quoted physical name.

call execute(cats('%sort(',quote(catx('\','C:\Users\Public\ABCDE\alldata',filename)),')'));

But if you already have a libref pointing to the directory there is no need to look for files.  Just look for datasets.  In fact you should only look for datasets that have the ID variable.

libname w 'C:\Users\Public\ABCDE\alldata';
proc sql;
create table members as
  select libname,memname
  from dictionary.columns
  where libname='W' and upcase(name)='ID'
  order by libname,memname
;
quit;
data _null_;
  set members;
  call execute(catx(' ','proc sort data=',catx('.',libname,memname),';by id;run;'));
run;
Sudeep_Neupane
Obsidian | Level 7
Wonderful. It worked like a charm! And thank you for pointing out the good programming practice out there.
Sudeep_Neupane
Obsidian | Level 7

Is it okay if I ask more questions @Tom ? I feel like I have been learning a lot this way.

 

So, I have 134 datasets (now SAS datasets, thanks to you), all sorted within the same library. Now I need to merge them in two ways.

 

First, I need to merge, them into four phases:

           1st Phase: s12345_0001_ads_qwerty

                             ....

                             s12345_0049_ads_cvbnm

            2nd Phase: s6789_0001_ads2_dfghk

                               ......

                                s6789_0028_ads2_zxcvb

and so on.

If you see the pattern here, the first name before the underscore stays the same, while the number after the underscore increases by 1 throughout the phase.

 

Secondly, I need to merge all 134 of them into one dataset.

 

The problem with using similar macros, in this case, is I guess you have one filename passed to it at once, so SAS doesn't recognize what to merge. While with the SQL, I have hard time assigning dataname into the table, if that's the way.

Also, I have not used loops to run individual files through the merge statement yet.

Thank you!

 

Tom
Super User Tom
Super User

So each dataset has ID and some other set of distinct variables that have different names than the variables in the other similarly named datasets?

data s12345;
  merge w.s12345_: ;
  by id;
run;
data s6789;
  merge w:s6789_:;
  by id;
run;

Are you sure don't want to SET them together instead of MERGEing them?

data s12345;
  set w.s12345_: indsname=indsname;
  by id;
  source=indsname;
run;
Sudeep_Neupane
Obsidian | Level 7
Yes. Each dataset has ID as a unique identifier. There are ~1400 observations altogether i.e. number of participants. 1st phase would have different datasets for different observations for 3 years on those participants.
2nd phase has those observations for another 3years for the same participants. Likewise for the 3rd and 4th phases.
So, if I set them, there will be 4(number of phases)*1400 observations = 6400 observations, while I only want 1400 observations.
Tom
Super User Tom
Super User

If you have 4 years of data for 1,400 subjects then you SHOULD have 4*1,400 = 6,400 observations.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 3095 views
  • 7 likes
  • 3 in conversation