BookmarkSubscribeRSS Feed
smunigala
Obsidian | Level 7

Dear all,

I need help with setting up macros for my analyses. I have 8 data sets with same variables in each data set. Only difference is that they are comming from different locations. I need to clean them up and create subsets from each of those 8 datasets. Cleaning and subsetting is exactly the same for each of these 8 data sets. I am not familiar with macros and any need help is greatly appreciated.

 

Please see the codes that I am looking to run. I need a simpler way to run  code rather than running it 8 times.  All my datasets end with _Name (like _BJH, _Boone) for different locations.

For example in the SAS code, the location is _BJH, it should be replaced by _WC, _STPETERS, _BOONE, _ALTON, PWEST, _CHRIST, _MBAP for the remaining 7 locations.

 

Thank you!

 

Proc sort data = surg_BJH; by visit_no; run;

Data uniq_surg_BJH;
Set surg_BJH;
procedure_date_dp=datepart(PROCEDURE_DATE);
procedure_date_tp=timepart(PROCEDURE_DATE);
format PROCEDURE_DATE datetime20. procedure_date_dp date9. procedure_date_tp time8.;
run;

proc sort data=uniq_surg_BJH NODUPKEY out = uniq_surg_BJH1;
 by visit_no procedure_date_dp procedure_date_tp;
run;

Proc sort data = uniq_surg_BJH1; by visit_no procedure_date_dp procedure_date_tp; run;
data uniq_surg_BJH2;
 set uniq_surg_BJH1;
 by visit_no procedure_date_dp procedure_date_tp;
 if first.procedure_date_dp then do;
  if     first.procedure_date_dp NE last.procedure_date_dp
     AND procedure_date_tp = '00:00:00't
   then delete;
  end;
run;
6 REPLIES 6
Reeza
Super User

Do them all at once, don't use a macro. Note I changed the SET statement and added a variable, source_file, to store the name of the contributing file. In the rest of your code you may want to add the variable SOURCE_FILE to your BY statements but otherwise it doesn't seem like anything in your code would have issues with this method.

 

Data uniq_surg_BJH;

Set surg_: indsname = source;

source_file = scan(source, 2);

procedure_date_dp=datepart(PROCEDURE_DATE);
procedure_date_tp=timepart(PROCEDURE_DATE);
format PROCEDURE_DATE datetime20. procedure_date_dp date9. procedure_date_tp time8.;
run;
smunigala
Obsidian | Level 7

Reeza,

Thanks for the quick reply. I need to create different data sets for different locations.

 

Data Uniq_surg_BJH;

Dat Uniq_Surg_Alton;

etc for 8 locations

 

Similarly I need to create those  datasets ending 1,2 after proc sort and other statements, for each location.

Reeza
Super User

Split it at the end of the the processing then. 

 

Combine > process > split

Split > process 1 > process 2 > process 3 > process 4 > process 5 > process 6 > ... > process 8

 

Which looks more efficient to you? 

 

Here's a data step that shows how to split the file using a data step:

https://gist.github.com/statgeek/4bfb7574713bedf4e011

smunigala
Obsidian | Level 7

Reeza,

I like to do each data set individually as there may be similar values for some variables under each of these data sets which I am afriad will be deleted when I sort them.

 

Thanks!

mkeintz
PROC Star

Then include the variable source_file as the primary sort key in your proc sort.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

@smunigala wrote:

Reeza,

I like to do each data set individually as there may be similar values for some variables under each of these data sets which I am afriad will be deleted when I sort them.

 

Thanks!


 

That's not possible if the first BY variable is the SOURCE_FILE as indicated in my post

Try it. If it doesn't work post why and your code and log. 

 

Otherwise if you want a macro, start a macro and we can help modify it, but to be honest, I don't want to be the one to convert your code to a macro. In this case, it would be a very quick one as well...in fact, going down this route of trying to show you the most efficient method is actually more work for me. It would be faster and easier (for me) to type out the answer directly. I'm happy to help, but I don't want to do your work...unless you're paying me (rates start at $100/hr USD). 

 

This is the standard macro reference I point users to, it has fully worked examples, the last is actually closest to what you want, but your situation is a plain text replacement. If you want other resources search at lexjansen.com or the documentation.

 

http://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/

 

This should provide enough information to answer your question:

https://support.sas.com/documentation/cdl/en/mcrolref/69726/HTML/default/viewer.htm#n0pfmkjlc3e719n1...

 

Or someone else will probably answer eventually. It's a forum that's open to anyone. 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1609 views
  • 0 likes
  • 3 in conversation