I have written the macro below, but am new at this. I have run this for one runner but need to do this for 12 000 runners. So I probably need to do this in a do loop.
The 1st data set a contains the times each runner passes over 6 time mats during a race (time_lynn--time_fini. I selected one runner's times (runner 54915)
The 2nd data set ww contain the weather data measured on the day and this data is merged with the runner data to provide the weather values (WBGT) at each of the 6 mats. The 6 WBGT values form a curve for each runner and then the Area under the curve (AUC) is calculated for each runner into his dataset auc54915. My program produce the correct AUC value for an individual runner.
The AUC datasets needs to be combined for all 12 000 runners.
***********************************MACRO FROM HERE*********************************;
%macro wbgt(runner=54915);
****SELECT ONE FINISHER*********;
data one;set a(keep=race_number time_lynn--time_fini);
if race_number=&runner;
drop race_number;
run;
*****TRANSPOSE DATA FOR SELECTED RUNNER***********;
proc transpose data=one prefix=time out=onet;
run;
****RUNNER TIMES AT EACH OF THE 6 WEATHER STATIONS*********;
data onett;set onet(rename=time1=time);
if _NAME_='time_wins' then delete;
if _NAME_='time_lynn' then namen=1;
if _NAME_='time_cato' then namen=2;
if _NAME_='time_drum' then namen=3;
if _NAME_='time_pine' then namen=4;
if _NAME_='time_sher' then namen=5;
if _NAME_='time_fini' then namen=6;
drop _NAME_;
run;
proc sort data=onett;
by namen time;
run;
*************COMBINE THE INDIV RUNNER MAT TIMES WITH THE WEATHER@EACH MAT TIME;
proc sql;
create table wbgt_6indiv&runner as
select a.time, a.namen, b.namen, b.wbgt, b.time as wbgt_t format=time8.2
from onett a
left join
ww b
on a.namen=b.namen
where b.time>=a.time
group by a.time,a.namen
having abs(a.time-b.time)=min(abs(a.time-b.time))
order by a.namen, b.namen descending;
quit;
***CALCULATE THE AUC FOR THE SELECTED RUNNER;
data auc&runner;set wbgt_6indiv&runner;
runner=&runner;
retain sum 0;
sum_wbgt=((lag(wbgt)+wbgt)/2)*(wbgt_t-lag(wbgt_t)) ;
if _N_>1 then sum=sum_wbgt+sum;
if _N_=6 then output;
run;
%mend;
%wbgt;
I don't see a need for a macro. Why can't you just run the code (with some small modifications) of the data set with 12,000 runners? Then, all these macro issues go away.
For example, the PROC TRANSPOSE can be performed with a BY statement, all 12,000 runners' data will get transposed at once. Then data set ONETT can be run on all 12,000 runners' data at once and SQL can be run on all 12,000 runners' data at once.
Proc Transpose would result in a data set with 12,000 records if you use BY RUNNER;
However, if it is not helping, then probably transpose is not needed here, you can skip it and then somehow drop down to the later steps, but I continue to believe (in fact, I am convinced) that macros are not needed.
In fact, this is one of the major advantages of the SAS system, you can process data like this without writing loops. There are many features in SAS that enable processing of individual records across many steps without explicitly writing loops.
If you provide some sample data for like 3 runners, people here can try to develop code without macros to do what you want. Please provide the data as working SAS data step code, which you can type in yourself, or you can follow these instructions.
You should provide a small example of the data sets involved.
I suspect that no macro is needed. BY group processing would likely get what you need but can't tell for sure without example data.
Proc sort data=a; by race_number; run; proc transpose data=a (keep=race_number tim_lynn-tim_fini) out=onet; by race_number; run;
Then calculate your variables in Onett the same.
Change the sort to:
proc sort data=onett; by race_number namen time; run;
And finish with something like
proc sql; create table wbgt_6indiv as select a.race_number, a.time, a.namen, b.namen, b.wbgt, b.time as wbgt_t format=time8.2 from onett a left join ww b on a.namen=b.namen where b.time>=a.time group by a.race_number, a.time,a.namen having abs(a.time-b.time)=min(abs(a.time-b.time)) order by a.race_number, a.namen, b.namen descending; quit; data auc; set wbgt_6indiv; by race_number; runner=race_number; retain sum 0; lt = lag(wbgt); ltt = lag(wbgt_t); if first.race_number then do; sum=0; sum_wbgt=(wbgt)/2)*(wbgt_t)) ; end; else do; sum_wbgt=((lt)+wbgt)/2)*(wbgt_t-ltt)) ; end sum=sum_wbgt+sum; if last.race_number then output; run;
I'm not going to try to rewrite all of your code, but I still don't see anything that stops working if you do it all at once. You might need to add in something to your SQL so it operates by race_number rather than expecting a single race number, but this output data set ONET from my brief code should be usable in its entirety in each of your subsequent steps.
data have;
input Race_number (time_lynn time_cato time_drum time_pine time_sher time_fini) (time8.);
cards;
28951 6:28:53 7:22:32 8:13:22 9:40:28 10:24:47 11:00:37
17472 6:28:52 7:22:29 8:13:22 9:40:28 10:25:52 11:03:45
54169 6:29:31 7:22:32 8:13:23 9:42:11 10:30:13 11:06:24
31020 6:28:52 7:22:30 8:13:23 9:42:51 10:31:07 11:08:06
19952 6:28:53 7:22:33 8:13:32 9:47:14 10:35:03 11:11:35
30807 6:28:54 7:22:33 8:13:25 9:41:33 10:31:41 11:13:27
28773 6:28:52 7:22:32 8:13:24 9:41:07 10:29:52 11:14:56
44452 6:29:10 7:22:39 8:13:31 9:49:06 10:38:40 11:16:09
56834 6:28:53 7:22:28 8:15:04 9:51:15 10:40:29 11:17:03
29774 6:28:54 7:22:33 8:13:27 9:42:18 10:36:43 11:17:19
;
proc transpose data=have prefix=time out=onet;
by race_number notsorted;
run;
I agree with others, using BY group processing is likely more efficient in the long run.
Here's a method that answers the question asked:
https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
Basically for you - use CALL EXECUTE to call for all runners.
Then you can combine your results into one dataset using:
data auc_combined;
set auc:;
run;
This will combine all data sets that start with AUC.
If you do go down the macro approach, I would highly recommend adding a step to your code to not create intermediary data sets for each runner and to only keep the final results. That's a ton of wasted space on your machine. Basically use proc data sets to delete intermediary data sets.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.