DATA Step, Macro, Functions and more

I cannot figure out how to automate a process

Reply
Occasional Contributor
Posts: 13

I cannot figure out how to automate a process

Hi everyone!

I am trying to automate a process in SAS 9.3, and I can’t figure out how to do it. Basically, I have a huge file with a large amount of IDs, and for each id there is a number of observations that varies from one ID to another. Therefore, I pick a sample with 1000 IDs, and I want to try different functions (the difference between the different functions is the β coefficient) that apply to every observation among these IDs. Then I sum the results for each observation, within a specific ID, across 101 values (that depend on the ID). These values are incremented based on two values (t_min and t_max) that depend on the ID. The next step is finding the maximum value across the sums (for each ID) and find the corresponding T. I would like to be able to increment β in an automatic way instead of replacing it manually and running the program manually. I hope I am clear enough. I thought I could use a loop but I did not figure out one that worked. Below is the code that I have now, it functions correctly:

*Select sample with 1000 IDs;

proc means data=mywork.data min max median;

   class id;

   var T;

   output out=mywork.datasorted min=T_min max=T_max median=T_median;

run;

data mywork.sample;

set mywork.datasorted;

random=ranuni(860244);

run;

proc sort data=mywork.sample;

by random;

run;

data mywork.sample;

set mywork.sample;

if _N_ le 1000;

run;

*Create table with the 1000 observations + the missing data;

proc sql;

create table mywork.datawithoutST as

select *

from mywork.data1000 a, mywork.sample b

where a.id=b.id;

quit;

*Add ST data (a specific value, there is one per ID);

proc sql;

create table mywork.data as

select *

from mywork.datawithoutST a, mywork.ST b

where a.id=b.id;

quit;

*Calculate function;

data mywork.Function3;

set mywork.Data;

array T_array {99} T1-T99;

array Function3_array {99} Function3_1-Function3_99;

Function3_min=-max(0,T_min - T) – β*sqrt(max(0,T - T_min));

do i=1 to 99;

T_array=T_min+i*(T_max - T_min)/100;

Function3_array=-max(T_array - T,0) - β*sqrt(max(T - T_array,0));

end;

Function3_max=-max(T_max - T,0) - β*sqrt(max(T - T_max,0));

max_Function3 = max(Function3_min,max(of Function3_1-Function3_99),Function3_max);

do i=1 to 99;

if Function3_array=max_Function3 then do;

BestT_Function3p=vname(T_array);

BestT_Function3=T_array;

output; leave; end; end;

if Function3_min=max_Function3 then BestT_Function3=T_min;

else if Function3_max=max_Function3 then BestT_Function3=T_max;

drop i random _TYPE_;

run;

*Sum utilities for each flight;

proc sort data=mywork.Function3;

by id;

run;

proc means data=mywork.Function3 noprint;

var Function3_min Function3_1-Function3_99 Function3_max;

output out=mywork.Function3_1 (drop=_type_ _freq_)

sum(Function3_min Function3_1-Function3_99 Function3_max)=SumFunction3_min SumFunction3_1-SumFunction3_99 SumFunction3_max;

by id;

run;

*Find the maximum of the sums:

data mywork.Function3_2;

array SumFunction3_array {99} SumFunction3_1-SumFunction3_99;

set mywork.Function3_1;

Max=max(SumFunction3_min,max(of SumFunction3_1-SumFunction3_99),SumFunction3_max);

do i=1 to 99;

if SumFunction3_array=Max then do;

BestT_Function3p=vname(SumFunction3_array);

BestT_Function3=SumFunction3_array;

output; leave; end; end;

if SumFunction3_min=Max then BestT_Function3=SumFunction3_min;

else if Function3_max=Max then BestT_Function3=SumFunction3_max;

drop BestT_Function3p BestT_Function3 Function3_max;

run;

*datafilter is a file containing the times for each ID (I just want one row per ID);

proc sql;

create table mywork.finalFunction3 as

select *

from mywork.datafilter a, mywork.Function3_2 b

where a.id=b.id;

quit;

data mywork.finalFunction3;

set mywork.finalFunction3;

drop SumFunction3_1-SumFunction3_99;

run;

Thank you so much !

Max

Respected Advisor
Posts: 4,173

Re: I cannot figure out how to automate a process

I would wrap a macro around the code which you need to execute once per value of Beta and then call this macro in a data _null_ step. Below code sample illustrates the approach.

data work.class;
set sashelp.class;
run;

%macro iter(beta);
data Result;
  retain beta β
  set work.class;
  height2=height*β
run;

proc append base=work.Results_Combined data=work.Result;
run;
%mend;


proc datasets lib=work memtype=(data view) nolist nowarn;
delete Results_Combined;
run;
quit;

data _null_;
input beta;
call execute ( cats('%iter(',beta,')') );
datalines;
2
4
6
;
run;

Trusted Advisor
Posts: 3,215

Re: I cannot figure out how to automate a process

This looks like a "monte carlo" approach on some kind of analyses. Is it a new type of analyses or is it an existing one.

If it is an existing it could be that is solved by one of the many many statistical procs already available. 

Searching/googling wirh some more dedicated names coud help.

---->-- ja karman --<-----
Occasional Contributor
Posts: 13

Re: I cannot figure out how to automate a process

Hello, thank you very much for your help!

Patrick, I tried using a macro, so I added "%macro iter(beta);" before the code and "%mend iter(beta);" right after it and replaced "beta" with "&beta". However, I don't fully understand the last part of your code:

"

proc datasets lib=work memtype=(data view) nolist nowarn;

delete Results_Combined;

run;

quit;

data _null_;
input beta;
call execute ( cats('%iter(',beta,')') );
datalines;
2
4
6
;
run;"

After my macro, I added "

data _null_;

input beta;

call execute ( cats('%iter(',beta,')') );

datalines;

2

4

6

;

run;

"

What datasets am I supposed to use instead of "work" and "Results_Combined"? I tried using the data null step only (without the proc datasets before) but then I only get the results for one beta, not all of them. How can I get the results corresponding to all the betas? (in several files or, even better, a single file)

Jaap, I don't think a statistical procedure would do the job here unfortunately.

Thank you for your help!

Respected Advisor
Posts: 3,799

Re: I cannot figure out how to automate a process

Let's concentrate on the sample then we can move on the the rest.  If you can supply some example data that would be helpful.  The following selects 4 samples of id size(region)=3 from sashelp.shoes.  Do these statements model the sampling part of you problem?

proc surveyselect rep=4 data=sashelp.shoes seed=987811 n=3 /*1000*/ out=sample;
   sampleingunit region;
   run;
Occasional Contributor
Posts: 13

Re: I cannot figure out how to automate a process

Hi! Thank you all for your help. I finally gave up on automating this part as I had a deadline to meet, but I added SAS code in the end to replace what I was doing on Excel so that it doesn't take too long for each Beta. It is not a very elegant way of solving my problem but it at least it works!

Ask a Question
Discussion stats
  • 5 replies
  • 251 views
  • 3 likes
  • 4 in conversation