BookmarkSubscribeRSS Feed
maxime
Calcite | Level 5

Hi everyone!

I am trying to automate a process in SAS 9.3, and I can’t figure out how to do it. Basically, I have a huge file with a large amount of IDs, and for each id there is a number of observations that varies from one ID to another. Therefore, I pick a sample with 1000 IDs, and I want to try different functions (the difference between the different functions is the β coefficient) that apply to every observation among these IDs. Then I sum the results for each observation, within a specific ID, across 101 values (that depend on the ID). These values are incremented based on two values (t_min and t_max) that depend on the ID. The next step is finding the maximum value across the sums (for each ID) and find the corresponding T. I would like to be able to increment β in an automatic way instead of replacing it manually and running the program manually. I hope I am clear enough. I thought I could use a loop but I did not figure out one that worked. Below is the code that I have now, it functions correctly:

*Select sample with 1000 IDs;

proc means data=mywork.data min max median;

   class id;

   var T;

   output out=mywork.datasorted min=T_min max=T_max median=T_median;

run;

data mywork.sample;

set mywork.datasorted;

random=ranuni(860244);

run;

proc sort data=mywork.sample;

by random;

run;

data mywork.sample;

set mywork.sample;

if _N_ le 1000;

run;

*Create table with the 1000 observations + the missing data;

proc sql;

create table mywork.datawithoutST as

select *

from mywork.data1000 a, mywork.sample b

where a.id=b.id;

quit;

*Add ST data (a specific value, there is one per ID);

proc sql;

create table mywork.data as

select *

from mywork.datawithoutST a, mywork.ST b

where a.id=b.id;

quit;

*Calculate function;

data mywork.Function3;

set mywork.Data;

array T_array {99} T1-T99;

array Function3_array {99} Function3_1-Function3_99;

Function3_min=-max(0,T_min - T) – β*sqrt(max(0,T - T_min));

do i=1 to 99;

T_array=T_min+i*(T_max - T_min)/100;

Function3_array=-max(T_array - T,0) - β*sqrt(max(T - T_array,0));

end;

Function3_max=-max(T_max - T,0) - β*sqrt(max(T - T_max,0));

max_Function3 = max(Function3_min,max(of Function3_1-Function3_99),Function3_max);

do i=1 to 99;

if Function3_array=max_Function3 then do;

BestT_Function3p=vname(T_array);

BestT_Function3=T_array;

output; leave; end; end;

if Function3_min=max_Function3 then BestT_Function3=T_min;

else if Function3_max=max_Function3 then BestT_Function3=T_max;

drop i random _TYPE_;

run;

*Sum utilities for each flight;

proc sort data=mywork.Function3;

by id;

run;

proc means data=mywork.Function3 noprint;

var Function3_min Function3_1-Function3_99 Function3_max;

output out=mywork.Function3_1 (drop=_type_ _freq_)

sum(Function3_min Function3_1-Function3_99 Function3_max)=SumFunction3_min SumFunction3_1-SumFunction3_99 SumFunction3_max;

by id;

run;

*Find the maximum of the sums:

data mywork.Function3_2;

array SumFunction3_array {99} SumFunction3_1-SumFunction3_99;

set mywork.Function3_1;

Max=max(SumFunction3_min,max(of SumFunction3_1-SumFunction3_99),SumFunction3_max);

do i=1 to 99;

if SumFunction3_array=Max then do;

BestT_Function3p=vname(SumFunction3_array);

BestT_Function3=SumFunction3_array;

output; leave; end; end;

if SumFunction3_min=Max then BestT_Function3=SumFunction3_min;

else if Function3_max=Max then BestT_Function3=SumFunction3_max;

drop BestT_Function3p BestT_Function3 Function3_max;

run;

*datafilter is a file containing the times for each ID (I just want one row per ID);

proc sql;

create table mywork.finalFunction3 as

select *

from mywork.datafilter a, mywork.Function3_2 b

where a.id=b.id;

quit;

data mywork.finalFunction3;

set mywork.finalFunction3;

drop SumFunction3_1-SumFunction3_99;

run;

Thank you so much !

Max

5 REPLIES 5
Patrick
Opal | Level 21

I would wrap a macro around the code which you need to execute once per value of Beta and then call this macro in a data _null_ step. Below code sample illustrates the approach.

data work.class;
set sashelp.class;
run;

%macro iter(beta);
data Result;
  retain beta β
  set work.class;
  height2=height*β
run;

proc append base=work.Results_Combined data=work.Result;
run;
%mend;


proc datasets lib=work memtype=(data view) nolist nowarn;
delete Results_Combined;
run;
quit;

data _null_;
input beta;
call execute ( cats('%iter(',beta,')') );
datalines;
2
4
6
;
run;

jakarman
Barite | Level 11

This looks like a "monte carlo" approach on some kind of analyses. Is it a new type of analyses or is it an existing one.

If it is an existing it could be that is solved by one of the many many statistical procs already available. 

Searching/googling wirh some more dedicated names coud help.

---->-- ja karman --<-----
maxime
Calcite | Level 5

Hello, thank you very much for your help!

Patrick, I tried using a macro, so I added "%macro iter(beta);" before the code and "%mend iter(beta);" right after it and replaced "beta" with "&beta". However, I don't fully understand the last part of your code:

"

proc datasets lib=work memtype=(data view) nolist nowarn;

delete Results_Combined;

run;

quit;

data _null_;
input beta;
call execute ( cats('%iter(',beta,')') );
datalines;
2
4
6
;
run;"

After my macro, I added "

data _null_;

input beta;

call execute ( cats('%iter(',beta,')') );

datalines;

2

4

6

;

run;

"

What datasets am I supposed to use instead of "work" and "Results_Combined"? I tried using the data null step only (without the proc datasets before) but then I only get the results for one beta, not all of them. How can I get the results corresponding to all the betas? (in several files or, even better, a single file)

Jaap, I don't think a statistical procedure would do the job here unfortunately.

Thank you for your help!

data_null__
Jade | Level 19

Let's concentrate on the sample then we can move on the the rest.  If you can supply some example data that would be helpful.  The following selects 4 samples of id size(region)=3 from sashelp.shoes.  Do these statements model the sampling part of you problem?

proc surveyselect rep=4 data=sashelp.shoes seed=987811 n=3 /*1000*/ out=sample;
   sampleingunit region;
   run;
maxime
Calcite | Level 5

Hi! Thank you all for your help. I finally gave up on automating this part as I had a deadline to meet, but I added SAS code in the end to replace what I was doing on Excel so that it doesn't take too long for each Beta. It is not a very elegant way of solving my problem but it at least it works!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 940 views
  • 3 likes
  • 4 in conversation