turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- I cannot figure out how to automate a process

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-26-2013 07:39 PM

Hi everyone!

I am trying to automate a process in SAS 9.3, and I can’t figure out how to do it. Basically, I have a huge file with a large amount of IDs, and for each id there is a number of observations that varies from one ID to another. Therefore, I pick a sample with 1000 IDs, and I want to try different functions (the difference between the different functions is the β coefficient) that apply to every observation among these IDs. Then I sum the results for each observation, within a specific ID, across 101 values (that depend on the ID). These values are incremented based on two values (t_min and t_max) that depend on the ID. The next step is finding the maximum value across the sums (for each ID) and find the corresponding T. I would like to be able to increment β in an automatic way instead of replacing it manually and running the program manually. I hope I am clear enough. I thought I could use a loop but I did not figure out one that worked. Below is the code that I have now, it functions correctly:

*Select sample with 1000 IDs;

**proc** **means** data=mywork.data min max median;

class id;

var T;

output out=mywork.datasorted min=T_min max=T_max median=T_median;

**run**;

**data** mywork.sample;

set mywork.datasorted;

random=ranuni(**860244**);

**run**;

**proc** **sort** data=mywork.sample;

by random;

**run**;

**data** mywork.sample;

set mywork.sample;

if _N_ le **1000**;

**run**;

*Create table with the 1000 observations + the missing data;

**proc** **sql**;

create table mywork.datawithoutST as

select *

from mywork.data1000 a, mywork.sample b

where a.id=b.id;

**quit**;

*Add ST data (a specific value, there is one per ID);

**proc** **sql**;

create table mywork.data as

select *

from mywork.datawithoutST a, mywork.ST b

where a.id=b.id;

**quit**;

*Calculate function;

**data** mywork.Function3;

set mywork.Data;

array T_array {**99**} T1-T99;

array Function3_array {**99**} Function3_1-Function3_99;

Function3_min=-max(**0**,T_min - T) – β*sqrt(max(**0**,T - T_min));

do i=**1** to **99**;

T_array*=T_min+i*(T_max - T_min)/***100**;

Function3_array*=-max(T_array - T,*

end;

Function3_max=-max(T_max - T,**0**) - β*sqrt(max(T - T_max,**0**));

max_Function3 = max(Function3_min,max(of Function3_1-Function3_99),Function3_max);

do i=**1** to **99**;

if Function3_array*=max_Function3 *then do;

BestT_Function3p=vname(T_array*);*

BestT_Function3=T_array*;*

output; leave; end; end;

if Function3_min=max_Function3 then BestT_Function3=T_min;

else if Function3_max=max_Function3 then BestT_Function3=T_max;

drop i random _TYPE_;

**run**;

*Sum utilities for each flight;

**proc** **sort** data=mywork.Function3;

by id;

**run**;

**proc** **means** data=mywork.Function3 noprint;

var Function3_min Function3_1-Function3_99 Function3_max;

output out=mywork.Function3_1 (drop=_type_ _freq_)

sum(Function3_min Function3_1-Function3_99 Function3_max)=SumFunction3_min SumFunction3_1-SumFunction3_99 SumFunction3_max;

by id;

**run**;

*Find the maximum of the sums:

** **

**data** mywork.Function3_2;

array SumFunction3_array {**99**} SumFunction3_1-SumFunction3_99;

set mywork.Function3_1;

Max=max(SumFunction3_min,max(of SumFunction3_1-SumFunction3_99),SumFunction3_max);

do i=**1** to **99**;

if SumFunction3_array*=Max *then do;

BestT_Function3p=vname(SumFunction3_array*);*

BestT_Function3=SumFunction3_array*;*

output; leave; end; end;

if SumFunction3_min=Max then BestT_Function3=SumFunction3_min;

else if Function3_max=Max then BestT_Function3=SumFunction3_max;

drop BestT_Function3p BestT_Function3 Function3_max;

**run**;

*datafilter is a file containing the times for each ID (I just want one row per ID);

**proc** **sql**;

create table mywork.finalFunction3 as

select *

from mywork.datafilter a, mywork.Function3_2 b

where a.id=b.id;

**quit**;

**data** mywork.finalFunction3;

set mywork.finalFunction3;

drop SumFunction3_1-SumFunction3_99;

**run**;

Thank you so much !

Max

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to maxime

10-26-2013 09:06 PM

I would wrap a macro around the code which you need to execute once per value of Beta and then call this macro in a data _null_ step. Below code sample illustrates the approach.

data work.class;

set sashelp.class;

run;

%macro iter(beta);

data Result;

retain beta β

set work.class;

height2=height*β

run;

proc append base=work.Results_Combined data=work.Result;

run;

%mend;

proc datasets lib=work memtype=(data view) nolist nowarn;

delete Results_Combined;

run;

quit;

data _null_;

input beta;

call execute ( cats('%iter(',beta,')') );

datalines;

2

4

6

;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Patrick

10-27-2013 02:38 AM

This looks like a "monte carlo" approach on some kind of analyses. Is it a new type of analyses or is it an existing one.

If it is an existing it could be that is solved by one of the many many statistical procs already available.

Searching/googling wirh some more dedicated names coud help.

---->-- ja karman --<-----

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to maxime

10-27-2013 02:51 PM

Hello, thank you very much for your help!

Patrick, I tried using a macro, so I added "%macro iter(beta);" before the code and "%mend iter(beta);" right after it and replaced "beta" with "&beta". However, I don't fully understand the last part of your code:

"

proc datasets lib=work memtype=(data view) nolist nowarn;

delete Results_Combined;

run;

quit;

data _null_;

input beta;

call execute ( cats('%iter(',beta,')') );

datalines;

2

4

6

;

run;"

After my macro, I added "

data _null_;

input beta;

call execute ( cats('%iter(',beta,')') );

datalines;

2

4

6

;

run;

"

What datasets am I supposed to use instead of "work" and "Results_Combined"? I tried using the data null step only (without the proc datasets before) but then I only get the results for one beta, not all of them. How can I get the results corresponding to all the betas? (in several files or, even better, a single file)

Jaap, I don't think a statistical procedure would do the job here unfortunately.

Thank you for your help!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to maxime

10-28-2013 09:11 AM

Let's concentrate on the sample then we can move on the the rest. If you can supply some example data that would be helpful. The following selects 4 samples of id size(region)=3 from sashelp.shoes. Do these statements model the sampling part of you problem?

sampleingunit region;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to maxime

10-31-2013 07:42 PM

Hi! Thank you all for your help. I finally gave up on automating this part as I had a deadline to meet, but I added SAS code in the end to replace what I was doing on Excel so that it doesn't take too long for each Beta. It is not a very elegant way of solving my problem but it at least it works!