DATA Step, Macro, Functions and more

macro proc genmod

Accepted Solution Solved
Reply
Contributor
Posts: 53
Accepted Solution

macro proc genmod

Hi;

 

I have one exposure (exp), 4 dependants (dep) and 9 independants (indp) and i want to run proc genmod. the exposure will present with each independant to model each dependant.  

 

For example modeling the first dependant:

 

proc genmod data=have;

model dep1 = exp indp1 /d=nb;run;

 

proc genmod data=have;

model dep1 = exp indp2 /d=nb;run;

.

.

.

proc genmod data=have;

model dep1 = exp indp9 /d=nb;run;

 

then the same for the other dependants.

 

i belive using macro will save the time.

 

Any hlep would be appreciated.

 


Accepted Solutions
Solution
‎04-14-2016 12:26 PM
Trusted Advisor
Posts: 1,115

Re: macro proc genmod

Hi @samnan,

 

Personally, I prefer analysis datasets in wide format (like your HAVE dataset) for statistical modelling procedures. All variables occurring in one MODEL statement must be available anyway in the input dataset.


With your existing dataset HAVE, your "macro" can be implemented using CALL EXECUTE:

data _null_;
array dep dep1 dep2 ...; /* list your dependent variables here */
array indp indp1 indp2 ...; /* list your independent variables here */
do i=1 to dim(dep);
  do j=1 to dim(indp);
    call execute( 'proc genmod data=have; '
               || 'model ' || vname(dep[i]) || ' = exp ' || vname(indp[j]) || ' /d=nb; run;');
  end;
end;
run;

This assumes that all your independent variables are numeric (which is plausible as you didn't mention a CLASS statement). Please note that the arrays above do not contain variables from dataset HAVE, but just "dummy variables" with the same names.

View solution in original post


All Replies
Super User
Super User
Posts: 7,430

Re: macro proc genmod

[ Edited ]

Not too familiar with that procedure, however all procedures allow by group processing - which would be both easier to maintain and less resource hungry.  Change your input data structure slightly - you can use transpose - normalised data is easier to work with:

You have:

... INDP1  INDP2 INDP3 ...

    xyz       xyz      xyz

 

Change this to:

... INDP   RES

...  1        xyz

...   2       xyz

 

Then you can do one gemod:

proc genmod data=have;
  by indp;
  model dep1 = exp res /d=nb;
run;

Sorry, not sure how this affects the model, can only advise on structure and syntax.

Trusted Advisor
Posts: 1,115

Re: macro proc genmod

Hi @RW9,

 

In your MODEL statement, indp should read res to make sense.

Super User
Super User
Posts: 7,430

Re: macro proc genmod

Thanks, updated.

Contributor
Posts: 53

Re: macro proc genmod

Dear @RW9 thanks for your valuable help, i do not have (res) variable. but the code provided by @FreelanceReinhard works good.

 

Super User
Super User
Posts: 7,430

Re: macro proc genmod

RES is just a variable name I created, when the data is taken from going along the table, i.e. they are variables your data looks like this:

... INDP1   INDP2   INDP3 ...

... xyz        def       abc     ...

...

 

When the data is normalised it would look like this:

...  INDP  RES (call these columns what you want)...

...  1        xyz...

...  2        def...

...  3        abc...

 

The second structure - exactly the same data, just in a different layout - is both easier to program with, uses core Base SA functionality of by group processing which is quicker and just generally better in all respsects.  It is a useful thing to note that a slight restructure to your data can make your programming more efficient and easier to read and maintain.

 

Contributor
Posts: 53

Re: macro proc genmod

Dear @RW9 thanks for your explanation, i will try it

Solution
‎04-14-2016 12:26 PM
Trusted Advisor
Posts: 1,115

Re: macro proc genmod

Hi @samnan,

 

Personally, I prefer analysis datasets in wide format (like your HAVE dataset) for statistical modelling procedures. All variables occurring in one MODEL statement must be available anyway in the input dataset.


With your existing dataset HAVE, your "macro" can be implemented using CALL EXECUTE:

data _null_;
array dep dep1 dep2 ...; /* list your dependent variables here */
array indp indp1 indp2 ...; /* list your independent variables here */
do i=1 to dim(dep);
  do j=1 to dim(indp);
    call execute( 'proc genmod data=have; '
               || 'model ' || vname(dep[i]) || ' = exp ' || vname(indp[j]) || ' /d=nb; run;');
  end;
end;
run;

This assumes that all your independent variables are numeric (which is plausible as you didn't mention a CLASS statement). Please note that the arrays above do not contain variables from dataset HAVE, but just "dummy variables" with the same names.

Contributor
Posts: 53

Re: macro proc genmod

Dear @FreelanceReinhard thanks for your appreciated help,

Contributor
Posts: 53

Re: macro proc genmod

Can you take it one step further, where to let macro keep significant (indp) variables only. just to make it like (selection) option in logistic regression modeling.

Trusted Advisor
Posts: 1,115

Re: macro proc genmod

So, you'd run the 4*9 PROC GENMOD steps (via CALL EXECUTE) and for each of the four dependent variables you would like to have a list of those independent variables which had p-values <0.05 in table "Analysis Of Maximum Likelihood Parameter Estimates," excluding variable EXP, which is always in the model?

 

Yes, this is possible. You could write the parameter estimates (incl. p-values) to datasets EST_DEP1, EST_DEP2, ... (where "DEPi" would be replaced by the name of the i-th dependent variable) and then, for example, select the names of the independent variables of interest via PROC SQL into macro variables INDPLIST_DEP1, INDPLIST_DEP2, ...

 

Here is draft code for this:

data _null_;
array dep dep1 dep2 ...; /* list your dependent variables here */
array indp indp1 indp2 ...; /* list your independent variables here */
do i=1 to dim(dep);
  call execute('ods output ParameterEstimates(persist=proc)=est_' || vname(dep[i]) ||';');
  do j=1 to dim(indp);
    call execute( 'proc genmod data=have;'
               || 'model ' || vname(dep[i]) || ' = exp ' || vname(indp[j]) || ' /d=nb; run;');
  end;
  call execute('ods output close;');
  call execute('proc sql noprint; select parameter into :indplist_' || vname(dep[i])
            || ' separated by " " from est_' || vname(dep[i])
            || ' where upcase(parameter) not in ("INTERCEPT", "EXP", "DISPERSION") & .<ProbChiSq<0.05; quit;');
end;
run;

%put &=indplist_dep1; /* replace dep1 by the name of the first dep. variable */
%put &=indplist_dep2; /* replace dep1 by the name of the second dep. variable */
...

You could use the variable lists &indplist_depi in MODEL statements of subsequent PROC GENMOD calls.

 

As I said, this is draft code. If for a particular dependent variable none of the 9 independent variables (excl. EXP) turned out to be significant, the corresponding macro variable would not be created (hence, the corresponding %PUT statement would cause a WARNING in the log).

Contributor
Posts: 53

Re: macro proc genmod

i am not sure about last part (%put &=indplist_dep1; /* replace dep1 by the name of the first dep. variable */
).

i rename my dependant variable to (dep, dep1 ... dep7) and the indepdendant variables to (indp, indp1 .... indp16) just to apply the code. when i run it this time it showed the same results of old one.

Trusted Advisor
Posts: 1,115

Re: macro proc genmod

The %PUT statements are just optional to demonstrate that the variable lists have been created.

 

Please note that my comments "list your (in)dependent variables here" referred to the lists dep1 dep2 ... and indp1 indp2 ..., respectively. The names dep and indp are the array names and must not be replaced. So, for instance, if your independent variables were AGE, HEIGHT, WEIGHT, the second array statement would read:

array indp age height weight;

and similarly for the first array.

 

If the first dependent variable was XYZ and only AGE and WEIGHT were significant for that, the suggested code would create a macro variable INDPLIST_XYZ containing age weight (possibly in upper or mixed case), selected from a WORK dataset named EST_XYZ.

Contributor
Posts: 53

Re: macro proc genmod

Dear @FreelanceReinhard you are doing very good coding. now i got it and i like it although the list contains all veriables and thier values either significant or insignificant.

Trusted Advisor
Posts: 1,115

Re: macro proc genmod

What do you mean by "the list contains all veriables and thier values either significant or insignificant"? Are you saying that some or all of the macro variables INDPLIST_depi do not contain the correct variable lists? If anything does not work or is unclear, we can continue the discussion tomorrow (Central European Time).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 21 replies
  • 590 views
  • 4 likes
  • 3 in conversation