Programming the statistical procedures from SAS

Carryover effect problem

Reply
Occasional Contributor
Posts: 16

Carryover effect problem

Hello all~

I always thank for your help doing my
research.

I’m having trouble with my SAS coding
today, any kind of your generous help or small advice would be grateful for me.

Briefly I have dataset as follows.


 

Cus_ID


 

 

Spd_w1


 

 

Spd_w2


 

 

Spd_w3


 

 

Death


 

 

10293


 

 

132


 

 

50


 

 

12


 

 

0


 

 

10234


 

 

0


 

 

12


 

 

45


 

 

1


 

 

22321


 

 

30


 

 

32


 

 

28


 

 

1

 

 

11224


 

 

19


 

 

80


 

 

67


 

 

0


 

Cus_ID is customers’ individual ID and
Spd_w#s are weekly spending amount of each customer.

And Death is status of customer after 3
weeks. (if death = 1 then it means customer is considered to be defected)

What I want to do is to build a logistics
regression model for this, so IVs are Spd_w1, 2, 3 and DV is Death. But, as
there should be difference among weekly data in terms of importance because Week
1 shows the figure 2 weeks ago and Week 2 represents the figure 1 week ago, so
I want to add ‘Carryover effect’ on this.

So, I made new variable ‘SPD’ as follows.

SPD= (spd_w1)*(X**2) + (spd_w2)*(x) + (spd_w3)

X=decay parameter

After that, I’ll use proc logistics to find
whether this variable is significant or not.

But, What really matters is to find the
optimal value for ‘decay parameter’

I build a code like this and want to
optimize the value for ‘decay parameter’ by using macro.

%macro optimization(&N)

data customer; set original;

decay=&N

Spd=Spd_w1*(decay**2) + spd_w2*(decay) + spd_w3;

run;

proc logistic descending customer;

model Death=spd; run;

%mend

%optimization(0.01);

%optimization(0.02);

%optimization(0.03);

%optimization(0.04);

%optimization(0.05);

%optimization(0.06);

%optimization(0.07);

.

As far as I know, the minimum -2LL figure
would be the most efficient way to make it optimize. But, I cannot check every
candidate figures manually one by one. It take too long time to check it.

So, How should I optimize this ‘decay
parameter’ figure by using macro? (Or not using macro)

Any advice would be great help for me.

Thanks

Respected Advisor
Posts: 4,606

Re: Carryover effect problem

Hello, what seems to be important (given the sample data) is the change in spd values. Instead of fitting a nonlinear function, I suggest you try something simple first, such as:

data spdChange;
set original;
spdChange12 = spd_w2 - spd_w1;
spdChange23 = spd_w3 - spd_w2;
run;

proc logistic data=spdChange;
model death = spdChange12 spdChange23;
run;

PG

PG
Regular Contributor
Posts: 152

Re: Carryover effect problem

The ODS OUTPUT statement allows you to write specific statistics from PROC LOGISTIC to a SAS data set, including the model fit statistics including the -2*log_likelihood.  You can concatenate these values for different values of the decay factor before selecting the decay factor with the smallest value of -2*log_likelihood.  Another analogous approach you might consider is to restructure your data from short and wide to long and narrow by transposing your SPD values and indexing them by week.  Since SPD_W3 follows SPD_W2 which follows SPD_W1, for the application I'm considering, you would have to index the most recent value of SPD [=SPD_W3] as week 1 and the earliest value of SPD [=SPD_W1] as week 3.  Then you could try PROC GLIMMIX for a repeated-measures logistic regression using week and subject ID in its RANDOM statement with a RESIDUAL option and a first-order autoregressive variance-covariance structure, AR(1).  The value of the autoregressive parameter, rho, would be equivalent to your decay factor.

Respected Advisor
Posts: 2,655

Re: Carryover effect problem

I'm almost the same, except that I would delete the RESIDUAL option, and fit the repeated factor as a G-side effect, choosing to view the mean proportions as conditional rather than marginal, especially as there are no other fixed effects to be considered.  I will allow that the marginal (RESIDUAL option) will probably have fewer convergence problems.

Basically, I just want to stay away from pseudo-likelihoods these days.  Quasi-likelihoods on the other hand are all right by me.

:smileylaugh:

Steve Denham

Ask a Question
Discussion stats
  • 3 replies
  • 226 views
  • 0 likes
  • 4 in conversation