DATA Step, Macro, Functions and more

regression with a by condition

Reply
Contributor
Posts: 40

regression with a by condition

Hi all,

 

I am having trouble doing a regression given a criteria for certain number of observations in a large dataset. I want to run a simple regression with one independent variable for 200 observations, after a condition is true and extract the coefficient of the independent variable.

Here is a sample of my data

 Id               date                         Y                            X           Condition

1             20120103            0.001421             0.017012

1             20120104            0.004966             -0.00138

1             20120105            0.011295             0.004055

1             20120106            0.003839             -0.000598

1             20120109            -0.01217              0.004418

1             20120110            0.001056              0.01111

1             20120111            0.005274              0.004571              

1             20120112            -0.013291             0.005077               1

1             20120113            -0.024814           -0.005005

1             20120117            0.02908               0.003767

1             20120118            0.041328             0.013701

1             20120119            -0.002374            0.005448

1             20120120            0.015981             0.003698

1             20120123            0.007363             0.001857     

 

For id=1 if condition=1 then I need to extract b in the regression y=a+bX for 100 observations before the condition and b in the regression y=a+bX for 100 observations (including the observation that contains the condition). I also want to exclude id numbers with less than 100 observations before or after the condition=1.

The desired output would be something like this.

Id               date                         Y                            X           Condition          slope

1             20120103            0.001421             0.017012                             -0.1777

1             20120104            0.004966             -0.00138                              -0.1777

1             20120105            0.011295             0.004055                             -0.1777

1             20120106            0.003839             -0.000598                            -0.1777

1             20120109            -0.01217              0.004418                             -0.1777

1             20120110            0.001056              0.01111                               -0.1777

1             20120111            0.005274              0.004571                             -0.1777

1             20120112            -0.013291             0.005077               1            3.1400

1             20120113            -0.024814            -0.005005                             3.1400

1             20120117            0.02908               0.003767                               3.1400

1             20120118            0.041328             0.013701                               3.1400

1             20120119            -0.002374            0.005448                               3.1400

1             20120120            0.015981             0.003698                               3.1400

1             20120123            0.007363             0.001857                               3.1400

 

In the above example the slopes are calculated for 7 observations before condition=1 and 8 observations after condition=1.

 

Any help would be greatly appreciated.

Respected Advisor
Posts: 3,055

Re: regression with a by condition

Modify your data set so you have some newly constructed variable which is present for every observation, and is sequential. So where you have ID 1 and condition is missing, your newly constructed variable has value 1. When condition has 1, the newly constructed variable has value 2. And so on. Then you can do the regression by ID and by the newly constructed variable.

--
Paige Miller
Contributor
Posts: 40

Re: regression with a by condition

Posted in reply to PaigeMiller
Any idea how to construct that variable?
Thanks!
Respected Advisor
Posts: 3,055

Re: regression with a by condition

UNTESTED CODE

 

Assumes data is properly sorted

 

data want;
    set have;
    by id date;
    if first.id or not missing(condition) then group+1;
run;
--
Paige Miller
Contributor
Posts: 40

Re: regression with a by condition

Posted in reply to PaigeMiller
Thanks!
Ask a Question
Discussion stats
  • 4 replies
  • 93 views
  • 2 likes
  • 2 in conversation