BookmarkSubscribeRSS Feed
whs278
Quartz | Level 8

Hello,

 

I am trying to apply a triangular kernel weight for a regression discontinuity design for a given bandwidth. 

 

To Illustrate what I want to accomplish, I generated a fake data set and wrote a macro.

 

However, I was wondering if there was any canned procedure in SAS to accomplish the same thing.

 

Thanks for the help.

 

Sincerely,

 

Bill

 

DATA TEST;
DO I = 1 TO 1000;

X = RAND("NORMAL", 100, 15);

IF X GT 115 THEN DO;
Y = X + 15 + RAND("NORMAL", 0 , 3);
Z = 1;
END;
ELSE DO;
Y = X + RAND("NORMAL", 0 , 3);
Z = 0;
END;

OUTPUT;
END;


RUN;

PROC SGPLOT DATA = TEST;
SCATTER X=X Y=Y / GROUP = Z;
LINEPARM X=0 Y=0 SLOPE=1 / LINEATTRS=(COLOR = BLUE);
LINEPARM X=0 Y=15 SLOPE=1 / LINEATTRS=(COLOR = RED);
REFLINE 115 / AXIS=X LINEATTRS=(PATTERN = DASH);
XAXIS MIN = 50;
YAXIS MIN = 40;
RUN;

 

%MACRO KERNEL_WT(DATA, X, CENTER, BW);

 

DATA WTD_DATA;
SET &DATA;


IF (&X GT &CENTER - &BW) AND (&X LT &CENTER + &BW) THEN DO;
KERNEL = (1 - ABS( (&X - &CENTER) / &BW)) ;
END;
ELSE KERNEL = 0;

RUN;

PROC SQL NOPRINT;
SELECT SUM(KERNEL) INTO :SUM_KERNEL
FROM WTD_DATA;
QUIT;

DATA WTD_DATA;
SET WTD_DATA;
WEIGHT = KERNEL / &SUM_KERNEL;
RUN;


%MEND KERNEL_WT;

%KERNEL_WT(TEST, X, 115, 10);


PROC SURVEYREG DATA = WTD_DATA;
MODEL Y = X Z / SOLUTION;
WEIGHT WEIGHT;
RUN;

5 REPLIES 5
mkeintz
PROC Star

Two observations:

 

  1. Please don't provide all your sas code in uppercase letters.  It's harder to read.  Just do lowercase, with occasional selective use of uppercase.

  2. You can do the job of the macro KERNEL in a single data step,  as in  (untested):
%MACRO KERNEL_WT(data,x,center,bw);
  data wtd_data;
    set &data  (in=firstpass)
        &data  (in=secondpass);
    if -&bw < (&x-&center) < &bw then  kernel = (1 - abs( (&x - &center) / &bw)) ;
    else kernel=0;

    if firstpass=1 then sum_kernel+kernel;
    if secondpass=1;
    weight=kernel/ sum_kernel;
run;
%MEND KERNEL_WT;

Note the SET statement reads the dataset twice.  The first time (firstpass=1) to generate SUM_KERNEL, and the second time to generate the weights and output the results.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
whs278
Quartz | Level 8
Thank you for the tip. This is a definite improvement over my original macro.

However, I was still wondering if there is a triangular kernel procedure in SAS already available? I am currently working on regression discontinuity project and I want to add more weight to observations close to the threshold.
Rick_SAS
SAS Super FREQ

It looks like you want to specify the location of the discontinuity (CENTER) and the scale (BW) and manufacture a triangular weight function. If so, the answer is no, there is no built-in function. As shown, it takes two lines in the DATA step to implement the transformation.

 

SAS does support using triangular kernels to estimate the density of a distribution. See the K=TRIANGULAR option on the HISTOGRAM statement of PROC UNIVARIATE.

 

What are you trying to accomplish with this kernel weight? For example, if you are trying to model data that have a jump discontinuity, we might be able to suggest better ways,

whs278
Quartz | Level 8
Hi Rick,

Thanks for the response. I am trying to estimate a treatment effect at the discontinuity using local linear regression. I wanted to use a triangular kernel function to weight observations closer to the discontinuity higher than those farther away. My concern is that observations farther away from the threshold may be biasing the estimate.

What I have been trying so far is using the code above to create a variable containing weights in the data set. Then I have been using PROC SURVEYREG with the WEIGHT Statement using only observations within the bandwidth.
Rick_SAS
SAS Super FREQ

I see. You are trying to perform "kernel regression", which is a form of local regression.

 

Your current implementation of the technique will only give you the local regression at a single location.

 

See the article "Kernel regression in SAS" for an implementation. However, a better method is to use loess regression ,

which is available by using PROC LOESS. Here is a loess fit for your data:

 

proc sgplot data=TEST;
loess X=X Y=Y;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1121 views
  • 1 like
  • 3 in conversation