BookmarkSubscribeRSS Feed
Xiyuan
Fluorite | Level 6

Hi all, I have a problem in make it faster when regressing using rolling window 

first note:

1. The date has no definite cycle, but commonly two or four records a year, but allow a large gap

2. Industry can be not just a and b

3. for each date and each industry, y is product's price, there are N products, so y has three levels: y_(date,industry,product)

4. what we need: we need regress y on x of each each and industry using window rolling

5. for example, if we use three years as window, we can calculate beta=-6.3352 (with stderr=0.181942) of industry A in 2009/12/31 using the observation 3-22 (in 2007/1/1 to 2009/12/31), we require the unique date records of industry A within 3 years should at least 3*2=6, otherwise we will not estimate it, e.g., we don’t have enough observation of industry A in 2008/12/31, because there are only 5 unique date records only in 2006/1/1 to 2008/12/31

6. the rolling window can also be 1 year (require at least 2 unique date records) and 5 years (at least 5*2=10)

7 REPLIES 7
PGStats
Opal | Level 21

Here is one way to do this, using a SQL view to generate the regression windows:

 

/* Window size in years */
%let window=2;

proc sql;
create view temp as
select 
    a.industry,
    a.product,
    a.date as refDate,
    b.date,
    b.x, b.y
from 
    have as a inner join
    have as b on a.industry=b.industry and a.product=b.product and
        b.date between intnx("year", a.date, -&window., "same") and a.date
group by a.industry, a.product, a.date
having count(distinct b.date) >= 2*&window.
order by industry, product, refDate, date;
quit;

/* For testing only: */
proc sql outobs=50;
select * from temp;
quit;

proc reg data=temp outest=want noprint;
by industry product refDate;
model y = x;
run;

proc print data=want(obs=5) noobs; 
var industry product refDate Intercept x; run;

PGStats_0-1648499412123.png

 

PG
Xiyuan
Fluorite | Level 6
thanks this works well, but one question is I have so many observations, if I merge and regress for each block, the sample can be huger than the sky ,so it's better to use a loop to handle this problem
PGStats
Opal | Level 21

This version doesn't (in theory) create any huge dataset (at least it doesn't need to). Only testing will tell :

 

proc sort data=have out=haveSort(keep=industry product date x y); 
by industry product date; 
run;

data haveObs;
set haveSort;
obs = _n_;
run;

/* Window size in years */
%let window=2;

proc sql;
create table temp as
select 
    a.industry,
    a.product,
    a.date as refDate,
    min(b.obs) as startObs,
    max(b.obs) as endObs
from 
    haveObs as a inner join
    haveObs as b on a.industry=b.industry and a.product=b.product and
        b.date between intnx("year", a.date, -&window., "same") and a.date
group by a.industry, a.product, a.date
having count(distinct b.date) >= 2*&window.;
quit;

data inter / view=inter;
set temp;
do point = startObs to endObs;
    set haveObs point=point;
    output;
    end;
keep industry product refDate date x y;
run;

proc reg data=inter outest=want noprint;
by industry product refDate;
model y = x;
run;

PG
PaigeMiller
Diamond | Level 26

PROC EXPAND can create a data set with regular date intervals from your data set that doesn't have regular date intervals. It's one way to proceed.

 

Many of us will not download your Excel file, as it is a security risk. Please provide a portion of your data as text via SAS data step code (instructions) and not as Excel file, not as screen captures.

--
Paige Miller
Xiyuan
Fluorite | Level 6

sorry, here is the txt.file, please check whether this works

PaigeMiller
Diamond | Level 26

I think this example using PROC EXPAND is where you should start. Once you handle the irregular dates, then you can try the rolling regressions.

 

https://documentation.sas.com/doc/en/pgmmvacdc/9.4/etsug/etsug_expand_examples03.htm

--
Paige Miller
Xiyuan
Fluorite | Level 6
Sorry the beta should be -6.349044 (with stderr= 0.179827 ) of industry A in 2009/12/31 using the observation 3-22 (in 2007/1/1 to 2009/12/31); but anyhow, i want to execute a macro, say %regfun(dependent= independent= ) for each industry and date, how to insert it?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1341 views
  • 0 likes
  • 3 in conversation