BookmarkSubscribeRSS Feed
Xiyuan
Fluorite | Level 6

Hi all, I have a problem in make it faster when regressing using rolling window 

first note:

1. The date has no definite cycle, but commonly two or four records a year, but allow a large gap

2. Industry can be not just a and b

3. for each date and each industry, y is product's price, there are N products, so y has three levels: y_(date,industry,product)

4. what we need: we need regress y on x of each each and industry using window rolling

5. for example, if we use three years as window, we can calculate beta=-6.3352 (with stderr=0.181942) of industry A in 2009/12/31 using the observation 3-22 (in 2007/1/1 to 2009/12/31), we require the unique date records of industry A within 3 years should at least 3*2=6, otherwise we will not estimate it, e.g., we don’t have enough observation of industry A in 2008/12/31, because there are only 5 unique date records only in 2006/1/1 to 2008/12/31

6. the rolling window can also be 1 year (require at least 2 unique date records) and 5 years (at least 5*2=10)

7 REPLIES 7
PGStats
Opal | Level 21

Here is one way to do this, using a SQL view to generate the regression windows:

 

/* Window size in years */
%let window=2;

proc sql;
create view temp as
select 
    a.industry,
    a.product,
    a.date as refDate,
    b.date,
    b.x, b.y
from 
    have as a inner join
    have as b on a.industry=b.industry and a.product=b.product and
        b.date between intnx("year", a.date, -&window., "same") and a.date
group by a.industry, a.product, a.date
having count(distinct b.date) >= 2*&window.
order by industry, product, refDate, date;
quit;

/* For testing only: */
proc sql outobs=50;
select * from temp;
quit;

proc reg data=temp outest=want noprint;
by industry product refDate;
model y = x;
run;

proc print data=want(obs=5) noobs; 
var industry product refDate Intercept x; run;

PGStats_0-1648499412123.png

 

PG
Xiyuan
Fluorite | Level 6
thanks this works well, but one question is I have so many observations, if I merge and regress for each block, the sample can be huger than the sky ,so it's better to use a loop to handle this problem
PGStats
Opal | Level 21

This version doesn't (in theory) create any huge dataset (at least it doesn't need to). Only testing will tell :

 

proc sort data=have out=haveSort(keep=industry product date x y); 
by industry product date; 
run;

data haveObs;
set haveSort;
obs = _n_;
run;

/* Window size in years */
%let window=2;

proc sql;
create table temp as
select 
    a.industry,
    a.product,
    a.date as refDate,
    min(b.obs) as startObs,
    max(b.obs) as endObs
from 
    haveObs as a inner join
    haveObs as b on a.industry=b.industry and a.product=b.product and
        b.date between intnx("year", a.date, -&window., "same") and a.date
group by a.industry, a.product, a.date
having count(distinct b.date) >= 2*&window.;
quit;

data inter / view=inter;
set temp;
do point = startObs to endObs;
    set haveObs point=point;
    output;
    end;
keep industry product refDate date x y;
run;

proc reg data=inter outest=want noprint;
by industry product refDate;
model y = x;
run;

PG
PaigeMiller
Diamond | Level 26

PROC EXPAND can create a data set with regular date intervals from your data set that doesn't have regular date intervals. It's one way to proceed.

 

Many of us will not download your Excel file, as it is a security risk. Please provide a portion of your data as text via SAS data step code (instructions) and not as Excel file, not as screen captures.

--
Paige Miller
Xiyuan
Fluorite | Level 6

sorry, here is the txt.file, please check whether this works

PaigeMiller
Diamond | Level 26

I think this example using PROC EXPAND is where you should start. Once you handle the irregular dates, then you can try the rolling regressions.

 

https://documentation.sas.com/doc/en/pgmmvacdc/9.4/etsug/etsug_expand_examples03.htm

--
Paige Miller
Xiyuan
Fluorite | Level 6
Sorry the beta should be -6.349044 (with stderr= 0.179827 ) of industry A in 2009/12/31 using the observation 3-22 (in 2007/1/1 to 2009/12/31); but anyhow, i want to execute a macro, say %regfun(dependent= independent= ) for each industry and date, how to insert it?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2675 views
  • 0 likes
  • 3 in conversation