Hi,
I am trying to set a 3-year moving window starting at a given date by the following code,
data have; infile cards dsd dlm=","; input ID $ publn_date :YYMMDD10. earliest_publn_date :YYMMDD10. ; format publn_date YYMMDDD10. earliest_publn_date YYMMDDD10. ; cards; 1,1993-09-05,1990-09-04
2,1993-09-04,1990-09-04
3,1993-09-03,1990-09-04
4,1992-09-05,1990-09-04
5,1992-09-04,1990-09-04
6,1992-09-03,1990-09-04
7,1990-09-04,1990-09-04
8,1990-09-03,1990-09-04 ;;;; run; proc sql; create table want as select distinct * from have WHERE publn_date <= intnx('year', earliest_publn_date,4) /*a 3-year moving window starting at the earliest publication date */ AND publn_date >= earliest_publn_date ; quit;
it meanings, for each observation,
if 1) 'publn_date' >= the 'earliest_publn_date' and
2) 'publn_date' <= 'earliest_publn_date' + 3-year,
then it should be remain.
By using the earliest_publn_date '1990-09-04' as an example,
1) the earliest 'publn_date' should be as same as earliest_publn_date '1990-09-04',
2) the latest 'publn_date' should be '1993-09-04', it is the 3 years after '1990-09-04',
Namely, I expect to get table want
2,1993-09-04,1990-09-04 3,1993-09-03,1990-09-04 4,1992-09-05,1990-09-04 5,1992-09-04,1990-09-04 6,1992-09-03,1990-09-04 7,1990-09-04,1990-09-04
I expect to exclude obs 8 because publn_date 1990-09-03 is earlier than earliest_publn_date 1990-09-04.
I expect to exclude obs 1 because publn_date 1993-09-04 is the 3 years after '1990-09-04', and 1993-09-05 is later than it.
I failed by using my code, could you please give me some suggestions about this?
thanks in advance.
A query with GROUP BY and HAVING clauses will keep the earliest dates together and restrict by publication date.
proc sql; create table want as select * from have group by earliest_publn_date having publn_date between earliest_publn_date and intnx('year', min(earliest_publn_date), 3, 'S') order by id ; quit;
If the data has other ids with an earliest date whose 3-year window overlaps a different ids publication date (in other words an id can be in more than one earliest publication date 3-year window) you may end up needing a self join.
I have to say that I don't understand the description of the problem and how the desired output is generated. Can you go through this in a little more detail?
@Alexxxxxxx wrote:
I have rewritten my question, please have a look
I am still lost. I don't understand. Your output data set is the same as the input data set except that rows 1 and 8 are removed. It also seems as if the term "3-year moving window" means something different to you than it does to me, but I don't know what you mean.
A query with GROUP BY and HAVING clauses will keep the earliest dates together and restrict by publication date.
proc sql; create table want as select * from have group by earliest_publn_date having publn_date between earliest_publn_date and intnx('year', min(earliest_publn_date), 3, 'S') order by id ; quit;
If the data has other ids with an earliest date whose 3-year window overlaps a different ids publication date (in other words an id can be in more than one earliest publication date 3-year window) you may end up needing a self join.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.