Find most recent 5 years' worth of data

Reply
New Contributor
Posts: 3

Find most recent 5 years' worth of data

I have found some close examples of what I am trying to do most not exactly what I need.

https://communities.sas.com/message/105322#105322

https://communities.sas.com/message/111663#111663

I have a dataset with 10 years' worth of data with certain variables I want to keep.  I only want to keep the last 5 years' worth of data.  The date variable is "year" and it is numeric (2012, 2011, 2010, etc.). I want to design it so that when a new years' worth of data is added, it shifts accordingly.  For example, the most recent 5 years' right now is 2008-2012; however, next year it will be 2009-2013.

Here is the code I have to date...without adding in the filter to get the most recent data;

PROC SQL;

     CREATE TABLE Query_enrollment AS

       SELECT t1.year, 

          t1.inst_name, 

          t1.enrollment, 

          t1.Level

      FROM enrollment t1;

QUIT;

Greatly appreciate the assistance!

Super User
Posts: 19,822

Re: Find most recent 5 years' worth of data

Does the following work? If you have a big dataset this probably isn't very good though, and you might want to type in &max_year perhaps.

proc sql noprint ;

     select max(year)  into :max_year from enrollment;

   CREATE TABLE Query_enrollment AS

       SELECT t1.year, 

          t1.inst_name, 

          t1.enrollment, 

          t1.Level

      FROM enrollment t1

     where t1.year >= &max_year-4;

quit;

OR

%let max_year=2013;

proc sql;

   CREATE TABLE Query_enrollment AS

       SELECT t1.year, 

          t1.inst_name, 

          t1.enrollment, 

          t1.Level

      FROM enrollment t1

     where t1.year >= &max_year-4;

quit;

Frequent Contributor
Posts: 87

Re: Find most recent 5 years' worth of data

Something different...

proc rank     data=have
   out= query_enrollment (where = (rank le 5))
   ties=dense
   descending ;
     var year ;
   ranks rank ;

    run;

Probably not the best method ;-)

Super Contributor
Posts: 297

Re: Find most recent 5 years' worth of data

Hi Alicia,

If the dataset you have is already sorted by YEAR you could:

%LET DSID = %SYSFUNC(OPEN(HAVE));

%LET NOBS =%SYSFUNC(ATTRN(&DSID,NOBS));

%LET RC = %SYSFUNC(FETCHOBS(&DSID,&NOBS));

%LET VAR_NUM = %SYSFUNC(VARNUM(&DSID.,YEAR));

%LET MAX_YEAR = %SYSFUNC(GETVARN(&DSID,&VAR_NUM));

%PUT &NOBS. &MAX_YEAR. &RC. &VAR_NUM.;

%LET CLOSE = %SYSFUNC(CLOSE(&DSID));

If you know that the last 5 years worth of data will appear in the dataset then it may not be necessary to evaluate the dataset to obtain the highest year value.

%LET START = %SYSFUNC(PUTN(%SYSFUNC(INTNX(YEAR,%SYSFUNC(TODAY()),-5)),YEAR4.));

%LET END   = %SYSFUNC(PUTN(%SYSFUNC(INTNX(YEAR,%SYSFUNC(TODAY()),-1)),YEAR4.));

DATA HAVE;

INPUT YEAR INST_NAME $ ENROLLMENT LEVEL;

DATALINES;

2001 BRISBANE 15 12

2002 BRISBANE 15 1

2003 BRISBANE 15 9

2004 BRISBANE 15 14

2005 BRISBANE 15 2

2006 BRISBANE 15 8

2007 BRISBANE 15 6

2008 BRISBANE 15 4

2009 BRISBANE 15 9

2010 BRISBANE 15 17

2011 BRISBANE 15 2

2012 BRISBANE 15 11

2013 BRISBANE 15 5

2001 SYDNEY 15 1

2002 SYDNEY 15 19

2003 SYDNEY 15 13

2004 SYDNEY 15 4

;

RUN;

DATA WANT ;

SET HAVE (WHERE = (&START. <= YEAR <= &END.));

RUN;

New Contributor
Posts: 3

Re: Find most recent 5 years' worth of data

Posted in reply to Scott_Mitchell

Scott,

Thank you very much!  Used these pieces below and worked exactly as needed:

%LET START = %SYSFUNC(PUTN(%SYSFUNC(INTNX(YEAR,%SYSFUNC(TODAY()),-5)),YEAR4.));

%LET END   = %SYSFUNC(PUTN(%SYSFUNC(INTNX(YEAR,%SYSFUNC(TODAY()),-1)),YEAR4.));

DATA WANT ;

SET HAVE (WHERE = (&START. <= YEAR <= &END.));

RUN;

Appreciate all the responses!

Alicia

Ask a Question
Discussion stats
  • 4 replies
  • 257 views
  • 1 like
  • 4 in conversation