DATA Step, Macro, Functions and more

how to define the lastest time of publication for a time serie

Reply
Frequent Contributor
Posts: 75

how to define the lastest time of publication for a time serie

I have a hundred time series.

These time series are among a hundred of thousand of other time series published by a major european institution, either at the end of each quarter or at the end of each month.

The institution does not publish the release date for the time series because it is too cumbersome. They publish, however, the timeliness for each series.

To have the estimate release date for the time series, you need to have the end of the quarters or the end of the months and you add a certain numbers of days to it.

An example would be

Serie A: Q1 2014 = end of January(31/01/2014) plus 50 days (x represents the number of days) = 22/03/2014

Serie B: M4 2014 = end of April (30/04/2014) plus 40 days = 09/06/2014

Serie C: M2 2014 = end of February (28/02/2014) plus 57 days = 26/04/2014

Serie D: Q4 2013 = end of December (31/12/2013) plus 25 days = 25/01/2014

I've created a dictionary like that is called dico

serie_name latest_release_time available

serie A   22/03/2014    .

serie B   09/06/2014    .

serie C   26/04/2014    .

serie D   25/01/2014    .

Let's say that the today's date is 08/07/2014.

My aim would be to select latest series. In this example, it would be Serie B as it is the closest to the 08/07/2014

I don't think there is a function in SAS or SQL that allows you to check the nearest date except if you say do something like that

proc sql; update dico set available="yes" where intnx('month',release_date,1) > date(); quit;

Am I correct?

Cheers for the help.

Contributor
Posts: 45

Re: how to define the lastest time of publication for a time serie

data want;                              

set have;                              

ndays = datdif(date1,today(),'ACT/365');

                                        

proc sort data=want; by ndays;run;      

data want;

set want;

if first.ndays;

Gives the closest date to today's date;

Frequent Contributor
Posts: 75

Re: how to define the lastest time of publication for a time serie

Posted in reply to RaviKommuri

Hi Ravi,

I've tried to use your code but it does not make any sense to me.

Can you be more explicit by putting a step by step , please?

I've provided some example you can use , if you want to...

Cheers.

Contributor
Posts: 45

Re: how to define the lastest time of publication for a time serie

I am sorry, I assume FIRST.NDAYS will work....

Now I have tried with your example and tested my code too...

data have;                                   

Input serie_name $ rel_date ddmmyy10.;       

Datalines ;                                  

A 22/03/2014                                 

B 09/06/2014                                 

C 26/04/2014                                 

D 25/01/2014                                 

;                                            

                                             

data want;                                   

set have;                                   

ndays = datdif(rel_date,today(),'ACT/ACT'); 

                                             

proc sort data=want; by ndays;run;           

                                             

proc sql;                                   

select * from want having ndays = min(ndays);

quit;     

Result:

serie_name  rel_date     ndays

------------------------------

B              19883        64

Frequent Contributor
Posts: 75

Re: how to define the lastest time of publication for a time serie

Posted in reply to RaviKommuri

Cool Ravi.

No worries.

I see. You need a proc sql to get the closest value.

Super Contributor
Posts: 308

Re: how to define the lastest time of publication for a time serie

You can use the following macro which may be launched with any date parameter:

data have;
infile datalines dsd missover;
input serie_name $ @10 latest_release_time ddmmyy10. available $ ;
format latest_release_time date9.;
datalines;
serieA   22/03/2014    .
serieB   09/06/2014    .
serieC   26/04/2014    .
serieD   25/01/2014    .
;
run;

%macro latest(dt);
data int1;
set have;
latest=&dt - latest_release_time;
run;

proc sql;
create table want as
select serie_name, latest_release_time,
case (select min(latest) from int1 where latest > 0)
when latest then 'yes'
end as available from int1;
quit;
%mend latest;

%latest("23JUN2014"d)

Respected Advisor
Posts: 3,156

Re: how to define the lastest time of publication for a time serie

Here is a one step SQL, maynot be the most efficient though:

data dict;

     input serie_name: $&10. re_dt :ddmmyy10.;

     cards;

serie A 22/03/2014 

serie B 09/06/2014 

serie C 26/04/2014 

serie D 25/01/2014 

;

data have;

     do dt='01jan2014'd to '31dec2014'd;

           if ranuni(123)<0.1 then

                output;

     end;

     format dt ddmmyy10.;

run;

proc sql;

     create table want as

           select a.*, (select serie_name from dict group by a.dt having abs(datdif(a.dt,re_dt, 'act/act'))=min(abs(datdif(a.dt,re_dt,'act/act')))) as serie_name

                from have a

     ;

quit;

Haikuo

Super User
Posts: 19,770

Re: how to define the lastest time of publication for a time serie

When you say "select" series what do you mean?

Is the date variable, ie always the closest to a particular date, or always the latest date?

How big is the data, is a regular sort feasible?

Ask a Question
Discussion stats
  • 7 replies
  • 263 views
  • 0 likes
  • 5 in conversation