BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
rakeon3
Fluorite | Level 6

I have a dataset with nine different dates.

Every time the job is launched, if the dataset already contains nine different dates, it deletes the least recent date, ensuring that there are always nine dates:

%macro versioning(FLOW=,dt_rif=);
    proc sql noprint;
        select count(distinct data_rif) into : num_ver
            from &FLOW
        ;
    quit;
proc sql noprint; select count(*) into : num_obs from &FLOW where data_rif=&dt_rif; quit;
%if &num_ver = 9 && &num_obs=0 %then %do;
proc sql noprint; select min(data_rif) into : data_rif_to_del from &FLOW; quit; data &FLOW; set &FLOW; where data_rif ne &data_rif_to_del; run; %end; %mend;

The issue is that the number of historical dates is no longer fixed at 9; it can vary. For example, it could be 5 or 12. How can I modify the code to determine which dates (&data_rif_to_del) need to be deleted?

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Because your requirements are only partially clear to me, let me propose a much simpler way to go about this.  Start with sorting your incoming data:

 

proc sort data=&FLOW;
   by descending dt_rif;
run;

Then select how many distinct dates you would like:

%let n_select = 10;

Then because the data is sorted from highest to lowest, select the first 10 dates you encounter:

data want;
   set &FLOW;
   by descending dt_rif;
   if first.dt_rif then date_count + 1;
   if date_count > &n_select then stop;
drop date_count; run;

This may do exactly what you want.  But if not, it should give us a way to discuss what you would like to achieve here.

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

Something along the line of below should work for you.

data have;
  format dt_var date9.;
  do dt_var=today()-11 to today();
    output;
    output;
  end;
run;

%macro versioning(inds=,outds=, dt_rif=,keep_n_dist_dates=999999);

    %if %nrbquote(&outds) = %nrbquote() %then %let outds=&inds;

    data _null_;
      if 0 then set &inds(keep=&dt_rif);
      dcl hash h1(dataset:"&inds", ordered:'d');
      h1.defineKey("&dt_rif");
      h1.defineData("&dt_rif");
      h1.defineDone();
      call symputx('n_dist_dates',h1.num_items,'l');
      h1.output(dataset:'work.__dist_dates');
    run;
  
    %if &n_dist_dates>&keep_n_dist_dates %then
      %do;
        data _null_;
          set work.__dist_dates(firstobs=&keep_n_dist_dates obs=&keep_n_dist_dates);
          call symputx("min_dt_keep",&dt_rif,'l');
        run;

        data &outds;
          set &inds;
          where &dt_rif >= &min_dt_keep;
        run;
      %end;
    %else %if %nrbquote(&inds) ne %nrbquote(&outds) %then
      %do;
        data &outds;
          set &inds;
        run;
      %end;

    proc datasets lib=work nolist nowarn;
      delete __dist_dates;
    quit;

%mend;

%versioning(inds=have, dt_rif=dt_var, keep_n_dist_dates=5);

proc print data=have;
run;

 

rakeon3
Fluorite | Level 6

Thanks Patrick,

but the input dataset can have multiple records with the same date.

For example

dt_rifValue_1Value_2Desc_1
31mar2023134343a
31mar20231334324b
31mar202323334324c
31mar202399546Desc_1
30apr202354565645e
30apr20233234354f
31may2023343445g
31may20232343454h
31may2023433455y
........


The current value of the different dt_rif is 9. However, they now want to modify this value.
If they change from 9 to 10, the job must remove the oldest dates so that the output dataset contains the last ten distinct dates.

Astounding
PROC Star

Because your requirements are only partially clear to me, let me propose a much simpler way to go about this.  Start with sorting your incoming data:

 

proc sort data=&FLOW;
   by descending dt_rif;
run;

Then select how many distinct dates you would like:

%let n_select = 10;

Then because the data is sorted from highest to lowest, select the first 10 dates you encounter:

data want;
   set &FLOW;
   by descending dt_rif;
   if first.dt_rif then date_count + 1;
   if date_count > &n_select then stop;
drop date_count; run;

This may do exactly what you want.  But if not, it should give us a way to discuss what you would like to achieve here.

Patrick
Opal | Level 21

@rakeon3 wrote:

Thanks Patrick,

but the input dataset can have multiple records with the same date.

....
The current value of the different dt_rif is 9. However, they now want to modify this value.
If they change from 9 to 10, the job must remove the oldest dates so that the output dataset contains the last ten distinct dates.


@rakeon3 That's what the macro I've shared is doing. The HAVE dataset created for testing contains already two rows per date.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 355 views
  • 2 likes
  • 3 in conversation