BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
t1ilie
Fluorite | Level 6

Hi, all

I have 2 datasets, sample datasets below:

                          data1 :

      id        date                 sequence

234123  2012-02-04         1

234123  2012-01-26         2

234124  2012-02-24         1

234124  2015-05-05         2

                             data2:

      id      startdate      enddate             sequence

234123   2010-01-01    2012-01-25      1

234123   2012-01-28    2012-03-05      2

234123   2012-02-14    2012-05-07      3

234124   2012-01-25    2012-02-25      1

234124   2012-02-23    2012-03-24      2

234124   2015-06-24    2015-07-25      3

For the same id, I want to calculate if its date is within any corresponding start date and end date, if the date is within, then give the flag=yes, else flag=no. Each date should have one flag. The final results are below:

      id        date                 sequence    flag

234123  2012-02-04         1                  yes

234123  2012-01-26         2                  no

234124  2012-02-24         1                  yes

234124  2015-05-05         2                  no

 Thank you for any help!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data data1 ;
input      id        date  :yymmdd10.          sequence;
format date  yymmdd10.  ;
cards;
234123  2012-02-04         1
234123  2012-01-26         2
234124  2012-02-24         1
234124  2015-05-05         2
;
data  data2;
input  id      startdate    :yymmdd10.    enddate      :yymmdd10.         sequence ;
format  startdate     enddate    yymmdd10. ;
cards;
234123   2010-01-01    2012-01-25      1
234123   2012-01-28    2012-03-05      2
234123   2012-02-14    2012-05-07      3
234124   2012-01-25    2012-02-25      1
234124   2012-02-23    2012-03-24      2
234124   2015-06-24    2015-07-25      3
;

data temp;
 set data2;
 do date=startdate  to   enddate;
  output;
 end;
keep id date;
run;
data want;
if _n_=1 then do;
 declare hash h(dataset:'temp',hashexp:20);
 h.definekey('id','date');
 h.definedone();
end;
set data1;
flag=ifc( h.check()=0 ,'yes','no ');
run;

View solution in original post

9 REPLIES 9
japelin
Rhodochrosite | Level 12
How does sequencing relate to this?
If it is the same ID, same sequence, then I think the flag value for the first obs of the result (id=234123, sequence=1) should be "no".
t1ilie
Fluorite | Level 6
For the same ID, each unique date should be calculated, so I give the sequence.
t1ilie
Fluorite | Level 6
The first obs should be yes because "2012-02-04" is within " 2012-01-28 2012-03-05".
t1ilie
Fluorite | Level 6
Sorry for the confusion, the two 'sequence' variables in the two datasets are not related. You can ignore them.
andreas_lds
Jade | Level 19

@t1ilie wrote:
For the same ID, each unique date should be calculated, so I give the sequence.

So, how should "sequence" be used in this task?

Patrick
Opal | Level 21

Does below work for you?

data have_date;
  input id date:yymmdd10.;
  format date date9.;
datalines;
234123 2012-02-04
234123 2012-01-26
234124 2012-02-24
234124 2015-05-05
;

data have_daterange;
  input id startdate:yymmdd10. enddate:yymmdd10.;
  format startdate enddate date9.;
datalines;
234123 2010-01-01 2012-01-25
234123 2012-01-28 2012-03-05
234123 2012-02-14 2012-05-07
234124 2012-01-25 2012-02-25
234124 2012-02-23 2012-03-24
234124 2015-06-24 2015-07-25
;

data want;
  if _n_=1 then
    do;
      if 0 then set have_daterange;
      dcl hash h1(dataset:'have_daterange', multidata:'y');
      h1.defineKey('id');
      h1.defineData('startdate','enddate');
      h1.defineDone();
      drop startdate enddate;
    end;
  call missing(of _all_);

  set have_date;

  inrange_flg=0;
  do while(h1.do_over() eq 0);
    if startdate<=date<=enddate then
      do;
        inrange_flg=1;
        leave;
      end;
  end;
run;

...and it would help us help you if you could provide sample data via a working data step as done in above code as this then gives us the time to deal with your question instead of preparing the sample data.

andreas_lds
Jade | Level 19

Assuming that id and sequence in data1 should be used to identify the ranges to check in data2:

data want;
   set have;
   
   if _n_ = 1 then do;
      if 0 then set work.sequences;
      
      declare hash s(dataset: 'work.sequences', multidata: 'yes');
      s.defineKey('id', 'sequence');
      s.defineData('startdate', 'enddate');
      s.defineDone();
   end;
  
   rc = s.find();
   flag = 0;
   
   do while (rc = 0);
      flag = (startdate <= date <= enddate) or flag;
      rc = not (s.find_next() = 0 and not flag);
   end;
   
   drop rc;
run;

have = data1

sequences = data2

Ksharp
Super User
data data1 ;
input      id        date  :yymmdd10.          sequence;
format date  yymmdd10.  ;
cards;
234123  2012-02-04         1
234123  2012-01-26         2
234124  2012-02-24         1
234124  2015-05-05         2
;
data  data2;
input  id      startdate    :yymmdd10.    enddate      :yymmdd10.         sequence ;
format  startdate     enddate    yymmdd10. ;
cards;
234123   2010-01-01    2012-01-25      1
234123   2012-01-28    2012-03-05      2
234123   2012-02-14    2012-05-07      3
234124   2012-01-25    2012-02-25      1
234124   2012-02-23    2012-03-24      2
234124   2015-06-24    2015-07-25      3
;

data temp;
 set data2;
 do date=startdate  to   enddate;
  output;
 end;
keep id date;
run;
data want;
if _n_=1 then do;
 declare hash h(dataset:'temp',hashexp:20);
 h.definekey('id','date');
 h.definedone();
end;
set data1;
flag=ifc( h.check()=0 ,'yes','no ');
run;
t1ilie
Fluorite | Level 6
Thank you so much! It works. This is my first time posting a question. Thank you for your suggestion, I will provide better sample dataset next time if I have more questions later!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2252 views
  • 1 like
  • 5 in conversation