BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Linlin
Lapis Lazuli | Level 10

Thank you!

data small;

format date1 mmddyy10.;

input id date1 mmddyy10.;

cards;

1 08/08/2000

1 08/08/2007

1 08/08/2012

;

data big;

format date2 mmddyy10.;

input id date2 mmddyy10.;

cards;

1 07/08/2000

1 08/08/2002

1 09/08/2004

1 10/08/2007

1 10/08/2009

1 10/08/2011

1 08/08/2012

;

data temp;

  set small;

  by id;

  if first.id then count=0;

  count+1;

  run;

  data after3(drop=count date1) select3;

    merge big temp(where=(count=3));

  by id ;

  if date2-date1>-90 then output select3;

  else output after3;

   data after2(drop=count date1) select2;

    merge after3  temp(where=(count=2) );

  by id ;

  if date2-date1>-90 then output select2;

  else output after2;

   data after1(drop=count date1) select1;

    merge after2 temp(where=(count=1));

  by id ;

  if date2-date1>-90  then output select1;

  else output after1;

  run;

  data final(drop=count);

  retain id date1 date2;

  set select1 select2 select3;

  by id;

  run;

  proc print;run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Hi Linlin, This gives you the answer you wanted:

proc sort data=small; by id descending date1; run;

data smallNext;
format lastDate1 firstDate1 yymmdd10.;
lastDate1 = '31DEC2200'd; /* The end of time... */
do until(last.id);
                set small; by id;
                firstDate1 = intnx("DAY", date1, -90);
                output;
                lastDate1 = intnx("DAY", date1, -1);
                end;
run;

proc sql;
create table want as
select
                S.id,
                S.date1,
                B.date2
from
                smallNext as S inner join
                big as B on
                               S.id = B.id and
                               B.date2 between S.firstDate1 and S.lastDate1
order by id, date1, date2;

drop table smallNext;
quit;

PG

PG

View solution in original post

10 REPLIES 10
Patrick
Opal | Level 21

What about:

data Hash_Source id_grp_cnt(keep=id count);
  set small;
  by id date1;

  if first.id then
    count=1;
  else count+1;
  output Hash_Source;

  if last.id then
    output id_grp_cnt;
run;

data want;
  if _n_=1 then
    do;
      if 0 then
        set temp;
      dcl hash h1 (dataset:'Hash_Source',ordered:'y');
      _rc=h1.defineKey('id','count');
      _rc=h1.defineData('date1');
      _rc=h1.defineDone();
    end;

  merge big id_grp_cnt;
  by id;

  do _i=count to 1 by -1;
    if h1.find(key:id,key:_i) =0 then
      do;
        if date2-date1>-90 then
          do;
            output;
            leave;
          end;
      end;
  end;
run;

...and this would be the "short code" version - but I assume that's not really what you had in mind.

proc sql;

  create table want2 as

    select s.id, s.date1, b.date2

    from small s, big b

    where b.date2-s.date1>-90

    group by s.id, b.date2

    having b.date2-s.date1 = min(b.date2-s.date1)

  ;

quit;

jakarman
Barite | Level 11

Linlin,

I am trying to reverse engineer your example. Ia ma seeing:

- I small dataset with dates.

  Could be optimized by using formats or hashing when needing a table lookup

- A big dataset that is summarized by id's

  Proc summary/means could be used to do that job.

  With in-database processing there is no need for sorting I assume. When it is ordered but not sorted it will also work.    

- The big dataset is merged with it is summarized by id's

  And then I get lost as subsetting the dataset in several steps on count....

   ... I got lost there on the intention... Please explain.

- The resulted selections are concatenated again

  Why losing the information of how it was constructed?
   

---->-- ja karman --<-----
Astounding
PROC Star

Linlin,

I think you'll find this much easier to work with if you transpose the second data set.  So start with DATE in both data sets (not DATE1 and DATE2).  Then transpose the dates in the larger data BY ID.  With one observation per ID being merged in from the larger data set, the rest should become much easier.

Good luck.

PGStats
Opal | Level 21

Hi Linlin, This gives you the answer you wanted:

proc sort data=small; by id descending date1; run;

data smallNext;
format lastDate1 firstDate1 yymmdd10.;
lastDate1 = '31DEC2200'd; /* The end of time... */
do until(last.id);
                set small; by id;
                firstDate1 = intnx("DAY", date1, -90);
                output;
                lastDate1 = intnx("DAY", date1, -1);
                end;
run;

proc sql;
create table want as
select
                S.id,
                S.date1,
                B.date2
from
                smallNext as S inner join
                big as B on
                               S.id = B.id and
                               B.date2 between S.firstDate1 and S.lastDate1
order by id, date1, date2;

drop table smallNext;
quit;

PG

PG
Linlin
Lapis Lazuli | Level 10

Hi Patrick,

Your hash solution is too diffcult for me:smileysilly:. I like your Sql solution (I changed

having b.date2-s.date1 = min(b.date2-s.date1) to

having b.date2-s.date1 = min(b.date2-s.date1) and b.id=s.id

. Thank youSmiley Happy !

Astounding, I will try to come up a solution as you suggestedSmiley Happy.

PG, I tried to rewrite your code to make sure I understand completely. I got the same result even without the ORDER statement. I wonder why :smileyconfused:.

data small;

format date1 mmddyy10.;

input id date1 mmddyy10.;

cards;

1 08/08/2000

1 08/08/2007

1 08/08/2012

2 07/05/2002

2 09/09/2004

;

data big;

format date2 mmddyy10.;

input id date2 mmddyy10.;

cards;

1 07/08/2000

1 08/08/2002

1 09/08/2004

1 10/08/2007

1 10/08/2009

1 10/08/2011

1 08/08/2012

2 06/05/2002

2 08/07/2003

2 09/09/2004

2 08/20/2005

;

run;

proc sort data=small; by id descending date1;

data snext;

format lastdate1 firstdate1 mmddyy10.;

lastdate1='01jan2020'd;

  do until (last.id);

   set small;

    by id;

   firstdate1=intnx('month',date1,-3,'s');

   output;

   lastdate1=intnx('day',date1,-1);

  end;

  run;

proc sql;

   create table want as select s.id,s.date1,b.date2

      from snext as s

    inner join

    big as b

    on s.id=b.id and (b.date2 between firstdate1 and lastdate1);

  quit;

Message was edited by: Linlin

PGStats
Opal | Level 21

At the base, SQL queries are set theory operations, they don't expect your data to be sorted, nor do they guarantee the order of the result, unless you request it explicitly with an ORDER BY clause. So, removing the ORDER BY clause gave the same result, but that order is not guaranteed. If you are looking for other removable statements, you could get rid of the format statement in snext, it was there only for debugging purposes.

PG

PG
iren_118
Calcite | Level 5

Hi, Ma'am Linlin,

You are Master and I am newbie.

Need to learn from you.

I recently post a request about creating an application of clicks, manual, dialoges....

Maybe Ma'am could help greatly.

I am Chinese proper. An old guy.

How about Ma'am?

Ksharp
Super User

My code is shorter than PG.

data small;
format date1 mmddyy10.;
input id date1 mmddyy10.;
cards;
1 08/08/2000
1 08/08/2007
1 08/08/2012
;
data big;
format date2 mmddyy10.;
input id date2 mmddyy10.;
cards;
1 07/08/2000
1 08/08/2002
1 09/08/2004
1 10/08/2007
1 10/08/2009
1 10/08/2011
1 08/08/2012
;
run;
data temp(rename=(date=date2));
 set small(rename=(date1=date) in=ina) big(rename=(date2=date) in=inb);
 by id date;
 retain _date;
 if ina then _date=date;
 if inb;
 format _date mmddyy10.;
run;
proc sort data=temp;by id descending _date;run;
data want(drop=_date);
 retain date1;
 set temp;
 if not missing(_date) then date1=_date;
  format date1 mmddyy10.;
run;
proc sort data=want;by id date1 date2;run;

Xia Keshan

iren_118
Calcite | Level 5

hi, xia,

this is bai. missed Beijing very much.

Ksharp
Super User

?? Why ? Are you Chinese guy ? I missed U.S. as well.

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1600 views
  • 10 likes
  • 7 in conversation