DATA Step, Macro, Functions and more

Please make my code short

Accepted Solution Solved
Reply
Super Contributor
Posts: 1,636
Accepted Solution

Please make my code short

Thank you!

data small;

format date1 mmddyy10.;

input id date1 mmddyy10.;

cards;

1 08/08/2000

1 08/08/2007

1 08/08/2012

;

data big;

format date2 mmddyy10.;

input id date2 mmddyy10.;

cards;

1 07/08/2000

1 08/08/2002

1 09/08/2004

1 10/08/2007

1 10/08/2009

1 10/08/2011

1 08/08/2012

;

data temp;

  set small;

  by id;

  if first.id then count=0;

  count+1;

  run;

  data after3(drop=count date1) select3;

    merge big temp(where=(count=3));

  by id ;

  if date2-date1>-90 then output select3;

  else output after3;

   data after2(drop=count date1) select2;

    merge after3  temp(where=(count=2) );

  by id ;

  if date2-date1>-90 then output select2;

  else output after2;

   data after1(drop=count date1) select1;

    merge after2 temp(where=(count=1));

  by id ;

  if date2-date1>-90  then output select1;

  else output after1;

  run;

  data final(drop=count);

  retain id date1 date2;

  set select1 select2 select3;

  by id;

  run;

  proc print;run;


Accepted Solutions
Solution
‎09-16-2013 10:48 AM
Respected Advisor
Posts: 4,644

Re: Please make my code short

Hi Linlin, This gives you the answer you wanted:

proc sort data=small; by id descending date1; run;

data smallNext;
format lastDate1 firstDate1 yymmdd10.;
lastDate1 = '31DEC2200'd; /* The end of time... */
do until(last.id);
                set small; by id;
                firstDate1 = intnx("DAY", date1, -90);
                output;
                lastDate1 = intnx("DAY", date1, -1);
                end;
run;

proc sql;
create table want as
select
                S.id,
                S.date1,
                B.date2
from
                smallNext as S inner join
                big as B on
                               S.id = B.id and
                               B.date2 between S.firstDate1 and S.lastDate1
order by id, date1, date2;

drop table smallNext;
quit;

PG

PG

View solution in original post


All Replies
Respected Advisor
Posts: 3,887

Re: Please make my code short

What about:

data Hash_Source id_grp_cnt(keep=id count);
  set small;
  by id date1;

  if first.id then
    count=1;
  else count+1;
  output Hash_Source;

  if last.id then
    output id_grp_cnt;
run;

data want;
  if _n_=1 then
    do;
      if 0 then
        set temp;
      dcl hash h1 (dataset:'Hash_Source',ordered:'y');
      _rc=h1.defineKey('id','count');
      _rc=h1.defineData('date1');
      _rc=h1.defineDone();
    end;

  merge big id_grp_cnt;
  by id;

  do _i=count to 1 by -1;
    if h1.find(key:id,key:_i) =0 then
      do;
        if date2-date1>-90 then
          do;
            output;
            leave;
          end;
      end;
  end;
run;

...and this would be the "short code" version - but I assume that's not really what you had in mind.

proc sql;

  create table want2 as

    select s.id, s.date1, b.date2

    from small s, big b

    where b.date2-s.date1>-90

    group by s.id, b.date2

    having b.date2-s.date1 = min(b.date2-s.date1)

  ;

quit;

Valued Guide
Posts: 3,208

Re: Please make my code short

Linlin,

I am trying to reverse engineer your example. Ia ma seeing:

- I small dataset with dates.

  Could be optimized by using formats or hashing when needing a table lookup

- A big dataset that is summarized by id's

  Proc summary/means could be used to do that job.

  With in-database processing there is no need for sorting I assume. When it is ordered but not sorted it will also work.    

- The big dataset is merged with it is summarized by id's

  And then I get lost as subsetting the dataset in several steps on count....

   ... I got lost there on the intention... Please explain.

- The resulted selections are concatenated again

  Why losing the information of how it was constructed?
   

---->-- ja karman --<-----
Super User
Posts: 5,080

Re: Please make my code short

Linlin,

I think you'll find this much easier to work with if you transpose the second data set.  So start with DATE in both data sets (not DATE1 and DATE2).  Then transpose the dates in the larger data BY ID.  With one observation per ID being merged in from the larger data set, the rest should become much easier.

Good luck.

Solution
‎09-16-2013 10:48 AM
Respected Advisor
Posts: 4,644

Re: Please make my code short

Hi Linlin, This gives you the answer you wanted:

proc sort data=small; by id descending date1; run;

data smallNext;
format lastDate1 firstDate1 yymmdd10.;
lastDate1 = '31DEC2200'd; /* The end of time... */
do until(last.id);
                set small; by id;
                firstDate1 = intnx("DAY", date1, -90);
                output;
                lastDate1 = intnx("DAY", date1, -1);
                end;
run;

proc sql;
create table want as
select
                S.id,
                S.date1,
                B.date2
from
                smallNext as S inner join
                big as B on
                               S.id = B.id and
                               B.date2 between S.firstDate1 and S.lastDate1
order by id, date1, date2;

drop table smallNext;
quit;

PG

PG
Super Contributor
Posts: 1,636

Re: Please make my code short

Hi Patrick,

Your hash solution is too diffcult for me:smileysilly:. I like your Sql solution (I changed

having b.date2-s.date1 = min(b.date2-s.date1) to

having b.date2-s.date1 = min(b.date2-s.date1) and b.id=s.id

. Thank youSmiley Happy !

Astounding, I will try to come up a solution as you suggestedSmiley Happy.

PG, I tried to rewrite your code to make sure I understand completely. I got the same result even without the ORDER statement. I wonder why :smileyconfused:.

data small;

format date1 mmddyy10.;

input id date1 mmddyy10.;

cards;

1 08/08/2000

1 08/08/2007

1 08/08/2012

2 07/05/2002

2 09/09/2004

;

data big;

format date2 mmddyy10.;

input id date2 mmddyy10.;

cards;

1 07/08/2000

1 08/08/2002

1 09/08/2004

1 10/08/2007

1 10/08/2009

1 10/08/2011

1 08/08/2012

2 06/05/2002

2 08/07/2003

2 09/09/2004

2 08/20/2005

;

run;

proc sort data=small; by id descending date1;

data snext;

format lastdate1 firstdate1 mmddyy10.;

lastdate1='01jan2020'd;

  do until (last.id);

   set small;

    by id;

   firstdate1=intnx('month',date1,-3,'s');

   output;

   lastdate1=intnx('day',date1,-1);

  end;

  run;

proc sql;

   create table want as select s.id,s.date1,b.date2

      from snext as s

    inner join

    big as b

    on s.id=b.id and (b.date2 between firstdate1 and lastdate1);

  quit;

Message was edited by: Linlin

Respected Advisor
Posts: 4,644

Re: Please make my code short

At the base, SQL queries are set theory operations, they don't expect your data to be sorted, nor do they guarantee the order of the result, unless you request it explicitly with an ORDER BY clause. So, removing the ORDER BY clause gave the same result, but that order is not guaranteed. If you are looking for other removable statements, you could get rid of the format statement in snext, it was there only for debugging purposes.

PG

PG
Occasional Contributor
Posts: 13

Re: Please make my code short

Hi, Ma'am Linlin,

You are Master and I am newbie.

Need to learn from you.

I recently post a request about creating an application of clicks, manual, dialoges....

Maybe Ma'am could help greatly.

I am Chinese proper. An old guy.

How about Ma'am?

Super User
Posts: 9,671

Re: Please make my code short

My code is shorter than PG.

data small;
format date1 mmddyy10.;
input id date1 mmddyy10.;
cards;
1 08/08/2000
1 08/08/2007
1 08/08/2012
;
data big;
format date2 mmddyy10.;
input id date2 mmddyy10.;
cards;
1 07/08/2000
1 08/08/2002
1 09/08/2004
1 10/08/2007
1 10/08/2009
1 10/08/2011
1 08/08/2012
;
run;
data temp(rename=(date=date2));
 set small(rename=(date1=date) in=ina) big(rename=(date2=date) in=inb);
 by id date;
 retain _date;
 if ina then _date=date;
 if inb;
 format _date mmddyy10.;
run;
proc sort data=temp;by id descending _date;run;
data want(drop=_date);
 retain date1;
 set temp;
 if not missing(_date) then date1=_date;
  format date1 mmddyy10.;
run;
proc sort data=want;by id date1 date2;run;

Xia Keshan

Occasional Contributor
Posts: 13

Re: Please make my code short

hi, xia,

this is bai. missed Beijing very much.

Super User
Posts: 9,671

Re: Please make my code short

?? Why ? Are you Chinese guy ? I missed U.S. as well.

Xia Keshan

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 561 views
  • 10 likes
  • 7 in conversation