Help using Base SAS procedures

date prblm

Reply
Occasional Contributor
Posts: 14

date prblm

hai all

my input dataset-a

cid cdate1
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010

my input dataset-b

cid cdate2
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010

i want output dataset-c

cid cdate1 cdate2
1 1jan2010 4jan2010(cdate1 1jan close to cdate2 4jan by cidwise)
1 6jan2010 9jan2010(cdate1 6jan close to cdate2  9jan by cidwise)
2 11jan2010 14jan2010(cdate1 11jan close to cdate2  14jan by cidwise)
2 18jan2010 20jan2010(cdate1 18jan close to cdate2  20jan by cidwise)
2 25jan2010 31jan2010(cdate1 25jan close to  cdate2 31jan by cidwise)

Occasional Contributor
Posts: 8

Re: date prblm

Hi,

If it is assumed that there should one-to-one correspondence between A and B, then a simple sort procedure for dataset A and B should be enough before doing a match-merge procedure.

data A;

informat CID best8.

  DATE date9.;

format CID best8.

  DATE date9.;

input CID DATE;

datalines;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

run;

data B;

informat CID best8.

  DATE date9.;

format CID best8.

  DATE date9.;

input CID DATE;

datalines;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

run;

proc sort data=A;

by CID DATE;

run;

proc sort data=B;

by CID DATE;

run;

data C;

merge A(in=a rename=(DATE=DATE_A)) B(in=b rename=(DATE=DATE_B));

by CID;

run;

Hope this helps.

Super User
Posts: 9,662

Re: date prblm

1 1jan2010 4jan2010(cdate1 1jan close to cdate2 4jan by cidwise)

1 6jan2010 9jan2010(cdate1 6jan close to cdate2  9jan by cidwise)

But the closest date for 6jan2010  is 4jan2010 ,not 9jan2010  . Are you sure it is ?

data A;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010
;
run;

 

data B;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010
;
run;
proc sql;
create table want as
 select a.*,b.date as _date
  from a,b
   where a.cid=b.cid 
    group by a.cid,a.date
     having abs(a.date-b.date)=min(abs(a.date-b.date));
quit;






Ksharp

Respected Advisor
Posts: 3,124

Re: date prblm

Another approach:

data a;

input cid$ cdate1 date9.;

format cdate1 date9.;

cards;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

;

data b;

input cid$ cdate2 date9.;

format cdate2 date9.;

cards;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

;

data want (drop=_Smiley Happy;

  if _n_=1 then do;

  if 0 then set b;

dcl hash h(dataset:'b', multidata: 'y');

h.definekey('cid');

h.definedata('cdate2');

h.definedone();

  end;

retain _min _mdt 0;

  set a;

  by cid;

  _rc=h.find();

  if _rc=0 then do;

  _min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

  do while (_rc=0);

if _min> abs(cdate1-cdate2) then do;

_min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

_rc=h.find_next();

end;

cdate2=_mdt;

run;

proc print;run;

Haikuo

Super User
Posts: 9,662

Re: date prblm

OK. If I understand what you mean.

data A;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010
;
run;

data B;
informat _CID best8.
  _DATE date9.;
format _CID best8.
  _DATE date9.;
input _CID _DATE;
_n+1;
datalines;
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010
;
run;
data want(keep=cid date _date);
 if _n_ eq 1 then do;
  if 0 then set b;
  declare hash ha(dataset:'b');
  declare hiter hi('ha');
   ha.definekey('_n');
   ha.definedata('_n','_cid','_date');
   ha.definedone();
 end;
set a;
min=99999;
do while(hi.next()=0);
 if cid=_cid and abs(date-_date) lt min then do;flag=1; n=_n; min=abs(date-_date);end;
end;
if flag then do;ha.find(key:n);output;
ha.remove(key:n); end;
run;
 

Ksharp

Respected Advisor
Posts: 3,124

Re: date prblm

Aha, if so, my previous code can be added ONE statement to do the same thing:

data a;

input cid$ cdate1 date9.;

format cdate1 date9.;

cards;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

;

data b;

input cid$ cdate2 date9.;

format cdate2 date9.;

cards;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

;

data want (drop=_Smiley Happy;

  if _n_=1 then do;

  if 0 then set b;

dcl hash h(dataset:'b', multidata: 'y');

h.definekey('cid');

h.definedata('cdate2');

h.definedone();

  end;

retain _min _mdt 0;

  set a;

  by cid;

  _rc=h.find();

  if _rc=0 then do;

  _min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

  do while (_rc=0);

if _min> abs(cdate1-cdate2) and cdate2>cdate1 then do;

_min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

_rc=h.find_next();

end;

cdate2=_mdt;

run;

proc print;run;

Haikuo

Ask a Question
Discussion stats
  • 5 replies
  • 297 views
  • 1 like
  • 4 in conversation