BookmarkSubscribeRSS Feed
sasemp999
Calcite | Level 5

hai all

my input dataset-a

cid cdate1
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010

my input dataset-b

cid cdate2
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010

i want output dataset-c

cid cdate1 cdate2
1 1jan2010 4jan2010(cdate1 1jan close to cdate2 4jan by cidwise)
1 6jan2010 9jan2010(cdate1 6jan close to cdate2  9jan by cidwise)
2 11jan2010 14jan2010(cdate1 11jan close to cdate2  14jan by cidwise)
2 18jan2010 20jan2010(cdate1 18jan close to cdate2  20jan by cidwise)
2 25jan2010 31jan2010(cdate1 25jan close to  cdate2 31jan by cidwise)

5 REPLIES 5
k_laiz
Calcite | Level 5

Hi,

If it is assumed that there should one-to-one correspondence between A and B, then a simple sort procedure for dataset A and B should be enough before doing a match-merge procedure.

data A;

informat CID best8.

  DATE date9.;

format CID best8.

  DATE date9.;

input CID DATE;

datalines;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

run;

data B;

informat CID best8.

  DATE date9.;

format CID best8.

  DATE date9.;

input CID DATE;

datalines;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

run;

proc sort data=A;

by CID DATE;

run;

proc sort data=B;

by CID DATE;

run;

data C;

merge A(in=a rename=(DATE=DATE_A)) B(in=b rename=(DATE=DATE_B));

by CID;

run;

Hope this helps.

Ksharp
Super User

1 1jan2010 4jan2010(cdate1 1jan close to cdate2 4jan by cidwise)

1 6jan2010 9jan2010(cdate1 6jan close to cdate2  9jan by cidwise)

But the closest date for 6jan2010  is 4jan2010 ,not 9jan2010  . Are you sure it is ?

data A;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010
;
run;

 

data B;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010
;
run;
proc sql;
create table want as
 select a.*,b.date as _date
  from a,b
   where a.cid=b.cid 
    group by a.cid,a.date
     having abs(a.date-b.date)=min(abs(a.date-b.date));
quit;






Ksharp

Haikuo
Onyx | Level 15

Another approach:

data a;

input cid$ cdate1 date9.;

format cdate1 date9.;

cards;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

;

data b;

input cid$ cdate2 date9.;

format cdate2 date9.;

cards;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

;

data want (drop=_:);

  if _n_=1 then do;

  if 0 then set b;

dcl hash h(dataset:'b', multidata: 'y');

h.definekey('cid');

h.definedata('cdate2');

h.definedone();

  end;

retain _min _mdt 0;

  set a;

  by cid;

  _rc=h.find();

  if _rc=0 then do;

  _min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

  do while (_rc=0);

if _min> abs(cdate1-cdate2) then do;

_min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

_rc=h.find_next();

end;

cdate2=_mdt;

run;

proc print;run;

Haikuo

Ksharp
Super User

OK. If I understand what you mean.

data A;
informat CID best8.
  DATE date9.;
format CID best8.
  DATE date9.;
input CID DATE;
datalines;
1 1jan2010
1 6jan2010
2 11jan2010
2 18jan2010
2 25jan2010
;
run;

data B;
informat _CID best8.
  _DATE date9.;
format _CID best8.
  _DATE date9.;
input _CID _DATE;
_n+1;
datalines;
1 9jan2010
1 4jan2010
2 14jan2010
2 20jan2010
2 31jan2010
;
run;
data want(keep=cid date _date);
 if _n_ eq 1 then do;
  if 0 then set b;
  declare hash ha(dataset:'b');
  declare hiter hi('ha');
   ha.definekey('_n');
   ha.definedata('_n','_cid','_date');
   ha.definedone();
 end;
set a;
min=99999;
do while(hi.next()=0);
 if cid=_cid and abs(date-_date) lt min then do;flag=1; n=_n; min=abs(date-_date);end;
end;
if flag then do;ha.find(key:n);output;
ha.remove(key:n); end;
run;
 

Ksharp

Haikuo
Onyx | Level 15

Aha, if so, my previous code can be added ONE statement to do the same thing:

data a;

input cid$ cdate1 date9.;

format cdate1 date9.;

cards;

1 1jan2010

1 6jan2010

2 11jan2010

2 18jan2010

2 25jan2010

;

data b;

input cid$ cdate2 date9.;

format cdate2 date9.;

cards;

1 9jan2010

1 4jan2010

2 14jan2010

2 20jan2010

2 31jan2010

;

data want (drop=_:);

  if _n_=1 then do;

  if 0 then set b;

dcl hash h(dataset:'b', multidata: 'y');

h.definekey('cid');

h.definedata('cdate2');

h.definedone();

  end;

retain _min _mdt 0;

  set a;

  by cid;

  _rc=h.find();

  if _rc=0 then do;

  _min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

  do while (_rc=0);

if _min> abs(cdate1-cdate2) and cdate2>cdate1 then do;

_min=abs(cdate1-cdate2);

_mdt=cdate2;

end;

_rc=h.find_next();

end;

cdate2=_mdt;

run;

proc print;run;

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 866 views
  • 1 like
  • 4 in conversation