BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fengyuwuzu
Pyrite | Level 9

I have two datasets, in dataset A I have 3 columns: ID, event ("Yes") and event date:

id event event_date

1 Yes 01SEP2020

1 Yes 03SEP2020

1 Yes 08SEP2020

1 Yes 10SEP2020

1 Yes 17SEP2020

1 Yes 20SEP2020

1 Yes 23SEP2020

1 Yes 26SEP2020

1 Yes 29SEP2020

1 Yes 02OCT2020

...

and in dataset B I have 3 columns: ID, visit number and visit date:

id visit visit_date

1 1 03SEP2020

1 2 17SEP2020

1 3  01OCT2020

...

I want to count how many events ("Yes") in dataset A between two visit dates in dataset B, i.e. for visit 1, number of events on or before visit 1 visit_date, for visit 2, number of event after visit 1 date and on/before visit 2 date; in the above example, I want:

id visit event_count

1 1 2

1 2 3

1 3 4

 

How should I count it? Merge by ID and retain visit_date? 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Another way:

data WANT;
  set EVENTS(in=E rename=(EVENT_DATE=DATE))
      VISITS(in=V rename=(VISIT_DATE=DATE));
  by DATE;
  retain _V;
  if VISIT then do;
    output;
    _V=VISIT;
    N=0;
  end;
  else N+1;
  keep ID VISIT N;
run;
ID VISIT N
1 1 2
1 2 3
1 3 4

 

View solution in original post

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

Can you explicit what the 4 in the result is counting?

fengyuwuzu
Pyrite | Level 9

the 4 for visit #3 is counting events after visit 2 and before or on visit 3, i.e. in (17SEP2020, 01OCT2020]. In my example, there were 4 events which were after 17SEP2020 and before or on 01OCT2020:

1 Yes 20SEP2020

1 Yes 23SEP2020

1 Yes 26SEP2020

1 Yes 29SEP2020

sbxkoenk
SAS Super FREQ

Hello,

 

I had a couple of minutes spare time but could not solve quickly and need to start doing something else now.

@fengyuwuzu : people are more inclined to answer if you include your data with datalines.

Like I do below for other members that want to give it a try.

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;
/* more to come */
/* end of program */

Koen

tarheel13
Rhodochrosite | Level 12

I believe this will do what you're looking for but someone else may have a more elegant solution. 

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;

proc transpose data=B out=B_t prefix=visit;
	by ID;
	var visit_date;
	id visit;
run;

proc sql;	
	create table joined as select a.*, b.visit1, b.visit2, b.visit3
		from a as a 
		left join b_t as b
		on b.id=a.id;
quit;

data joined2;
	set joined;
	if event_date <= visit1 then flag1=1;
	if visit1 < event_date <= visit2 then flag2=1;
	if visit2 < event_date <= visit3 then flag3=1;
run;

proc sql; 
	create table summed as 
	select id, sum(flag1) as visit1, sum(flag2) as visit2, sum(flag3) as visit3
		from joined2
		group by id;
		
quit;

proc transpose data=summed out=summed_t(rename=(col1=event_count));
	var visit1 visit2 visit3;
	by id;
run;

data want;
	set summed_t;
	visit=input(substr(_name_,6),4.);
	drop _name_;
run;
fengyuwuzu
Pyrite | Level 9

I figured out one way to do this:

 

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;


proc sql;
create table test1 as
select b.id, b.visit, count(event) as count
from a as a, b as b
where a.id=b.id and event_date <=visit_date
group by b.id, visit
order by b.id, visit;
quit;

data want;
set test1;
by id visit;
count2=count-lag(count);
if first.id then count2=count;
run;
ChrisNZ
Tourmaline | Level 20

Another way:

data WANT;
  set EVENTS(in=E rename=(EVENT_DATE=DATE))
      VISITS(in=V rename=(VISIT_DATE=DATE));
  by DATE;
  retain _V;
  if VISIT then do;
    output;
    _V=VISIT;
    N=0;
  end;
  else N+1;
  keep ID VISIT N;
run;
ID VISIT N
1 1 2
1 2 3
1 3 4

 

ChrisNZ
Tourmaline | Level 20

Oh I forgot to deal with IDs. Use  by ID DATE;  and reset all with each new ID by testing first.ID 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2638 views
  • 6 likes
  • 4 in conversation