BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fengyuwuzu
Pyrite | Level 9

I have two datasets, in dataset A I have 3 columns: ID, event ("Yes") and event date:

id event event_date

1 Yes 01SEP2020

1 Yes 03SEP2020

1 Yes 08SEP2020

1 Yes 10SEP2020

1 Yes 17SEP2020

1 Yes 20SEP2020

1 Yes 23SEP2020

1 Yes 26SEP2020

1 Yes 29SEP2020

1 Yes 02OCT2020

...

and in dataset B I have 3 columns: ID, visit number and visit date:

id visit visit_date

1 1 03SEP2020

1 2 17SEP2020

1 3  01OCT2020

...

I want to count how many events ("Yes") in dataset A between two visit dates in dataset B, i.e. for visit 1, number of events on or before visit 1 visit_date, for visit 2, number of event after visit 1 date and on/before visit 2 date; in the above example, I want:

id visit event_count

1 1 2

1 2 3

1 3 4

 

How should I count it? Merge by ID and retain visit_date? 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Another way:

data WANT;
  set EVENTS(in=E rename=(EVENT_DATE=DATE))
      VISITS(in=V rename=(VISIT_DATE=DATE));
  by DATE;
  retain _V;
  if VISIT then do;
    output;
    _V=VISIT;
    N=0;
  end;
  else N+1;
  keep ID VISIT N;
run;
ID VISIT N
1 1 2
1 2 3
1 3 4

 

View solution in original post

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

Can you explicit what the 4 in the result is counting?

fengyuwuzu
Pyrite | Level 9

the 4 for visit #3 is counting events after visit 2 and before or on visit 3, i.e. in (17SEP2020, 01OCT2020]. In my example, there were 4 events which were after 17SEP2020 and before or on 01OCT2020:

1 Yes 20SEP2020

1 Yes 23SEP2020

1 Yes 26SEP2020

1 Yes 29SEP2020

sbxkoenk
SAS Super FREQ

Hello,

 

I had a couple of minutes spare time but could not solve quickly and need to start doing something else now.

@fengyuwuzu : people are more inclined to answer if you include your data with datalines.

Like I do below for other members that want to give it a try.

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;
/* more to come */
/* end of program */

Koen

tarheel13
Rhodochrosite | Level 12

I believe this will do what you're looking for but someone else may have a more elegant solution. 

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;

proc transpose data=B out=B_t prefix=visit;
	by ID;
	var visit_date;
	id visit;
run;

proc sql;	
	create table joined as select a.*, b.visit1, b.visit2, b.visit3
		from a as a 
		left join b_t as b
		on b.id=a.id;
quit;

data joined2;
	set joined;
	if event_date <= visit1 then flag1=1;
	if visit1 < event_date <= visit2 then flag2=1;
	if visit2 < event_date <= visit3 then flag3=1;
run;

proc sql; 
	create table summed as 
	select id, sum(flag1) as visit1, sum(flag2) as visit2, sum(flag3) as visit3
		from joined2
		group by id;
		
quit;

proc transpose data=summed out=summed_t(rename=(col1=event_count));
	var visit1 visit2 visit3;
	by id;
run;

data want;
	set summed_t;
	visit=input(substr(_name_,6),4.);
	drop _name_;
run;
fengyuwuzu
Pyrite | Level 9

I figured out one way to do this:

 

data work.A;
input id $ event $ event_date date9.;
format event_date date9.;
cards;
1 Yes 01SEP2020
1 Yes 03SEP2020
1 Yes 08SEP2020
1 Yes 10SEP2020
1 Yes 17SEP2020
1 Yes 20SEP2020
1 Yes 23SEP2020
1 Yes 26SEP2020
1 Yes 29SEP2020
1 Yes 02OCT2020
;
run;

data work.B;
input id $ visit visit_date date9.;
format visit_date date9.;
cards;
1 1 03SEP2020
1 2 17SEP2020
1 3 01OCT2020
;
run;


proc sql;
create table test1 as
select b.id, b.visit, count(event) as count
from a as a, b as b
where a.id=b.id and event_date <=visit_date
group by b.id, visit
order by b.id, visit;
quit;

data want;
set test1;
by id visit;
count2=count-lag(count);
if first.id then count2=count;
run;
ChrisNZ
Tourmaline | Level 20

Another way:

data WANT;
  set EVENTS(in=E rename=(EVENT_DATE=DATE))
      VISITS(in=V rename=(VISIT_DATE=DATE));
  by DATE;
  retain _V;
  if VISIT then do;
    output;
    _V=VISIT;
    N=0;
  end;
  else N+1;
  keep ID VISIT N;
run;
ID VISIT N
1 1 2
1 2 3
1 3 4

 

ChrisNZ
Tourmaline | Level 20

Oh I forgot to deal with IDs. Use  by ID DATE;  and reset all with each new ID by testing first.ID 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2779 views
  • 6 likes
  • 4 in conversation