BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
tan-wongv
Obsidian | Level 7

I have a data set as follow. Dx_Date refers to the date of each visit. I would like to keep only sample who have >= 3 visits.
In this case, only sample with DMRN 10 and 13 will be kept. What code should I use? Thank you

DMRN DX_DATE Age
2 09MAY2022 73
4 09APR2021 58
10 01JAN2017 60
10 03FEB2018 61
10 06JUN2018 61
13 08FEB2017 70
13 09JUL2017 70
13 12MAR2019 72
13 25DEC2019 72
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Below two options how to achieve this. 

For option 1: If your source data isn't already sorted by DMRN and DX_DATE then you need to add a Proc Sort.

data have;
  input DMRN DX_DATE:date9. Age;
  format DX_DATE date9.;
datalines;
2 09MAY2022 73
4 09APR2021 58
10 01JAN2017 60
10 03FEB2018 61
10 06JUN2018 61
13 08FEB2017 70
13 09JUL2017 70
13 12MAR2019 72
13 25DEC2019 72
;

/* option 1 */
data visit_cnt(keep=DMRN visits);
  set have;
  by DMRN DX_DATE;
  if first.dmrn then visits=1;
  else visits+1;
  if last.dmrn then output;
run;

data want1;
  merge have visit_cnt;
  by dmrn;
  if visits>=3;
run;

/* option 2 */
proc sql;
  create table want2 as
  select *, count(*) as visits
  from have
  group by dmrn
  having visits>=3
  order by dmrn, dx_date
  ;
quit;

 

View solution in original post

2 REPLIES 2
Patrick
Opal | Level 21

Below two options how to achieve this. 

For option 1: If your source data isn't already sorted by DMRN and DX_DATE then you need to add a Proc Sort.

data have;
  input DMRN DX_DATE:date9. Age;
  format DX_DATE date9.;
datalines;
2 09MAY2022 73
4 09APR2021 58
10 01JAN2017 60
10 03FEB2018 61
10 06JUN2018 61
13 08FEB2017 70
13 09JUL2017 70
13 12MAR2019 72
13 25DEC2019 72
;

/* option 1 */
data visit_cnt(keep=DMRN visits);
  set have;
  by DMRN DX_DATE;
  if first.dmrn then visits=1;
  else visits+1;
  if last.dmrn then output;
run;

data want1;
  merge have visit_cnt;
  by dmrn;
  if visits>=3;
run;

/* option 2 */
proc sql;
  create table want2 as
  select *, count(*) as visits
  from have
  group by dmrn
  having visits>=3
  order by dmrn, dx_date
  ;
quit;

 

tan-wongv
Obsidian | Level 7
Thank you so much!! I’m always grateful for your help 🙂