BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lulu3
Obsidian | Level 7

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

SQL is slightly better at this type of analysis.

 

proc sql;
create table want as
select ID, count(distinct indexdate) as num_dates
from have 
where asthma=1
group by id
having num_dates >=2;
quit;

 


@lulu3 wrote:

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;


 

 

View solution in original post

2 REPLIES 2
Shmuel
Garnet | Level 18

Is that what you are looking for:

proc sort data=have out=temp nodupkey;
  by id indexdate;
run;
data want;
 set temp;
   by id indexdate;
      if not (first.indexdate and last.indexdate); /* more then 1 claim */
      if first.indexdate; /* the earliest claim */
run;

It would be easy to check if you post the wanted results.

Reeza
Super User

SQL is slightly better at this type of analysis.

 

proc sql;
create table want as
select ID, count(distinct indexdate) as num_dates
from have 
where asthma=1
group by id
having num_dates >=2;
quit;

 


@lulu3 wrote:

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;


 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 722 views
  • 0 likes
  • 3 in conversation