BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lulu3
Obsidian | Level 7

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

SQL is slightly better at this type of analysis.

 

proc sql;
create table want as
select ID, count(distinct indexdate) as num_dates
from have 
where asthma=1
group by id
having num_dates >=2;
quit;

 


@lulu3 wrote:

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;


 

 

View solution in original post

2 REPLIES 2
Shmuel
Garnet | Level 18

Is that what you are looking for:

proc sort data=have out=temp nodupkey;
  by id indexdate;
run;
data want;
 set temp;
   by id indexdate;
      if not (first.indexdate and last.indexdate); /* more then 1 claim */
      if first.indexdate; /* the earliest claim */
run;

It would be easy to check if you post the wanted results.

Reeza
Super User

SQL is slightly better at this type of analysis.

 

proc sql;
create table want as
select ID, count(distinct indexdate) as num_dates
from have 
where asthma=1
group by id
having num_dates >=2;
quit;

 


@lulu3 wrote:

Hi experts,

 

I have a dataset below and I would like to select subject who had ≥2 claims on separate dates for an asthma diagnosis, and for the final dataset, keep the earliest data as the indexdate. I had run my codes below, and it seems my codes were redundant. I was wondering if someone could help me to improve my codes?

 

Thank you in advance!

 

data have;
input id $ indexdate : mmddyy10. asthma $;
format indexdate mmddyy10.;
datalines;
201 04/10/2013 1
201 04/11/2013 1
201 04/12/2013 1
202 02/21/2013 1
202 03/07/2013 1
202 03/07/2013 1
202 03/08/2013 1
202 03/09/2013 1
202 03/20/2013 1
203 02/14/2013 1
203 02/15/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
204 01/23/2013 1
205 03/18/2013 1
206 05/01/2013 1
206 05/01/2013 1
;
run;

 

 

 

proc sort data=have out=have_nodup nodupkey;
by id indexdate;
run;

proc sort data=have_nodup nouniquekeys out=have_a;
by id;
run;

proc sort data=have_a nodupkey out=want;
by id;
run;


 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 326 views
  • 0 likes
  • 3 in conversation