BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JME1
Obsidian | Level 7

Hi all,

 

I'm working with claims data and there are repeated observations. I was interested in counting the frequency of arthritis in each individual. But I have 2 questions:

1. Since it's claims data, diagnosis that happened on the same day will be repeated in the data set, is there a way to ask SAS to keep only the first service date/observation if the diagnosis happened on the same day (svcdate) for each enrolid.

 

2. If I do solve question 1 above, and have distinct service dates with the 'diff' variable which is the difference of the svcdate and first_Svcdate. If the 'diff' is consecutive as it currently is for enrolid 3 - I want to count it as 1, whereas with enrolid 1 the first 3 observations I want to count as 1 and since there is a gap, I want to count the one with a diff of 5 as a second diagnosis.

 

I've shown what the data currently looks like and how I want it to look below.

 

Please I'd appreciate any help or guidance!

 

 

This is what the data looks like
data Trial;

input enrolid arthritis  svcdate      first_svcdate  diff;
cards;
        1            1          1/10/2018    1/10/2018       0
        1            1          1/10/2018    1/10/2018       0
        1            1          1/11/2018    1/10/2018       1  
        1            1          1/15/2018    1/10/2018       5
        2            1          2/15/2018    2/15/2018       0
        2            1          2/15/2018    2/15/2018       0
        2            1          2/18/2019    2/15/2018       3
        2            1          3/14/2019    2/15/2018       27
        3            1          3/15/2012    3/15/2012       0 
        3            1          3/16/2012    3/15/2012       1
        3            1          3/17/2012    3/15/2012       2
;

 

This is how I want it to look like. 

 

data Trial;

input enrolid count ;
cards;
        1          2
        2          3 

        3          1

;

 

1 ACCEPTED SOLUTION

Accepted Solutions
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @JME1 

 

Here is a solution that works without the diff variable. The counter is set to 1 In the first observation for each enrolid, and in the following observations svcdate is compared to the previous svcdate, and 1 is added to the counter only if there is a gap. The result is written after the last record for each enrollid.

 

data Trial;
input enrolid arthritis svcdate mmddyy10. first_svcdate mmddyy10. diff;
format svcdate first_svcdate date9.;
cards;
1 1 1/10/2018 1/10/2018 0
1 1 1/10/2018 1/10/2018 0
1 1 1/11/2018 1/10/2018 1  
1 1 1/15/2018 1/10/2018 5
2 1 2/15/2018 2/15/2018 0
2 1 2/15/2018 2/15/2018 0
2 1 2/18/2019 2/15/2018 3
2 1 3/14/2019 2/15/2018 27
3 1 3/15/2012 3/15/2012 0 
3 1 3/16/2012 3/15/2012 1
3 1 3/17/2012 3/15/2012 2
;
run;

data t2 (keep=enrolid count); set trial (drop=diff);
 	by enrolid;
	retain count;
	ldate = lag(svcdate);
	if first.enrolid then count = 1;
	else if svcdate-ldate > 1 then count + 1;
	if last.enrolid then output;
run;

View solution in original post

2 REPLIES 2
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @JME1 

 

Here is a solution that works without the diff variable. The counter is set to 1 In the first observation for each enrolid, and in the following observations svcdate is compared to the previous svcdate, and 1 is added to the counter only if there is a gap. The result is written after the last record for each enrollid.

 

data Trial;
input enrolid arthritis svcdate mmddyy10. first_svcdate mmddyy10. diff;
format svcdate first_svcdate date9.;
cards;
1 1 1/10/2018 1/10/2018 0
1 1 1/10/2018 1/10/2018 0
1 1 1/11/2018 1/10/2018 1  
1 1 1/15/2018 1/10/2018 5
2 1 2/15/2018 2/15/2018 0
2 1 2/15/2018 2/15/2018 0
2 1 2/18/2019 2/15/2018 3
2 1 3/14/2019 2/15/2018 27
3 1 3/15/2012 3/15/2012 0 
3 1 3/16/2012 3/15/2012 1
3 1 3/17/2012 3/15/2012 2
;
run;

data t2 (keep=enrolid count); set trial (drop=diff);
 	by enrolid;
	retain count;
	ldate = lag(svcdate);
	if first.enrolid then count = 1;
	else if svcdate-ldate > 1 then count + 1;
	if last.enrolid then output;
run;
JME1
Obsidian | Level 7

Thank you so  much @ErikLund_Jensen ! It worked perfectly. 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1485 views
  • 0 likes
  • 2 in conversation