BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fouke_o
Fluorite | Level 6

Hi all, 

 

I have data that looks like this:

 

        Date          Date               
 ID   registration  admission    Echo     

 A    01JAN2001    01JAN2001      1         
 A    07JAN2001        .          0         
 A    15JAN2001        .          1         
 A    15MAR2001        .          1         
 B    07MAR2005        .          1       
 B    10MAR2005    10MAR2005      0         
 B    21MAR2005    21MAR2005      1          

 

 I need to flag the sum of the echo's within 42 days from admission. There can be more than one admission in my data and it can only count the echo's of this last admission. Any ideas? 

 

What I want:

 

 

 

        Date          Date                Number 
 ID   registration  admission    Echo     of Echo

 A    01JAN2001    01JAN2001      1          2
 A    07JAN2001        .          0          2
 A    15JAN2001        .          1          2
 A    15MAR2001        .          1          .
 B    07MAR2005        .          1          .
 B    10MAR2005    10MAR2005      0          1
 B    21MAR2005    21MAR2005      1          1

 

All help would be greatly appreciated!

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Assumes data is sorted by ID and admission date;

 

 data have;
 input  ID $  registration :date9. admission :date9. Echo;
 retain admission1;
 prev_id=lag(id);
 if not missing(admission) or prev_id^=id then admission1=admission;
 if registration-admission1<=42 then within_time_period=1; else within_time_period=0;
 format registration admission date9.;
 drop prev_id;
 cards;
 A    01JAN2001    01JAN2001      1         
 A    07JAN2001        .          0         
 A    15JAN2001        .          1         
 A    15MAR2001        .          1         
 B    07MAR2005        .          1       
 B    10MAR2005    10MAR2005      0         
 B    21MAR2005    21MAR2005      1   
 ;
 proc summary data=have nway;
 	class id admission1;
 	var echo/weight=within_time_period;
 	output out=sums(drop=_:) sum=number_of_echo;
 run;
 data want;
     merge have sums;
     by id admission1;
     if within_time_period=0 then number_of_echo=.;
     drop admission1 within_time_period;
 run;
--
Paige Miller

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Please correct your input data set. There are rows in the output that are not present in the input, and so we cannot possibly match the output unless you correct the input.

--
Paige Miller
Ksharp
Super User
'last admission' mean the last admission date within each id group ?

" within 42 days from admission" means [ admission date-42 , admission date+42 ] ?
fouke_o
Fluorite | Level 6

Hello, 

 

yes, sorry, it mean the last occurrence of admission date for this ID group. 

PaigeMiller
Diamond | Level 26

Assumes data is sorted by ID and admission date;

 

 data have;
 input  ID $  registration :date9. admission :date9. Echo;
 retain admission1;
 prev_id=lag(id);
 if not missing(admission) or prev_id^=id then admission1=admission;
 if registration-admission1<=42 then within_time_period=1; else within_time_period=0;
 format registration admission date9.;
 drop prev_id;
 cards;
 A    01JAN2001    01JAN2001      1         
 A    07JAN2001        .          0         
 A    15JAN2001        .          1         
 A    15MAR2001        .          1         
 B    07MAR2005        .          1       
 B    10MAR2005    10MAR2005      0         
 B    21MAR2005    21MAR2005      1   
 ;
 proc summary data=have nway;
 	class id admission1;
 	var echo/weight=within_time_period;
 	output out=sums(drop=_:) sum=number_of_echo;
 run;
 data want;
     merge have sums;
     by id admission1;
     if within_time_period=0 then number_of_echo=.;
     drop admission1 within_time_period;
 run;
--
Paige Miller
Ksharp
Super User
Why not include B 07MAR2005 ? since 07MAR2005 - 21MAR2005 is within 42 days .
fouke_o
Fluorite | Level 6

Because this occurrs before the admission date. 

Ksharp
Super User

OK. Assuming I understand what you mean and don't have a big table.

 

data have;
 input  ID $  registration :date9. admission :date9.  Echo;
 format registration admission  date9.;
 cards;
 A    01JAN2001    01JAN2001      1         
 A    07JAN2001        .          0         
 A    15JAN2001        .          1         
 A    15MAR2001        .          1         
 B    07MAR2005        .          1       
 B    10MAR2005    10MAR2005      0         
 B    21MAR2005    21MAR2005      1   
 ;
data have;
 set have;
 by id;
 retain _admin;
 if first.id then call missing(_admin);
 if not missing(admission) then _admin=admission;
 format _admin date9.;
run;
proc sql;
create table want as
 select *,case when .<registration-_admin<=42 then
 (select sum(echo) from have where id=a.id and 
  registration between a._admin and a._admin+42 )
  else 0 end as number_of_echo
  from have as a;
quit;
fouke_o
Fluorite | Level 6

I think I need to be more specific, apologies. 

 

I have a dataset of 10million records, where the structure looks more like this:

        Date          Date               
 ID   registration  admission    Echo     

 A    01JAN2001    01JAN2001      1         
 A    07JAN2001        .          0   
 A    15JAN2001    01JAN2001      0
 A    15JAN2001    01JAN2001      0
 A    15JAN2001        .          1         
 A    15MAR2001        .          1   
 A    18MAR2001    18MAR2001      0  
 A    18MAR2001    18MAR2001      0 
 A    19MAR2001    18MAR2001      1
 A    20MAR2001        .          0
 A    22MAR2001    18MAR2001      1       
 B    07MAR2005        .          1       
 B    10MAR2005    10MAR2005      0         
 B    21MAR2005    21MAR2005      1 
 B    23MAR2005        .          0

 

This is offcourse just an example, because 10million would be a bit much. 

Roughly each ID has over 1000 records, of which there can be several admissions. I only want to count the occurence of echo when it is within the 42 days of the last date of admission. Both codes provided here do not seem to work. 

PaigeMiller
Diamond | Level 26

And what should the output be for this revised data set?

 

ID A has three admissions dated 01JAN2001, are these counted as three different admissions, or one admission?

--
Paige Miller
fouke_o
Fluorite | Level 6

Hi, 

 

this is what I would like to have, the sum of the echo's within 42 days from the admission

        Date          Date                 Echo_
 ID   registration  admission    Echo     Count

 A    01JAN2001    01JAN2001      1         2  
 A    07JAN2001        .          0         2
 A    15JAN2001    01JAN2001      0         2
 A    15JAN2001    01JAN2001      0         2
 A    15JAN2001        .          1         2 
 A    15MAR2001        .          1         .
 A    18MAR2001    18MAR2001      0         2
 A    18MAR2001    18MAR2001      0         2
 A    19MAR2001    18MAR2001      1         2
 A    20MAR2001        .          0         2
 A    22MAR2001    18MAR2001      1         2      
 B    07MAR2005        .          1         .
 B    10MAR2005    10MAR2005      0         1
 B    21MAR2005    21MAR2005      1         1
 B    23MAR2005        .          0         1
PaigeMiller
Diamond | Level 26

As far as I can see, the solution I provided earlier ought to work on the revised data.

--
Paige Miller
fouke_o
Fluorite | Level 6

Thank you, indeed it worked!

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2291 views
  • 2 likes
  • 3 in conversation