Hi all,
I have data that looks like this:
Date Date
ID registration admission Echo
A 01JAN2001 01JAN2001 1
A 07JAN2001 . 0
A 15JAN2001 . 1
A 15MAR2001 . 1
B 07MAR2005 . 1
B 10MAR2005 10MAR2005 0
B 21MAR2005 21MAR2005 1
I need to flag the sum of the echo's within 42 days from admission. There can be more than one admission in my data and it can only count the echo's of this last admission. Any ideas?
What I want:
Date Date Number
ID registration admission Echo of Echo
A 01JAN2001 01JAN2001 1 2
A 07JAN2001 . 0 2
A 15JAN2001 . 1 2
A 15MAR2001 . 1 .
B 07MAR2005 . 1 .
B 10MAR2005 10MAR2005 0 1
B 21MAR2005 21MAR2005 1 1
All help would be greatly appreciated!
Assumes data is sorted by ID and admission date;
data have;
input ID $ registration :date9. admission :date9. Echo;
retain admission1;
prev_id=lag(id);
if not missing(admission) or prev_id^=id then admission1=admission;
if registration-admission1<=42 then within_time_period=1; else within_time_period=0;
format registration admission date9.;
drop prev_id;
cards;
A 01JAN2001 01JAN2001 1
A 07JAN2001 . 0
A 15JAN2001 . 1
A 15MAR2001 . 1
B 07MAR2005 . 1
B 10MAR2005 10MAR2005 0
B 21MAR2005 21MAR2005 1
;
proc summary data=have nway;
class id admission1;
var echo/weight=within_time_period;
output out=sums(drop=_:) sum=number_of_echo;
run;
data want;
merge have sums;
by id admission1;
if within_time_period=0 then number_of_echo=.;
drop admission1 within_time_period;
run;
Please correct your input data set. There are rows in the output that are not present in the input, and so we cannot possibly match the output unless you correct the input.
Hello,
yes, sorry, it mean the last occurrence of admission date for this ID group.
Assumes data is sorted by ID and admission date;
data have;
input ID $ registration :date9. admission :date9. Echo;
retain admission1;
prev_id=lag(id);
if not missing(admission) or prev_id^=id then admission1=admission;
if registration-admission1<=42 then within_time_period=1; else within_time_period=0;
format registration admission date9.;
drop prev_id;
cards;
A 01JAN2001 01JAN2001 1
A 07JAN2001 . 0
A 15JAN2001 . 1
A 15MAR2001 . 1
B 07MAR2005 . 1
B 10MAR2005 10MAR2005 0
B 21MAR2005 21MAR2005 1
;
proc summary data=have nway;
class id admission1;
var echo/weight=within_time_period;
output out=sums(drop=_:) sum=number_of_echo;
run;
data want;
merge have sums;
by id admission1;
if within_time_period=0 then number_of_echo=.;
drop admission1 within_time_period;
run;
Because this occurrs before the admission date.
OK. Assuming I understand what you mean and don't have a big table.
data have;
input ID $ registration :date9. admission :date9. Echo;
format registration admission date9.;
cards;
A 01JAN2001 01JAN2001 1
A 07JAN2001 . 0
A 15JAN2001 . 1
A 15MAR2001 . 1
B 07MAR2005 . 1
B 10MAR2005 10MAR2005 0
B 21MAR2005 21MAR2005 1
;
data have;
set have;
by id;
retain _admin;
if first.id then call missing(_admin);
if not missing(admission) then _admin=admission;
format _admin date9.;
run;
proc sql;
create table want as
select *,case when .<registration-_admin<=42 then
(select sum(echo) from have where id=a.id and
registration between a._admin and a._admin+42 )
else 0 end as number_of_echo
from have as a;
quit;
I think I need to be more specific, apologies.
I have a dataset of 10million records, where the structure looks more like this:
Date Date
ID registration admission Echo
A 01JAN2001 01JAN2001 1
A 07JAN2001 . 0
A 15JAN2001 01JAN2001 0
A 15JAN2001 01JAN2001 0
A 15JAN2001 . 1
A 15MAR2001 . 1
A 18MAR2001 18MAR2001 0
A 18MAR2001 18MAR2001 0
A 19MAR2001 18MAR2001 1
A 20MAR2001 . 0
A 22MAR2001 18MAR2001 1
B 07MAR2005 . 1
B 10MAR2005 10MAR2005 0
B 21MAR2005 21MAR2005 1
B 23MAR2005 . 0
This is offcourse just an example, because 10million would be a bit much.
Roughly each ID has over 1000 records, of which there can be several admissions. I only want to count the occurence of echo when it is within the 42 days of the last date of admission. Both codes provided here do not seem to work.
And what should the output be for this revised data set?
ID A has three admissions dated 01JAN2001, are these counted as three different admissions, or one admission?
Hi,
this is what I would like to have, the sum of the echo's within 42 days from the admission
Date Date Echo_
ID registration admission Echo Count
A 01JAN2001 01JAN2001 1 2
A 07JAN2001 . 0 2
A 15JAN2001 01JAN2001 0 2
A 15JAN2001 01JAN2001 0 2
A 15JAN2001 . 1 2
A 15MAR2001 . 1 .
A 18MAR2001 18MAR2001 0 2
A 18MAR2001 18MAR2001 0 2
A 19MAR2001 18MAR2001 1 2
A 20MAR2001 . 0 2
A 22MAR2001 18MAR2001 1 2
B 07MAR2005 . 1 .
B 10MAR2005 10MAR2005 0 1
B 21MAR2005 21MAR2005 1 1
B 23MAR2005 . 0 1
As far as I can see, the solution I provided earlier ought to work on the revised data.
Thank you, indeed it worked!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.