SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Summation with constraints

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Summation with constraints

Hi,

 
I have a dataset that looks like this:
 
Patient_IDAdmission_dateComorbidity
A14/05/20111
B21/04/20110
C30/06/20112
C23/09/20113
C22/06/20121
D25/04/20112
D26/04/20121
E01/12/20111
E04/12/20112
E29/04/20123
E30/04/20132
 
It's a large dataset of patients with different admission dates and different recorded
comorbidity scores. I am interested in the sum (patient level) of their scores for the period of 365 days before admission. 
If the patient was not admitted before, then the score is 0. If the previous admission is more than 365 days ago, the score is also 0. If there is more than 1 admission in the previous year, it's the sum of the scores.
 
In the table above, the results would look like this:
 
Patient_IDAdmission_dateComorbidityPast_Score
A14/05/201110
B21/04/201100
C30/06/201120
C23/09/201132
C22/06/201215
D25/04/201120
D26/04/201112
E01/12/201110
E04/12/201121
E29/04/201233
E30/04/201320
I tried many different approaches (proc means, changing the dataset to wide,...) but 
I am struggling to find a working solution.
 

I am using SAS Enterprise Guide, version 7.1

 
Any help will be greatly appreciated!
 
Many thanks.
 

Accepted Solutions
Solution
‎11-26-2016 08:42 PM
Super User
Posts: 17,912

Re: Summation with constraints

SQL is a good choice for this type of question. 

 

You join the table to itself, within period of a year. 

 

Untested:

 

Proc SQL;

create table want as

select a.*, sum(b.comorbidity) as scores

from have as a

left join have as b

on b.date between a.date-1 and a.date-365

group by a.id, a.date, a.comorbidity; 

quit; 

View solution in original post


All Replies
Solution
‎11-26-2016 08:42 PM
Super User
Posts: 17,912

Re: Summation with constraints

SQL is a good choice for this type of question. 

 

You join the table to itself, within period of a year. 

 

Untested:

 

Proc SQL;

create table want as

select a.*, sum(b.comorbidity) as scores

from have as a

left join have as b

on b.date between a.date-1 and a.date-365

group by a.id, a.date, a.comorbidity; 

quit; 

Respected Advisor
Posts: 3,901

Re: Summation with constraints

Using @Reeza's code may be also wrap a coalesce() function around the sum() so that you get 0 and not missing for cases where there is no previous admission.

 

Proc SQL;
create table want as
select a.*, coalesce(sum(b.comorbidity),0) as scores
from have as a
left join have as b
on b.date between a.date-1 and a.date-365
group by a.id, a.date, a.comorbidity;
quit;

New Contributor
Posts: 4

Re: Summation with constraints

Thank you very much @Reeza and @Patrick. I really appreciate your help.

 

The code works great, the only problem arises when I keep some other variables in the table - in that case proc sql duplicates some observations.

 

For example:

 

data have;
input id date ddmmyy8. comorbidity dummy ;
format date date8.;
cards;
1 11072014 1 1
1 09022016 1 1
1 17032016 1 1
1 11012016 1 1
2 11012016 1 1
2 08022016 1 1
3 19012016 1 1
4 10012016 1 1

;
run;

 

proc sql;
create table want as
select a.*, coalesce(sum(b.comorbidity),0) as scores
from have as a
left join have as b
on a.id=b.id

and b.date between a.date-1 and a.date-365
group by a.id, a.date, a.comorbidity;
quit;

 

 

 

 

New Contributor
Posts: 4

Re: Summation with constraints

Sorry, I mean I end up with 9 instead of original 8 observations.

 

Thank you!

 

Best,

K.

Respected Advisor
Posts: 3,901

Re: Summation with constraints

use either a DISTINCT in the select clause or even better add the variable to the Group By clause.

New Contributor
Posts: 4

Re: Summation with constraints

Thanks, it's works perfectly now!

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 434 views
  • 3 likes
  • 3 in conversation