Hello!
I am working on a project, and to my understanding, I need a way to sum values down a column, by an id group: however my variable needs to sum only values from visits that were achieved from dates above it. Additionally, I will need to account for another variable and subtract it when applicable.
For example, say I am a doctor, with data on all of the visits from patients. If a patient needs stitches, I total the number of stitches needed for that visit. The total number of stitches should total the number of stitches received in the previous visit. For these data, I will need to account for stitches removed. Thus, if stitches were removed, they should be subtracted from the total number of stitches. From the given variables ID, visit_order, num_stitches, and num_removed, I need to create total_stitches, num_never_removed, and patientlevel_num_never_removed. Where the num_never_removed = total_stitches - num_removed.
ID
visit_order
num_stitches
num_removed
total_stitches (to this point)
num_never_removed (0s could replace missings here)
patientlevel_num_never_removed
X
1
4
.
4
.
2
X
2
.
2
4
2 (= 4-2)
2
X
3
3
.
5 (= 4+3 - 2))
.
2
X
4
.
3
5
2 (= 5-3)
2
X
5
.
.
2 (= 5+. -2)
.
2
A
1
5
.
5
.
0
A
2
.
.
5
.
0
A
3
3
.
8
.
0
B
1
2
.
2
.
3
B
2
4
.
6
.
3
B
3
.
3
6
3 (=6-3)
3
B
4
12
.
15 (6+12 - 3)
.
3
and so on
data have;
input ID $ visit_order num_stitches num_removed;
datalines;
X 1 4 .
X 2 . 2
X 3 3 .
X 4 . 3
X 5 . .
A 1 5 .
A 2 . .
A 3 3 .
B 1 2 .
B 2 4 .
B 3 . 3
B 4 . 12
;
run;
I attempted dabbling with the lag function, but I could not get a solution that worked properly for each scenario. Thank you in advance for any help you can provide. I have been working on this problem for hours and have not been able to find a solution.
... View more