I have a data series as follows. I'd like to come up with the sum for column A for each continuous series of C for each ID. For instance, the first continuous series in C for ID 101, the sum in A is 5; in the next continuous series in C for ID 101, the sum in A is 4. I cannot use proc sql to get the sum because I cannot group by ID and C as the number in C is not unique.My intuition is that I probably need some retain statement along with a loop to accomplish this. But I'm not sure how to make it work. I'd appreciate any help in coding it. Thanks!
ID | A | B | C |
101 | 1 | 0 | 6 |
101 | 1 | 1 | 6 |
101 | 1 | 1 | 6 |
101 | 0 | 1 | 6 |
101 | 1 | 0 | 6 |
101 | 1 | 1 | 6 |
101 | 0 | 0 | 0 |
101 | 1 | 1 | 6 |
101 | 1 | 0 | 6 |
101 | 0 | 1 | 6 |
101 | 0 | 1 | 6 |
101 | 1 | 0 | 6 |
101 | 1 | 0 | 6 |
102 | 0 | 1 | 5 |
102 | 1 | 1 | 5 |
102 | 1 | 1 | 5 |
102 | 0 | 1 | 5 |
102 | 1 | 0 | 5 |
All you might need is to replace
output;
with
if C ne 0 then output;
in @ChrisNZ's code above.
Otherwise, it does exactly what you want.
Like this?
data WANT;
set HAVE;
by ID C notsorted;
SUMA+A;
if last.C then do;
output;
SUMA=0;
end
run;
Thank you for your quick reply, Chris. But I have a question. In your code, the last.c refers to the last observation in column C for an ID (obs #13), or the last observation in Column C in a continuous series where C is not zero (obs #6). I'd like to accumulate A by the latter. Thanks!
obs | ID | A | B | C |
1 | 101 | 1 | 0 | 6 |
2 | 101 | 1 | 1 | 6 |
3 | 101 | 1 | 1 | 6 |
4 | 101 | 0 | 1 | 6 |
5 | 101 | 1 | 0 | 6 |
6 | 101 | 1 | 1 | 6 |
7 | 101 | 0 | 0 | 0 |
8 | 101 | 1 | 1 | 6 |
9 | 101 | 1 | 0 | 6 |
10 | 101 | 0 | 1 | 6 |
11 | 101 | 0 | 1 | 6 |
12 | 101 | 1 | 0 | 6 |
13 | 101 | 1 | 0 | 6 |
14 | 102 | 0 | 1 | 5 |
15 | 102 | 1 | 1 | 5 |
16 | 102 | 1 | 1 | 5 |
17 | 102 | 0 | 1 | 5 |
18 | 102 | 1 | 0 | 5 |
All you might need is to replace
output;
with
if C ne 0 then output;
in @ChrisNZ's code above.
Otherwise, it does exactly what you want.
Unsure I understand. Please provide the desired utput.
Maybe don't read the observations where C equals 0?
where C;
The change that PGStats suggested works! Thank you both!!!
I have a follow up programming question. With the codes above, I did get the output for each cumulative total of column A in a series. However, is there any way to populate this total to each continuous series. The desired output looks like the following:
obs | ID | A | B | C | SUMA |
1 | 101 | 1 | 0 | 6 | 5 |
2 | 101 | 1 | 1 | 6 | 5 |
3 | 101 | 1 | 1 | 6 | 5 |
4 | 101 | 0 | 1 | 6 | 5 |
5 | 101 | 1 | 0 | 6 | 5 |
6 | 101 | 1 | 1 | 6 | 5 |
7 | 101 | 0 | 0 | 0 | 0 |
8 | 101 | 1 | 1 | 6 | 4 |
9 | 101 | 1 | 0 | 6 | 4 |
10 | 101 | 0 | 1 | 6 | 4 |
11 | 101 | 0 | 1 | 6 | 4 |
12 | 101 | 1 | 0 | 6 | 4 |
13 | 101 | 1 | 0 | 6 | 4 |
14 | 102 | 0 | 1 | 5 | 3 |
15 | 102 | 1 | 1 | 5 | 3 |
16 | 102 | 1 | 1 | 5 | 3 |
17 | 102 | 0 | 1 | 5 | 3 |
18 | 102 | 1 | 0 | 5 | 3 |
Sure.
data WANT;
suma = 0;
do until(last.c);
set have; by id c notsorted;
suma = suma + a;
end;
do until(last.c);
set have; by id c notsorted;
output;
end;
run;
(untested)
Thank you so much for the quick reply. I tested the codes and got the following results.
obs | ID | A | B | C | SUMA |
1 | 101 | 1 | 0 | 6 | . |
2 | 101 | 1 | 1 | 6 | . |
3 | 101 | 1 | 1 | 6 | . |
4 | 101 | 0 | 1 | 6 | . |
5 | 101 | 1 | 0 | 6 | . |
6 | 101 | 1 | 1 | 6 | 5 |
7 | 101 | 0 | 0 | 0 | 0 |
8 | 101 | 1 | 1 | 6 | . |
9 | 101 | 1 | 0 | 6 | . |
10 | 101 | 0 | 1 | 6 | . |
11 | 101 | 0 | 1 | 6 | . |
12 | 101 | 1 | 0 | 6 | . |
13 | 101 | 1 | 0 | 6 | 4 |
14 | 102 | 0 | 1 | 5 | . |
15 | 102 | 1 | 1 | 5 | . |
16 | 102 | 1 | 1 | 5 | . |
17 | 102 | 0 | 1 | 5 | . |
18 | 102 | 1 | 0 | 5 | 3 |
Is there a way to fill those missing values with the accumulated total as below? Thank you!!!
obs | ID | A | B | C | SUMA |
1 | 101 | 1 | 0 | 6 | 5 |
2 | 101 | 1 | 1 | 6 | 5 |
3 | 101 | 1 | 1 | 6 | 5 |
4 | 101 | 0 | 1 | 6 | 5 |
5 | 101 | 1 | 0 | 6 | 5 |
6 | 101 | 1 | 1 | 6 | 5 |
7 | 101 | 0 | 0 | 0 | 0 |
8 | 101 | 1 | 1 | 6 | 4 |
9 | 101 | 1 | 0 | 6 | 4 |
10 | 101 | 0 | 1 | 6 | 4 |
11 | 101 | 0 | 1 | 6 | 4 |
12 | 101 | 1 | 0 | 6 | 4 |
13 | 101 | 1 | 0 | 6 | 4 |
14 | 102 | 0 | 1 | 5 | 3 |
15 | 102 | 1 | 1 | 5 | 3 |
16 | 102 | 1 | 1 | 5 | 3 |
17 | 102 | 0 | 1 | 5 | 3 |
18 | 102 | 1 | 0 | 5 | 3 |
Looks like your have dataset already contained variable suma. Try this then:
data WANT;
suma = 0;
do until(last.c);
set have(drop=suma); by id c notsorted;
suma = suma + a;
end;
do until(last.c);
set have(drop=suma); by id c notsorted;
output;
end;
run;
Yep!!! Now it works wonders!!! Thank YOU!!!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: