I have a data series as follows. I'd like to come up with the sum for column A for each continuous series of C for each ID. For instance, the first continuous series in C for ID 101, the sum in A is 5; in the next continuous series in C for ID 101, the sum in A is 4. I cannot use proc sql to get the sum because I cannot group by ID and C as the number in C is not unique.My intuition is that I probably need some retain statement along with a loop to accomplish this. But I'm not sure how to make it work. I'd appreciate any help in coding it. Thanks!
| ID | A | B | C |
| 101 | 1 | 0 | 6 |
| 101 | 1 | 1 | 6 |
| 101 | 1 | 1 | 6 |
| 101 | 0 | 1 | 6 |
| 101 | 1 | 0 | 6 |
| 101 | 1 | 1 | 6 |
| 101 | 0 | 0 | 0 |
| 101 | 1 | 1 | 6 |
| 101 | 1 | 0 | 6 |
| 101 | 0 | 1 | 6 |
| 101 | 0 | 1 | 6 |
| 101 | 1 | 0 | 6 |
| 101 | 1 | 0 | 6 |
| 102 | 0 | 1 | 5 |
| 102 | 1 | 1 | 5 |
| 102 | 1 | 1 | 5 |
| 102 | 0 | 1 | 5 |
| 102 | 1 | 0 | 5 |
All you might need is to replace
output;
with
if C ne 0 then output;
in @ChrisNZ's code above.
Otherwise, it does exactly what you want.
Like this?
data WANT;
set HAVE;
by ID C notsorted;
SUMA+A;
if last.C then do;
output;
SUMA=0;
end
run;
Thank you for your quick reply, Chris. But I have a question. In your code, the last.c refers to the last observation in column C for an ID (obs #13), or the last observation in Column C in a continuous series where C is not zero (obs #6). I'd like to accumulate A by the latter. Thanks!
| obs | ID | A | B | C |
| 1 | 101 | 1 | 0 | 6 |
| 2 | 101 | 1 | 1 | 6 |
| 3 | 101 | 1 | 1 | 6 |
| 4 | 101 | 0 | 1 | 6 |
| 5 | 101 | 1 | 0 | 6 |
| 6 | 101 | 1 | 1 | 6 |
| 7 | 101 | 0 | 0 | 0 |
| 8 | 101 | 1 | 1 | 6 |
| 9 | 101 | 1 | 0 | 6 |
| 10 | 101 | 0 | 1 | 6 |
| 11 | 101 | 0 | 1 | 6 |
| 12 | 101 | 1 | 0 | 6 |
| 13 | 101 | 1 | 0 | 6 |
| 14 | 102 | 0 | 1 | 5 |
| 15 | 102 | 1 | 1 | 5 |
| 16 | 102 | 1 | 1 | 5 |
| 17 | 102 | 0 | 1 | 5 |
| 18 | 102 | 1 | 0 | 5 |
All you might need is to replace
output;
with
if C ne 0 then output;
in @ChrisNZ's code above.
Otherwise, it does exactly what you want.
Unsure I understand. Please provide the desired utput.
Maybe don't read the observations where C equals 0?
where C;
The change that PGStats suggested works! Thank you both!!!
I have a follow up programming question. With the codes above, I did get the output for each cumulative total of column A in a series. However, is there any way to populate this total to each continuous series. The desired output looks like the following:
| obs | ID | A | B | C | SUMA |
| 1 | 101 | 1 | 0 | 6 | 5 |
| 2 | 101 | 1 | 1 | 6 | 5 |
| 3 | 101 | 1 | 1 | 6 | 5 |
| 4 | 101 | 0 | 1 | 6 | 5 |
| 5 | 101 | 1 | 0 | 6 | 5 |
| 6 | 101 | 1 | 1 | 6 | 5 |
| 7 | 101 | 0 | 0 | 0 | 0 |
| 8 | 101 | 1 | 1 | 6 | 4 |
| 9 | 101 | 1 | 0 | 6 | 4 |
| 10 | 101 | 0 | 1 | 6 | 4 |
| 11 | 101 | 0 | 1 | 6 | 4 |
| 12 | 101 | 1 | 0 | 6 | 4 |
| 13 | 101 | 1 | 0 | 6 | 4 |
| 14 | 102 | 0 | 1 | 5 | 3 |
| 15 | 102 | 1 | 1 | 5 | 3 |
| 16 | 102 | 1 | 1 | 5 | 3 |
| 17 | 102 | 0 | 1 | 5 | 3 |
| 18 | 102 | 1 | 0 | 5 | 3 |
Sure.
data WANT;
suma = 0;
do until(last.c);
set have; by id c notsorted;
suma = suma + a;
end;
do until(last.c);
set have; by id c notsorted;
output;
end;
run;
(untested)
Thank you so much for the quick reply. I tested the codes and got the following results.
| obs | ID | A | B | C | SUMA |
| 1 | 101 | 1 | 0 | 6 | . |
| 2 | 101 | 1 | 1 | 6 | . |
| 3 | 101 | 1 | 1 | 6 | . |
| 4 | 101 | 0 | 1 | 6 | . |
| 5 | 101 | 1 | 0 | 6 | . |
| 6 | 101 | 1 | 1 | 6 | 5 |
| 7 | 101 | 0 | 0 | 0 | 0 |
| 8 | 101 | 1 | 1 | 6 | . |
| 9 | 101 | 1 | 0 | 6 | . |
| 10 | 101 | 0 | 1 | 6 | . |
| 11 | 101 | 0 | 1 | 6 | . |
| 12 | 101 | 1 | 0 | 6 | . |
| 13 | 101 | 1 | 0 | 6 | 4 |
| 14 | 102 | 0 | 1 | 5 | . |
| 15 | 102 | 1 | 1 | 5 | . |
| 16 | 102 | 1 | 1 | 5 | . |
| 17 | 102 | 0 | 1 | 5 | . |
| 18 | 102 | 1 | 0 | 5 | 3 |
Is there a way to fill those missing values with the accumulated total as below? Thank you!!!
| obs | ID | A | B | C | SUMA |
| 1 | 101 | 1 | 0 | 6 | 5 |
| 2 | 101 | 1 | 1 | 6 | 5 |
| 3 | 101 | 1 | 1 | 6 | 5 |
| 4 | 101 | 0 | 1 | 6 | 5 |
| 5 | 101 | 1 | 0 | 6 | 5 |
| 6 | 101 | 1 | 1 | 6 | 5 |
| 7 | 101 | 0 | 0 | 0 | 0 |
| 8 | 101 | 1 | 1 | 6 | 4 |
| 9 | 101 | 1 | 0 | 6 | 4 |
| 10 | 101 | 0 | 1 | 6 | 4 |
| 11 | 101 | 0 | 1 | 6 | 4 |
| 12 | 101 | 1 | 0 | 6 | 4 |
| 13 | 101 | 1 | 0 | 6 | 4 |
| 14 | 102 | 0 | 1 | 5 | 3 |
| 15 | 102 | 1 | 1 | 5 | 3 |
| 16 | 102 | 1 | 1 | 5 | 3 |
| 17 | 102 | 0 | 1 | 5 | 3 |
| 18 | 102 | 1 | 0 | 5 | 3 |
Looks like your have dataset already contained variable suma. Try this then:
data WANT;
suma = 0;
do until(last.c);
set have(drop=suma); by id c notsorted;
suma = suma + a;
end;
do until(last.c);
set have(drop=suma); by id c notsorted;
output;
end;
run;
Yep!!! Now it works wonders!!! Thank YOU!!!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.