I think this is a simple one.
This is what I have
date | Type | Count |
Jan-21 | A | 100 |
Jan-21 | B | 200 |
Feb-21 | A | 300 |
Feb-21 | B | 400 |
This is what I want
date | Type | Count |
Jan-21 | A | 100 |
Jan-21 | B | 200 |
Jan-21 | C | 300 |
Feb-21 | A | 300 |
Feb-21 | B | 400 |
Feb-21 | C | 700 |
Thanks
How about this?
There is no format to display it as Jan-21, though, so it will be JAN21.
data have;
input date:monyy6. Type $ Count;
format date monyy6.;
cards;
Jan-21 A 100
Jan-21 B 200
Feb-21 A 300
Feb-21 B 400
;
run;
proc sort data=have;
by date;
run;
data want;
set have;
by date;
retain sum;
if first.date then sum=0;
sum=sum+count;
output;
if last.date then do;
Type='C';
Count=sum;
output;
end;
drop sum;
run;
How about this?
There is no format to display it as Jan-21, though, so it will be JAN21.
data have;
input date:monyy6. Type $ Count;
format date monyy6.;
cards;
Jan-21 A 100
Jan-21 B 200
Feb-21 A 300
Feb-21 B 400
;
run;
proc sort data=have;
by date;
run;
data want;
set have;
by date;
retain sum;
if first.date then sum=0;
sum=sum+count;
output;
if last.date then do;
Type='C';
Count=sum;
output;
end;
drop sum;
run;
If every date has exactly two observations as in your example, and if the data are sorted by date, then:
data want;
set have;
by date ;
output;
type='C';
count+lag(count);
if last.date then output;
run;
The "by date" statement not only tells SAS to expect the data to be sorted, but also generates two dummy variables: first.date and last.date, indicating whether the observation in hand is the first or last one for a given date.
The first output statement just outputs the current record, followed by modifying count to add the previous obs. The second output statement only outputs that modified count variable for the last obs of each date.
A slightly different take, reading in a DO loop:
data want;
do until (last.date);
set have;
by date;
output;
_sum = sum(_sum,count);
end;
type = "C";
count = _sum;
output;
drop _sum;
run;
Note that SQL is very poor at processing sequences, it is aimed more at groups. So the tool of choice for your task is the data step (Maxim 14).
data have;
input date:monyy6. Type $ Count;
format date monyy6.;
cards;
Jan-21 A 100
Jan-21 B 200
Feb-21 A 300
Feb-21 B 400
;
proc sql;
create table want as
select * from have
union
select date,'C',sum(count) from have group by date;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.