For every PHASE in every TYPE I'm trying to calculate the AVG and STD based on the last 2 quarters. This is my current code:
data have;
infile datalines truncover;
input type $ quarter phase counter percent;
datalines;
BB-B 202001 1 94 13
BB-B 202001 2 91 2
BB-B 202001 3 50 2
BB-B 202002 1 79 11
BB-B 202002 2 20 4
BB-B 202002 3 88 4
BB-B 202003 1 17 1
BB-B 202003 2 2 3
BB-B 202003 3 100 3
AA-C 202001 1 67 6
AA-C 202001 2 49 4
AA-C 202001 3 68 4
AA-C 202002 1 23 6
AA-C 202002 2 39 4
AA-C 202003 3 98 4
AA-C 202003 1 7 6
AA-C 202003 2 9 4
AA-C 202003 3 48 4
;
data want;
set have;
by type phase notsorted;
lcounter=lag(counter);
if last.phase then do
avg=mean(lcounter,counter);
std=std(lcounter,counter);
output;
end;
keep type phase avg std;
run;
This is my current output:
This is my desired output:
type phase AVG SDeviation
BB-B 1 (79+17)/2 (Sdeviation here)
BB-B 2 (20+2)/2 (Sdeviation here)
BB-B 3 (88/100)/2 (Sdeviation here)
AA-C 1 (23+7)/2 (Sdeviation here)
AA-C 2 (39+9)/2 (Sdeviation here)
AA-C 3 (98+48)/2 (Sdeviation here)
data have;
infile datalines truncover;
input type $ quarter phase counter percent;
datalines;
BB-B 202001 1 94 13
BB-B 202001 2 91 2
BB-B 202001 3 50 2
BB-B 202002 1 79 11
BB-B 202002 2 20 4
BB-B 202002 3 88 4
BB-B 202003 1 17 1
BB-B 202003 2 2 3
BB-B 202003 3 100 3
AA-C 202001 1 67 6
AA-C 202001 2 49 4
AA-C 202001 3 68 4
AA-C 202002 1 23 6
AA-C 202002 2 39 4
AA-C 202003 3 98 4
AA-C 202003 1 7 6
AA-C 202003 2 9 4
AA-C 202003 3 48 4
;
proc sql;
create table want as
select type,phase,mean(counter) as mean,std(counter) as std
from (
select * from have group by type having quarter=max(quarter) or quarter=max(quarter)-1
)
group by type,phase;
quit;
Your program is written to expect all the records for a given type/phase to be consecutive (even if not in ascending or descending sort order). But your data is not so grouped. Sort your data by type/phase first. Then run your code.
data have;
infile datalines truncover;
input type $ quarter phase counter percent;
datalines;
BB-B 202001 1 94 13
BB-B 202001 2 91 2
BB-B 202001 3 50 2
BB-B 202002 1 79 11
BB-B 202002 2 20 4
BB-B 202002 3 88 4
BB-B 202003 1 17 1
BB-B 202003 2 2 3
BB-B 202003 3 100 3
AA-C 202001 1 67 6
AA-C 202001 2 49 4
AA-C 202001 3 68 4
AA-C 202002 1 23 6
AA-C 202002 2 39 4
AA-C 202003 3 98 4
AA-C 202003 1 7 6
AA-C 202003 2 9 4
AA-C 202003 3 48 4
;
proc sql;
create table want as
select type,phase,mean(counter) as mean,std(counter) as std
from (
select * from have group by type having quarter=max(quarter) or quarter=max(quarter)-1
)
group by type,phase;
quit;
@Ksharp I used below code but it still only grabs the last 2 quarters instead of the last 4:
proc sql;
create table want as
select type,phase,mean(counter) as mean,std(counter) as std
from (
select * from outputdatasetname group by type having quarter=max(quarter) or quarter=max(quarter)-1 or quarter=max(quarter)-2 or quarter=max(quarter)-3
)
group by type,phase;
quit;
Do yourself a favor from now on ... don't try to write your own mean and standard deviation (and median and minimum and maximum and a whole lot of other things) code in a DATA step (except for extremely unusual circumstances, which this is not). SAS has already programmed these calculations, tested it, debugged it, and proved that they work in a gazillion real world situations. You are essentially re-inventing the wheel. You (or your company or university) is paying for all this work that SAS has already done, so use that work SAS has already done.
Use PROC SUMMARY or PROC SQL (I strongly prefer PROC SUMMARY)
proc summary data=have(where=(quarter>=202002)) nway;
class type phase;
var counter;
output out=want mean=avg stddev=std;
run;
@Andalusia wrote:
What if I want to do it for the last 4 quarters instead of the last 2 as I have right now?
Extract the last X observations for every type/phase group first, then run PROC SUMMARY:
%let n_quarters = 2;
proc sort data=have;
by type phase quarter;
run;
data extract;
counta = 0;
do until (last.phase);
set have;
by type phase;
counta + 1;
end;
countb = 0;
do until (last.phase);
set have;
by type phase;
countb + 1;
if countb gt (counta - &n_quarters.) then output;
end;
drop counta countb;
run;
This is a slight expansion of the solution I provided here.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.