Dear all,
I have one dataset as below.
Subj score
a 1
a
a 3
a 4
b
b 4
b
b 5
my requirement is that i want to calculate average of score only for subject where number of non missing values equal to 3 or greatee than .If number of non missing value is less than 3 then will set the mean as . and add the same as observation after last record for particular subj.
Below is output
a 1
a
a 3
a 4
average of a "value"
b
b 4
b
b 5
average of b .
The calculation can be done in SQL. But you also show an example of a report layout, which is probably better to do separately, like in PROC REPORT (and there are others that can you guide better than me).
For SQL, use agregate filtering with HAVING
i.e.
group by id
having sum(case when missing(score) then 0 else 1 end) ge 3
Hello,
This is producing what you want. But I completely agree with LinusH, ... you should create a report instead of 'enriching' your dataset with an extra line below each subject containing his/her/its average.
So I don't recommend using the last data step. You don't want to spoil a dataset by adding extra rows containing an average while the other rows contain raw scores.
data work.ab;
infile datalines truncover;
input Subj $ score;
datalines;
a 1
a
a 3
a 4
b
b 4
b
b 5
;
run;
proc means data=work.ab nway noprint;
class Subj;
var score;
output out=work.absummary mean= n= nmiss= / autoname;
run;
PROC SQL noprint;
create table work.abmerge as
select a.* , b.*
from work.ab a
, work.absummary b
where a.Subj = b.Subj
order by a.Subj , a.Score;
quit;
data work.abmerge;
length Subj $ 25;
set work.abmerge;
by Subj;
if last.Subj then do;
output;
if score_N >=3 then score=Score_Mean;
else score=.;
Subj="Avg. of above Subject";
output;
end;
if index(Subj,'Avg')=0 then output;
run;
/* end of program */
Cheers,
Koen
Echo with @LinusH and @sbxkoenk, this should be a report request. Here is a one-pass data step alternative:
data work.ab;
infile datalines truncover;
length Subj $ 50;
input Subj score;
datalines;
a 1
a
a 3
a 4
b
b 4
b
b 5
;
run;
data want;
do until (last.subj);
set ab;
by subj notsorted;
if score>=3 then
do;
_ct+1;
_tot+score;
end;
output;
end;
subj='Avg. of above Subject';
score=_tot/_ct;
output;
call missing(of _:);
drop _:;
run;
Want make a REPORT ?
data have;
input Subj $ score;
cards;
a 1
a .
a 3
a 4
b .
b 4
b .
b 5
;
run;
proc report data=have nowd ;
column Subj score ;
define Subj/order;
define score/analysis mean;
compute score;
if not missing(score.mean) then n+1;
if not missing(_BREAK_) then do;
if n lt 4 then call missing(score.mean);
n=0;
end;
endcomp;
break after Subj/summarize;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.