12-30-2015 06:31 AM
I have one dataset as below.
my requirement is that i want to calculate average of score only for subject where number of non missing values equal to 3 or greatee than .If number of non missing value is less than 3 then will set the mean as . and add the same as observation after last record for particular subj.
Below is output
average of a "value"
average of b .
12-30-2015 06:49 AM
The calculation can be done in SQL. But you also show an example of a report layout, which is probably better to do separately, like in PROC REPORT (and there are others that can you guide better than me).
For SQL, use agregate filtering with HAVING
group by id
having sum(case when missing(score) then 0 else 1 end) ge 3
12-30-2015 07:21 AM
This is producing what you want. But I completely agree with LinusH, ... you should create a report instead of 'enriching' your dataset with an extra line below each subject containing his/her/its average.
So I don't recommend using the last data step. You don't want to spoil a dataset by adding extra rows containing an average while the other rows contain raw scores.
data work.ab; infile datalines truncover; input Subj $ score; datalines; a 1 a a 3 a 4 b b 4 b b 5 ; run; proc means data=work.ab nway noprint; class Subj; var score; output out=work.absummary mean= n= nmiss= / autoname; run; PROC SQL noprint; create table work.abmerge as select a.* , b.* from work.ab a , work.absummary b where a.Subj = b.Subj order by a.Subj , a.Score; quit; data work.abmerge; length Subj $ 25; set work.abmerge; by Subj; if last.Subj then do; output; if score_N >=3 then score=Score_Mean; else score=.; Subj="Avg. of above Subject"; output; end; if index(Subj,'Avg')=0 then output; run; /* end of program */
12-30-2015 10:06 AM
data work.ab; infile datalines truncover; length Subj $ 50; input Subj score; datalines; a 1 a a 3 a 4 b b 4 b b 5 ; run; data want; do until (last.subj); set ab; by subj notsorted; if score>=3 then do; _ct+1; _tot+score; end; output; end; subj='Avg. of above Subject'; score=_tot/_ct; output; call missing(of _:); drop _:; run;
12-30-2015 09:55 PM
Want make a REPORT ?
data have; input Subj $ score; cards; a 1 a . a 3 a 4 b . b 4 b . b 5 ; run; proc report data=have nowd ; column Subj score ; define Subj/order; define score/analysis mean; compute score; if not missing(score.mean) then n+1; if not missing(_BREAK_) then do; if n lt 4 then call missing(score.mean); n=0; end; endcomp; break after Subj/summarize; run;