I have a dataset to calculate by the given formulae. Kindly suggest a code to resolve it.
data Have;
input ID Val Type$;
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
run;
Formula: count (abnormal)/sum(Val) * 1000 : i.e: 4/51.9*1000
What if there are no abnormal types?
proc sql;
select sum(type='abnormal') as N
, sum(case when (type='abnormal') then val else . end) as D
, calculated N/calculated D *1000 as want
from have
;
quit;
N D want ---------------------------- 4 51.9 77.07129
Hello @Sathish_jammy,
With PROC SQL you can do it in one step:
proc sql;
select divide(1000, m)
from (select mean(val) as m from have where type='abnormal');
quit;
What if there are no abnormal types?
proc sql;
select sum(type='abnormal') as N
, sum(case when (type='abnormal') then val else . end) as D
, calculated N/calculated D *1000 as want
from have
;
quit;
N D want ---------------------------- 4 51.9 77.07129
@Tom wrote:
What if there are no abnormal types?
And what if there is a missing VAL for an abnormal type?
Do you mean
(count (abnormal) / sum(Val)) * 1000
or
count (abnormal) / (sum(Val) * 1000)
?
What you are asking for is the reciprocal of the mean of VAL for the subgroup type='abnormal'.
If your task expands to generating such a calculation for many subgroups/many variables, or if you might simultaneously want this for a hierarchy of subgroups, you might consider using PROC SUMMARY followed by a data step. Let's say you want this not only for "abnormal" but also for "normal" and for the entire population, and maybe you have a second variable, NEWVAL:
data Have;
input ID Val Type$;
newval=val + 3*uniform(12531366);
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
run;
proc summary data=have ;
class type;
var val newval;
output out=need (where=(_stat_='MEAN'));
run;
data want;
set need (drop=_:);
val_result=1000*(1/val);
newval_result=1000*(1/newval);
run;
Run it and take a look at the final and intermediate result datasets.
data Have; input ID Val Type$; cards; 1 10.5 normal 2 12.4 abnormal 3 16.6 normal 4 10.8 normal 5 13.9 abnormal 6 17.1 normal 7 14.4 abnormal 8 11.2 abnormal ; run; proc sql; select count(val) as N , sum(val) as D , calculated N/calculated D *1000 as want from have where type='abnormal' ; quit;
@Ksharp wrote:
data Have; input ID Val Type$; cards; 1 10.5 normal 2 12.4 abnormal 3 16.6 normal 4 10.8 normal 5 13.9 abnormal 6 17.1 normal 7 14.4 abnormal 8 11.2 abnormal ; run; proc sql; select count(*) as N , sum(val) as D , calculated N/calculated D *1000 as want from have where type='abnormal' ; quit;
For other data sets, which may have missing values of variable VAL, the above code gives the wrong answer, while code using PROC MEANS or PROC SUMMARY still gives the right answer when missing values are present.
Ha Paige, I just edited my code to fix the problem you are talking about .
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.