Solved: How to calculate with n of the dataset

Sathish_jammy · Posted 09-01-2021 07:27 AM

I have a dataset to calculate by the given formulae. Kindly suggest a code to resolve it.

data Have;
input ID Val Type$;
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
run;

Formula: count (abnormal)/sum(Val) * 1000 : i.e: 4/51.9*1000

Tom · Posted 09-01-2021 08:01 AM

What if there are no abnormal types?

proc sql;
 select sum(type='abnormal') as N
      , sum(case when (type='abnormal') then val else . end) as D
      , calculated N/calculated D *1000 as want
   from have 
 ;
quit;

       N         D      want
----------------------------
       4      51.9  77.07129

View solution in original post

FreelanceReinh · Posted 09-01-2021 07:46 AM

Hello @Sathish_jammy,

With PROC SQL you can do it in one step:

proc sql;
select divide(1000, m)
from (select mean(val) as m from have where type='abnormal');
quit;

Tom · Posted 09-01-2021 08:01 AM

What if there are no abnormal types?

proc sql;
 select sum(type='abnormal') as N
      , sum(case when (type='abnormal') then val else . end) as D
      , calculated N/calculated D *1000 as want
   from have 
 ;
quit;

       N         D      want
----------------------------
       4      51.9  77.07129

mkeintz · Posted 09-01-2021 08:54 AM

@Tom wrote:

What if there are no abnormal types?

And what if there is a missing VAL for an abnormal type?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Kurt_Bremser · Posted 09-01-2021 08:01 AM

Do you mean

(count (abnormal) / sum(Val)) * 1000

or

count (abnormal) / (sum(Val) * 1000)

?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mkeintz · Posted 09-01-2021 08:50 AM

What you are asking for is the reciprocal of the mean of VAL for the subgroup type='abnormal'.

If your task expands to generating such a calculation for many subgroups/many variables, or if you might simultaneously want this for a hierarchy of subgroups, you might consider using PROC SUMMARY followed by a data step. Let's say you want this not only for "abnormal" but also for "normal" and for the entire population, and maybe you have a second variable, NEWVAL:

data Have;
input ID Val Type$;
newval=val + 3*uniform(12531366);
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
run;

proc summary data=have ;
  class type;
  var  val newval;
  output out=need (where=(_stat_='MEAN'));
run;

data want;
  set need (drop=_:);
  val_result=1000*(1/val);
  newval_result=1000*(1/newval);
run;

Run it and take a look at the final and intermediate result datasets.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 09-01-2021 09:47 AM

data Have;
input ID Val Type$;
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
;
run;


proc sql;
 select count(val) as N
      , sum(val) as D
      , calculated N/calculated D *1000 as want
   from have 
   	where type='abnormal'
 ;
quit;

PaigeMiller · Posted 09-01-2021 09:52 AM

@Ksharp wrote:

data Have;
input ID Val Type$;
cards;
1 10.5 normal
2 12.4 abnormal
3 16.6 normal
4 10.8 normal
5 13.9 abnormal
6 17.1 normal
7 14.4 abnormal
8 11.2 abnormal
;
run;


proc sql;
 select count(*) as N
      , sum(val) as D
      , calculated N/calculated D *1000 as want
   from have 
   	where type='abnormal'
 ;
quit;

For other data sets, which may have missing values of variable VAL, the above code gives the wrong answer, while code using PROC MEANS or PROC SUMMARY still gives the right answer when missing values are present.

--
Paige Miller

Ksharp · Posted 09-01-2021 09:56 AM

Ha Paige, I just edited my code to fix the problem you are talking about .

How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

Re: How to calculate with n of the dataset

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away