DATA Step, Macro, Functions and more

sumif and a countif in sas

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

sumif and a countif in sas

Hi , 

 

I have a table like this below , from this i have to do two things 

1) i have to count all the one (1) that is present for each variable 

2) i have to add the all probability scores of those variable where the value is one 

 

i have given here a dummy  data , in the original data set which i am using is having around 300 variables like that and for all the variable i have to do the same task

 


data have;
infile cards dlm='09'x;
input age1-age7 probability;
cards;
0 0 1 0 0 0 0 0.037316109
0 0 1 0 0 0 0 0.012645846
0 0 1 0 0 0 0 0.030745011
1 0 0 0 0 0 0 0.109232968
0 1 0 0 0 0 0 0.121137734
0 0 0 1 0 0 0 0.064422562
0 0 1 0 0 0 0 0.030978301
0 0 1 0 0 0 0 0.030990311
0 0 1 0 0 0 0 0.15763914
1 0 0 0 0 0 0 0.05447492
1 0 0 0 0 0 0 0.032326732
0 0 1 0 0 0 0 0.100191375
0 0 1 0 0 0 0 0.015438986
1 0 0 0 0 0 0 0.026323937
0 1 0 0 0 0 0 0.069376926
0 1 0 0 0 0 0 0.030861443
0 0 0 0 1 0 0 0.019696911
;
run;

 

Want

 

Var namesCount of oneSum of Probability
age_140.222358557
age_230.221376102
age_380.415945079
age_410.064422562
age_510.019696911
age_600
age_700


  

 


Accepted Solutions
Solution
3 weeks ago
Super User
Posts: 6,629

Re: sumif and a countif in sas

Posted in reply to soham_sas

I think @PaigeMiller is moving in the right direction here.  However, with 300 variables, I would switch things around a little:

 

proc summary data=have;

var age1-age300;

weight probability;

output out=pr sum=;

run;

 

That gives you the total probability for each AGE variable.  Getting the count of the 1's is simpler ... get rid of the WEIGHT statement.

View solution in original post


All Replies
Respected Advisor
Posts: 2,812

Re: sumif and a countif in sas

Posted in reply to soham_sas

UNTESTED CODE

 

proc summary data=have;
    var probability;
    weight age1;
    output out=stats1 sum=sum_probability sumwgt=n_age1;
run;
/* repeat for age2 through age 7 */

 

--
Paige Miller
Solution
3 weeks ago
Super User
Posts: 6,629

Re: sumif and a countif in sas

Posted in reply to soham_sas

I think @PaigeMiller is moving in the right direction here.  However, with 300 variables, I would switch things around a little:

 

proc summary data=have;

var age1-age300;

weight probability;

output out=pr sum=;

run;

 

That gives you the total probability for each AGE variable.  Getting the count of the 1's is simpler ... get rid of the WEIGHT statement.

Frequent Contributor
Posts: 102

Re: sumif and a countif in sas

Posted in reply to Astounding

Hi @Astounding and @PaigeMiller, if you have got a couple of mins to spare, would you mind a small explanation for the weight statement plz? like what it does and why use it.

 

The doc is rather confusing me

Respected Advisor
Posts: 2,812

Re: sumif and a countif in sas

It computes (in this case) a weighted sum instead of the ordinary sum. Using @Astounding's code, the sum of the zeros and ones are weighted by the probability. So, if age1 is 0, this contributes 0*probability to the sum; if age1 is 1, this contributes 1*probability to the sum.

--
Paige Miller
Esteemed Advisor
Posts: 5,479

Re: sumif and a countif in sas

Posted in reply to soham_sas

Try this:

 

data temp;
set have;
age = whichn(1, of age1-age7);
run;

proc sql;
create table want as
select
    age,
    count(*) as n,
    sum(probability) as probability
from temp
group by age;
select * from want;
quit;
PG
PROC Star
Posts: 2,316

Re: sumif and a countif in sas

Posted in reply to soham_sas

Not the best method for 300 variables, but to answer your question about SUMIF, you can do something like this is in SAS to achieve SUMIF's logic:

data WANT;
  set HAVE end=LASTOBS;
  AGE_1+AGE1*PROBABILITY;
  if LASTOBS;
  putlog AGE_1=;
run;

AGE_1=0.222358557

The logic is:

AGE_1 + (any test you want goes here and 0 or 1 will be returned) * PROBABILITY;

 

 

Occasional Contributor
Posts: 15

Re: sumif and a countif in sas

Posted in reply to soham_sas
Please try below solution.
DATA TEMP;
infile cards;
input age1-age7 probability;
cards;
0 0 1 0 0 0 0 0.037316109
0 0 1 0 0 0 0 0.012645846
0 0 1 0 0 0 0 0.030745011
1 0 0 0 0 0 0 0.109232968
0 1 0 0 0 0 0 0.121137734
0 0 0 1 0 0 0 0.064422562
0 0 1 0 0 0 0 0.030978301
0 0 1 0 0 0 0 0.030990311
0 0 1 0 0 0 0 0.15763914
1 0 0 0 0 0 0 0.05447492
1 0 0 0 0 0 0 0.032326732
0 0 1 0 0 0 0 0.100191375
0 0 1 0 0 0 0 0.015438986
1 0 0 0 0 0 0 0.026323937
0 1 0 0 0 0 0 0.069376926
0 1 0 0 0 0 0 0.030861443
0 0 0 0 1 0 0 0.019696911
;
run;




PROC SORT DATA=TEMP;
BY probability;

PROC TRANSPOSE DATA=TEMP OUT=TEMP_1 NAME= Age_Col;
BY probability;
VAR Age:;
RUN;


PROC SQL;

SELECT
Age_col,
SUM(COL1) as Total_Count,
SUM(
CASE WHEN COL1=1 THEN probability
ELSE 0 END
) as Sum_Prob
FROM
TEMP_1
GROUP BY
Age_col
;
QUIT;




Respected Advisor
Posts: 2,812

Re: sumif and a countif in sas

Posted in reply to mahesh146

I just want to throw out a little bit of philosophy that you can adopt, or ignore, or argue with, about why I choose a PROC instead of a data step to do these calculations, which are really just addition and multiplication. (And I want to thank @Astounding for coming up with a much better solution that I had along similar lines)

 

My philosophy is this: SAS has already written code to compute sums and averages and weighted sums and weighted averages in several different PROCs. They have debugged it, and thousands/millions/tens of millions of applications later, we know the calculations inside these PROCs are correct. So I believe it is to my benefit to use the work that SAS has already done to compute sums and averages (and other similar statistics) — the coding is simpler, and if I write my own code to do this I can make errors

 

In the end, that's why we have SAS, isn't it? They do all the work to create these algorithms so we don't have to. 

 

Anyway, there's nothing wrong with doing it via data step or PROC SQL, if that's your choice. It's just not my choice to do that level of coding creating an algorithm to do sums and averages.

 

As an amusing side story, I once took over some code someone else had written. THere were several complicated data steps of about 15-20 lines of code, and it took me about 30 minutes to figure out what these data steps were doing. These complicated data steps were just adding up zeros and ones. Hence, another reason to use built in SAS procedures to do basic statistics is that the next person who has to make use of your code will understand quickly what is being done.

--
Paige Miller
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 179 views
  • 5 likes
  • 7 in conversation