- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've the data (INPUT) as follows.
Now I want to create the target variable 'Total' using the similar formula as below. I've got the requirement using SUMIF function from excel and when I simplify it, I end with the calculation as shown below. I knew that we don't have SUMIF in SAS but I was wondering what will be the equal to it in SAS and also not sure how to achive the this calculation in one data step.
Final values of 'Total' should be,
825.666
971.823
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So it would be this:
proc sql;
select
sum(G) * sum(H) / sqrt(sum(G) ** 2 + sum(I) ** 2 + sum(G) * sum(I))
as wanted
from have;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data test;
/** I want to add a and c only if b > 0 **/
a=1;
b=2;
c=3;
sum=(a+c)*(b>0);
put sum=;
a=1;
b=-1;
c=3;
sum=(a+c)*(b>0);
put sum=;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You might as well go ahead and use
sum=sum(a,c)*(b>0);
The value of sum will be 0 when b is less than or equal to zero, which includes missing. Does that match your need?
If you are dividing anything by that sum you should check that the total is not zero to avoid the "divide by zero" errors.
Since it appears your "overall formula" did not involve counts of things then I suspect having the occasional 0 isn't going to be a problem.
or use
if b> 0 then sum= sum(a,c);
You likely want to consider what to do if one of A or C is missing. In SAS A+C is missing if either of the two variables has a missing value. If in that case you want the value of what either variable may have you should use the SUM function, which I think is going to be more like the Excel result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
are you looking for such functionality like this:
data have;
input x y;
cards;
1 10
5 13
3 12
4 10
2 11
;
run;
data want;
set have end = EOF;
partial_sum + ifn(y>10, x, 0); /* agregate for all rows if condition is satisfy */
if EOF; /* at the end of dataset */
sum_square = partial_sum * partial_sum; /* generate final value */
output;
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How about instead of providing "code" that has little bearing on how SAS works show an actual formulae given the shown data.
Since you do not tell us which column should be A or B or what ever, then the Excel formula is even more problematic. I would guess that your formula references more cells that you provided in the "example" since I see a $D, unless you actually intend to calculate something using the value of and ID variable?
I am not seeing any value to compare the cells to so am questioning the use of SUMIF at all. It looks like SUM would be the function
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@ballardw @yabwon Actual formula to derive the target variable (Total) is, shown below. Is there a way that you can help me with the sample data as I provided in the Initial post.
=SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)*SUMIF(INPUT!$A:$A;A2;INPUT!$H:$H)/SQRT(SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)^2+SUMIF(INPUT!$A:$A;A2;INPUT!$I:$I)^2+SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)*SUMIF(INPUT!$A:$A;A2;INPUT!$I:$I))
A refers to the field 'ID'
G refers to the field 'HNSLT_GR'
H refers to the field 'H_GR'
I refers to the field 'HSLT_GR'
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@David_Billa wrote:
@ballardw @yabwon Actual formula to derive the target variable (Total) is, shown below. Is there a way that you can help me with the sample data as I provided in the Initial post.
=SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)*SUMIF(INPUT!$A:$A;A2;INPUT!$H:$H)/SQRT(SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)^2+SUMIF(INPUT!$A:$A;A2;INPUT!$I:$I)^2+SUMIF(INPUT!$A:$A;A2;INPUT!$G:$G)*SUMIF(INPUT!$A:$A;A2;INPUT!$I:$I))A refers to the field 'ID'
G refers to the field 'HNSLT_GR'
H refers to the field 'H_GR'
I refers to the field 'HSLT_GR'
That's a start. Now, what exactly are we "summing"? I don't speak Excel that well and with only two rows of data you should be able to write out the equation such as (123 + 456) / (456) or what ever using the values you show.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why don't you formulate your issue in plain language, like "when condition a is met, do calculation b, otherwise do calculation c"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I simplify the Excel formula, then I get this.
=(sum of G)*(sum of H)/SQRT((sum of G)^2+(sum of I)^2+(sum of G)*(sum of I))
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What do you mean by "sum of"? Is "sum of G" a vertical summation of column G over all observations?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So it would be this:
proc sql;
select
sum(G) * sum(H) / sqrt(sum(G) ** 2 + sum(I) ** 2 + sum(G) * sum(I))
as wanted
from have;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Kurt_Bremser Now I've simplified the excel formula and updated the initial post. I would like to know how to derive the target variables in one data step.
- Tags:
- ku