data test;
input var1 $ var2 var3;
datalines;
abc 4 1
abc 4 0
abc 4 2
abc 4 0
abc 4 2
xyz 2 1
xyz 2 0
xyz 2 1
xyz 2 3
;
run;
I need to calculate var3 sum by selecting # of observations based on var2.
For example: abc sum should be 1+0+2+0 and xyz should sum should be 1+0
I tried to use first.variable command with do loop and it is giving me an error.
Any other thoughts?
data test;
input var1 $ var2 var3;
datalines;
abc 4 1
abc 4 0
abc 4 2
abc 4 0
abc 4 2
xyz 2 1
xyz 2 0
xyz 2 1
xyz 2 3
;
run;
I need to calculate var3 sum by selecting # of observations based on var2.
For example: abc sum should be 1+0+2+0 and xyz should sum should be 1+0
I tried to use first.variable command with do loop and it is giving me an error.
Any other thoughts?
I don't think this is a data step process.
PROC MEANS, SQL are more appropriate.
You may need to play with the GROUP BY to get what you want but hard to see. For example I don't know why you're excluding 1/3, the last two records for XYZ.
proc sql;
create table want as
select * , sum(var2) as var3
from test
group by var1, var2;
quit;
@dht115 wrote:
data test;
input var1 $ var2 var3;
datalines;
abc 4 1
abc 4 0
abc 4 2
abc 4 0
abc 4 2
xyz 2 1
xyz 2 0
xyz 2 1
xyz 2 3
;
run;
I need to calculate var3 sum by selecting # of observations based on var2.
For example: abc sum should be 1+0+2+0 and xyz should sum should be 1+0
I tried to use first.variable command with do loop and it is giving me an error.
Any other thoughts?
EDIT: https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
abc sum should be 1+0+2+0
... and xyz should sum should be 1+0
I am not following this part of your explanation. Should not the abc sum be 1+0+2+0+2? Should not the xyx sum be 1+0+1+3?
no - because var2 value for xyz variable is 2. Therefore we need to select first two observation for xyz.
I tried to use use first.variable option. I am able to get expected result by producing sum for all observation by group.
I am able to produce the result but not able to get the desire result by selecting number of the observation based on var2.
DATA test1;
set test;
by var1;
if first.var1 then do;
final = var3;
end;
else do;
retain final;
final = var3 + final;
end;
run;
I choose first.variable option because I also need to calculate other function as well beside sum.
I am trying to do following logic and it is not providing me desire output
DATA test1;
set test;
by var1;
if first.var1 then do;
final = var3;
end;
else do i = 1 to var2;
retain final;
final = var3 + final;
end;
run;
It really helps to say exactly how "the sum is based on Var2".
I am going to guess that you mean is that the "Var2 has the number of records to sum over for Var3".
Which brings up the question of exactly what your expected output would look like. One row for Var1='abc'? the current 5 rows but the sum appearing only on the 4th? Last? All the ABC records?
One way that may get started:
data possible; set test; by notsorted var1; retain loopcount sumtot; if first.var1 then do; loopcount=0; sumtot=0; end; loopcount+1; sumtot= sum(sumtot,(loopcount le var2)*var3); if last.var1; drop loopcount; run;
What do you expect to happen if Var2 is bigger than than the number of records for the values of Var1?
How would missing values in Var3 be handled?
@dht115 wrote:
data test;
input var1 $ var2 var3;
datalines;
abc 4 1
abc 4 0
abc 4 2
abc 4 0
abc 4 2
xyz 2 1
xyz 2 0
xyz 2 1
xyz 2 3
;
run;
I need to calculate var3 sum by selecting # of observations based on var2.
For example: abc sum should be 1+0+2+0 and xyz should sum should be 1+0
I tried to use first.variable command with do loop and it is giving me an error.
Any other thoughts?
Just count how many observations you have processed for this value of your ID variable (VAR1).
data want ;
do obs=1 by 1 until (last.var1);
set test ;
by var1 ;
if obs <= var2 then want=sum(want,var3);
end;
keep var1 want;
run;
Result:
Obs var1 want 1 abc 3 2 xyz 1
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.