BookmarkSubscribeRSS Feed
dht115
Calcite | Level 5

data test;

input var1 $ var2 var3;

datalines;

abc 4 1 

abc 4 0

abc 4 2

abc 4 0 

abc 4 2

xyz 2 1 

xyz 2 0

xyz 2 1

xyz 2 3

;

run;

 

I need to calculate var3 sum by selecting # of observations based on var2. 

For example: abc sum should be 1+0+2+0  and xyz should sum should be 1+0

 

I tried to use first.variable command with do loop and it is giving me an error. 

 

Any other thoughts? 

8 REPLIES 8
Reeza
Super User
Did you have a question?
dht115
Calcite | Level 5

data test;

input var1 $ var2 var3;

datalines;

abc 4 1 

abc 4 0

abc 4 2

abc 4 0 

abc 4 2

xyz 2 1 

xyz 2 0

xyz 2 1

xyz 2 3

;

run;

 

I need to calculate var3 sum by selecting # of observations based on var2. 

For example: abc sum should be 1+0+2+0  and xyz should sum should be 1+0

 

I tried to use first.variable command with do loop and it is giving me an error. 

 

Any other thoughts? 

Reeza
Super User

I don't think this is a data step process. 

PROC MEANS, SQL are more appropriate. 

 

You may need to play with the GROUP BY to get what you want but hard to see. For example I don't know why you're excluding 1/3, the last two records for XYZ.

 

proc sql;
create table want as
select * , sum(var2) as var3
from test
group by var1, var2;
quit;

 


@dht115 wrote:

data test;

input var1 $ var2 var3;

datalines;

abc 4 1 

abc 4 0

abc 4 2

abc 4 0 

abc 4 2

xyz 2 1 

xyz 2 0

xyz 2 1

xyz 2 3

;

run;

 

I need to calculate var3 sum by selecting # of observations based on var2. 

For example: abc sum should be 1+0+2+0  and xyz should sum should be 1+0

 

I tried to use first.variable command with do loop and it is giving me an error. 

 

Any other thoughts? 


EDIT: https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas

PaigeMiller
Diamond | Level 26

abc sum should be 1+0+2+0

... and xyz should sum should be 1+0

 

I am not following this part of your explanation. Should not the abc sum be 1+0+2+0+2? Should not the xyx sum be 1+0+1+3?

--
Paige Miller
dht115
Calcite | Level 5

no - because var2 value for xyz variable is 2. Therefore we need to select first two observation for xyz.

 

I tried to use use first.variable option. I am able to get expected result by producing sum for all observation by group. 

I am able to produce the result but not able to get the desire result by selecting number of the observation based on var2. 

 

DATA test1;

set test;

by var1;

if first.var1 then do;

final = var3;

end;

else do;

retain final;

final = var3 + final;

end;

run;

 

I choose first.variable option because I also need to calculate other function as well beside sum. 

 

I am trying to do following logic and it is not providing me desire output

 

DATA test1;

set test;

by var1;

if first.var1 then do;

final = var3;

end;

else do i = 1 to var2;

retain final;

final = var3 + final;

end;

run;

 

ballardw
Super User

It really helps to say exactly how "the sum is based on Var2".

I am going to guess that you mean is that the "Var2 has the number of records to sum over for Var3".

Which brings up the question of exactly what your expected output would look like. One row for Var1='abc'? the current 5 rows but the sum appearing only on the 4th? Last? All the ABC records?

One way that may get started:

data possible;
  set test;
  by notsorted var1;
  retain loopcount sumtot;
  if first.var1 then do;
      loopcount=0;
      sumtot=0;
  end;
  loopcount+1;
  sumtot= sum(sumtot,(loopcount le var2)*var3);
  if last.var1;
  drop loopcount;
run;

 

What do you expect to happen if Var2 is bigger than than the number of records for the values of Var1?

How would missing values in Var3 be handled?

 

 


@dht115 wrote:

data test;

input var1 $ var2 var3;

datalines;

abc 4 1 

abc 4 0

abc 4 2

abc 4 0 

abc 4 2

xyz 2 1 

xyz 2 0

xyz 2 1

xyz 2 3

;

run;

 

I need to calculate var3 sum by selecting # of observations based on var2. 

For example: abc sum should be 1+0+2+0  and xyz should sum should be 1+0

 

I tried to use first.variable command with do loop and it is giving me an error. 

 

Any other thoughts? 


 

Reeza
Super User
You win the guessing games today!
Tom
Super User Tom
Super User

Just count how many observations you have processed for this value of your ID variable (VAR1).

data want ;
  do obs=1 by 1 until (last.var1);
    set test ;
    by var1 ;
    if obs <= var2 then want=sum(want,var3);
  end;
  keep var1 want;
run;

Result:

Obs    var1    want

 1     abc       3
 2     xyz       1

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 901 views
  • 2 likes
  • 5 in conversation