Hello! I am trying to create the "want" column.
venue | fruit | number | want |
market | apple | 3 | 12 |
market | orange | 2 | 9 |
market | banana | 7 | 7 |
store | apple | 4 | 11 |
store | orange | 1 | 7 |
store | banana | 6 | 6 |
For the first entry, it first sums the number of apples, oranges, and bananas (for the market venue).
For the second entry it sums the oranges and bananas (for the market venue).
And the third entry sums the bananas (for the market venue).
This process then repeats for each type of venue.
Thanks in advance for the help!
Simple double DO loop:
data want;
do until (last.venue);
set have;
by venue;
want = sum(want,number);
end;
do until (last.venue);
set have;
by venue;
output;
want = sum(want,-number);
end;
run;
What code have you tried?
It looks like if you reverse the sort order, WANT is a running sum of NUMBER.
Try double DoW-loop, of course I assume data are properly sorted:
data Have;
input venue $ fruit $ number want;
cards;
market apple 3 12
market orange 2 9
market banana 7 7
store apple 4 11
store orange 1 7
store banana 6 6
run;
data Want;
sum = 0;
do _N_ = 1 by 1 until(last.venue);
set have(drop = want);
by venue;
sum + number;
end;
do _N_ = 0 to _N_ - 1;
set have(drop = want);
ln = lag(number);
if _N_ then want = want - ln;
else want = sum;
output;
end;
keep venue fruit number want;
run;
Bart
Good approach. Consider simplifying the bottom loop:
want = sum;
output;
sum = sum - number;
Reverse running sum. Weird way to do it.
1. Either reverse the data, calculate a running total and then reverse again.
2. Calculate an overall total by venue, merge it in and subtract as you go down the rows.
FYI - since order seems to matter here, I would have expected an order variable of some sort? Is it just alphabetical for fruit? Different orders will obviously change the resuts so something to think about.
proc sql;
create table want as
select *, sum(number) as total, calculated total - number as want
from have
group by venue;
quit;
@Cruiser13 wrote:
Hello! I am trying to create the "want" column.
venue fruit number want market apple 3 12 market orange 2 9 market banana 7 7 store apple 4 11 store orange 1 7 store banana 6 6
For the first entry, it first sums the number of apples, oranges, and bananas (for the market venue).
For the second entry it sums the oranges and bananas (for the market venue).
And the third entry sums the bananas (for the market venue).
This process then repeats for each type of venue.
Thanks in advance for the help!
If
data have;
input venue $ fruit $ number x;
datalines;
market apple 3 12
market orange 2 9
market banana 7 7
store apple 4 11
store orange 1 7
store banana 6 6
run;
data want;
set have;
by venue notsorted;
want+number; /*Make the total for the venue */
/* At end of venue, reread and output records*/
if last.venue then do until (last.venue);
set have;
by venue notsorted;
output;
want=want-number;
end;
run;
I read in your desired resulting values as variable X, and replicated by calculating variable WANT.
Each venue is read twice. Note that the first pass increments WANT, and at the end of the first pass, the second pass is programmed in a DO group, initiated by the statement.
if last.venue then do until (last.venue);
In this statement the first "last.venue" is tracking the first pass of the venue, but the second "last.venue" is associated with the second pass - i.e. the second SET and BY statements.
Eddited addtion. This program also solves the more general problem as stated in your subject line, i.e. the first obs sums the entire venue, the 2nd obs sums entire venue minus the first obs. And the i'th obs sum the entire venue except for obs 1 through i-1. You then presumablhy wouldn't have to worry about the order or number of fruits.
Simple double DO loop:
data want;
do until (last.venue);
set have;
by venue;
want = sum(want,number);
end;
do until (last.venue);
set have;
by venue;
output;
want = sum(want,-number);
end;
run;
You will have to read the data twice, e.g.:
data want;
want=0;
do until(last.venue);
set have;
by venue;
want+number;
end;
do until(last.venue);
set have;
by venue;
output;
want+-number; /* subtract after outputting */
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.