Hi,
I have a table something like this:
Customer | Number | Amount |
A | X | 10 |
A | X | 20 |
A | Y | 70 |
B | W | 40 |
B | Z | 20 |
B | Z | 30 |
I want to create this:
Customer | Number | Amount | Total Amount |
A | Y | 70 | 100 |
B | W | 40 | 90 |
The goal is to find the highesr amount for each customer then also create a column to sum up the total for each customer. How could I do that?
I can think about the first. option but I am not sure about the syntax. I'd appreciate any support. Thanks!
Try this
data have;
input Customer $ Number $ Amount;
datalines;
A X 10
A X 20
A Y 70
B W 40
B Z 20
B Z 30
;
proc sort data = have;
by Customer Amount;
run;
data want;
set have;
by Customer Amount;
if first.Customer then total = 0;
total + Amount;
if last.Customer;
run;
Try this
data have;
input Customer $ Number $ Amount;
datalines;
A X 10
A X 20
A Y 70
B W 40
B Z 20
B Z 30
;
proc sort data = have;
by Customer Amount;
run;
data want;
set have;
by Customer Amount;
if first.Customer then total = 0;
total + Amount;
if last.Customer;
run;
Did you run my code exactly as posted?
@znhnm wrote:
Hi,
I have a table something like this:
Customer
Number
Amount
A
X
10
A
X
20
A
Y
70
B
W
40
B
Z
20
B
Z
30
I want to create this:
Customer
Number
Amount
Total Amount
A
Y
70
100
B
W
40
90
The goal is to find the highesr amount for each customer then also create a column to sum up the total for each customer. How could I do that?
I can think about the first. option but I am not sure about the syntax. I'd appreciate any support. Thanks!
One way:
proc summary data=have nway; class customer; var amount; output out=want(drop=_:) max(amount)=Amount sum(amount)=totalamount maxid(amount(Number))=Number ; run;
The procedures Means and Summary have many options for getting statistics of multiple variables by groups.
They also include, as shown, ways to get the value of an identification variable (or multiple variables) that would be the identification value of a Max or Min value (which appears to be desired but was not even mentioned in your problem description).
The Drop=_: data set option removes to two other variables _type_ and _freq_ that would indicate the combination of class or by variables and the count of records used. The Nway option results in only the combination of all Class or By variables. The procedure if you did not include the option would have an additional row of data that included the summary variables for all records as well.
In my opinion, PROC SUMMARY is definitely the way to go, instead of using a DATA step. All of the necessary calculations have been built into PROC SUMMARY, debugged and tested, and handle missing values properly; whereas in a data step, you would have to do the debugging and testing and handling of missing values yourself.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.