Hello ,
I'm new to SAS EG , and i'm trying to perform some data cleaning.
I want to sum a column depending on the value of two other columns.
Here is a scope of the data I have :
CODE | CLIENT | AMOUNT |
M0013R | 201 | 300 |
M0013R | 201 | 500 |
M0013R | 201 | 600 |
M0013R | 200 | 900 |
M0013R | 200 | 300 |
M0013R | 201 | 400 |
M0013R | 200 | 200 |
T0198K | 200 | 700 |
T0198K | 200 | 200 |
U8910E | 201 | 600 |
U8910E | 200 | 300 |
P9823D | 200 | 400 |
P9823D | 200 | 1800 |
P9823D | 201 | 600 |
P9823D | 200 | 600 |
The Code is an alpha-numerical column ,
The CLIENT column can have only 2 values : 200 or 201 .
I want to vertically sum the amount column when the code is the same and the client is the same.
The data I want should look like this :
CODE | CLIENT | AMOUNT |
M0013R | 201 | 1800 |
M0013R | 200 | 1400 |
T0198K | 200 | 900 |
U8910E | 201 | 600 |
U8910E | 200 | 300 |
P9823D | 200 | 2800 |
P9823D | 201 | 600 |
Thank you in advance for your responses.
As far as I can see it is the same as the OP had in the post.
*Creating the dummy data;
data have;
input CODE $ CLIENT AMOUNT;
datalines;
M0013R 201 300
M0013R 201 500
M0013R 201 600
M0013R 200 900
M0013R 200 300
M0013R 201 400
M0013R 200 200
T0198K 200 700
T0198K 200 200
U8910E 201 600
U8910E 200 300
P9823D 200 400
P9823D 200 1800
P9823D 201 600
P9823D 200 600
;;;
run;
*This is the code that the Query Builder gives you when following my instructions;
%_eg_conditional_dropds(WORK.QUERY_FOR_HAVE);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_HAVE AS
SELECT DISTINCT t1.CODE,
t1.CLIENT,
/* SUM_of_AMOUNT */
(SUM(t1.AMOUNT)) AS SUM_of_AMOUNT
FROM WORK.HAVE t1
GROUP BY t1.CODE,
t1.CLIENT
ORDER BY t1.CODE,
t1.CLIENT;
QUIT;
Output data (same data, but different order):
CODE CLIENT SUM_of_AMOUNT
M0013R 200 1400
M0013R 201 1800
P9823D 200 2800
P9823D 201 600
T0198K 200 900
U8910E 200 300
U8910E 201 600
This is a standard PROC MEANS - CLASS or BY statements can be used to group your variables for analysis.
proc means data=have noprint nway;
class code client;
var amount;
output out=want sum=total_amount;
run;
proc print data=want;
run;
@LoPez_Diaz wrote:
Hello ,
I'm new to SAS EG , and i'm trying to perform some data cleaning.
I want to sum a column depending on the value of two other columns.
Here is a scope of the data I have :
CODE CLIENT AMOUNT M0013R 201 300 M0013R 201 500 M0013R 201 600 M0013R 200 900 M0013R 200 300 M0013R 201 400 M0013R 200 200 T0198K 200 700 T0198K 200 200 U8910E 201 600 U8910E 200 300 P9823D 200 400 P9823D 200 1800 P9823D 201 600 P9823D 200 600
The Code is an alpha-numerical column ,
The CLIENT column can have only 2 values : 200 or 201 .
I want to vertically sum the amount column when the code is the same and the client is the same.
The data I want should look like this :
CODE CLIENT AMOUNT M0013R 201 1800 M0013R 200 1400 T0198K 200 900 U8910E 201 600 U8910E 200 300 P9823D 200 2800 P9823D 201 600
Thank you in advance for your responses.
In EG Select:
Tasks | Describe | Summary Statistics
Drag to the Task Roles:
Amount to the Analysis Variables
Code and Client to the Classification Variables
Click Run
Or you can open the Query Builder, select the three variables Code, Client and Amount.
Then on the row for Amount you click on the third column and select sum. And on the bottom of the dialog you see the Select distinct rows only.
You might want to click on the tab Sort Data to sort them on Code and Client.
@heffo wrote:
Or you can open the Query Builder, select the three variables Code, Client and Amount.
Then on the row for Amount you click on the third column and select sum. And on the bottom of the dialog you see the Select distinct rows only.
You might want to click on the tab Sort Data to sort them on Code and Client.
That will not sum amount which the OP is doing.
As far as I can see it is the same as the OP had in the post.
*Creating the dummy data;
data have;
input CODE $ CLIENT AMOUNT;
datalines;
M0013R 201 300
M0013R 201 500
M0013R 201 600
M0013R 200 900
M0013R 200 300
M0013R 201 400
M0013R 200 200
T0198K 200 700
T0198K 200 200
U8910E 201 600
U8910E 200 300
P9823D 200 400
P9823D 200 1800
P9823D 201 600
P9823D 200 600
;;;
run;
*This is the code that the Query Builder gives you when following my instructions;
%_eg_conditional_dropds(WORK.QUERY_FOR_HAVE);
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_HAVE AS
SELECT DISTINCT t1.CODE,
t1.CLIENT,
/* SUM_of_AMOUNT */
(SUM(t1.AMOUNT)) AS SUM_of_AMOUNT
FROM WORK.HAVE t1
GROUP BY t1.CODE,
t1.CLIENT
ORDER BY t1.CODE,
t1.CLIENT;
QUIT;
Output data (same data, but different order):
CODE CLIENT SUM_of_AMOUNT
M0013R 200 1400
M0013R 201 1800
P9823D 200 2800
P9823D 201 600
T0198K 200 900
U8910E 200 300
U8910E 201 600
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.