I am trying to add the amount rows to get the total amount per ID, if the ID term and type match.
Here is what I have:
ID | term | amount | type |
1828111 | 2019FA | -859 | 110131 |
1828111 | 2019FA | 859 | 110131 |
999999 | 2019FA | -3108 | 170125 |
999999 | 2019FA | 3108 | 170125 |
8888888 | 2019SP | -14550 | 130133 |
8888888 | 2019SP | 14750 | 130133 |
Here is what I would like
ID | term | amount | type |
1828111 | 2019FA | 0 | 110131 |
999999 | 2019FA | 0 | 170125 |
8888888 | 2019SP | 200 | 130133 |
Looks like a job for proc summary if you want a new data set:
For something like you show:
Proc summary data= have nway; class id term type; var amount; output out=work.want (drop=_type_ _freq_) sum=; run;
The class statement tells the procedure which variables are used to define groups. Caution: if you have variable that has missing values that you need to treat as a "group" use the /missing option on the class statement.
Var statement tells SAS what variable you want to summarize.
The output statement says that you want to create output, names the data set and the SUM= will apply the Sum statistic to all variables on the Var statement and the name will be the same as the original variable. If you want multiple statistics for a variable you'll need to provide variable names or use the /autoname option to append the statistic name to the variables.
The odd looking NWAY on the Proc statement is one way of controlling the output so that only the full cross of the class variables is in the output. Without that option you would get additional summary lines in the data for 1) sum over all 2) sum of just the Id, term or type variables alone, 3) sum of the id & term, id & type and term & type as well as the sum will all three. The Drop option removes the variable _type_ which is an indicator of which of those combinations of variables were used for the particular output record in the data, and the _freq_ which is how many records were used for each output record.
Use Proc Print to display the results as desired.
You may not believe it now but this concept of doing multiple groups summaries is very powerful. There are a number of other options for creating specific groups as well. The procedure will in effect sort the data to create the groups if it is not sorted but there can be some issues with very large numbers of levels in multiple class variables.
If you want a report, these are things that people read, then perhaps:
Proc report data=have; column id term type amount; define id /group; define term/ group; define type/ group; define amount /sum; run;
Proc report builds from left to right. So for Type to be used for grouping it would have to appear to the left of Amount in column order.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.