Hi,
I have two columns. One is a column of numerical variables identifying the category of the asset (first number is main category, second number is subcategory eg. 31 is 3 for equity and 1 for common equity) the other column consists of numerical variables describing the total amount invested in the asset.
Example of dataset:
Id total value
31 10000
23 9930
37 2892
26 6833
Desired result:
Category 2 Category 3
16763 12892
I need to sum every main category and present the data in a table.
So far I have created subsets of the data using the data process as seen below:
data data (keep=asset rename=(asset=asset_category));
set original_dataset;
if 71<=Assetnumber<=79 then; else delete;
I end up with 10 different data sets.
How do I sum + combine the data the smartest way?
Is there a more convienient way to handle the entire dataset?
regards,
Thorius
I would consider using a format, something like this:
Proc format;
value Asset
0-<10='Junk'
10-<20='Reasonable '
20-<30='Sort of good'
/* etc. for other categories */
other='Unknown'
;
run;
proc summary data=have nway missing;
class ID;
format ID Asset.;
var value;
output out=want(drop=_TYPE_) sum=;
run;
Always a good idea to post sample data. So this code is untested:
proc sql;
create table want as
select substr(put(Assetnumber, 2.), 1, 1) as Main_Category
,sum(amount) as Amount_Sum
from original_dataset
group by calculated Main_Category;
quit;
Ok. Changed my response above 🙂
"the smartest way?" - start by posting some test data in the form of a datastep:
Then show what you want out from that data.
This code uses your sample data and creates the desired result:
data original_dataset;
input Id total_value;
datalines;
31 10000
23 9930
37 2892
26 6833
;
proc sql;
create table want as
select substr(put(Id, 2.), 1, 1) as Main_Category
,sum(total_value) as Amount_Sum
from original_dataset
group by calculated Main_Category;
quit;
@Thorius_Prime wrote:
Added a data sample now 🙂
We need a data sample that matches the words you used to describe the problem. 😞
Create a new group variable:
data intermediate;
set original;
groupvar = substr(put(asset,z2.),1,1);
run;
proc sql;
create table want as
select groupvar, sum(amount) as amount
from intermediate
group by groupvar;
quit;
Edit: changed the creation of groupvar.
Something like this (although I agree with the others, you haven't shown us a dataset that matches the problem you describe in words)
proc summary data=have;
class id;
var total_value;
output out=want sum=;
run;
I would consider using a format, something like this:
Proc format;
value Asset
0-<10='Junk'
10-<20='Reasonable '
20-<30='Sort of good'
/* etc. for other categories */
other='Unknown'
;
run;
proc summary data=have nway missing;
class ID;
format ID Asset.;
var value;
output out=want(drop=_TYPE_) sum=;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.