Hi,
I have two columns. One is a column of numerical variables identifying the category of the asset (first number is main category, second number is subcategory eg. 31 is 3 for equity and 1 for common equity) the other column consists of numerical variables describing the total amount invested in the asset.
Example of dataset:
Id total value
31 10000
23 9930
37 2892
26 6833
Desired result:
Category 2 Category 3
16763 12892
I need to sum every main category and present the data in a table.
So far I have created subsets of the data using the data process as seen below:
data data (keep=asset rename=(asset=asset_category));
set original_dataset;
if 71<=Assetnumber<=79 then; else delete;
I end up with 10 different data sets.
How do I sum + combine the data the smartest way?
Is there a more convienient way to handle the entire dataset?
regards,
Thorius
I would consider using a format, something like this:
Proc format;
value Asset
0-<10='Junk'
10-<20='Reasonable '
20-<30='Sort of good'
/* etc. for other categories */
other='Unknown'
;
run;
proc summary data=have nway missing;
class ID;
format ID Asset.;
var value;
output out=want(drop=_TYPE_) sum=;
run;
Always a good idea to post sample data. So this code is untested:
proc sql;
create table want as
select substr(put(Assetnumber, 2.), 1, 1) as Main_Category
,sum(amount) as Amount_Sum
from original_dataset
group by calculated Main_Category;
quit;
Ok. Changed my response above 🙂
"the smartest way?" - start by posting some test data in the form of a datastep:
Then show what you want out from that data.
This code uses your sample data and creates the desired result:
data original_dataset;
input Id total_value;
datalines;
31 10000
23 9930
37 2892
26 6833
;
proc sql;
create table want as
select substr(put(Id, 2.), 1, 1) as Main_Category
,sum(total_value) as Amount_Sum
from original_dataset
group by calculated Main_Category;
quit;
@Thorius_Prime wrote:
Added a data sample now 🙂
We need a data sample that matches the words you used to describe the problem. 😞
Create a new group variable:
data intermediate;
set original;
groupvar = substr(put(asset,z2.),1,1);
run;
proc sql;
create table want as
select groupvar, sum(amount) as amount
from intermediate
group by groupvar;
quit;
Edit: changed the creation of groupvar.
Something like this (although I agree with the others, you haven't shown us a dataset that matches the problem you describe in words)
proc summary data=have;
class id;
var total_value;
output out=want sum=;
run;
I would consider using a format, something like this:
Proc format;
value Asset
0-<10='Junk'
10-<20='Reasonable '
20-<30='Sort of good'
/* etc. for other categories */
other='Unknown'
;
run;
proc summary data=have nway missing;
class ID;
format ID Asset.;
var value;
output out=want(drop=_TYPE_) sum=;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.