Solved: Re: Group and sum observations from two columns

Thorius_Prime · Posted 08-29-2018 08:04 AM

Hi,

I have two columns. One is a column of numerical variables identifying the category of the asset (first number is main category, second number is subcategory eg. 31 is 3 for equity and 1 for common equity) the other column consists of numerical variables describing the total amount invested in the asset.

Example of dataset:
Id    total value
31   10000
23   9930
37   2892
26   6833

Desired result:
Category 2 Category 3
16763          12892

I need to sum every main category and present the data in a table.

So far I have created subsets of the data using the data process as seen below:

data data (keep=asset rename=(asset=asset_category));

set original_dataset;

if 71<=Assetnumber<=79 then; else delete;

I end up with 10 different data sets.

How do I sum + combine the data the smartest way?

Is there a more convienient way to handle the entire dataset?

regards,

Thorius

s_lassen · Posted 08-29-2018 08:22 AM

I would consider using a format, something like this:

Proc format;
  value Asset
    0-<10='Junk'
   10-<20='Reasonable '
   20-<30='Sort of good'
  /* etc. for other categories */
    other='Unknown'
    ;
run;

proc summary data=have nway missing;
  class ID;
  format ID Asset.;
  var value;
  output out=want(drop=_TYPE_) sum=; 
run;

View solution in original post

PeterClemmensen · Posted 08-29-2018 08:06 AM

Always a good idea to post sample data. So this code is untested:

proc sql;
   create table want as
   select substr(put(Assetnumber, 2.), 1, 1) as Main_Category
         ,sum(amount) as Amount_Sum
   from original_dataset
   group by calculated Main_Category;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

Thorius_Prime · Posted 08-29-2018 08:10 AM

Nope. It's usually just a two digit number. It could be converted without any issue if that makes it easier to handle.

PeterClemmensen · Posted 08-29-2018 08:16 AM

Ok. Changed my response above 🙂

The DATA to DATA Step Macro
Blog: SASnrd

RW9 · Posted 08-29-2018 08:09 AM

"the smartest way?" - start by posting some test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

Then show what you want out from that data.

Thorius_Prime · Posted 08-29-2018 08:21 AM

Added a data sample now 🙂

PeterClemmensen · Posted 08-29-2018 08:24 AM

This code uses your sample data and creates the desired result:

data original_dataset;
input Id total_value;
datalines;
31 10000
23 9930
37 2892
26 6833
;

proc sql;
    create table want as
    select substr(put(Id, 2.), 1, 1) as Main_Category
    ,sum(total_value) as Amount_Sum
    from original_dataset
    group by calculated Main_Category;
quit;

The DATA to DATA Step Macro
Blog: SASnrd

PaigeMiller · Posted 08-29-2018 08:25 AM

@Thorius_Prime wrote:
Added a data sample now 🙂

We need a data sample that matches the words you used to describe the problem. 😞

--
Paige Miller

Kurt_Bremser · Posted 08-29-2018 08:11 AM

Create a new group variable:

data intermediate;
set original;
groupvar = substr(put(asset,z2.),1,1);
run;

proc sql;
create table want as
select groupvar, sum(amount) as amount
from intermediate
group by groupvar;
quit;

Edit: changed the creation of groupvar.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

PaigeMiller · Posted 08-29-2018 08:16 AM

Something like this (although I agree with the others, you haven't shown us a dataset that matches the problem you describe in words)

proc summary data=have;
    class id;
    var total_value;
    output out=want sum=;
run;

--
Paige Miller

s_lassen · Posted 08-29-2018 08:22 AM

I would consider using a format, something like this:

Proc format;
  value Asset
    0-<10='Junk'
   10-<20='Reasonable '
   20-<30='Sort of good'
  /* etc. for other categories */
    other='Unknown'
    ;
run;

proc summary data=have nway missing;
  class ID;
  format ID Asset.;
  var value;
  output out=want(drop=_TYPE_) sum=; 
run;

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away