DATA Step, Macro, Functions and more

Aggregating data using proc means

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

Aggregating data using proc means

For a project, I have a large dataset of 1.5m entries, I am looking to aggregate some car loan data by some constraint variables such as:

 

Country, Currency, ID, Fixed or floating , performing ,  Initial Loan Value , Car Type , Car Make

 

I am wondering if it is possible to aggregate data by summing the initial loan value for the numeric and then condensing the similar variables into one row with the same observation such that I turn the first dataset into the second

 

Country Currency ID Fixed_or_Floating Performing Initial_Value Current_Value 

data have;
set have;
input country $ currency $ ID Fixed $ performing $ initial current;
datalines;

UK       GBP     1   Fixed     Performing          100    50    
UK       GBP     1   Fixed     Performing          150    30   
UK       GBP     1   Fixed     Performing          160    70   
UK       GBP     1   Floating Performing          150    30   
UK       GBP     1   Floating Performing          115    80   
UK       GBP     1   Floating Performing          110    60   
UK       GBP     1   Fixed     Non-Performing   100    50 
UK       GBP     1   Fixed     Non-Performing   120    30  
;
run;

data want;
set have;
input country $ currency $ ID Fixed $ performing $ initial current;
datalines;

UK GBP 1 Fixed Performing 410 150
UK GBP 1 Floating Performing 275 170
UK GBP 1 Fixed Non-performing 220 80
;
run;

Essentially looking for a way to sum the numeric values while concatenating the character variables.

I've tried this code

proc means data=have sum;
var initial current;
by country currency id fixed performing;
run;

Unsure If i'll have to use a proc sql (would be too slow for such a large dataset) or possibly a data step.

any help in concatenating would be appreciated.

Accepted Solutions
Solution
‎03-14-2018 08:34 AM
Respected Advisor
Posts: 3,294

Re: Aggregating data using proc means

proc summary data=have nway;
var initial current;
class country currency id fixed performing;
output out=sums sum=sum_initial sum_current;
run;
--
Paige Miller

View solution in original post


All Replies
Solution
‎03-14-2018 08:34 AM
Respected Advisor
Posts: 3,294

Re: Aggregating data using proc means

proc summary data=have nway;
var initial current;
class country currency id fixed performing;
output out=sums sum=sum_initial sum_current;
run;
--
Paige Miller
Contributor
Posts: 65

Re: Aggregating data using proc means

Posted in reply to PaigeMiller

perfect, thanks.

Super User
Posts: 13,950

Re: Aggregating data using proc means


@89974114 wrote:

perfect, thanks.


Note that with @PaigeMiller's solution if you remove NWAY you would get summaries for each level and combination of the class variables as well as one overall. This can be very useful if you are interested in the different groupings. The variable _type_ can be used to process or use subsets of this type of summary. An optional Classdata set could be used to indicate combinations of interest.

Respected Advisor
Posts: 3,294

Re: Aggregating data using proc means

Yes @ballardw, PROC SUMMARY is an extremely powerful procedure that does lots and lots of very useful things, and it isn't particularly well known by the SAS programmers that I interact with (your experience may be different). The Classdata data set option is a good one, the TYPES statement in PROC SUMMARY does essentially the same thing.

--
Paige Miller
Super User
Posts: 13,950

Re: Aggregating data using proc means

Posted in reply to PaigeMiller

@PaigeMiller I've been using Proc Summary since the time when Proc Means didn't produce output data sets so it is a tool I reach for fairly often though sometimes I forget the options added in the past 20 years or so Smiley Happy

Contributor
Posts: 65

Re: Aggregating data using proc means

Posted in reply to PaigeMiller

Quick supplementary question, i'm trying to get my head around the summary function

 

i'm trying to make a weighted variable such that it is the weight of each segment where weight = initial / total

 

Initial total weight
100   330  0.303
110   330  0.333
120   330  0.363

I tried using weight = (initial / total ) but in the summary function it doesn't understand this command.

 

 

 

 

 

Respected Advisor
Posts: 3,294

Re: Aggregating data using proc means

This has to be done in a data step

 

proc summary data=have nway;
var initial current;
class country currency id fixed performing;
output out=sums sum=sum_initial sum_current;
run;

data sums;
    set sums;
    weight=sum_initial/sum_current;
run;
--
Paige Miller
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 148 views
  • 1 like
  • 3 in conversation