DATA Step, Macro, Functions and more

Creating dynamic variable names & aggregating

Reply
Contributor
Posts: 47

Creating dynamic variable names & aggregating

Hi,

 

I've seen a couple of similar threads, but haven't been able to use them to answer my question. I have the following dataset:

 

Address    Year   Account   Amount
Address1   2015   Account1  100
Address1   2015   Account2  200
Address1   2016   Account1  50
Address2   2015   Account1  150
Address2   2015   Account2  10
Address2   2016   Account1  50
Address2   2016   Account1  30

I would like to transpose this, such that each year-account combination is a variable, as below:

 

Address    Year   Account   Amount _2015_Account1 _2015_Account2 _2016_Account1 _2016_Account2
Address1   2015   Account1  100    100            .              .              .     
Address1   2015   Account2  200    .              200            .              .
Address1   2016   Account1  50     .              .              50             .
Address2   2015   Account1  150    150            .              .              . 
Address2   2015   Account2  10     .              10             .              .
Address2   2016   Account1  50     .              .              50             .
Address2   2016   Account2  30     .              .              .              30

I would then like to aggregate such that each address is only shown once, as below:

Address    _2015_Account1 _2015_Account2 _2016_Account1 _2016_Account2
Address1   100            200            50             .     
Address2   150            10             50             30 

 

Any thoughts on how I could do the above by dynamically creating the columns? I'm currently creating each column manually, but there are 15 accounts, and 5 years Smiley Happy

 

 

 

Super User
Posts: 23,950

Re: Creating dynamic variable names & aggregating

Posted in reply to MikeFranz

PROC TRANSPOSE.

 

proc transpose data=have out=want;
by address;
var amount;
id year account;
run;

If this doesn't work, post your code, log and an explanation of exactly how it's not meeting your requirements.

 


MikeFranz wrote:

Hi,

 

I've seen a couple of similar threads, but haven't been able to use them to answer my question. I have the following dataset:

 

Address    Year   Account   Amount
Address1   2015   Account1  100
Address1   2015   Account2  200
Address1   2016   Account1  50
Address2   2015   Account1  150
Address2   2015   Account2  10
Address2   2016   Account1  50
Address2   2016   Account1  30

I would like to transpose this, such that each year-account combination is a variable, as below:

 

Address    Year   Account   Amount _2015_Account1 _2015_Account2 _2016_Account1 _2016_Account2
Address1   2015   Account1  100    100            .              .              .     
Address1   2015   Account2  200    .              200            .              .
Address1   2016   Account1  50     .              .              50             .
Address2   2015   Account1  150    150            .              .              . 
Address2   2015   Account2  10     .              10             .              .
Address2   2016   Account1  50     .              .              50             .
Address2   2016   Account2  30     .              .              .              30

I would then like to aggregate such that each address is only shown once, as below:

Address    _2015_Account1 _2015_Account2 _2016_Account1 _2016_Account2
Address1   100            200            50             .     
Address2   150            10             50             30 

 

Any thoughts on how I could do the above by dynamically creating the columns? I'm currently creating each column manually, but there are 15 accounts, and 5 years Smiley Happy

 

 

 


 

Contributor
Posts: 47

Re: Creating dynamic variable names & aggregating

That worked, thank you! Significantly more elegant than my 200 line code, very useful PROC Smiley Happy
Respected Advisor
Posts: 3,251

Re: Creating dynamic variable names & aggregating

Posted in reply to MikeFranz

In your desired output, before aggregation, you have Address2 2016 Account2

 

but there is no such combination in your input data. How can this be?

 

Why do you need columns? Why isn't the original layout where there is one record on a row sufficient? What are you going to do next that these columns are needed? They are not needed for aggregating by Address.

--
Paige Miller
Super User
Posts: 13,886

Re: Creating dynamic variable names & aggregating

Posted in reply to MikeFranz

And if you are looking for a report that people read without manipulating data at all.

 

proc tabulate data=have;

class address year account;

var amount;

table address,

         year='' *account=''*amount*sum=''*f=best10.;

run;

Ask a Question
Discussion stats
  • 4 replies
  • 206 views
  • 3 likes
  • 4 in conversation