DATA Step, Macro, Functions and more

reshape observations als variables ore viceversa

Reply
Frequent Contributor
Posts: 127

reshape observations als variables ore viceversa

Dear experts,

 

I have the following data sets:

 

DATA cost;
INPUT cost 4. category $2. product $4. year $4. ;
DATALINES;
061 1 ABC 2010
062 1 ABC 2011
067 1 ABC 2012
070 1 ABC 2013
100 1 DEF 2010
105 1 DEF 2011
050 2 GHI 2011
055 2 GHI 2012
060 2 GHI 2013
;run;

DATA revenues;
INPUT category $2. product $4. revenue_2010 4. revenue_2011 4. revenue_2012 4. revenue_2013 4. ;
DATALINES;
1 ABC 100 098 102 100
1 DEF 120 110 000 000
2 GHI 000 080 075 070
;run;

 

In order to calculate the difference between cost and revenues (it is an example, it is about how to do it not why) I need to reshape the both the tables in the same structure. I would like to reshape cost in the form of revenues and viceversal.

Any Ideas? I tried with poc mean, proc sql, proc transpose but I did not what I get in a authomatize, quick and elegant manne.

Thanks in advance for your reply.

BRs, SH

 

Super User
Posts: 10,548

Re: reshape observations als variables ore viceversa

Since transforming the Revenue data makes more sense as far as matching up :

data revenuewant (keep= category product year revenue);
   set revenues;
   array r {2010:2013} revenue: ;
   do year= 2010 to 2013;
      revenue = r[year];
      output;
   end;
run;
Super User
Super User
Posts: 7,430

Re: reshape observations als variables ore viceversa

Well, the problem here, as with most problems, is the data structure of the second dataset.  Using "data" as columns headers always leads to problems, and lots of unmanagable code. Simply putting that data into a normalised data structure with fixed columns names, and the data in the data part makes the coding so much simpler:

data cost;
  input cost 4. category $2. product $4. year $4. ;
datalines;
061 1 ABC 2010
062 1 ABC 2011
067 1 ABC 2012
070 1 ABC 2013
100 1 DEF 2010
105 1 DEF 2011
050 2 GHI 2011
055 2 GHI 2012
060 2 GHI 2013
;
run;
data revenues;
  input category $2. product $4. revenue_2010 4. revenue_2011 4. revenue_2012 4. revenue_2013 4. ;
datalines;
1 ABC 100 098 102 100
1 DEF 120 110 000 000
2 GHI 000 080 075 070
;
run;

proc transpose data=revenues out=t_revenues;
  by category product;
  var revenue:;
run;

data t_revenues;
  set t_revenues;
  year=scan(_name_,2,"_");
run;

proc sql;
  create table INTER as
  select  A.*,
          B.COST
  from    T_REVENUES A
  left join COST B
  on      A.CATEGORY=B.CATEGORY
  and     A.PRODUCT=B.PRODUCT
  and     A.YEAR=B.YEAR;
quit;
Super User
Posts: 6,971

Re: reshape observations als variables ore viceversa

I did that:

proc transpose data=revenues out=revenues_1 prefix=revenue name=revyear;
by category product;
var _numeric_;
run;

data revenues_2;
set revenues_1 (rename=(revenue1=revenue));
length year $4;
year = substr(revyear,9);
drop revyear;
run;

Now you can run your analyses by category, product and year.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 5,097

Re: reshape observations als variables ore viceversa

On the one hand, I agree with those who contend that COST has the "right" structure and REVENUE has the "wrong" structure.  On the other hand, it's not for me to pre-judge the requirements that you will encounter in your future programming.  With that said, here is how you might transpose the COST data into the REVENUE structure:

 

proc transpose data=cost prefix=cost_ out=want (drop=_name_);
by category product;
var cost;
id year;
run;

Ask a Question
Discussion stats
  • 4 replies
  • 273 views
  • 0 likes
  • 5 in conversation