Dear experts,
I have the following data sets:
DATA cost;
INPUT cost 4. category $2. product $4. year $4. ;
DATALINES;
061 1 ABC 2010
062 1 ABC 2011
067 1 ABC 2012
070 1 ABC 2013
100 1 DEF 2010
105 1 DEF 2011
050 2 GHI 2011
055 2 GHI 2012
060 2 GHI 2013
;run;
DATA revenues;
INPUT category $2. product $4. revenue_2010 4. revenue_2011 4. revenue_2012 4. revenue_2013 4. ;
DATALINES;
1 ABC 100 098 102 100
1 DEF 120 110 000 000
2 GHI 000 080 075 070
;run;
In order to calculate the difference between cost and revenues (it is an example, it is about how to do it not why) I need to reshape the both the tables in the same structure. I would like to reshape cost in the form of revenues and viceversal.
Any Ideas? I tried with poc mean, proc sql, proc transpose but I did not what I get in a authomatize, quick and elegant manne.
Thanks in advance for your reply.
BRs, SH
Since transforming the Revenue data makes more sense as far as matching up :
data revenuewant (keep= category product year revenue);
set revenues;
array r {2010:2013} revenue: ;
do year= 2010 to 2013;
revenue = r[year];
output;
end;
run;
Well, the problem here, as with most problems, is the data structure of the second dataset. Using "data" as columns headers always leads to problems, and lots of unmanagable code. Simply putting that data into a normalised data structure with fixed columns names, and the data in the data part makes the coding so much simpler:
data cost; input cost 4. category $2. product $4. year $4. ; datalines; 061 1 ABC 2010 062 1 ABC 2011 067 1 ABC 2012 070 1 ABC 2013 100 1 DEF 2010 105 1 DEF 2011 050 2 GHI 2011 055 2 GHI 2012 060 2 GHI 2013 ; run; data revenues; input category $2. product $4. revenue_2010 4. revenue_2011 4. revenue_2012 4. revenue_2013 4. ; datalines; 1 ABC 100 098 102 100 1 DEF 120 110 000 000 2 GHI 000 080 075 070 ; run; proc transpose data=revenues out=t_revenues; by category product; var revenue:; run; data t_revenues; set t_revenues; year=scan(_name_,2,"_"); run; proc sql; create table INTER as select A.*, B.COST from T_REVENUES A left join COST B on A.CATEGORY=B.CATEGORY and A.PRODUCT=B.PRODUCT and A.YEAR=B.YEAR; quit;
I did that:
proc transpose data=revenues out=revenues_1 prefix=revenue name=revyear;
by category product;
var _numeric_;
run;
data revenues_2;
set revenues_1 (rename=(revenue1=revenue));
length year $4;
year = substr(revyear,9);
drop revyear;
run;
Now you can run your analyses by category, product and year.
On the one hand, I agree with those who contend that COST has the "right" structure and REVENUE has the "wrong" structure. On the other hand, it's not for me to pre-judge the requirements that you will encounter in your future programming. With that said, here is how you might transpose the COST data into the REVENUE structure:
proc transpose data=cost prefix=cost_ out=want (drop=_name_);
by category product;
var cost;
id year;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.