Hi,
I have this table:
DATA HAVE;
INPUT CATEGORY TYPE$ INV_M1 INV_M2 INV_M3 ;
CARDS;
1 A 5 5 9
1 B 10 2 2
2 A 2 2 3
2 B 6 6 5
3 A 8 7 9
3 B 4 9 8
4 A 10 8 9
4 B 5 2 1
;
RUN;
For each CATEGORY and for each month (M1,M2,M3), I have to calculate the following result (TYPE B / TYPE A)
The result should be this:
What function can I use ?
One way given that data set:
data want; set have; by category; array in (*) INV_M1 INV_M2 INV_M3; array l (*) LINV_M1 LINV_M2 LINV_M3; Array r (*) Result_M1 Result_M2 Result_M3; LINV_M1 = Lag(INV_M1); LINV_M2 = Lag(INV_M2); LINV_M3 = Lag(INV_M3); if last.category then do; do i=1 to dim(In); r[i]= in[i]/l[i]; end; output; end; drop i Inv_: Linv: type; run;
However the data is poorly structured it would be better to be
Category Month B A. and likely the Months should actually be a date instead of some random number like 1, 2 or 3 Or 25...
Code like this can reshape the data to that nicer form for calculating the result:
proc transpose data=have out=trans (rename=(_name_=month)); by category type; var Inv_m1 Inv_m2 Inv_m3; run; proc sort data=trans; by category month type; run; proc transpose data=trans out=need (drop=_name_); by category month; id type; var col1; run;
then
data calc; set need; result= B / A; run;
Normally I would stop manipulating data and make a report for that set if you really need separate columns for the result such as one of these:
proc report data=calc; columns category month,result; define category /group; define month/across ""; define result/''; run; proc tabulate data=calc; class category month; var result; tables category='', month=''*result=''*sum='' /box='Category' ; run;
if you really need a poorly structured data set:
proc transpose data=calc out=want2 (drop=_name_) prefix=Result_; by category; id month; var result; run;
One way given that data set:
data want; set have; by category; array in (*) INV_M1 INV_M2 INV_M3; array l (*) LINV_M1 LINV_M2 LINV_M3; Array r (*) Result_M1 Result_M2 Result_M3; LINV_M1 = Lag(INV_M1); LINV_M2 = Lag(INV_M2); LINV_M3 = Lag(INV_M3); if last.category then do; do i=1 to dim(In); r[i]= in[i]/l[i]; end; output; end; drop i Inv_: Linv: type; run;
However the data is poorly structured it would be better to be
Category Month B A. and likely the Months should actually be a date instead of some random number like 1, 2 or 3 Or 25...
Code like this can reshape the data to that nicer form for calculating the result:
proc transpose data=have out=trans (rename=(_name_=month)); by category type; var Inv_m1 Inv_m2 Inv_m3; run; proc sort data=trans; by category month type; run; proc transpose data=trans out=need (drop=_name_); by category month; id type; var col1; run;
then
data calc; set need; result= B / A; run;
Normally I would stop manipulating data and make a report for that set if you really need separate columns for the result such as one of these:
proc report data=calc; columns category month,result; define category /group; define month/across ""; define result/''; run; proc tabulate data=calc; class category month; var result; tables category='', month=''*result=''*sum='' /box='Category' ; run;
if you really need a poorly structured data set:
proc transpose data=calc out=want2 (drop=_name_) prefix=Result_; by category; id month; var result; run;
I'm gonna go with the proc transpose solution... Thanks !
Transpose the INV_Μ: variables to long layout by category and type.
Then sort by category, _NAME_ and type, after which it is easy to compare the two successive values in a DATA step.
proc transpose data=have out=long;
by category type;
var inv_m:;
run;
proc sort data=long;
by category _name_ type;
run;
data want;
set long;
by category _name_;
l_col1= lag(col1);
if last._name_;
result = col1 / l_col1;
drop l_col1 col1;
run;
You can transpose the want dataset to wide,but I recommend to keep the long layout, as it is much more flexible for further analysis.
Adding:
I agree with @ballardw the data is poorly structured and would be much easier to handle with a better data structure, which he describes.
I know the data is structured in a strange way but there is a reason behind it. Can you elaborate on the idea SQL-macro variables because I have to do the calculation for the 12 months... It could be useful.
@sasuser_8 wrote:
I know the data is structured in a strange way but there is a reason behind it. Can you elaborate on the idea SQL-macro variables because I have to do the calculation for the 12 months... It could be useful.
Just because the data may arrive in that wide format does not mean you have to do analysis in that format. Doing analysis in the long format makes programming easier, as shown above by @Tom and others. The solution from Tom works without modification even if you have 100 INV_M variables. A word to the wise: work with long data sets.
To divide two numbers they need to be on the same observation.
Simplest way is to transpose your dataset so that the TYPE values, A and B, are the variable names then MONTH values are stored into a variable.
proc transpose data=have name=month out=have_t;
by category;
id type;
var inv_m1-inv_m3;
run;
data want;
set have_t;
result=b/a;
format result 5.2 ;
run;
You can then easily create a REPORT that looks like your request.
proc report;
column category month,(a b result);
define category / group;
define month / across ' ';
run;
Results
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.