What I am trying to do is to calculate the differences of Sales # for each department between 3 different dates. The codes seem work in the dataset "test".
However, the real dataset I have consists of 3 variables and 12 observations shown as below in the dataset "real". My question is is there any way to transpose the observation of date into variable to make the dataset looks the same as "test", so I can calculate the differences in Sales for each period. Or perhaps there's a way to achieve the same result directly from dataset "real"?
Any suggestions are gladly appreciated.
DATA TEST;
INPUT DEPARTMENT $ SALE1 SALE2 SALE3 ;
CARDS;
WEST 175 134 124
EAST 198 182 117
SOUTH 200 203 205
NORTH 199 194 106
;
RUN;
DATA XYZ;
SET TEST;
ARRAY SALE(*) SALE1 - SALE3;
ARRAY SALE_DIFF(2);
DO I = 1 TO DIM(SALE_DIFF);
SALE_DIFF(I) = SALE(I + 1) - SALE(I);
END;
DROP I;
RUN;
PROC PRINT DATA=XYZ NOOBS;
VAR DEPARTMENT SALE_DIFF1 - SALE_DIFF2;
RUN;
DATA REAL;
INPUT DATE $ 1-9 DEPARTMENT $ SALE ;
CARDS;
30NOV2016 WEST 175
31DEC2016 WEST 134
31JAN2017 WEST 124
30NOV2016 EAST 198
31DEC2016 EAST 182
31JAN2017 EAST 117
30NOV2016 SOUTH 200
31DEC2016 SOUTH 203
31JAN2017 SOUTH 205
30NOV2016 NORTH 199
31DEC2016 NORTH 194
31JAN2017 NORTH 106
;
RUN;
You are getting the odd stuff in your names because of how you are calling proc transpose.
If you want the final names to be sales_diff1 to sales_diff2, just use:
proc sort data=real;
by department date;
run;
proc transpose data=real prefix=sale_ out=realtrans (drop=_name_);
by department notsorted;
var sale;
run;
data realtrans (drop=sale_1-sale_3);
set realtrans;
sales_diff1=sale_1-sale_2;
sales_diff2=sale_2-sale_3;
run;
Art, CEO, AnalystFinder.com
proc transpose data=real prefix=D out=realtrans (drop=_name_);
by department;
id date;
var sale;
run;
If your data are already in the order shown, then the following code will work. If they're not in that order, then you would have to sort the file first:
proc transpose data=real prefix=sale out=realtrans (drop=_name_); by department notsorted; var sale; run;
Art, CEO, AnalystFinder.com
Hi Art297, I tried your approach and it worked perfectly, Thanks!
Since I am not that familiar with Proc Transpose, it seems the variable names look a little bit weird shown as below, and I need to change the array name to achieve the end result. Is there any better way to assign the array name in the dataset Final?
Again, appreciated for your advices.
DATA REAL;
INPUT DATE $ 1-10 DEPARTMENT $ SALE ;
CARDS;
30NOV2016 WEST 175
31DEC2016 WEST 134
31JAN2017 WEST 124
30NOV2016 EAST 198
31DEC2016 EAST 182
31JAN2017 EAST 117
30NOV2016 SOUTH 200
31DEC2016 SOUTH 203
31JAN2017 SOUTH 205
30NOV2016 NORTH 199
31DEC2016 NORTH 194
31JAN2017 NORTH 106
;
RUN;
proc sort data=real out=real1;
by department;
run;
proc transpose data=real1 prefix=D out=realtrans (drop=_name_);
by department;
id date;
var sale;
run;
TITLE ' AFTER TRANSPOSE';
PROC PRINT DATA=realtrans NOOBS;
RUN;
DATA FINAL;
SET realtrans;
ARRAY SALE(3) D30NOV2016 D31DEC2016 D31JAN2017; /* Is there any better or efficient way to assign the array name here ? I tried D30NOV2016 - D31JAN2017 but it didn't work*/
ARRAY SALE_DIFF(2);
DO I = 1 TO DIM(SALE_DIFF);
SALE_DIFF(I) = SALE(I + 1) - SALE(I);
END;
DROP I;
RUN;
TITLE 'FINAL REPORT';
PROC PRINT DATA=XYZ NOOBS;
VAR DEPARTMENT SALE_DIFF1 - SALE_DIFF2;
RUN;
AFTER TRANSPOSE | |||
DEPARTMENT | D30NOV2016 | D31DEC2016 | D31JAN2017 |
EAST | 198 | 182 | 117 |
NORTH | 199 | 194 | 106 |
SOUTH | 200 | 203 | 205 |
WEST | 175 | 134 | 124 |
FINAL REPORT | ||
DEPARTMENT | SALE_DIFF1 | SALE_DIFF2 |
WEST | -41 | -10 |
EAST | -16 | -65 |
SOUTH | 3 | 2 |
NORTH | -5 | -88 |
You are getting the odd stuff in your names because of how you are calling proc transpose.
If you want the final names to be sales_diff1 to sales_diff2, just use:
proc sort data=real;
by department date;
run;
proc transpose data=real prefix=sale_ out=realtrans (drop=_name_);
by department notsorted;
var sale;
run;
data realtrans (drop=sale_1-sale_3);
set realtrans;
sales_diff1=sale_1-sale_2;
sales_diff2=sale_2-sale_3;
run;
Art, CEO, AnalystFinder.com
The ID statement automatically creates the variable names based on distinct values present in the Date column. Since the values begin with a number and SAS does not allow variable names to begin with a number, we add on a prefix (in this case, D is the prefix I used). So the first variable is D+30NOV2016 = D30NOV2016, and so on. Without providing a prefix, SAS would just put an underscore at the beginning of the variable names like _30NOV2016.
With regard to automatically pulling in var names in your array, this is how you can do it:
proc transpose data=real prefix=D_ out=realtrans (drop=_name_); /* added an underscore so that query below works, want to distinguish these variables from Department */
by department notsorted;
id date;
var sale;
run;
proc sql;
select name into :namevarlist separated by ' '
from dictionary.columns
where memname='REALTRANS'
and substr(name,1,2) ='D_';
quit;
DATA FINAL;
SET realtrans;
ARRAY SALE(3) &namevarlist.;
ARRAY SALE_DIFF(2);
DO I = 1 TO DIM(SALE_DIFF);
SALE_DIFF(I) = SALE(I + 1) - SALE(I);
END;
DROP I;
RUN;
Keep the original order of the REAL dataset and use the lag() function.
data want;
set real;
by department notsorted;
diff = sale - lag(sale);
if first.department then diff = .;
run;
Difference is sales for store1-store2 and store3-store2
Nicely layed out question
Other solutions could use lag or 'by not sorted'
No need to transpose.
My solution will stop processing if you have more then three 'stores' .
inspired by
https://goo.gl/OgQQNz
https://communities.sas.com/t5/Base-SAS-Programming/Is-there-any-way-to-transpose-the-observation-into-variable/m-p/335664
HAVE
====
Up to 40 obs WORK.HAVE total obs=12
Obs DATE DEPARTMENT SALE
1 30NOV2016 WEST 175
2 31DEC2016 WEST 134
3 31JAN2017 WEST 124
4 30NOV2016 EAST 198
5 31DEC2016 EAST 182
6 31JAN2017 EAST 117
7 30NOV2016 SOUTH 200
8 31DEC2016 SOUTH 203
9 31JAN2017 SOUTH 205
10 30NOV2016 NORTH 199
11 31DEC2016 NORTH 194
12 31JAN2017 NORTH 106
WANT
====
Up to 40 obs from want total obs=4
SALE_ SALE_
DEPARTMENT DIFF1 DIFF2
WEST -41 -10
EAST -16 -65
SOUTH 3 2
NORTH -5 -88
DETAILS
=======
????
Obs DEPARTMENT DIF1 Sale1-Sale2 DIF2 Sale3-Sale2
1 WEST 41 175-134 -10 124-134
2 EAST 16 198-182 -65 117-182
3 SOUTH -3 200-203 2 205-203
4 NORTH 5 199-194 -88 106-194
WORKING CODE
===========
select;
when (mod(_n_,3)=2) vals[1]=sale;
when (mod(_n_,3)=1) vals[2]=sale;
when (mod(_n_,3)=0) do;
dif1=vals[2]-vals[1];
dif2=sale-vals[1];
output;
end;
end; * leave off otherwise to force error;
FULL SOLUTION
=============
DATA have;
INPUT DATE $ 1-9 DEPARTMENT $ SALE ;
CARDS;
30NOV2016 WEST 175
31DEC2016 WEST 134
31JAN2017 WEST 124
30NOV2016 EAST 198
31DEC2016 EAST 182
31JAN2017 EAST 117
30NOV2016 SOUTH 200
31DEC2016 SOUTH 203
31JAN2017 SOUTH 205
30NOV2016 NORTH 199
31DEC2016 NORTH 194
31JAN2017 NORTH 106
;
RUN;
data want;
array vals[2] _temporary_;
set have;
select;
when (mod(_n_,3)=2) vals[1]=sale;
when (mod(_n_,3)=1) vals[2]=sale;
when (mod(_n_,3)=0) do;
dif1=vals[2]-vals[1];
dif2=sale-vals[1];
output;
end;
end; * leave off otherwise to force error;
keep department dif:;
run;quit;
Up to 40 obs from want total obs=4
Obs DEPARTMENT DIF1 DIF2
1 WEST 41 -10
2 EAST 16 -65
3 SOUTH -3 2
4 NORTH 5 -88
Difference is sales for store1-store2 and store3-store2
Nicely layed out question
Other solutions could use lag or 'by not sorted'
No need to transpose.
My solution will stop processing if you have more then three 'stores' .
inspired by
https://goo.gl/OgQQNz
https://communities.sas.com/t5/Base-SAS-Programming/Is-there-any-way-to-transpose-the-observation-into-variable/m-p/335664
HAVE
====
Up to 40 obs WORK.HAVE total obs=12
Obs DATE DEPARTMENT SALE
1 30NOV2016 WEST 175
2 31DEC2016 WEST 134
3 31JAN2017 WEST 124
4 30NOV2016 EAST 198
5 31DEC2016 EAST 182
6 31JAN2017 EAST 117
7 30NOV2016 SOUTH 200
8 31DEC2016 SOUTH 203
9 31JAN2017 SOUTH 205
10 30NOV2016 NORTH 199
11 31DEC2016 NORTH 194
12 31JAN2017 NORTH 106
WANT
====
Up to 40 obs from want total obs=4
SALE_ SALE_
DEPARTMENT DIFF1 DIFF2
WEST -41 -10
EAST -16 -65
SOUTH 3 2
NORTH -5 -88
DETAILS
=======
????
Obs DEPARTMENT DIF1 Sale1-Sale2 DIF2 Sale3-Sale2
1 WEST 41 175-134 -10 124-134
2 EAST 16 198-182 -65 117-182
3 SOUTH -3 200-203 2 205-203
4 NORTH 5 199-194 -88 106-194
WORKING CODE
===========
select;
when (mod(_n_,3)=2) vals[1]=sale;
when (mod(_n_,3)=1) vals[2]=sale;
when (mod(_n_,3)=0) do;
dif1=vals[2]-vals[1];
dif2=sale-vals[1];
output;
end;
end; * leave off otherwise to force error;
FULL SOLUTION
=============
DATA have;
INPUT DATE $ 1-9 DEPARTMENT $ SALE ;
CARDS;
30NOV2016 WEST 175
31DEC2016 WEST 134
31JAN2017 WEST 124
30NOV2016 EAST 198
31DEC2016 EAST 182
31JAN2017 EAST 117
30NOV2016 SOUTH 200
31DEC2016 SOUTH 203
31JAN2017 SOUTH 205
30NOV2016 NORTH 199
31DEC2016 NORTH 194
31JAN2017 NORTH 106
;
RUN;
data want;
array vals[2] _temporary_;
set have;
select;
when (mod(_n_,3)=2) vals[1]=sale;
when (mod(_n_,3)=1) vals[2]=sale;
when (mod(_n_,3)=0) do;
dif1=vals[2]-vals[1];
dif2=sale-vals[1];
output;
end;
end; * leave off otherwise to force error;
keep department dif:;
run;quit;
Up to 40 obs from want total obs=4
Obs DEPARTMENT DIF1 DIF2
1 WEST 41 -10
2 EAST 16 -65
3 SOUTH -3 2
4 NORTH 5 -88
Hi, or this? Provided data input already ordered by date.
data real;
set real;
by department notsorted;
dif_sale = dif(sale);
if first.department then do; dif_sale = .; end;
input date $ 1-9 department $ sale ;
cards;
30nov2016 west 175
31dec2016 west 134
31jan2017 west 124
30nov2016 east 198
31dec2016 east 182
31jan2017 east 117
30nov2016 south 200
31dec2016 south 203
31jan2017 south 205
30nov2016 north 199
31dec2016 north 194
31jan2017 north 106
;
run;
⏰
Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.
Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.