Hello
LEt's say I have a data set called Have with Lonf structure.
I want to convert it to wide structure.
I know to convery long to side for one varaible using proc transpose.
My question-
Here I am doing x6 times proc transpose .
IS there a beter way to do it with less code that tranpsose Long to wide for 6 varaibles together?
Data have'
input CustID YYYYMM monIndex X W Z R Q Y ;
cards;
111 202401 1 10 11 13 15 19 1
111 202402 2 17 17 12 10 18 0
111 202403 3 19 18 12 12 11 0
111 202404 4 12 12 12 13 14 1
111 202405 5 16 15 14 18 17 1
111 202406 6 16 17 17 18 15 0
222 202401 1 11 11 11 11 11 0
222 202402 2 12 13 14 11 16 0
222 202403 3 19 19 18 16 19 1
222 202404 4 17 17 16 15 18 1
222 202405 5 16 15 16 16 17 0
222 202406 6 14 16 14 13 17 0
;
Run
proc transpose data=have out=wide_1;
by CustID ;
ID monIndex ;
Var X;
Run;
proc transpose data=have out=wide_2;
by CustID ;
ID monIndex ;
Var W;
Run;
proc transpose data=have out=wide_3;
by CustID ;
ID monIndex ;
Var Z;
Run;
proc transpose data=have out=wide_4;
by CustID ;
ID monIndex ;
Var R;
Run;
proc transpose data=have out=wide_5;
by CustID ;
ID monIndex ;
Var Q;
Run;
proc transpose data=have out=wide_6;
by CustID ;
ID monIndex ;
Var Y;
Run;
Data want;
Merge wide_1-wide_6;
by CustID;
Run;
VAR statement accepts multiple variables, no need to transpose each variable separately.
proc transpose data=have out=want prefix=Month_;
by CustID ;
ID monIndex ;
Var X W Z R Q Y;
Run;
Your WANT dataset is identical to WIDE_6, which presumably is not your intent. Please show what you expect WANT to look like.
you are right.
IT is because I forgot add prefix.
Data have;
input CustID YYYYMM monIndex X W Z R Q Y ;
cards;
111 202401 1 10 11 13 15 19 1
111 202402 2 17 17 12 10 18 0
111 202403 3 19 18 12 12 11 0
111 202404 4 12 12 12 13 14 1
111 202405 5 16 15 14 18 17 1
111 202406 6 16 17 17 18 15 0
222 202401 1 11 11 11 11 11 0
222 202402 2 12 13 14 11 16 0
222 202403 3 19 19 18 16 19 1
222 202404 4 17 17 16 15 18 1
222 202405 5 16 15 16 16 17 0
222 202406 6 14 16 14 13 17 0
;
Run
/**Task: Convert Long to Wide*****/
/***WAY1***/
proc transpose data=have out=wide_1 prefix=X;
by CustID ;
ID monIndex;
Var X;
Run;
proc transpose data=have out=wide_2 prefix=W;
by CustID ;
ID monIndex ;
Var W;
Run;
proc transpose data=have out=wide_3 prefix=Z;
by CustID ;
ID monIndex ;
Var Z;
Run;
proc transpose data=have out=wide_4 prefix=R;
by CustID ;
ID monIndex ;
Var R;
Run;
proc transpose data=have out=wide_5 prefix=Q;
by CustID ;
ID monIndex ;
Var Q;
Run;
proc transpose data=have out=wide_6 prefix=Y;
by CustID ;
ID monIndex ;
Var Y;
Run;
Data want;
Merge wide_1-wide_6;
by CustID;
DROP _name_;
Run;
VAR statement accepts multiple variables, no need to transpose each variable separately.
proc transpose data=have out=want prefix=Month_;
by CustID ;
ID monIndex ;
Var X W Z R Q Y;
Run;
Sorry but this is not the final solution, need one more transpose to get in it full wide structure
Since you have multiple variables your starting point is not truly a TALL dataset.
You should first convert your intermediate dataset into a TALL dataset. Then you can transpose that TALL dataset to a WIDE dataset.
proc transpose data=have out=tall;
by custid yyyymm monIndex;
var x -- y ;
run;
proc transpose data=tall out=want(drop=_name_);
by custid ;
id _name_ monindex;
var col1 ;
run;
Result
If you want the new variables ordered differently add a PROC SORT step in between.
proc sort;
by custid _name_ monindex;
run;
Result
Tom already gave you Double Transpose solution.
But if you have a very BIG table,could try MERGE skill proposed by me.
Data have;
input CustID YYYYMM monIndex X W Z R Q Y ;
cards;
111 202401 1 10 11 13 15 19 1
111 202402 2 17 17 12 10 18 0
111 202403 3 19 18 12 12 11 0
111 202404 4 12 12 12 13 14 1
111 202405 5 16 15 14 18 17 1
111 202406 6 16 17 17 18 15 0
222 202401 1 11 11 11 11 11 0
222 202402 2 12 13 14 11 16 0
222 202403 3 19 19 18 16 19 1
222 202404 4 17 17 16 15 18 1
222 202405 5 16 15 16 16 17 0
222 202406 6 14 16 14 13 17 0
;
proc sql noprint;
select distinct catt('have(where=(monIndex=',monIndex,') rename=(X=X',monIndex,' W=W',monIndex,' Z=Z',monIndex,' R=R',monIndex,' Q=Q',monIndex,' Y=Y',monIndex,'))')
into :merge separated by ' '
from have ;
quit;
data want;
merge &merge.;
by CustId;
drop YYYYMM monIndex;
run;
Very nice solution.
You wrote that if data set is big then this solution is better than using proc transpose?
Can you give a general rule of thumb what is a big data set? (How many rows, how many columns )
For example:
60 million rows and 10 columns is a big data set?
If you don't have gap in monIndex variable and dataset has been sorted by CustID and monIndex, Could try proc summary solution:
Data have;
input CustID YYYYMM monIndex X W Z R Q Y ;
cards;
111 202401 1 10 11 13 15 19 1
111 202402 2 17 17 12 10 18 0
111 202403 3 19 18 12 12 11 0
111 202404 4 12 12 12 13 14 1
111 202405 5 16 15 14 18 17 1
111 202406 6 16 17 17 18 15 0
222 202401 1 11 11 11 11 11 0
222 202402 2 12 13 14 11 16 0
222 202403 3 19 19 18 16 19 1
222 202404 4 17 17 16 15 18 1
222 202405 5 16 15 16 16 17 0
222 202406 6 14 16 14 13 17 0
;
proc sql noprint;
select max(n) into :n
from (select count(*) as n from have group by CustID);
quit;
proc summary data=have nway;
class CustID;
output out=want(drop=_type_ _freq_) idgroup(out[&n] (X W Z R Q Y)=);
run;
Thanks for all,
I summarize the provided solutions
Data have;
input CustID YYYYMM monIndex X W Z R Q Y ;
cards;
111 202401 1 10 11 13 15 19 1
111 202402 2 17 17 12 10 18 0
111 202403 3 19 18 12 12 11 0
111 202404 4 12 12 12 13 14 1
111 202405 5 16 15 14 18 17 1
111 202406 6 16 17 17 18 15 0
222 202401 1 11 11 11 11 11 0
222 202402 2 12 13 14 11 16 0
222 202403 3 19 19 18 16 19 1
222 202404 4 17 17 16 15 18 1
222 202405 5 16 15 16 16 17 0
222 202406 6 14 16 14 13 17 0
;
Run
/**Task: Convert Long to Wide*****/
/***WAY1***/
proc transpose data=have out=wide_1 prefix=X;
by CustID ;
ID monIndex;
Var X;
Run;
proc transpose data=have out=wide_2 prefix=W;
by CustID ;
ID monIndex ;
Var W;
Run;
proc transpose data=have out=wide_3 prefix=Z;
by CustID ;
ID monIndex ;
Var Z;
Run;
proc transpose data=have out=wide_4 prefix=R;
by CustID ;
ID monIndex ;
Var R;
Run;
proc transpose data=have out=wide_5 prefix=Q;
by CustID ;
ID monIndex ;
Var Q;
Run;
proc transpose data=have out=wide_6 prefix=Y;
by CustID ;
ID monIndex ;
Var Y;
Run;
Data want;
Merge wide_1-wide_6;
by CustID;
DROP _name_;
Run;
/***WAY2-Quick way--VAR statement accepts multiple variables, no need to transpose each variable separately***/
proc transpose data=have out=want prefix=X;
by CustID ;
ID monIndex;
Var X W Z R Q Y;
/*Var X--Y;*/
DROP _name_;
Run;
/***WAY3-Double Transpose solution***/
proc transpose data=have out=Longer;
by custid yyyymm monIndex;
var x -- y ;
run;
proc sort data=Longer;
by custid _name_ monindex;
run;
/*If you want the new variables ordered differently add a PROC SORT step in between*/
proc transpose data=Longer out=want(drop=_name_);
by custid ;
id _name_ monindex;
var col1 ;
run;
/***WAY4-IF you have very big data set****/
data want;
Merge
have(where=(monIndex=1) rename=(X=X1 W=W1 Z=Z1 R=R1 Q=Q1 Y=Y1))
have(where=(monIndex=2) rename=(X=X2 W=W2 Z=Z2 R=R2 Q=Q2 Y=Y2))
have(where=(monIndex=3) rename=(X=X3 W=W3 Z=Z3 R=R3 Q=Q3 Y=Y3))
have(where=(monIndex=4) rename=(X=X4 W=W4 Z=Z4 R=R4 Q=Q4 Y=Y4))
have(where=(monIndex=5) rename=(X=X5 W=W5 Z=Z5 R=R5 Q=Q5 Y=Y5))
have(where=(monIndex=6) rename=(X=X6 W=W6 Z=Z6 R=R6 Q=Q6 Y=Y6))
;
by CustId;
drop YYYYMM monIndex;
Run;
/***WAY4-IF you have very big data set--Using macro var tooo****/
proc sql noprint;
select distinct catt('have(where=(monIndex=',monIndex,') rename=(X=X',monIndex,' W=W',monIndex,' Z=Z',monIndex,' R=R',monIndex,' Q=Q',monIndex,' Y=Y',monIndex,'))')
into :merge separated by ' '
from have ;
quit;
%put &merge;
data want;
merge &merge.;
by CustId;
drop YYYYMM monIndex;
run;
/**Way5-If you don't have gap in monIndex variable and dataset has been sorted by CustID and monIndex, Could try proc summary solution***/
proc sql noprint;
select max(n) into :n
From (select count(*) as n from have group by CustID);
quit;
%put &n;
proc summary data=have nway;
class CustID;
output out=want(drop=_type_ _freq_) idgroup(out[&n] (X W Z R Q Y)=);
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.