Hello
I want to change the table from wide to long.
It means that for each ID there will be 3 rows (with information of X1 X2 X3 ).
I have 3 queations:
1-Since X3 is numeric and X1,X2 are char, should I need to do any convertion (Because in the long structure the values of X1 ,X2,X3 should be in same column)
2-Is there a better way to create the long structure table?
3-I want to add a new column to LongTbl that is called labelField that will get the value of label of X1/X2/X3.
What is the way to do it please in a way that it takes the label automatically ?(Here I type it but in real life I have many varaibles and I dont want to type the label for each varaible and I prefer to find a clever way to do it
Data WideTbl;
input ID date : date9. x1 $ x2 $ x3 ;
format date ddmmyy10.;
label x1='Team' x2='smoker' X3='status';
cards;
1 01jan2021 a y 1
2 01jan2021 b n 2
3 01jan2021 c n 2
4 01jan2021 a n 3
5 02jan2021 b y 1
6 02jan2021 a y 1
7 03jan2021 c y 2
8 03jan2021 a n 2
9 03jan2021 a y 3
;
run;
data LongTbl;
set WideTbl;
length category $3;
category=x1; field='x1';labelField ='Team'; output;
category=x2; field='x2'; labelField ='smoker';output;
category=x3; field='x3';labelField ='status'; output;
drop x1 - x3;
run;
Since you have unique BY variables just use PROC TRANSPOSE.
data WideTbl;
input ID date : date9. x1 $ x2 $ x3 ;
format date ddmmyy10.;
label x1='Team' x2='smoker' X3='status';
cards;
1 01jan2021 a y 1
2 01jan2021 b n 2
3 01jan2021 c n 2
4 01jan2021 a n 3
5 02jan2021 b y 1
6 02jan2021 a y 1
7 03jan2021 c y 2
8 03jan2021 a n 2
9 03jan2021 a y 3
;
proc transpose data=widetbl
out=LongTbl(rename=(_name_=field _label_=labelField col1=category))
;
by id date;
var x1-x3;
run;
If you have mixed numeric and character then CATEGORY will automatically become character.
label Obs ID date field Field category 1 1 01/01/2021 x1 Team a 2 1 01/01/2021 x2 smoker y 3 1 01/01/2021 x3 status 1 4 2 01/01/2021 x1 Team b 5 2 01/01/2021 x2 smoker n 6 2 01/01/2021 x3 status 2 7 3 01/01/2021 x1 Team c 8 3 01/01/2021 x2 smoker n 9 3 01/01/2021 x3 status 2
But you might want to add a step to left justify the strings generated from the numeric variables.
data LongTbl; set LongTbl;
category=left(category);
run;
label Obs ID date field Field category 1 1 01/01/2021 x1 Team a 2 1 01/01/2021 x2 smoker y 3 1 01/01/2021 x3 status 1 4 2 01/01/2021 x1 Team b 5 2 01/01/2021 x2 smoker n 6 2 01/01/2021 x3 status 2 7 3 01/01/2021 x1 Team c 8 3 01/01/2021 x2 smoker n 9 3 01/01/2021 x3 status 2
1. Yes, you'll need to convert your variables to the same type.
2. No, since you need to do type conversion, you may as well do it on in a data step
3. VLABEL() and VNAME() functions are what you're looking for here.
PROC TRANSPOSE has the IDLABEL variable but not sure how that works when you go wide to long.
Transposing data tutorials:
Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/
And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Thanks, may you please show how to apply these solution here?
Will it work well?
Data WideTbl;
input ID date : date9. x1 $ x2 $ x3 ;
format date ddmmyy10.;
label x1='Team' x2='smoker' X3='status';
cards;
1 01jan2021 a y 1
2 01jan2021 b n 2
3 01jan2021 c n 2
4 01jan2021 a n 3
5 02jan2021 b y 1
6 02jan2021 a y 1
7 03jan2021 c y 2
8 03jan2021 a n 2
9 03jan2021 a y 3
;
run;
/*To convert X3 to char*/
Data WideTbl2;
set WideTbl;
X3_char=compress(X3);
drop X3;
rename X3_char=X3;
Run;
data LongTbl;
set WideTbl;
length category $3 labelField $20. ;
field='x1';category=x1;labelField = vlabel(x1);output;
field='x2';category=x2;labelField = vlabel(x2);output;
field='x3';category=x3;labelField = vlabel(x3);output;
drop x1 - x3;
run;
You really should avoid automatic numeric to character conversions that your code used. Explicitly PUT the numeric with an appropriate format AND add the -L modifier to left justify the value unless you want one or more leading spaces.
data LongTbl; set WideTbl; length category $3; field='x1';category=x1;labelField = vlabel(x1);output; field='x2';category=x2;labelField = vlabel(x2);output; field='x3';category=put(x3,best. -L);labelField = vlabel(x3);output; drop x1 - x3; run;
Since you have unique BY variables just use PROC TRANSPOSE.
data WideTbl;
input ID date : date9. x1 $ x2 $ x3 ;
format date ddmmyy10.;
label x1='Team' x2='smoker' X3='status';
cards;
1 01jan2021 a y 1
2 01jan2021 b n 2
3 01jan2021 c n 2
4 01jan2021 a n 3
5 02jan2021 b y 1
6 02jan2021 a y 1
7 03jan2021 c y 2
8 03jan2021 a n 2
9 03jan2021 a y 3
;
proc transpose data=widetbl
out=LongTbl(rename=(_name_=field _label_=labelField col1=category))
;
by id date;
var x1-x3;
run;
If you have mixed numeric and character then CATEGORY will automatically become character.
label Obs ID date field Field category 1 1 01/01/2021 x1 Team a 2 1 01/01/2021 x2 smoker y 3 1 01/01/2021 x3 status 1 4 2 01/01/2021 x1 Team b 5 2 01/01/2021 x2 smoker n 6 2 01/01/2021 x3 status 2 7 3 01/01/2021 x1 Team c 8 3 01/01/2021 x2 smoker n 9 3 01/01/2021 x3 status 2
But you might want to add a step to left justify the strings generated from the numeric variables.
data LongTbl; set LongTbl;
category=left(category);
run;
label Obs ID date field Field category 1 1 01/01/2021 x1 Team a 2 1 01/01/2021 x2 smoker y 3 1 01/01/2021 x3 status 1 4 2 01/01/2021 x1 Team b 5 2 01/01/2021 x2 smoker n 6 2 01/01/2021 x3 status 2 7 3 01/01/2021 x1 Team c 8 3 01/01/2021 x2 smoker n 9 3 01/01/2021 x3 status 2
The IDLABEL statement is for transposing the other way, when you are using the it to make new variables.
The fact that at least one of the variables being transposed had a LABEL attached to it is what caused PROC TRANSPOSE to create the _LABEL_ variable.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.