Hi.
i have the following long formatted data (2 variables: ID and var (with more than 32 bytes))
id var
1 my name is mike
1 8 z street
1 unit 5
2 my name is z
2 78 8th street
2 city oktown - old town of the fairf county
etc.
i want to create a dummy variable for each unique value of variable “var” (name of the dummy variables should be col1, col2 etc. and labels should come from the values of var). Can anyone help me?
thank you.
Hi Zula,
Assuming there is no duplicate observation exists for VAR variable.
data test;
infile datalines dsd dlm='';
input id var $50.;
datalines;
1 my name is mike
1 8 z street
1 unit 5
2 my name is z
2 78 8th street
2 city oktown - old town of the fairf county
;
run;
proc transpose data=test out=want (drop=_name_);
id var;
idlabel var;
var id;
by id notsorted;
run;Please show what you expect as output based on that input and what you've tried so far. Hint: PROC TRANSPOSE should be your starting point and look at ID and IDLABEL.
@Zula wrote:
Hi.
i have the following long formatted data (2 variables: ID and var (with more than 32 bytes))
id var
1 my name is mike
1 8 z street
1 unit 5
2 my name is z
2 78 8th street
2 city oktown - old town of the fairf county
etc.
i want to create a dummy variable for each unique value of variable “var” (name of the dummy variables should be col1, col2 etc. and labels should come from the values of var). Can anyone help me?
thank you.
The output should be as below (wide formatted):
ID col1 (labeled as my name is mike) col2 (labeled as 8 z street) col3 (labeled unit 5) col4 (labeled my name is z)etc.
1 1 1 1 0
2 0 0 0 1
thank you.
try doing a proc freq then a proc transpose and you should get what you are asking for.
Wow what a possibility for the n columns you will create.
Hi Zula,
Assuming there is no duplicate observation exists for VAR variable.
data test;
infile datalines dsd dlm='';
input id var $50.;
datalines;
1 my name is mike
1 8 z street
1 unit 5
2 my name is z
2 78 8th street
2 city oktown - old town of the fairf county
;
run;
proc transpose data=test out=want (drop=_name_);
id var;
idlabel var;
var id;
by id notsorted;
run;@singhsahab This is a bit closer to what @Zula wants.
data test;
   input id var $50.;
   datalines;
1 my name is mike
1 8 z street
1 unit 5
2 my name is z
2 78 8th street
2 city oktown - old town of the fairf county
;;;;
   run;
proc print;
   run;
proc glmmod noprint data=test outdesign=want0 order=data;
   class var;
   model id = var / noint;
   run;
proc summary data=want0 nway;
   class id;
   output out=want(drop=_type_ _freq_) sum(col:)=;
   run;
proc contents varnum data=want;
proc print data=want;
   run;
@data_null__ Good morning Guru, Thank you. However, while a somebody like me who is not even close to your level of expertise, could I bother you for some comments as to what it does. Sure, I am dig in with docs and follow the logic/results, I will eventually get there. Just asking plz
@novinosrin wrote:
@data_null__ Good morning Guru, Thank you. However, while a somebody like me who is not even close to your level of expertise, could I bother you for some comments as to what it does. Sure, I am dig in with docs and follow the logic/results, I will eventually get there. Just asking plz
@novinosrin this example is bit more general and make the variable labels with out the leading VAR. Just the value of VAR.
PROC GLMMOD is one of the ways that we can use to create dummy variables. In this case for each value of VAR a new dummy variable is created. Much easier than data step.
I've added the variable _OBS_ to the data set so we can merge the dummies back to the original data and serve as the Y variable in the model.
I use the OUTPARM data from GLMMOD to create a new data set call LABELS that will supply new labels for the COL variable with out variable name VAR as the beginning.
879  data test;
880     _obs_ + 1; /*this will be Y left side analysis variable*/
881     input id:$1. var $50.;
882     datalines;
NOTE: The data set WORK.TEST has 9 observations and 3 variables.
892  ;;;;
893     run;
894  proc print;
895     run;
Obs    _obs_    id    var
 1       1      1     my name is mike
 2       2      1     8 z street
 3       3      1     unit 5
 4       4      2     my name is z
 5       5      2     78 8th street
 6       6      2     city oktown - old town of the fairf county
 7       7      3     8 z street
 8       8      3     unit 5
 9       9      3     my name is z
896  /*Create dummy variables for each level of the variable VAR */
897  /*The default name of the new variables is COLn*/
898  proc glmmod noprint data=test outdesign=want0 order=data outparm=parm;
899     class var;
900     model _obs_ = var / noint; /*Don't need the intercept*/
901     run;
NOTE: There were 9 observations read from the data set WORK.TEST.
NOTE: The data set WORK.WANT0 has 9 observations and 7 variables.
NOTE: The data set WORK.PARM has 6 observations and 3 variables.
902  /*The default label for the COL variables is "effect name" value */
903  /*I don't want the effect name, use the OUTPARM data to create new labels */
904  proc print data=parm;
905     run;
Obs    _COLNUM_    EFFNAME    var
 1         1         var      my name is mike
 2         2         var      8 z street
 3         3         var      unit 5
 4         4         var      my name is z
 5         5         var      78 8th street
 6         6         var      city oktown - old town of the fairf county
906  /*I guess this is a bit of a trick to create numeric variables and label them*/
907  proc transpose data=parm out=labels(drop=_name_) prefix=COL;
908     id _colnum_;
909     idlabel var;
910     var;
911     run;
NOTE: No variables to transpose.
NOTE: There were 6 observations read from the data set WORK.PARM.
NOTE: The data set WORK.LABELS has 0 observations and 6 variables.
912  proc contents varnum;
913     run;
                        Variables in Creation Order
#    Variable    Type    Len    Label
1    COL1        Num       8    my name is mike
2    COL2        Num       8    8 z street
3    COL3        Num       8    unit 5
4    COL4        Num       8    my name is z
5    COL5        Num       8    78 8th street
6    COL6        Num       8    city oktown - old town of the fairf county
914
915  /*Merge the dummies with the input data*/
916  data want;
917     if 0 then set test labels; /*set order of variables and get new labels for the COLn variables*/
918     merge test want0;
919     by _obs_;
920     run;
NOTE: There were 9 observations read from the data set WORK.TEST.
NOTE: There were 9 observations read from the data set WORK.WANT0.
NOTE: The data set WORK.WANT has 9 observations and 9 variables.
921  proc contents varnum data=want;
                         Variables in Creation Order
 #    Variable    Type    Len    Label
 1    _obs_       Num       8
 2    id          Char      1
 3    var         Char     50
 4    COL1        Num       8    my name is mike
 5    COL2        Num       8    8 z street
 6    COL3        Num       8    unit 5
 7    COL4        Num       8    my name is z
 8    COL5        Num       8    78 8th street
 9    COL6        Num       8    city oktown - old town of the fairf county
922  proc print data=want;
923     run;
Obs    _obs_    id    var                                           COL1    COL2    COL3    COL4    COL5    COL6
 1       1      1     my name is mike                                 1       0       0       0       0       0
 2       2      1     8 z street                                      0       1       0       0       0       0
 3       3      1     unit 5                                          0       0       1       0       0       0
 4       4      2     my name is z                                    0       0       0       1       0       0
 5       5      2     78 8th street                                   0       0       0       0       1       0
 6       6      2     city oktown - old town of the fairf county      0       0       0       0       0       1
 7       7      3     8 z street                                      0       1       0       0       0       0
 8       8      3     unit 5                                          0       0       1       0       0       0
 9       9      3     my name is z                                    0       0       0       1       0       0
					
				
			
			
				
			
			
			
			
			
			
			
		Class act and Thank you sooooooooooooo much! Can't appreciate enough!!!!!!!!!!!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.
