I have a long dataset like the example below (in my real dataset I have thousands of ID's and approx one thousand distinct Dx values).
ID Dx
Pat1 A
Pat1 B
Pat1 C
Pat2 B
Pat2 D
Pat3 A
Pat3 D
And I want to generate a design matrix like the one below (obviously the real design matrix will be high dimensional with a column for each Dx value).
ID A B C D
Pat1 1 1 1 0
Pat2 0 1 0 1
Pat3 1 0 0 1
My original plan was to transpose the long dataset, and then let proc glm generate the design matrix using a class statement; however I can't get the data to transpose how proc glm would need it with distinct dx values in each dx column.
Using transpose I haven't been able to get anything better than the structure below.
ID Dx1 Dx2 Dx3
Pat1 A B C
Pat2 B D
Pat3 A
I've thought about approaches using loops, arrays, and proc sql, but I haven't been able to put the pieces together to get to the end design matrix I need.
Any help is appreciated, and while any answer that works is great, in my situation a proc sql solution would be optimal.
Thanks much!
... View more