<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: create design matrix from long dataset in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397511#M96074</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or if just need to generate a matrix.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc tabulate data=have;&lt;BR /&gt;class id dx;&lt;BR /&gt;table id,dx=""*N="";&lt;BR /&gt;run;&lt;/P&gt;</description>
    <pubDate>Wed, 20 Sep 2017 16:29:55 GMT</pubDate>
    <dc:creator>stat_sas</dc:creator>
    <dc:date>2017-09-20T16:29:55Z</dc:date>
    <item>
      <title>create design matrix from long dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397506#M96072</link>
      <description>&lt;P&gt;I have a long dataset like the example below (in my real dataset I have thousands of ID's and approx one thousand distinct Dx values). &amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Dx&lt;/P&gt;
&lt;P&gt;Pat1 &amp;nbsp; &amp;nbsp; A&lt;/P&gt;
&lt;P&gt;Pat1 &amp;nbsp; &amp;nbsp; B&lt;/P&gt;
&lt;P&gt;Pat1 &amp;nbsp; &amp;nbsp; C&lt;/P&gt;
&lt;P&gt;Pat2 &amp;nbsp; &amp;nbsp; B&lt;/P&gt;
&lt;P&gt;Pat2 &amp;nbsp; &amp;nbsp; D&lt;/P&gt;
&lt;P&gt;Pat3 &amp;nbsp; &amp;nbsp; A&lt;/P&gt;
&lt;P&gt;Pat3 &amp;nbsp; &amp;nbsp; D&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And I want to generate a design matrix like the one below (obviously the real design matrix will be high dimensional with a column for each Dx value). &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; A &amp;nbsp; &amp;nbsp; &amp;nbsp;B &amp;nbsp; &amp;nbsp; C &amp;nbsp; &amp;nbsp; D&lt;/P&gt;
&lt;P&gt;Pat1 &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;/P&gt;
&lt;P&gt;Pat2 &amp;nbsp; &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp;&lt;/P&gt;
&lt;P&gt;Pat3 &amp;nbsp; &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My original plan was to transpose the long dataset, and then let proc glm generate the design matrix using a class statement; however I can't get the data to transpose how proc glm would need it with distinct dx values in each dx column. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using transpose I haven't been able to get anything better than the structure below.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Dx1 &amp;nbsp; &amp;nbsp; &amp;nbsp;Dx2 &amp;nbsp; &amp;nbsp; Dx3&lt;/P&gt;
&lt;P&gt;Pat1 &amp;nbsp; &amp;nbsp; &amp;nbsp; A &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; B &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; C&lt;/P&gt;
&lt;P&gt;Pat2 &amp;nbsp; &amp;nbsp; &amp;nbsp; B &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; D&lt;/P&gt;
&lt;P&gt;Pat3 &amp;nbsp; &amp;nbsp; &amp;nbsp; A&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've thought about approaches using loops, arrays, and proc sql, but I haven't been able to put the pieces together to get to the end design matrix I need. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any help is appreciated, and while any answer that works is great, in my situation a proc sql solution would be optimal. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks much!&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2017 16:15:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397506#M96072</guid>
      <dc:creator>bootstrap_armada</dc:creator>
      <dc:date>2017-09-20T16:15:18Z</dc:date>
    </item>
    <item>
      <title>Re: create design matrix from long dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397508#M96073</link>
      <description>&lt;P&gt;PROC TRANSPOSE can handle this.&amp;nbsp; You just need to use this statement:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;id Dx;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since Dx is no longer going to be in the VAR statement, you may need to take a couple of additional steps (try it and see if they are needed):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Before transposing, create a variable that is 1 on each observation, and use that in the VAR statement of PROC TRANSPOSE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;After the transpose go back and replace missing values with zeros.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One final consideration ... your incoming data set may may be to deduplicated.&amp;nbsp; You can't have multiple observations for the same ID with the same DX.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2017 16:22:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397508#M96073</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2017-09-20T16:22:26Z</dc:date>
    </item>
    <item>
      <title>Re: create design matrix from long dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397511#M96074</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or if just need to generate a matrix.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc tabulate data=have;&lt;BR /&gt;class id dx;&lt;BR /&gt;table id,dx=""*N="";&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Sep 2017 16:29:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397511#M96074</guid>
      <dc:creator>stat_sas</dc:creator>
      <dc:date>2017-09-20T16:29:55Z</dc:date>
    </item>
    <item>
      <title>Re: create design matrix from long dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397778#M96157</link>
      <description>&lt;P&gt;This is a good, simple solution - I hadn't thought of that approach with proc transpose. &amp;nbsp;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 21 Sep 2017 13:56:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/create-design-matrix-from-long-dataset/m-p/397778#M96157</guid>
      <dc:creator>bootstrap_armada</dc:creator>
      <dc:date>2017-09-21T13:56:13Z</dc:date>
    </item>
  </channel>
</rss>

