<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Counting similar comma separated values in columns in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570695#M75250</link>
    <description>The trick with that is you need to know the maximum number of columns ahead of time. SAS can't add columns dynamically. So you either need to know that ahead of time, but then there's no easy way to count distinct within a row either or count duplicates so a long format gives you what you need. It is possible to do what you want, but it's more code.</description>
    <pubDate>Tue, 02 Jul 2019 18:32:59 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2019-07-02T18:32:59Z</dc:date>
    <item>
      <title>Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570670#M75246</link>
      <description>&lt;P&gt;Greetings, I'm fairly new to SAS and I was wondering if this would be possible&lt;/P&gt;&lt;P&gt;On the image that I have attached, I am showing a column from an excel file in which several polymorphisms have been found in some patient samples and are separated by commas. I was wondering if there was a way to extract each polymorphism into its own column as and then count how many of those repeat or are unique.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30705iEAC9B0B6ED509237/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Jul 2019 17:47:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570670#M75246</guid>
      <dc:creator>CliveWarren</dc:creator>
      <dc:date>2019-07-02T17:47:15Z</dc:date>
    </item>
    <item>
      <title>Re: Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570673#M75247</link>
      <description>I would actually change your data structure by creating multiple lines, one for each polymorphism and then count them that way. &lt;BR /&gt;&lt;BR /&gt;COUNTW() will tell you the number of items you have. &lt;BR /&gt;SCAN() will extract each item&lt;BR /&gt;OUTPUT will create each line. &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;data long;&lt;BR /&gt;set have;&lt;BR /&gt;ID = _n_; * you likely don't need this;&lt;BR /&gt;n_loop = count(pr_poly);&lt;BR /&gt;do i=1 to n_loop;&lt;BR /&gt;polyMorph = scan(pr_poly, i, ",");&lt;BR /&gt;output;&lt;BR /&gt;end;&lt;BR /&gt;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;To count distinct now:&lt;BR /&gt;&lt;BR /&gt;proc sql;&lt;BR /&gt;create table want as &lt;BR /&gt;select ID, count(distinct polyMorph) as n_distinct&lt;BR /&gt;from long&lt;BR /&gt;group by ID;&lt;BR /&gt;quit;&lt;BR /&gt;&lt;BR /&gt;Your exact data structure may require some modifications to the code, but hopefully that gives you a good idea.</description>
      <pubDate>Tue, 02 Jul 2019 17:51:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570673#M75247</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-02T17:51:55Z</dc:date>
    </item>
    <item>
      <title>Re: Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570692#M75248</link>
      <description>&lt;P&gt;I modified the first part of the code because I think you meant to use COUNTW() instead of COUNT()&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have attached a screenshot with some of the results, i was wondering if the idea here was to extract only one value into the polyMorph column according to the value of i. Would it be possible of instead of having one row per polymoprhism, you could have several columns for each polymorphism.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data long;
	set mydata;
	id=_n_;
	n_loop=countw(Pr_poly, ",");

	do i=1 to n_loop;
		polyMorph=scan(pr_poly, i, ',');
		output;
	end;
run;&lt;BR /&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 560px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30707i88EF22B4B1D74420/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Jul 2019 18:30:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570692#M75248</guid>
      <dc:creator>CliveWarren</dc:creator>
      <dc:date>2019-07-02T18:30:18Z</dc:date>
    </item>
    <item>
      <title>Re: Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570695#M75250</link>
      <description>The trick with that is you need to know the maximum number of columns ahead of time. SAS can't add columns dynamically. So you either need to know that ahead of time, but then there's no easy way to count distinct within a row either or count duplicates so a long format gives you what you need. It is possible to do what you want, but it's more code.</description>
      <pubDate>Tue, 02 Jul 2019 18:32:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570695#M75250</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-07-02T18:32:59Z</dc:date>
    </item>
    <item>
      <title>Re: Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570708#M75252</link>
      <description>&lt;P&gt;I sort of managed to take them into their own column using PROC TRANSPOSE. I suppose I could use other procedures that allow me to count each individual polymorphism&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc transpose data=long out=ordered prefix=polymorph;
	by id;
	var polymorph;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30709iE9B034DAD2DABB35/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Eventually I would like to associate all the different PR_polymorphisms under a single category" style="width: 386px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30710i56814C1859700224/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Eventually I would like to associate all the different PR_polymorphisms under a single category" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Eventually I would like to associate all the different PR_polymorphisms under a single category&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Jul 2019 19:02:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/570708#M75252</guid>
      <dc:creator>CliveWarren</dc:creator>
      <dc:date>2019-07-02T19:02:04Z</dc:date>
    </item>
    <item>
      <title>Re: Counting similar comma separated values in columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/571034#M75287</link>
      <description>&lt;P&gt;To continue this thread in case somebody faces the same issue (or as a reference to myself who might need it later), I managed to count unique polymorphisms after I transposed them each one in their own column using the following code&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data PrWant;
	set hivfull.pr_polymorph;
	array pr_polymorph{24}pr_polymorph1-pr_polymorph24;
	array new {24} $20  _temporary_;
	do _n_=1 to 24;
		new{_n_}=pr_polymorph{_n_};
	end;
	call sortc(of new{*});
	count=(new{1}&amp;gt;'');

	do _n_=2 to 24;

		if new{_n_} ne new{_n_-1} then
			count + 1;
	end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/30731i407F8C973A22EC8A/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The next step would be to evaluate which polymorphisms are the most common regardless of where they are placed in the column, it is likely that polymorphism A in column 23 could be found in a different column number for another patient, which is why i think it would've been more useful to have each polymorphism as its own column name and then rows would check if it is present or not.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jul 2019 17:16:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Counting-similar-comma-separated-values-in-columns/m-p/571034#M75287</guid>
      <dc:creator>CliveWarren</dc:creator>
      <dc:date>2019-07-03T17:16:57Z</dc:date>
    </item>
  </channel>
</rss>

