<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: categorize a character variable with values not mutually exclusive in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651273#M22374</link>
    <description>&lt;P&gt;If your values actually are represented by exactly one character per "category" to get a "more than one" you can use the LENGTH function that would return the number of characters in the Fruit_code.&lt;/P&gt;
&lt;P&gt;Instead of&lt;/P&gt;
&lt;PRE&gt;else if index(fruit_code, 'AB') &amp;gt;0 then fruit=3;&lt;/PRE&gt;
&lt;PRE&gt;else if length(fruit_code)&amp;gt;1 then fruit=3;
&lt;/PRE&gt;
&lt;P&gt;However this comparison should be the first with the type of code you show because if fruit_code = 'AB' then index(fruit_code, 'A') &amp;gt;0 is true as is index(fruit_code, 'B') &amp;gt;0&lt;/P&gt;
&lt;P&gt;With Index function you would also have to test for 'BA' since you have not stated that is an impossible value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your fruit code values are actually something else then you need to provide actual values as this can get messy quickly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And whose decision was it to put multiple values into a single variable and then compound the issue by not providing a delimiter between values?&lt;/P&gt;</description>
    <pubDate>Thu, 28 May 2020 04:05:29 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2020-05-28T04:05:29Z</dc:date>
    <item>
      <title>categorize a character variable with values not mutually exclusive</title>
      <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651266#M22372</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to categorize a variable that could have up to 8 different values, in some instances the more than 1 value has been saved for the same id.&lt;/P&gt;&lt;P&gt;Below is a sample data which works fine for mutually exclusive values but not when they have been mixed.&lt;/P&gt;&lt;P&gt;How do I code this to be able to get "More than one.." category?&lt;/P&gt;&lt;P&gt;Appreciate any help!&lt;/P&gt;&lt;P&gt;Margaret&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc format;&lt;BR /&gt;value fruit&lt;BR /&gt;1=1:Apple&lt;BR /&gt;2=2:Blueberry&lt;BR /&gt;3=3:More than one fruit;&lt;/P&gt;&lt;P&gt;data fruit;&lt;BR /&gt;input id fruit_code $;&lt;BR /&gt;datalines;&lt;BR /&gt;101 A&lt;BR /&gt;102 B&lt;BR /&gt;103 AB&lt;BR /&gt;104&lt;BR /&gt;;&lt;BR /&gt;data a;&lt;BR /&gt;set fruit;&lt;BR /&gt;if index(fruit_code, 'A') &amp;gt;0 then fruit=1;&lt;BR /&gt;else if index(fruit_code, 'B') &amp;gt;0 then fruit=2;&lt;BR /&gt;else if index(fruit_code, 'AB') &amp;gt;0 then fruit=3;&lt;BR /&gt;else if index(fruit_code, ' ') &amp;gt;0 then fruit=.;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 03:34:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651266#M22372</guid>
      <dc:creator>urban58</dc:creator>
      <dc:date>2020-05-28T03:34:38Z</dc:date>
    </item>
    <item>
      <title>Re: categorize a character variable with values not mutually exclusive</title>
      <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651271#M22373</link>
      <description>&lt;P&gt;Unsure what the issue is. Like this?&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;if&amp;nbsp; &amp;nbsp; &amp;nbsp; index(fruit_code, 'AB') then fruit=3;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;else if index(fruit_code, 'A' ) then fruit=1;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;else if index(fruit_code, 'B' ) then fruit=2;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;else&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;fruit=.;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 03:52:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651271#M22373</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-28T03:52:21Z</dc:date>
    </item>
    <item>
      <title>Re: categorize a character variable with values not mutually exclusive</title>
      <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651273#M22374</link>
      <description>&lt;P&gt;If your values actually are represented by exactly one character per "category" to get a "more than one" you can use the LENGTH function that would return the number of characters in the Fruit_code.&lt;/P&gt;
&lt;P&gt;Instead of&lt;/P&gt;
&lt;PRE&gt;else if index(fruit_code, 'AB') &amp;gt;0 then fruit=3;&lt;/PRE&gt;
&lt;PRE&gt;else if length(fruit_code)&amp;gt;1 then fruit=3;
&lt;/PRE&gt;
&lt;P&gt;However this comparison should be the first with the type of code you show because if fruit_code = 'AB' then index(fruit_code, 'A') &amp;gt;0 is true as is index(fruit_code, 'B') &amp;gt;0&lt;/P&gt;
&lt;P&gt;With Index function you would also have to test for 'BA' since you have not stated that is an impossible value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your fruit code values are actually something else then you need to provide actual values as this can get messy quickly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And whose decision was it to put multiple values into a single variable and then compound the issue by not providing a delimiter between values?&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 04:05:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651273#M22374</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-05-28T04:05:29Z</dc:date>
    </item>
    <item>
      <title>Re: categorize a character variable with values not mutually exclusive</title>
      <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651344#M22376</link>
      <description>&lt;P&gt;Thank you both for your replies.&lt;/P&gt;&lt;P&gt;The data is of course much more complicated as ballardw astutely points out.&lt;/P&gt;&lt;P&gt;Here is a more realistic dataset&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;for this problem, the characters pertaining to fruit (A,B,C,N,P,R,Y,O and missing), the other characters (E,L) can be ignored (i.e. removed) for his problem&lt;/P&gt;&lt;P&gt;How do I adapt my code below to have another category fruit=9 (more than 1 fruit) with E,L removed&lt;/P&gt;&lt;P&gt;Many thanks!&lt;/P&gt;&lt;P&gt;Margaret&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data fruit;&lt;/P&gt;&lt;P&gt;input id fruit_code $;&lt;BR /&gt;datalines;&lt;BR /&gt;101&lt;BR /&gt;102 A&lt;BR /&gt;103 B&lt;BR /&gt;104 BB&lt;BR /&gt;105 BEB&lt;BR /&gt;105 BEBN&lt;BR /&gt;106 BECBN&lt;BR /&gt;107 E&lt;BR /&gt;108 EC&lt;BR /&gt;109 ECBN&lt;BR /&gt;110 ECN&lt;BR /&gt;111 LL&lt;BR /&gt;112 LLR&lt;BR /&gt;113 LLC&lt;BR /&gt;114 C&lt;BR /&gt;115 Y&lt;BR /&gt;116 YA&lt;BR /&gt;117 YEA&lt;BR /&gt;118 YECA&lt;BR /&gt;119 LPR&lt;BR /&gt;120 O&lt;BR /&gt;;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data a;&lt;BR /&gt;set fruit;&lt;BR /&gt;if index(fruit_code, 'A') &amp;gt;0 then fruit=1;&lt;BR /&gt;else if index(fruit_code, 'B') &amp;gt;0 then fruit=2;&lt;BR /&gt;else if index(fruit_code, 'C') &amp;gt;0 then fruit=3;&lt;BR /&gt;else if index(fruit_code, 'N') &amp;gt;0 then fruit=4;&lt;BR /&gt;else if index(fruit_code, 'P') &amp;gt;0 then fruit=5;&lt;BR /&gt;else if index(fruit_code, 'R') &amp;gt;0 then fruit=6;&lt;BR /&gt;else if index(fruit_code, 'Y') &amp;gt;0 then fruit=7;&lt;BR /&gt;else if index(fruit_code, 'O') &amp;gt;0 then fruit=8;&lt;BR /&gt;else if index(fruit_code, ' ') &amp;gt;0 then fruit=.;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 10:40:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651344#M22376</guid>
      <dc:creator>urban58</dc:creator>
      <dc:date>2020-05-28T10:40:00Z</dc:date>
    </item>
    <item>
      <title>Re: categorize a character variable with values not mutually exclusive</title>
      <link>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651419#M22380</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/322157"&gt;@urban58&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thank you both for your replies.&lt;/P&gt;
&lt;P&gt;The data is of course much more complicated as ballardw astutely points out.&lt;/P&gt;
&lt;P&gt;Here is a more realistic dataset&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;for this problem, the characters pertaining to fruit (A,B,C,N,P,R,Y,O and missing), the other characters (E,L) can be ignored (i.e. removed) for his problem&lt;/P&gt;
&lt;P&gt;How do I adapt my code below to have another category fruit=9 (more than 1 fruit) with E,L removed&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;
&lt;P&gt;Margaret&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data fruit;&lt;/P&gt;
&lt;P&gt;input id fruit_code $;&lt;BR /&gt;datalines;&lt;BR /&gt;101&lt;BR /&gt;102 A&lt;BR /&gt;103 B&lt;BR /&gt;104 BB&lt;BR /&gt;105 BEB&lt;BR /&gt;105 BEBN&lt;BR /&gt;106 BECBN&lt;BR /&gt;107 E&lt;BR /&gt;108 EC&lt;BR /&gt;109 ECBN&lt;BR /&gt;110 ECN&lt;BR /&gt;111 LL&lt;BR /&gt;112 LLR&lt;BR /&gt;113 LLC&lt;BR /&gt;114 C&lt;BR /&gt;115 Y&lt;BR /&gt;116 YA&lt;BR /&gt;117 YEA&lt;BR /&gt;118 YECA&lt;BR /&gt;119 LPR&lt;BR /&gt;120 O&lt;BR /&gt;;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;data a;&lt;BR /&gt;set fruit;&lt;BR /&gt;if index(fruit_code, 'A') &amp;gt;0 then fruit=1;&lt;BR /&gt;else if index(fruit_code, 'B') &amp;gt;0 then fruit=2;&lt;BR /&gt;else if index(fruit_code, 'C') &amp;gt;0 then fruit=3;&lt;BR /&gt;else if index(fruit_code, 'N') &amp;gt;0 then fruit=4;&lt;BR /&gt;else if index(fruit_code, 'P') &amp;gt;0 then fruit=5;&lt;BR /&gt;else if index(fruit_code, 'R') &amp;gt;0 then fruit=6;&lt;BR /&gt;else if index(fruit_code, 'Y') &amp;gt;0 then fruit=7;&lt;BR /&gt;else if index(fruit_code, 'O') &amp;gt;0 then fruit=8;&lt;BR /&gt;else if index(fruit_code, ' ') &amp;gt;0 then fruit=.;&lt;BR /&gt;run;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I would suggest you show us what your expected output is for some of those values. The reason I ask is because you have added a significant complication with the phrase " the other characters (E,L) can be ignored (i.e. removed) " and the duplication of codes within the string of codes. Consider is "BB" supposed to be a multiple or not?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you do not want to consider E and L when present then you can use&amp;nbsp; compress(fruit_code,'EL') instead of fruit_code for the comparisons. The compress function removes the individual characters in the second parameter from the string.&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 15:24:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/categorize-a-character-variable-with-values-not-mutually/m-p/651419#M22380</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-05-28T15:24:41Z</dc:date>
    </item>
  </channel>
</rss>

