<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Create ID groups by criteria in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680589#M205708</link>
    <description>&lt;P&gt;Hi All, I have the following data. I'd like to create 3 groups (1, 3, 5) based on the percent variable. I'd like the group = 1 to represent the closest approximation to 20%. For example, .18 and .23 exist but .18 is closer to .2 so I'd like that to be the cutoff. Essentially, the tie goes to the number that minimizes the difference from .2. The same logic applies for group 5 but instead I'd like the cutoff to be .8, again I'd like the tie to go towards the number that minimizes the difference from 0.8. For all the values in between, they can be assigned group = 3. The want data is below. Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
   input percent;
   datalines;
.01
.05
.16
.18
.23
.46
.55 
.67
.69
.698
.74
.78
.81
.91
.82
;
run;

data want;
   input percent group;
   datalines;
.01 1
.05 1
.16 1
.18 1
.23 3
.46 3
.55 3
.67 3
.69 3
.698 3
.74 3
.78 3
.81 5
.91 5
.92 5
;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Tue, 01 Sep 2020 01:05:47 GMT</pubDate>
    <dc:creator>PeterBr</dc:creator>
    <dc:date>2020-09-01T01:05:47Z</dc:date>
    <item>
      <title>Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680589#M205708</link>
      <description>&lt;P&gt;Hi All, I have the following data. I'd like to create 3 groups (1, 3, 5) based on the percent variable. I'd like the group = 1 to represent the closest approximation to 20%. For example, .18 and .23 exist but .18 is closer to .2 so I'd like that to be the cutoff. Essentially, the tie goes to the number that minimizes the difference from .2. The same logic applies for group 5 but instead I'd like the cutoff to be .8, again I'd like the tie to go towards the number that minimizes the difference from 0.8. For all the values in between, they can be assigned group = 3. The want data is below. Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
   input percent;
   datalines;
.01
.05
.16
.18
.23
.46
.55 
.67
.69
.698
.74
.78
.81
.91
.82
;
run;

data want;
   input percent group;
   datalines;
.01 1
.05 1
.16 1
.18 1
.23 3
.46 3
.55 3
.67 3
.69 3
.698 3
.74 3
.78 3
.81 5
.91 5
.92 5
;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 01 Sep 2020 01:05:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680589#M205708</guid>
      <dc:creator>PeterBr</dc:creator>
      <dc:date>2020-09-01T01:05:47Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680598#M205714</link>
      <description>&lt;P&gt;Sorry the .82 in the have data should have been .92, here it is again:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;data have;
   input percent;
   datalines;
.01
.05
.16
.18
.23
.46
.55 
.67
.69
.698
.74
.78
.81
.91
.92
;
run;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE class="language-sas"&gt;&lt;CODE&gt;data want;
   input percent group;
   datalines;
.01 1
.05 1
.16 1
.18 1
.23 3
.46 3
.55 3
.67 3
.69 3
.698 3
.74 3
.78 3
.81 5
.91 5
.92 5
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2020 02:28:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680598#M205714</guid>
      <dc:creator>PeterBr</dc:creator>
      <dc:date>2020-09-01T02:28:54Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680600#M205715</link>
      <description>&lt;P&gt;What if you have a tie, e.g.&amp;nbsp; one value of .18 and one value of .22?&amp;nbsp; They are the same distance from .2, so which one becomes your cutoff for group 1?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And what if your lowest value is, say, .35?&amp;nbsp;&amp;nbsp; Does it get put in group 1?&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2020 02:39:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680600#M205715</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-09-01T02:39:21Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680619#M205727</link>
      <description>&lt;P&gt;From your description i expected 0.81 to be in group 3, because the distance to 0.8 is smaller than the distance between 0.78 and 0.8&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2020 06:23:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680619#M205727</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2020-09-01T06:23:19Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680620#M205728</link>
      <description>&lt;P&gt;Here is some SQL that finds the 'breakpoints' and uses them to apply the group values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The 1 and 5 groups are computed to contain the largest number of percents in case of ties, which means ties for group 1 break at value &amp;gt; 0.20 and ties for group 5 break at value &amp;lt; 0.80.&amp;nbsp; You will want to ROUND results to ensure ties are properly identified.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data have;&lt;BR /&gt;input percent;&lt;BR /&gt;datalines;&lt;BR /&gt;.01&lt;BR /&gt;.05&lt;BR /&gt;.16&lt;BR /&gt;.18&lt;BR /&gt;.22 /* higher tie */&lt;BR /&gt;.23&lt;BR /&gt;.46&lt;BR /&gt;.55 &lt;BR /&gt;.67&lt;BR /&gt;.69&lt;BR /&gt;.698&lt;BR /&gt;.74&lt;BR /&gt;.78 /* lower tie */&lt;BR /&gt;.82&lt;BR /&gt;.91&lt;BR /&gt;.92&lt;BR /&gt;;

proc sql;
  create table breakpoints as
  select percent, group, diff
  from 
  (
    ( select
      percent
    , 1 as group
    , round(abs(percent-0.20),0.01) as diff
    from have
    having diff = min(diff)
    )
    union
    (select
      percent
    , 5 as group
    , round(abs(percent-0.80),0.01) as diff
    from have
    having diff = min(diff)
    )
  )
  group by group
  having group = 1 and percent=max(percent) /* tie processing */
      or group = 5 and percent=min(percent)
  ;

  create table want as
  select 
    have.*
  , coalesce(group, 3) as group
  from have 
  left join breakpoints
  on 
  have.percent &amp;lt;= breakpoints.percent and group = 1
  or
  have.percent &amp;gt;= breakpoints.percent and group = 5
  order by percent
  ;&lt;/PRE&gt;</description>
      <pubDate>Tue, 01 Sep 2020 06:25:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680620#M205728</guid>
      <dc:creator>RichardDeVen</dc:creator>
      <dc:date>2020-09-01T06:25:17Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680843#M205861</link>
      <description>&lt;P&gt;Hi this code is almost perfect, I really appreciate it! One thing I noticed though is when there are duplicate percent values, the step when creating the want table, it deletes the duplicates so that only one unique value of percent can exist. With the have data below, your code produces the want data and see that .23 and .18 duplicates are both missing. Is there a way to resolve that? It says there there might be a cartesian product problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
   input percent;
   datalines;
.01
.05
.16
.18
.18
.23
.23
.46
.55
.67
.69
.698
.74
.78
.82
.91
.92
;
run;


data want;
   input percent group;
   datalines;
.01 1
.05 1
.16 1
.18 1
.23 3
.46 3
.55 3
.67 3
.69 3
.698 3
.74 3
.78 3
.82 5
.91 5
.92 5
;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 02 Sep 2020 01:37:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680843#M205861</guid>
      <dc:creator>PeterBr</dc:creator>
      <dc:date>2020-09-02T01:37:37Z</dc:date>
    </item>
    <item>
      <title>Re: Create ID groups by criteria</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680845#M205862</link>
      <description>&lt;P&gt;ignore my last comment it was a problem on my end, works great thanks so much!&lt;/P&gt;</description>
      <pubDate>Wed, 02 Sep 2020 01:42:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-ID-groups-by-criteria/m-p/680845#M205862</guid>
      <dc:creator>PeterBr</dc:creator>
      <dc:date>2020-09-02T01:42:17Z</dc:date>
    </item>
  </channel>
</rss>

