<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dropping Some Observations in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673121#M202433</link>
    <description>&lt;P&gt;So by "abbreviation" you don't mean things like "ltd" (for limited) or "dir." (for director).&amp;nbsp; Your examples seem to be such that one title is a sequence of characters that falls within the other title.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If so, and your data are sorted by jobcode:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;  
  set have;
  by jobcode;
  merge have have (firstobs=2 rename=(job_title=nxt_job_title));&lt;BR /&gt;
  if (first.jobcode=0 and length(job_title)&amp;gt;length(lag(job_title)))
  or (last.jobcode=0  and length(job_title)&amp;gt;length(nxt_job_title))
  or (first.jobcode=1 and last.jobcode=1);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The subsetting if is satisfied by any of these three conditions&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The record-in-hand is the second for this jobcode and its jobtitle length is greater than the preceding (first for this jobcode) jobtitle.&lt;/LI&gt;
&lt;LI&gt;The record-in-hand is the first for this jobcode and its jobtitle length is greater than the following (second for this jobcode).&lt;/LI&gt;
&lt;LI&gt;The record-in-hand is the only one for this jobcode.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Note this program assumes you never have more than 2 records per jobcode.&lt;/P&gt;</description>
    <pubDate>Wed, 29 Jul 2020 13:45:22 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2020-07-29T13:45:22Z</dc:date>
    <item>
      <title>Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673109#M202428</link>
      <description>&lt;P&gt;Could sure use some help on this problem please.&amp;nbsp;I have a data set where there are instances of duplicate job codes where one instance the job title is abbreviated and the other instance the job title is complete. I need to keep the observation with the complete job title and ignore the observation with the abbreviated job title. As you can see with some observations the abbreviated form is first but in other cases the complete form is first.&lt;/P&gt;&lt;DIV class="mceNonEditable lia-copypaste-placeholder"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="2020-07-29_8-30-39.jpg" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/47760iADA98A8B0F3BC235/image-size/large?v=v2&amp;amp;px=999" role="button" title="2020-07-29_8-30-39.jpg" alt="2020-07-29_8-30-39.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 13:08:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673109#M202428</guid>
      <dc:creator>JeffreyLowe</dc:creator>
      <dc:date>2020-07-29T13:08:33Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673115#M202431</link>
      <description>&lt;P&gt;I assume that:&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;- the only difference between observations of the same job_ID is the job title,&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;- the complete title is the longest one&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;- the file is sorted by job_id&lt;/P&gt;
&lt;P&gt;then use next code to save wanted result:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
 set have;
  by Job_code;
      length title $50;  /* max expected length */
      retain title;
      if first.job_code then title = Job_title; 
      else  if length(Job_Title) &amp;gt; length(title) then title = job_title;
      if last.Job_code then do;
         Job_title = title;
         output;
      end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 29 Jul 2020 13:37:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673115#M202431</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2020-07-29T13:37:17Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673121#M202433</link>
      <description>&lt;P&gt;So by "abbreviation" you don't mean things like "ltd" (for limited) or "dir." (for director).&amp;nbsp; Your examples seem to be such that one title is a sequence of characters that falls within the other title.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If so, and your data are sorted by jobcode:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;  
  set have;
  by jobcode;
  merge have have (firstobs=2 rename=(job_title=nxt_job_title));&lt;BR /&gt;
  if (first.jobcode=0 and length(job_title)&amp;gt;length(lag(job_title)))
  or (last.jobcode=0  and length(job_title)&amp;gt;length(nxt_job_title))
  or (first.jobcode=1 and last.jobcode=1);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The subsetting if is satisfied by any of these three conditions&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The record-in-hand is the second for this jobcode and its jobtitle length is greater than the preceding (first for this jobcode) jobtitle.&lt;/LI&gt;
&lt;LI&gt;The record-in-hand is the first for this jobcode and its jobtitle length is greater than the following (second for this jobcode).&lt;/LI&gt;
&lt;LI&gt;The record-in-hand is the only one for this jobcode.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Note this program assumes you never have more than 2 records per jobcode.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 13:45:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673121#M202433</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-07-29T13:45:22Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673143#M202443</link>
      <description>&lt;P&gt;Your solution worked great, except for one thing....my fault I totally forgot to mention it.....in these examples, the job_code and job_title are duplicates but the BU_Level_01 are different, so in these cases I would need to keep both. So sorry for the confusion totally my fault for omitting this info. Other than that the code you provided works perfectly! Thank you!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="2020-07-29_10-13-15.jpg" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/47764i8D8037D34E1D1253/image-size/large?v=v2&amp;amp;px=999" role="button" title="2020-07-29_10-13-15.jpg" alt="2020-07-29_10-13-15.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 14:21:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673143#M202443</guid>
      <dc:creator>JeffreyLowe</dc:creator>
      <dc:date>2020-07-29T14:21:38Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673146#M202446</link>
      <description>&lt;P&gt;Thank you for your response, yes things like "Dir" would be considered a duplicate if there were a second observation with the same job_code but "Director" spelled out as seen here:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;100041 Learning &amp;amp; Development Dir&lt;BR /&gt;100041 Learning &amp;amp; Development Director&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a duplicate, the first needs dropped and the second is the observation I want to keep.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 14:25:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673146#M202446</guid>
      <dc:creator>JeffreyLowe</dc:creator>
      <dc:date>2020-07-29T14:25:15Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673239#M202483</link>
      <description>&lt;P&gt;It is easy to check both job_code and BU_level_1&amp;nbsp; - updated code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=have out=temp;
    by job_code bu_level_1;
run;
data want;
 set temp;
  by Job_code bu_level_1;
      length title $50;  /* max expected length */
      retain title;
      if first.bu_level_1 then title = Job_title; 
      else  if length(Job_Title) &amp;gt; length(title) then title = job_title;
      if last.bu_level_1 then do;
         Job_title = title;
         output;
      end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 29 Jul 2020 18:41:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673239#M202483</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2020-07-29T18:41:24Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping Some Observations</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673249#M202490</link>
      <description>&lt;P&gt;Perfect! Thank you so much Shmuel works like a charm!&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 18:20:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dropping-Some-Observations/m-p/673249#M202490</guid>
      <dc:creator>JeffreyLowe</dc:creator>
      <dc:date>2020-07-29T18:20:56Z</dc:date>
    </item>
  </channel>
</rss>

