<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Code Bottom 10%, But Not Including Blank Cells! in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799170#M314220</link>
    <description>&lt;P&gt;Agreeing with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See also these 3 blogs (especially the first one) :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Selecting the top n% and bottom n% of observations from a data set&lt;BR /&gt;By Kathryn McLawhorn on SAS Users July 21, 2017&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/sgf/2017/07/21/selecting-the-top-n-and-bottom-n-of-observations-from-a-data-set/" target="_blank"&gt;https://blogs.sas.com/content/sgf/2017/07/21/selecting-the-top-n-and-bottom-n-of-observations-from-a-data-set/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4 ways to find the k smallest and largest data values in SAS&lt;BR /&gt;By Rick Wicklin on The DO Loop January 26, 2022&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An easy way to make a "Top 10" table and bar chart in SAS&lt;BR /&gt;By Rick Wicklin on The DO Loop June 4, 2018&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2018/06/04/top-10-table-bar-chart.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2018/06/04/top-10-table-bar-chart.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
    <pubDate>Mon, 28 Feb 2022 16:36:35 GMT</pubDate>
    <dc:creator>sbxkoenk</dc:creator>
    <dc:date>2022-02-28T16:36:35Z</dc:date>
    <item>
      <title>Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799100#M314189</link>
      <description>&lt;P&gt;Terrific code from&amp;nbsp;&lt;SPAN&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;, to create 'dummy variables':&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Bottom 10% of a certain variable, code dummy variable as 1.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Otherwise, code the dummy variable as 0.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;But I've encountered a problem....&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;proc summary data=&amp;amp;etf..&amp;amp;etf._combined;
var
i_50401_Z
i_50402_Z
i_50403_Z
i_50404_Z
i_50405_Z
i_50408_Z
;

output out=&amp;amp;etf..&amp;amp;etf._combined_temp (drop= _:) p10= /autoname;
run;

Proc sql;
create table &amp;amp;etf..&amp;amp;etf._combined_2 as
select a.*,
(a.i_50401_Z &amp;lt;= b.i_50401_Z_P10) as i_50401_Z_bottom10pct ,
(a.i_50402_Z &amp;lt;= b.i_50402_Z_P10) as i_50402_Z_bottom10pct ,
(a.i_50403_Z &amp;lt;= b.i_50403_Z_P10) as i_50403_Z_bottom10pct ,
(a.i_50404_Z &amp;lt;= b.i_50404_Z_P10) as i_50404_Z_bottom10pct ,
(a.i_50405_Z &amp;lt;= b.i_50405_Z_P10) as i_50405_Z_bottom10pct ,
(a.i_50408_Z &amp;lt;= b.i_50408_Z_P10) as i_50408_Z_bottom10pct

from &amp;amp;etf..&amp;amp;etf._combined as a,  &amp;amp;etf..&amp;amp;etf._combined_temp as b;

quit;
&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The problem is, there is some missing data.&amp;nbsp; Blank cells.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;And the above code is dummy coding these blank cells as 1.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;So, I have to somehow tell the code above to only code for the lowest 10% of actual numerical values.&amp;nbsp; &lt;STRONG&gt;Ignore the blanks.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Any help greatly appreciated.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Nicholas Kormanik&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 10:20:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799100#M314189</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2022-02-28T10:20:33Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799104#M314193</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;(a.i_50401_Z &amp;lt;= b.i_50401_Z_P10) and not missing(a.i_50401_Z) as i_50401_Z_bottom10pct&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;or maybe even better&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;case when missing(a.i_50401_Z) then . else a.i_50401_Z &amp;lt;= b.i_50401_Z_P10 end as i_50401_Z_bottom10pct&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;IMHO, this whole idea of finding the bottom 10pct would be better done in PROC RANK and arrays, rather than by tediously coding SQL conditions for each variable. In addition, PROC RANK has several methods of handling ties.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 11:20:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799104#M314193</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-02-28T11:20:14Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799170#M314220</link>
      <description>&lt;P&gt;Agreeing with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See also these 3 blogs (especially the first one) :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Selecting the top n% and bottom n% of observations from a data set&lt;BR /&gt;By Kathryn McLawhorn on SAS Users July 21, 2017&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/sgf/2017/07/21/selecting-the-top-n-and-bottom-n-of-observations-from-a-data-set/" target="_blank"&gt;https://blogs.sas.com/content/sgf/2017/07/21/selecting-the-top-n-and-bottom-n-of-observations-from-a-data-set/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4 ways to find the k smallest and largest data values in SAS&lt;BR /&gt;By Rick Wicklin on The DO Loop January 26, 2022&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2022/01/26/k-smallest-largest-data.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An easy way to make a "Top 10" table and bar chart in SAS&lt;BR /&gt;By Rick Wicklin on The DO Loop June 4, 2018&lt;BR /&gt;&lt;A href="https://blogs.sas.com/content/iml/2018/06/04/top-10-table-bar-chart.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2018/06/04/top-10-table-bar-chart.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 16:36:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799170#M314220</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-02-28T16:36:35Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799250#M314255</link>
      <description>&lt;P&gt;How about&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select a.*,
( . &amp;lt; a.i_50401_Z &amp;lt;= b.i_50401_Z_P10) as i_50401_Z_bottom10pct ,
etc&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 06:02:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799250#M314255</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2022-03-01T06:02:50Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799253#M314257</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/60547"&gt;@sbxkoenk&lt;/a&gt;&amp;nbsp;&lt;SPAN&gt;&lt;A href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884" target="_blank"&gt;@ballardw&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Both PaigeMiller and ChrisNZ solutions work.&amp;nbsp; Good job guys.&amp;nbsp; They work in the sense that the blank cells are no longer coded 1, as they were in the original code configuration.&amp;nbsp; They are coded 0.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That, though, does raise another matter, but for now will overlook it.&amp;nbsp; The matter is that blank cells are being coded 0.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If Logistical Regression is looking for optimizing 1, no worries, I suppose.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If LR, or other, is focusing on 0 values, then all the blank cells coded as 0 may cause problems.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In fairness to ballardw, his code (found via Google search) was for &lt;STRONG&gt;highest&lt;/STRONG&gt; percent of variable, not lowest -- highest percent coded to 1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've looked at using the Proc Rank approach and feel ballardw's coding is much easier to use by a long shot.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 07:04:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799253#M314257</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2022-03-01T07:04:42Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799263#M314266</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/22691"&gt;@NKormanik&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;That, though, does raise another matter, but for now will overlook it.&amp;nbsp; The matter is that blank cells are being coded 0.&lt;/P&gt;
&lt;P&gt;If Logistical Regression is looking for optimizing 1, no worries, I suppose.&lt;/P&gt;
&lt;P&gt;If LR, or other, is focusing on 0 values, then all the blank cells coded as 0 may cause problems.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Regardless of what your target event of interest is (1 versus 0), the fact that blanks are coded as 0 does matter for your final model of course!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Cheers,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 09:12:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799263#M314266</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-03-01T09:12:14Z</dc:date>
    </item>
    <item>
      <title>Re: Code Bottom 10%, But Not Including Blank Cells!</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799277#M314277</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/22691"&gt;@NKormanik&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/60547"&gt;@sbxkoenk&lt;/a&gt;&amp;nbsp;&lt;SPAN&gt;&lt;A href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884" target="_blank" rel="noopener"&gt;@ballardw&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Both PaigeMiller and ChrisNZ solutions work.&amp;nbsp; Good job guys.&amp;nbsp; They work in the sense that the blank cells are no longer coded 1, as they were in the original code configuration.&amp;nbsp; They are coded 0.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That, though, does raise another matter, but for now will overlook it.&amp;nbsp; The matter is that blank cells are being coded 0.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Incorrect. I gave you two different ways to code your SQL, the second of which produces a missing when the variable is missing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;I've looked at using the Proc Rank approach and feel ballardw's coding is much easier to use by a long shot.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I certainly disagree. By a long shot.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If Logistical Regression is looking for optimizing 1, no worries, I suppose.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If LR, or other, is focusing on 0 values, then all the blank cells coded as 0 may cause problems.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Are you planning to use these new variables that have a 1 when you are in the bottom 10% as the response variable in Logistic Regression? Wouldn't it better to treat the continuous variable as continuous and do linear regression instead of arbitrarily turning these into 0s and 1s and doing Logistic regression?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why do you not explain the real problem, that's almost always helpful. Focusing on the mechanics of obtaining these variables loses the big picture, and it may be that what you are planning to do with these new variables is not a good idea. This appears to be another example of the &lt;A href="https://xyproblem.info/" target="_self"&gt;XY Problem&lt;/A&gt; in action.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 11:13:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Code-Bottom-10-But-Not-Including-Blank-Cells/m-p/799277#M314277</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-03-01T11:13:38Z</dc:date>
    </item>
  </channel>
</rss>

