<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic HPSPLIT customizing the predicated 1 in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/835847#M330472</link>
    <description>&lt;P&gt;Hey, I am trying to run an HPSPLIT procedure with customizing the predcited = 1 to assign to those observations that have P_FLAG1 &amp;gt;= 0.7 instead of 0.5. Any idea how can I make this happen ?&amp;nbsp; In addition to that I also want to check my outcomes for the cross validation set&amp;nbsp; PARTITION FRACTION(VALIDATE=0.3 SEED=42);&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;An alternative solution would be if you can help me create a flag which will tell me which observations have been used for training and which ones for cross validation as the output data doesn't show it.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="artyomkosyan_0-1664461435485.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/75707i04AA1400BEB5EA2C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="artyomkosyan_0-1664461435485.png" alt="artyomkosyan_0-1664461435485.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="artyomkosyan_1-1664461492096.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/75708iC37B20FE5377FFD6/image-size/medium?v=v2&amp;amp;px=400" role="button" title="artyomkosyan_1-1664461492096.png" alt="artyomkosyan_1-1664461492096.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
    <pubDate>Thu, 29 Sep 2022 14:25:05 GMT</pubDate>
    <dc:creator>artyomkosyan</dc:creator>
    <dc:date>2022-09-29T14:25:05Z</dc:date>
    <item>
      <title>HPSPLIT customizing the predicated 1</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/835847#M330472</link>
      <description>&lt;P&gt;Hey, I am trying to run an HPSPLIT procedure with customizing the predcited = 1 to assign to those observations that have P_FLAG1 &amp;gt;= 0.7 instead of 0.5. Any idea how can I make this happen ?&amp;nbsp; In addition to that I also want to check my outcomes for the cross validation set&amp;nbsp; PARTITION FRACTION(VALIDATE=0.3 SEED=42);&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;An alternative solution would be if you can help me create a flag which will tell me which observations have been used for training and which ones for cross validation as the output data doesn't show it.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="artyomkosyan_0-1664461435485.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/75707i04AA1400BEB5EA2C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="artyomkosyan_0-1664461435485.png" alt="artyomkosyan_0-1664461435485.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="artyomkosyan_1-1664461492096.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/75708iC37B20FE5377FFD6/image-size/medium?v=v2&amp;amp;px=400" role="button" title="artyomkosyan_1-1664461492096.png" alt="artyomkosyan_1-1664461492096.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2022 14:25:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/835847#M330472</guid>
      <dc:creator>artyomkosyan</dc:creator>
      <dc:date>2022-09-29T14:25:05Z</dc:date>
    </item>
    <item>
      <title>Re: HPSPLIT customizing the predicated 1</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/835886#M330492</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/435176"&gt;@artyomkosyan&lt;/a&gt;&amp;nbsp;and welcome to the SAS Support Communities!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let me first say that I have &lt;EM&gt;very&lt;/EM&gt; little experience with PROC HPSPLIT.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does the last section of&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/statug/15.2/statug_hpsplit_examples01.htm" target="_blank" rel="noopener"&gt;Example 67.1 Building a Classification Tree for a Binary Outcome&lt;/A&gt; (scroll down to the bottom of the page) answer your first question? In that example the probability cutoff is changed from 0.5 to 0.1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As to your second question, I was surprised to see that, indeed, the OUT= output dataset (or any other output dataset) does not contain the information about the training-validation split. But a naïve guess was successful with the HMEQ sample dataset used in&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/statug/15.2/statug_hpsplit_examples04.htm" target="_blank" rel="noopener"&gt;Example 67.4 Creating a Binary Classification Tree with Validation Data&lt;/A&gt;:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I created an input dataset HMEQ2 containing the split information&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data hmeq2;
call streaminit(123);
set hmeq;
_partind_=rand('bern',0.3);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and compared the results (both printed output and all three output files) of the original code from Example 67.4 (with an OUTPUT statement added)&lt;/P&gt;
&lt;PRE&gt;proc hpsplit data=hmeq maxdepth=5;
   class Bad Delinq Derog Job nInq Reason;
   model Bad(event='1') = Delinq Derog Job nInq Reason CLAge CLNo
               DebtInc Loan MortDue Value YoJ;
   prune costcomplexity;
   partition fraction(validate=0.3 seed=123);
   code file='hpsplexc.sas';
   rules file='rules.txt';
   &lt;STRONG&gt;output out=hpsout;&lt;/STRONG&gt;
run;&lt;/PRE&gt;
&lt;P&gt;to those obtained with the new input dataset and a correspondingly adapted PARTITION statement&lt;/P&gt;
&lt;PRE&gt;proc hpsplit data=&lt;FONT size="4" color="#3366FF"&gt;&lt;STRONG&gt;hmeq2&lt;/STRONG&gt;&lt;/FONT&gt; maxdepth=5;
   class Bad Delinq Derog Job nInq Reason;
   model Bad(event='1') = Delinq Derog Job nInq Reason CLAge CLNo
               DebtInc Loan MortDue Value YoJ;
   prune costcomplexity;
   partition &lt;FONT size="4" color="#3366FF"&gt;&lt;STRONG&gt;rolevar=_partind_(TRAIN='0' VALIDATE='1')&lt;/STRONG&gt;&lt;/FONT&gt;;
   code file='hpsplexc&lt;FONT size="4" color="#3366FF"&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;&lt;/FONT&gt;.sas';
   rules file='rules&lt;FONT size="4" color="#3366FF"&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;&lt;/FONT&gt;.txt';
   output out=hpsout&lt;FONT size="4" color="#3366FF"&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;&lt;/FONT&gt;;
run;

proc compare data=hpsout c=hpsout2;
run;&lt;/PRE&gt;
&lt;P&gt;The results were exactly identical in my single-machine environment (using 4 threads by default) using SAS/STAT 14.3. (I verified this also with a few other split probabilities and seed values.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But to be 100% sure about which observations were used for training and which for validation you can, of course, use a dataset (like HMEQ2) containing a variable such as _PARTIND_ above in the first place.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2022 16:22:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/835886#M330492</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2022-09-29T16:22:54Z</dc:date>
    </item>
    <item>
      <title>Re: HPSPLIT customizing the predicated 1</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/836552#M330762</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp; this is helpful.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Oct 2022 15:54:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HPSPLIT-customizing-the-predicated-1/m-p/836552#M330762</guid>
      <dc:creator>artyomkosyan</dc:creator>
      <dc:date>2022-10-03T15:54:41Z</dc:date>
    </item>
  </channel>
</rss>

