<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: in-sample out-sample with equal failure proprtion in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751298#M236486</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;This code didn't work .&lt;/P&gt;
&lt;P&gt;Why did you write&amp;nbsp; sampsize =25?&lt;/P&gt;
&lt;P&gt;My source data set has 40,000 observations.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;1) dummy code for dummy data. I explained that would select 25 records from each strata.&lt;/P&gt;
&lt;P&gt;2) when I posted that you had not said anything about the size of your data set so I picked a small number hoping that it would at least run.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I do not know what you are attempting. I think that you need to consider generating a small example by hand of what you expect from a given example data set.&lt;/P&gt;</description>
    <pubDate>Wed, 30 Jun 2021 16:19:37 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2021-06-30T16:19:37Z</dc:date>
    <item>
      <title>in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750771#M236206</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;
&lt;P&gt;&lt;CODE class=" language-sas"&gt;Rawtbl has&amp;nbsp;one&amp;nbsp;row&amp;nbsp;for&amp;nbsp;each&amp;nbsp;customer&amp;nbsp;with&amp;nbsp;explanatory variables&amp;nbsp;and&amp;nbsp;information&amp;nbsp;of&amp;nbsp;failure&amp;nbsp;in&amp;nbsp;12&amp;nbsp;months.&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;I am using this code to divide population into :in-sample, out-sample.&lt;/P&gt;
&lt;P&gt;This code divide the population by random method.&lt;/P&gt;
&lt;P&gt;The outcome of this method is that proportion of failure in following period is similar in in-sample and out-sample.&lt;/P&gt;
&lt;P&gt;However,I want to add one more condition for this division :&lt;/P&gt;
&lt;P&gt;I want that proportion of&amp;nbsp; customers with failure in following period will be &lt;STRONG&gt;&lt;U&gt;equal&lt;/U&gt;&lt;/STRONG&gt; in in-sample and out-sample.&lt;/P&gt;
&lt;P&gt;What is the way to do it?&lt;/P&gt;
&lt;P&gt;Please note that the reason that I do it is that Gini coefficient has significant different value&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Wanted; 
set  Rawtbl.;
random=ranuni(1234);
if random=&amp;gt;0.3 then outsample=0;/*Build here Regression model 70%*/
else outsample=1;/*Check here Regression model*/
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 28 Jun 2021 10:09:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750771#M236206</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-28T10:09:19Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750773#M236207</link>
      <description>&lt;P&gt;The method is:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Determine the proportion of failures in the full data set, let's assume that it is k% (use PROC FREQ)&lt;/LI&gt;
&lt;LI&gt;Assign random numbers to the full data set, and then sort by the random number and by failure or not&lt;/LI&gt;
&lt;LI&gt;Take the sorted full data set, and assign the first 70*k% of the failures to insample and remaining 30*k% to outsample, and assign first 70*(1-k)% of the non-failures to insample and the remaining 30*(1-k)% of the non-failures to the outsample&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I think you can achieve the same using PROC SURVEYSELECT, but I have never done it that way.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jun 2021 11:14:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750773#M236207</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-06-28T11:14:34Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750779#M236210</link>
      <description>What does your "failure in following period" variable look like? &lt;BR /&gt;Likely Proc surveyselect with your not named variable with the failure in following period information as a STRATA variable and a SAMPRATE of 50 would work or maybe a specific SAMPSIZE. But kind of need to know what sort of variable holds the information. &lt;BR /&gt;Or possibly variables. May need to create a single variable that Surveyselect can use. &lt;BR /&gt;&lt;BR /&gt;Details matter.</description>
      <pubDate>Mon, 28 Jun 2021 12:04:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750779#M236210</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-06-28T12:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750807#M236226</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Rawtbl;
Input CustomerID X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Ind_Touch_Failue;
Cards;
1 10 20 30 40 50 60 70 80 90 100 0
2 15 20 25 30 35 40 45 50 55 110 1
3 25 30 25 30 35 40 35 50 50 130 0
4 25 30 25 30 35 20 35 50 25 100 0
5 10 30 45 30 35 40 45 50 55 150 1
and so on
;
Run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 28 Jun 2021 13:38:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750807#M236226</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-28T13:38:42Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750808#M236227</link>
      <description>&lt;P&gt;May you please show code?&lt;/P&gt;
&lt;P&gt;It is much easier to understand with real code.thanks&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jun 2021 13:39:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750808#M236227</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-28T13:39:28Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750822#M236234</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;May you please show code?&lt;/P&gt;
&lt;P&gt;It is much easier to understand with real code.thanks&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Why don't you take a try at it? I know you can do PROC FREQ, I know you can do PROC SORT, I know you can create random numbers. Show us what you have if it isn't working.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jun 2021 14:28:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750822#M236234</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-06-28T14:28:30Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750823#M236235</link>
      <description>&lt;P&gt;&amp;nbsp;This would select 25 records from each "strata" of Ind_touch_failure, if there are at least 25 of each. Which means that the two levels in the selected set would have "equal proportion" , i.e. 50% of each.&lt;/P&gt;
&lt;PRE&gt;Proc sort data=rawtbl1;
   by Ind_touch_failure;
run;

proc surveyselect data=rawtbl out=want
   sampsize =25;
   strata ind_touch_failure;
run;&lt;/PRE&gt;
&lt;P&gt;If you want a repeatable selection then you want to set a SEED= option otherwise you'll likely get a different set if you rerun the code.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jun 2021 14:28:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750823#M236235</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-06-28T14:28:40Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750893#M236271</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Rawtbl;
Input CustomerID X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Ind_Touch_Failue;
Cards;
1 10 20 30 40 50 60 70 80 90 100 0
2 15 20 25 30 35 40 45 50 55 110 1
3 25 30 25 30 35 40 35 50 50 130 0
4 25 30 25 30 35 20 35 50 25 100 0
5 10 30 45 30 35 40 45 50 55 150 1
and so on
;
Run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So what about this data indicates "following period".&amp;nbsp; &amp;nbsp;In fact, what indicates current period?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Jun 2021 00:31:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/750893#M236271</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-06-29T00:31:21Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751111#M236397</link>
      <description>X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 are information in current period (base month) and Ind_Touch_Failue is information if customer "touch" failure in next 12 months (following period)</description>
      <pubDate>Wed, 30 Jun 2021 03:56:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751111#M236397</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-30T03:56:25Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751115#M236400</link>
      <description>I dont understand what you did here.&lt;BR /&gt;The real source data set contain 40,000 rows.&lt;BR /&gt;1500 of them are with failure (Ind_Touch_Failue=1) and 38500 rows with no failure(Ind_Touch_Failue=0).&lt;BR /&gt;I want to divide the rows into 2 populations:&lt;BR /&gt;In-Sample (25% of observations) so 25% of 40000 is 10,000 rows.&lt;BR /&gt;Out-Sample (75% of observations) so 75% of 40000 is 30,000 rows.&lt;BR /&gt;The only issue is that I need to add one more criteria to division into 2 populations.&lt;BR /&gt;I need that proportion of failure in 2 populations be equal!&lt;BR /&gt;What is the code to do it please?&lt;BR /&gt;This code below doesn't take into consideration the request of equal proportion of failure .&lt;BR /&gt;&lt;BR /&gt;data wanted; &lt;BR /&gt;set  have;&lt;BR /&gt;randomPop=ranuni(1234);&lt;BR /&gt;if randomPop=&amp;gt;0.3 then outsample=0;&lt;BR /&gt;else outsample=1;&lt;BR /&gt;Run;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 30 Jun 2021 04:37:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751115#M236400</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-30T04:37:46Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751117#M236402</link>
      <description>&lt;P&gt;This code didn't work .&lt;/P&gt;
&lt;P&gt;Why did you write&amp;nbsp; sampsize =25?&lt;/P&gt;
&lt;P&gt;My source data set has 40,000 observations.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 04:39:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751117#M236402</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2021-06-30T04:39:22Z</dc:date>
    </item>
    <item>
      <title>Re: in-sample out-sample with equal failure proprtion</title>
      <link>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751298#M236486</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;This code didn't work .&lt;/P&gt;
&lt;P&gt;Why did you write&amp;nbsp; sampsize =25?&lt;/P&gt;
&lt;P&gt;My source data set has 40,000 observations.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;1) dummy code for dummy data. I explained that would select 25 records from each strata.&lt;/P&gt;
&lt;P&gt;2) when I posted that you had not said anything about the size of your data set so I picked a small number hoping that it would at least run.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I do not know what you are attempting. I think that you need to consider generating a small example by hand of what you expect from a given example data set.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jun 2021 16:19:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/in-sample-out-sample-with-equal-failure-proprtion/m-p/751298#M236486</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-06-30T16:19:37Z</dc:date>
    </item>
  </channel>
</rss>

