<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Comparison CDF curves in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454901#M23757</link>
    <description>&lt;P&gt;The data_xy dataset contains for each value of &lt;EM&gt;x&lt;/EM&gt; (ActualReturn) the proportion of &lt;EM&gt;y&lt;/EM&gt; values (BootstrapReturn) which are inferior or equal&amp;nbsp;to &lt;EM&gt;x&lt;/EM&gt; in the&amp;nbsp;variable cdf_y.&lt;/P&gt;</description>
    <pubDate>Tue, 17 Apr 2018 19:15:04 GMT</pubDate>
    <dc:creator>PGStats</dc:creator>
    <dc:date>2018-04-17T19:15:04Z</dc:date>
    <item>
      <title>Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454536#M23743</link>
      <description>&lt;P&gt;Goodevening everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have two variables (bootstrapreturn and actualreturn) that I have plotted one CDF graphs to be able to compare them by using the following SAS code:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ods graphics on;&lt;BR /&gt;data Comparison;&lt;BR /&gt;set WORK.FAMA3CDF;&lt;BR /&gt;length varName $20;&lt;BR /&gt;ObsNum = _N_;&lt;BR /&gt;varName = "BootstrapReturn"; Value = BootstrapReturn; output; /* put VAR1 on this line */&lt;BR /&gt;varName = "ActualReturn"; Value = ActualReturn; output; /* put VAR2 on this line */&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc univariate data=Comparison;&lt;BR /&gt;class varName;&lt;BR /&gt;var Value;&lt;BR /&gt;cdfplot Value/ overlay;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That being said, I would like to be able, for each percentile, to know the proportion of observations of bootstrap return higher/lower than actual return. For example, I would like to know that, at the 95th percentile, 87% of the observations of bootstrap return are lower than actual return.&lt;BR /&gt;&lt;BR /&gt;Any suggestion? I really have no idea how I could do that...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks a lot in advance for your help!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Apr 2018 18:35:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454536#M23743</guid>
      <dc:creator>Max05</dc:creator>
      <dc:date>2018-04-16T18:35:48Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454551#M23744</link>
      <description>&lt;P&gt;An interesting problem. I must admit, I don't have a clue about how to do it either, but SAS gives us a great set of meccano pieces to stitch something together.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First question: are you expecting to see something graphical, or something tabular? In either case, could you sketch out what your desired result would look like? That will give us a target to aim at.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tom&lt;/P&gt;</description>
      <pubDate>Mon, 16 Apr 2018 19:50:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454551#M23744</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2018-04-16T19:50:18Z</dc:date>
    </item>
    <item>
      <title>Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454554#M23745</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;thanks a lot for your answer. I expect to have a table with the following structure:&lt;BR /&gt;&lt;BR /&gt;Percentile ActualReturn BootstrapReturn % &amp;gt; ActualReturn&lt;BR /&gt;10 -4.25 -3.18 8.12%&lt;BR /&gt;20 -3.5 -2.8 23%&lt;BR /&gt;30 -2 -1.9 61%&lt;BR /&gt;40&lt;BR /&gt;...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 16 Apr 2018 19:56:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454554#M23745</guid>
      <dc:creator>Max05</dc:creator>
      <dc:date>2018-04-16T19:56:37Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454604#M23746</link>
      <description>&lt;P&gt;You could do that like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
call streaminit(896868);
do i = 1 to 200;
    x = rand("normal");
    y = rand("normal");
    output;
    end;
run;

proc sort data=test(keep=x) out=test_x(rename=x=u); by x; run;
proc sort data=test(keep=y) out=test_y(rename=y=u); by y; run;
proc rank data=test_y out=test_y fraction; var u; ranks ru; run;

data test_xy;
merge test_x(in=inx) test_y(in=iny);
by u;
retain cdf_y 0;
if iny then cdf_y = ru;
if inx;
drop ru;
rename u=x;
run;

proc sgplot data=test_xy;
step x=x y=cdf_y;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 17 Apr 2018 00:13:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454604#M23746</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-04-17T00:13:50Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454615#M23747</link>
      <description>&lt;P&gt;This will give you the side by side percentiles. The percentage column needs some more thinking.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tom&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc univariate data=FAMA3CDF;
	var ActualReturn BootstrapReturn;
	output out=Pctls pctlpre=AR BR pctlpts=10 to 100 by 10;
run;

proc transpose data=Pctls out=TrnsPctls;
run;

data TrnsPctls;
	set TrnsPctls(rename=(Col1=PctlValue));
	Category = substr(_name_, 1, 2);
	Percentile = input(subpad(_name_, 3), best3.);
	drop _name_ _label_;
run;

proc transpose data=TrnsPctls out=ReTrnsPctls(drop=_name_);
	var PctlValue;
	by Percentile;
	ID Category;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 17 Apr 2018 02:40:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454615#M23747</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2018-04-17T02:40:50Z</dc:date>
    </item>
    <item>
      <title>Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454628#M23748</link>
      <description>Yes that's it! Let me know if you have a solution for the last part &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;Thanks a lot&lt;BR /&gt;</description>
      <pubDate>Tue, 17 Apr 2018 05:54:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454628#M23748</guid>
      <dc:creator>Max05</dc:creator>
      <dc:date>2018-04-17T05:54:37Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454629#M23749</link>
      <description>&lt;P&gt;Hi PG,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I applied it to my code so I obtain&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=fama3cdf(keep=ActualReturn) out=test_x(rename=ActualReturn=u); by ActualReturn; run;&lt;BR /&gt;proc sort data=fama3cdf(keep=BootstrapReturn) out=test_y(rename=BootstrapReturn=u); by BootstrapReturn; run;&lt;BR /&gt;proc rank data=test_y out=test_y fraction; var u; ranks ru; run;&lt;/P&gt;&lt;P&gt;data test_xy;&lt;BR /&gt;merge test_x(in=inx) test_y(in=iny);&lt;BR /&gt;by u;&lt;BR /&gt;retain cdf_y;&lt;BR /&gt;if iny then cdf_y = ru;&lt;BR /&gt;if inx;&lt;BR /&gt;drop ru;&lt;BR /&gt;rename u=x;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc sgplot data=test_xy;&lt;BR /&gt;series x=x y=cdf_y;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I guess I have to skip the first part as I have my dataset. But I dont understand the result I obtain... What is the meaning of the third table?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks a lot for your advice&lt;/P&gt;</description>
      <pubDate>Tue, 17 Apr 2018 06:01:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454629#M23749</guid>
      <dc:creator>Max05</dc:creator>
      <dc:date>2018-04-17T06:01:57Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454891#M23756</link>
      <description>&lt;P&gt;Here's what I hope is the whole thing. I don't have the time to verify the results using my test data in detail, so I suggest that you evaluate it very carefully. I've created different temporary datasets all the way through so that you can see the partial results as they are created.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let me know!&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;Tom&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc univariate data=FAMA3CDF;
	var ActualReturn BootstrapReturn;
	output out=Pctls pctlpre=AR BR pctlpts=10 to 100 by 10;
run;

proc transpose data=Pctls out=TrnsPctls;
run;

data TrnsPctls;
	set TrnsPctls(rename=(Col1=PctlValue));
	Category = substr(_name_, 1, 2);
	Percentile = input(subpad(_name_, 3), best3.);
	drop _name_ _label_;
run;

proc transpose data=TrnsPctls out=ReTrnsPctls(drop=_name_);
	var PctlValue;
	by Percentile;
	ID Category;
run;

proc sql noprint;
	select count(*) into: RecordCount from work.FAMA3CDF;
quit;

proc sort data=WORK.FAMA3CDF out=FAMA3CDFSort;
	by BootstrapReturn;
run;

proc sql noprint;
	create table AllVars as
		select * from FAMA3CDFSort cross join Pctls;
quit;

%macro CalcStats;

	data Stats;
		set AllVars end=LastRec;

		%do i = 10 %to 100 %by 10; /* Set up an accumulator variable for each percentile */
			retain ARUnder&amp;amp;i 0;

			if BootstrapReturn &amp;lt; AR&amp;amp;i then
				ARUnder&amp;amp;i = ARUnder&amp;amp;i + 1;
		%end;

		if LastRec
			then do;

			%do i = 10 %to 100 %by 10; /* Set up an accumulator variable for each percentile */
				Percentile = &amp;amp;i;
				ARPctg&amp;amp;i = ARUnder&amp;amp;i / &amp;amp;RecordCount;
				keep ARPctg&amp;amp;i;
			%end;

			output;
		end;
	run;

%mend;

%CalcStats;

proc transpose data=Stats out=TrnsStats;
run;

data Percentages;
	set TrnsStats(rename=(Col1=Percentage));
	Percentile = input(subpad(_name_,7), best3.);
	drop _name_;
run;

proc sql noprint;
	create table Want as
		select r.Percentile, r.AR, r.BR, p.Percentage
			from ReTrnsPctls r inner join Percentages p
				on r.Percentile = p.Percentile;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 17 Apr 2018 18:28:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454891#M23756</guid>
      <dc:creator>TomKari</dc:creator>
      <dc:date>2018-04-17T18:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454901#M23757</link>
      <description>&lt;P&gt;The data_xy dataset contains for each value of &lt;EM&gt;x&lt;/EM&gt; (ActualReturn) the proportion of &lt;EM&gt;y&lt;/EM&gt; values (BootstrapReturn) which are inferior or equal&amp;nbsp;to &lt;EM&gt;x&lt;/EM&gt; in the&amp;nbsp;variable cdf_y.&lt;/P&gt;</description>
      <pubDate>Tue, 17 Apr 2018 19:15:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/454901#M23757</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-04-17T19:15:04Z</dc:date>
    </item>
    <item>
      <title>Re: Comparison CDF curves</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/455615#M23763</link>
      <description>&lt;P&gt;I think PGStats's program gives you the proportion for each value of X. For large data sets, you might want to compute the proportions only for the percentiles of X. For example, the following SAS/IML program&amp;nbsp;uses PGStats's idea, but only results in 100 computations, regardless of the size of X:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
call streaminit(896868);
do i = 1 to 200;
    x = rand("normal", 0);
    y = rand("normal", 0.1);
    output;
    end;
run;

proc iml;
use test;  read all var {x y};  close;
call sort(x);
call sort(y);

/* for each percentile of x, what is the proportion of observations 
   in y that are less than that percentile? */
pctls = do(0.0, 1, 0.01);
call qntl(q, x, pctls);   /* convention: 0th pctl=min; 100th pctl=max */
prop = j(nrow(q), 1);
do i = 1 to nrow(q);
   prop[i] = mean(y &amp;lt;= q[i]);   /* or use &amp;gt;= */
end;

title "Proportion of Y that is less than or equal to quantile of X";
call scatter(q, prop) label={"Quantiles of X" "Proportion of Y That Is Less Than"};
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 19 Apr 2018 14:19:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Comparison-CDF-curves/m-p/455615#M23763</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-04-19T14:19:47Z</dc:date>
    </item>
  </channel>
</rss>

