<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Calculating expression of a formula for each resample in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242599#M55819</link>
    <description>&lt;P&gt;I have a datafile called 'original' and in it contains 4 variables, call them a, b, c, d, with n observations each. I then use proc surveyselect to draw 1000 resamples from the 'original' dataset with sample size n_b = n/4, the code is as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select ceil(count(*)/4) into :record_count
from original;
quit;


%put &amp;amp;record_count;
%let rep = 1000;
proc surveyselect data= original out=bootsample
     seed = 1234 method = urs
	 sampsize=&amp;amp;record_count outhits rep = &amp;amp;rep;
run;
ods listing close;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This produces a datafile named 'bootsample' which contains 1000 samples with sample size n_b of each variable (a, b, c, and d) from the 'original' dataset. Each observation's replication ID is given by the variable "Replicate" (ranging from 1 to 1000).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I need to do is this:Take Replicate = 1 (i.e., the first replication sample) and the variable a as an example. I want to calculate the following value of t: (if the picture below doesn't show, please see attachment of the picture titled "formula")&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1404iE1033CAFDFF71747/image-size/original?v=mpbl-1&amp;amp;px=-1" alt="expression.png" title="expression.png" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;where mean(a) is the sample average of the variable a for replication sample 1, std(a) is the sample standard deviation of the variable a for replication sample 1, a_i represents each individual observation of the variable a for replication sample 1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then, I want to repeat the above procedure and calculate the value of t for all 1000 replication samples and all four variables: a, b, c, and d. I want to store the final result in a datafile called "result" that has 4 variables called a_t, b_t, c_t, and d_t (i.e., 4 columns) and the 1000 values of t of each variable in each row. So, graphically, a datafile structured like this: (if the picture below doesn't show, please see attachment of the picture titled "result")&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1405i4B6FFD51361A0D8D/image-size/original?v=mpbl-1&amp;amp;px=-1" alt="result.png" title="result.png" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anyone show me a template code that can achieve what I described above? I'm thinking maybe proc sql can do the trick, but I'm quite new to SAS and still don't really know the syntax very well. Thanks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BR /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/12176i05F9E39202351BC6/image-size/large?v=1.0&amp;amp;px=600" border="0" alt="formula.png" title="formula.png" /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/12177iEADD4643717028A9/image-size/large?v=1.0&amp;amp;px=600" border="0" alt="result.png" title="result.png" /&gt;</description>
    <pubDate>Sun, 10 Jan 2016 06:50:38 GMT</pubDate>
    <dc:creator>TrueTears</dc:creator>
    <dc:date>2016-01-10T06:50:38Z</dc:date>
    <item>
      <title>Calculating expression of a formula for each resample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242599#M55819</link>
      <description>&lt;P&gt;I have a datafile called 'original' and in it contains 4 variables, call them a, b, c, d, with n observations each. I then use proc surveyselect to draw 1000 resamples from the 'original' dataset with sample size n_b = n/4, the code is as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
select ceil(count(*)/4) into :record_count
from original;
quit;


%put &amp;amp;record_count;
%let rep = 1000;
proc surveyselect data= original out=bootsample
     seed = 1234 method = urs
	 sampsize=&amp;amp;record_count outhits rep = &amp;amp;rep;
run;
ods listing close;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This produces a datafile named 'bootsample' which contains 1000 samples with sample size n_b of each variable (a, b, c, and d) from the 'original' dataset. Each observation's replication ID is given by the variable "Replicate" (ranging from 1 to 1000).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I need to do is this:Take Replicate = 1 (i.e., the first replication sample) and the variable a as an example. I want to calculate the following value of t: (if the picture below doesn't show, please see attachment of the picture titled "formula")&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1404iE1033CAFDFF71747/image-size/original?v=mpbl-1&amp;amp;px=-1" alt="expression.png" title="expression.png" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;where mean(a) is the sample average of the variable a for replication sample 1, std(a) is the sample standard deviation of the variable a for replication sample 1, a_i represents each individual observation of the variable a for replication sample 1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then, I want to repeat the above procedure and calculate the value of t for all 1000 replication samples and all four variables: a, b, c, and d. I want to store the final result in a datafile called "result" that has 4 variables called a_t, b_t, c_t, and d_t (i.e., 4 columns) and the 1000 values of t of each variable in each row. So, graphically, a datafile structured like this: (if the picture below doesn't show, please see attachment of the picture titled "result")&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1405i4B6FFD51361A0D8D/image-size/original?v=mpbl-1&amp;amp;px=-1" alt="result.png" title="result.png" border="0" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anyone show me a template code that can achieve what I described above? I'm thinking maybe proc sql can do the trick, but I'm quite new to SAS and still don't really know the syntax very well. Thanks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BR /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/12176i05F9E39202351BC6/image-size/large?v=1.0&amp;amp;px=600" border="0" alt="formula.png" title="formula.png" /&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/12177iEADD4643717028A9/image-size/large?v=1.0&amp;amp;px=600" border="0" alt="result.png" title="result.png" /&gt;</description>
      <pubDate>Sun, 10 Jan 2016 06:50:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242599#M55819</guid>
      <dc:creator>TrueTears</dc:creator>
      <dc:date>2016-01-10T06:50:38Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating expression of a formula for each resample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242601#M55820</link>
      <description>Look into proc means. Using BY processing in SAS tells proc how to define your groups. &lt;BR /&gt;Here's some sample code. &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Proc means data=have stackods n mean median std max min;&lt;BR /&gt;By replicate;&lt;BR /&gt;Var a b c d t;&lt;BR /&gt;Ods table summary=Results;&lt;BR /&gt;Run;&lt;BR /&gt;&lt;BR /&gt;Proc print data=results;&lt;BR /&gt;Run;</description>
      <pubDate>Sun, 10 Jan 2016 06:36:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242601#M55820</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-01-10T06:36:59Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating expression of a formula for each resample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242602#M55821</link>
      <description>Sorry, didn't see your formulas in attachment. It helps to include or mention them in your question. Is your metric related to skewness?</description>
      <pubDate>Sun, 10 Jan 2016 06:39:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242602#M55821</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-01-10T06:39:02Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating expression of a formula for each resample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242603#M55822</link>
      <description>I included them as a picture in the post itself, maybe it doesn't show for you for some reason. I have attached both pictures as an attachment.&lt;BR /&gt;&lt;BR /&gt;Basically, the trouble I am having is how to code the formula and also how to output the results.&lt;BR /&gt;&lt;BR /&gt;Thanks for your help.</description>
      <pubDate>Sun, 10 Jan 2016 06:52:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242603#M55822</guid>
      <dc:creator>TrueTears</dc:creator>
      <dc:date>2016-01-10T06:52:43Z</dc:date>
    </item>
    <item>
      <title>Re: Calculating expression of a formula for each resample</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242612#M55826</link>
      <description>&lt;P&gt;Sure, PROC SQL can do the trick, but PROC MEANS (or PROC SUMMARY for that matter) can compute&amp;nbsp;more&amp;nbsp;statistics than PROC SQL, e.g. skewness (thanks,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza﻿&lt;/a&gt;,&amp;nbsp;for the hint!). Please note that your &lt;EM&gt;S&lt;/EM&gt; statistic can be derived from the coefficient of variation: &lt;EM&gt;S&lt;/EM&gt;=100/CV. Your&amp;nbsp;&lt;FONT face="symbol"&gt;g&lt;/FONT&gt;&amp;nbsp;statistic can be calculated directly as skewness with the VARDEF=N option of the PROC MEANS statement, whereas your sample standard deviation would require VARDEF=DF (for the denominator &lt;EM&gt;n&lt;/EM&gt;-1, which I assume is what you want). Instead of merging two datasets with summary statistics (one for each setting of VARDEF), I decided to convert the default "DF skewness"&amp;nbsp;to "N skewness" by multiplying with the appropriate conversion factor &lt;EM&gt;f&amp;nbsp;&lt;/EM&gt;:= (&lt;EM&gt;n&lt;/EM&gt;-1)(&lt;EM&gt;n&lt;/EM&gt;-2)/&lt;EM&gt;n&lt;/EM&gt;² = 1 - 3/&lt;EM&gt;n&amp;nbsp;&lt;/EM&gt;+ 2/&lt;SPAN&gt;&lt;EM&gt;n&lt;/EM&gt;².&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=bootsample;
by replicate;
var a b c d;
output out=stats(drop=_:) cv= skew= / autoname;
run;

%let f=%sysevalf(1-3/&amp;amp;record_count+2/&amp;amp;record_count**2);

data want;
set stats;
a_t=100/a_CV+&amp;amp;f*a_Skew;
b_t=100/b_CV+&amp;amp;f*b_Skew;
c_t=100/c_CV+&amp;amp;f*c_Skew;
d_t=100/d_CV+&amp;amp;f*d_Skew;
drop a_CV--d_Skew;
run;

ods html file="C:\Temp\t_stat.html";
ods listing close;
title 'The t Statistic';

proc print data=want label noobs;
run;

ods html close;
ods listing;
title;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This could be the PROC SQL code (perhaps for validation purposes or for comparison of run times and numerical accuracy):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select replicate,
       m_a/s_a+sum(x_a)/(&amp;amp;record_count*s_a**3) as a_t,
       m_b/s_b+sum(x_b)/(&amp;amp;record_count*s_b**3) as b_t,
       m_c/s_c+sum(x_c)/(&amp;amp;record_count*s_c**3) as c_t,
       m_d/s_d+sum(x_d)/(&amp;amp;record_count*s_d**3) as d_t from
(select replicate, mean(a) as m_a, std(a) as s_a, (a-calculated m_a)**3 as x_a, 
                   mean(b) as m_b, std(b) as s_b, (b-calculated m_b)**3 as x_b,
                   mean(c) as m_c, std(c) as s_c, (c-calculated m_c)**3 as x_c,
                   mean(d) as m_d, std(d) as s_d, (d-calculated m_d)**3 as x_d
 from bootsample
 group by replicate)
group by replicate;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Minor differences (like 1E-13, but depending on your a, b, c, d values) between the two approaches are likely to occur.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 10 Jan 2016 18:47:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Calculating-expression-of-a-formula-for-each-resample/m-p/242612#M55826</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-01-10T18:47:26Z</dc:date>
    </item>
  </channel>
</rss>

