<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Creating dataset that matches profile of another dataset in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/471175#M120658</link>
    <description>&lt;P&gt;That looks like it might be what I want. I'll have a bash at it in the morning.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andrew&lt;/P&gt;</description>
    <pubDate>Mon, 18 Jun 2018 17:30:59 GMT</pubDate>
    <dc:creator>andrewjmdata</dc:creator>
    <dc:date>2018-06-18T17:30:59Z</dc:date>
    <item>
      <title>Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470775#M120490</link>
      <description>&lt;P&gt;Hi, I have&amp;nbsp;mailing file&amp;nbsp;which I have profiled on age, gender, ethnicity, occupation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From the main database I want to select as big a sample as&amp;nbsp;possible that has the same characteristics as the mailing file based on age, gender, ethnicity and occupation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can do this in a very long winded way by matching % of the database that are same in the mailing file but it would be better to if I could find a sas procedure/methodology to do it as I need an element of statistical rigour.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using SAS EG v7.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jun 2018 10:45:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470775#M120490</guid>
      <dc:creator>andrewjmdata</dc:creator>
      <dc:date>2018-06-16T10:45:18Z</dc:date>
    </item>
    <item>
      <title>Re: Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470790#M120495</link>
      <description>&lt;P&gt;could use proc sort nodupkey to reduce the mailing file to all the possible combinations of age/sex/ethnicity/occupation. And then merge this onto the master file.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=mailingfile out=template (keep=age sex ethnicity occupation) nodupkey;&lt;/P&gt;&lt;P&gt;by age sex ethnicity occupation;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort masterfile&lt;/P&gt;&lt;P&gt;by age sex ethnicity occupation;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data new;&lt;/P&gt;&lt;P&gt;merge masterfile template (in=a);&lt;/P&gt;&lt;P&gt;by age sex ethnicity occupation;&lt;/P&gt;&lt;P&gt;if a then output;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;it deponds a lot on exactly what you're doing ie how they should be matched. You could do something sophisticated ie proc psmatch &lt;A href="https://support.sas.com/documentation/onlinedoc/stat/142/psmatch.pdf" target="_blank"&gt;https://support.sas.com/documentation/onlinedoc/stat/142/psmatch.pdf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jun 2018 15:04:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470790#M120495</guid>
      <dc:creator>pau13rown</dc:creator>
      <dc:date>2018-06-16T15:04:59Z</dc:date>
    </item>
    <item>
      <title>Re: Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470819#M120503</link>
      <description>&lt;P&gt;And if you don't have the latest version of SAS to use PSMATCH, see the Mayo Clinic's macro for greedy match algorithms.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Jun 2018 23:45:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470819#M120503</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-06-16T23:45:36Z</dc:date>
    </item>
    <item>
      <title>Re: Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470859#M120522</link>
      <description>&lt;P&gt;Mmm! Not sure that will work, I probably haven't explained the guts of this too well.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If in the mail file I have the following profile of age sex ethnicity occupation...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE width="390"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="78"&gt;Age&lt;/TD&gt;
&lt;TD width="78"&gt;Sex&lt;/TD&gt;
&lt;TD width="78"&gt;Ethnicity&lt;/TD&gt;
&lt;TD width="78"&gt;Occupation&lt;/TD&gt;
&lt;TD width="78"&gt;% of total&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;20-30&lt;/TD&gt;
&lt;TD&gt;Male&lt;/TD&gt;
&lt;TD&gt;Black&lt;/TD&gt;
&lt;TD&gt;Plumber&lt;/TD&gt;
&lt;TD&gt;3.5%&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;30-40&lt;/TD&gt;
&lt;TD&gt;Male&lt;/TD&gt;
&lt;TD&gt;White&lt;/TD&gt;
&lt;TD&gt;Painter&lt;/TD&gt;
&lt;TD&gt;2.9%&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;40-50&lt;/TD&gt;
&lt;TD&gt;Male&lt;/TD&gt;
&lt;TD&gt;White&lt;/TD&gt;
&lt;TD&gt;Lawyer&lt;/TD&gt;
&lt;TD&gt;1.7%&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;20-30&lt;/TD&gt;
&lt;TD&gt;Female&lt;/TD&gt;
&lt;TD&gt;White&lt;/TD&gt;
&lt;TD&gt;Teacher&lt;/TD&gt;
&lt;TD&gt;2.1%&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;30-40&lt;/TD&gt;
&lt;TD&gt;Male&lt;/TD&gt;
&lt;TD&gt;Asian&lt;/TD&gt;
&lt;TD&gt;Doctor&lt;/TD&gt;
&lt;TD&gt;2.9%&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;40-50&lt;/TD&gt;
&lt;TD&gt;Male&lt;/TD&gt;
&lt;TD&gt;White&lt;/TD&gt;
&lt;TD&gt;Plumber&lt;/TD&gt;
&lt;TD&gt;1.7%&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I want in the control I select from the main database to have the same profile by age, sex, ethnicity and occupation in terms of the %.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this makes more sense now, and thanks for your help so far.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Sun, 17 Jun 2018 11:50:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470859#M120522</guid>
      <dc:creator>andrewjmdata</dc:creator>
      <dc:date>2018-06-17T11:50:14Z</dc:date>
    </item>
    <item>
      <title>Re: Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470935#M120561</link>
      <description>&lt;P&gt;I think you need two columns: nMail and nPop with the mail sample sizes and the population sizes respectively. From those, you could do:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data counts;
input Age $ Sex $ Ethnicity $ Occupation $ nMail nPop; 
datalines;
20-30 Male Black Plumber 350 2000
30-40 Male White Painter 290 5000
40-50 Male White Lawyer 170 1200 
20-30 Female White Teacher 210 5000
30-40 Male Asian Doctor 290 500
40-50 Male White Plumber 170 3500
;

proc sql;
create table sampsize as
select
    *,
    min(nPop/nMail) * nMail as SampleSize
from counts;
select * from sampsize;
quit;

proc surveyselect data=pop out=control sampsize=sampsize;
strata age sex ethnicity occupation;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note: The population and sample size datasets should be sorted the same way.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 17 Jun 2018 21:15:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/470935#M120561</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-17T21:15:38Z</dc:date>
    </item>
    <item>
      <title>Re: Creating dataset that matches profile of another dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/471175#M120658</link>
      <description>&lt;P&gt;That looks like it might be what I want. I'll have a bash at it in the morning.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Mon, 18 Jun 2018 17:30:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Creating-dataset-that-matches-profile-of-another-dataset/m-p/471175#M120658</guid>
      <dc:creator>andrewjmdata</dc:creator>
      <dc:date>2018-06-18T17:30:59Z</dc:date>
    </item>
  </channel>
</rss>

