<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to generate a dummy data set for testing without access to the original data set in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639550#M21687</link>
    <description>Unfortunately, using the anonymized IDs is not possible. They are continuous variables and I want to match at least means and variances, and show a similar association with other variables in the dummy data set like odds ratio, as would be expected if using the original data set. ‘Close’ Shouldn’t be exactly the same but realistic enough to an ordinary eye. Any more thoughts? Thanks much!</description>
    <pubDate>Mon, 13 Apr 2020 18:53:22 GMT</pubDate>
    <dc:creator>Ogee</dc:creator>
    <dc:date>2020-04-13T18:53:22Z</dc:date>
    <item>
      <title>How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639527#M21680</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please can anyone share some coding language (simple and efficient) on how to create a dummy data that mimics the structure of an original data set that one may not have access to? The goal is to ensure that if I run a proc means on the dummy data set, I should get almost similar results as those gotten if I were to run a proc means on the original data set. Any tip is welcome. Thanks in advance.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 17:28:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639527#M21680</guid>
      <dc:creator>Ogee</dc:creator>
      <dc:date>2020-04-13T17:28:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639529#M21681</link>
      <description>&lt;P&gt;If you don't have access to the original data set, how do you expect to know its structure?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 17:32:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639529#M21681</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-04-13T17:32:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639531#M21682</link>
      <description>Sorry, I should be more clear... so let's say you know the variables and metadata of the original data set or have confidential information in the original data set that prevents you from sharing the results, how can one generate a dummy data set that mimics the structure of the original data set given the limited information on the original data set that one knows. Thanks in advance.</description>
      <pubDate>Mon, 13 Apr 2020 17:38:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639531#M21682</guid>
      <dc:creator>Ogee</dc:creator>
      <dc:date>2020-04-13T17:38:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639548#M21685</link>
      <description>&lt;P&gt;Continuous or discrete values?&lt;/P&gt;
&lt;P&gt;Hint: with discrete values you may be more interested in percentage of occurrence than "mean"&lt;/P&gt;
&lt;P&gt;Do you expect to show a similar association with any other variables?&lt;/P&gt;
&lt;P&gt;How many dummy observations do you intend to create? How "close" is close enough?&lt;/P&gt;
&lt;P&gt;Do you already know the mean, standard deviation (and maybe skewness and kurtosis) of the variables of interest?&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 18:42:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639548#M21685</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-04-13T18:42:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639549#M21686</link>
      <description>&lt;P&gt;You have to decide in advance which features of the original data you want to mimic. You may want to match means, variances, distributions, correlations, ... the list is endless. Is using the original data with anonymised IDs a possibility?&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 18:43:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639549#M21686</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-04-13T18:43:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639550#M21687</link>
      <description>Unfortunately, using the anonymized IDs is not possible. They are continuous variables and I want to match at least means and variances, and show a similar association with other variables in the dummy data set like odds ratio, as would be expected if using the original data set. ‘Close’ Shouldn’t be exactly the same but realistic enough to an ordinary eye. Any more thoughts? Thanks much!</description>
      <pubDate>Mon, 13 Apr 2020 18:53:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639550#M21687</guid>
      <dc:creator>Ogee</dc:creator>
      <dc:date>2020-04-13T18:53:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639556#M21689</link>
      <description>&lt;P&gt;If you have reason to believe that you have continuous variables&amp;nbsp; that are close to normally distributed you could do something similar to:&lt;/P&gt;
&lt;PRE&gt;proc summary data=sashelp.class ;
var height weight;
output out=classsummary mean= stddev= / autoname;
run;

&lt;BR /&gt;data trial;&lt;BR /&gt;   set classsummary;&lt;BR /&gt;   do i=1 to 20;&lt;BR /&gt;      modheight = rand('normal',height_mean,height_stddev);&lt;BR /&gt;      modweight = rand('normal',weight_mean,weight_stddev);&lt;BR /&gt;      output;&lt;BR /&gt;   end;&lt;BR /&gt;   keep modheight modweight;&lt;BR /&gt;run;&lt;BR /&gt;

proc means data=trial mean stddev;
run;   

   &lt;/PRE&gt;
&lt;P&gt;If you know your data is some other distribution similar may be possible by replacing the 'normal' with the appropriate distribution from the RAND documentation along with the required parameters. You would have to get the parameters from Proc means/summary or univariate somewhere, i.e. your existing data.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 19:20:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639556#M21689</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-04-13T19:20:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639557#M21690</link>
      <description>&lt;P&gt;Data simulation is well covered in &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt; book&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.sas.com/en/books/authors/rick-wicklin.html" target="_blank"&gt;https://support.sas.com/en/books/authors/rick-wicklin.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 19:20:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/639557#M21690</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-04-13T19:20:29Z</dc:date>
    </item>
    <item>
      <title>Re: How to generate a dummy data set for testing without access to the original data set</title>
      <link>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/640149#M21724</link>
      <description>&lt;P&gt;Thanks all for your thoughtful response to my questions. Much appreciated!&lt;/P&gt;</description>
      <pubDate>Wed, 15 Apr 2020 17:44:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/How-to-generate-a-dummy-data-set-for-testing-without-access-to/m-p/640149#M21724</guid>
      <dc:creator>Ogee</dc:creator>
      <dc:date>2020-04-15T17:44:34Z</dc:date>
    </item>
  </channel>
</rss>

