<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dataset creation in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878674#M347174</link>
    <description>&lt;P&gt;Thank you all for your help. The problem has been solved.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 01 Jun 2023 14:10:24 GMT</pubDate>
    <dc:creator>hongjie76</dc:creator>
    <dc:date>2023-06-01T14:10:24Z</dc:date>
    <item>
      <title>Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878319#M346998</link>
      <description>&lt;P&gt;I tried to create a dataset of 200 individuals. There are 4 variables in the dataset, y, x1, x2, and x3. The value of y is predicted from x1, x2, and x3. There are also some correlations among x1, x2, and x3. I created the following SAS syntax. However, I got error messages and could not create the dataset. Would you please help me to see what went wrong here? Thank you in advance!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* Set the number of individuals */&lt;BR /&gt;%let num_individuals = 200;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* Set the correlation matrix */&lt;BR /&gt;%let correlation_matrix = 1, 0.5, 0.3,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.5, 1, 0.2,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0.3, 0.2, 1;&lt;/P&gt;&lt;P&gt;/* Create the dataset */&lt;BR /&gt;data my_dataset;&lt;BR /&gt;array x[3] x1-x3;&lt;BR /&gt;call streaminit(12345); /* Set the seed for random number generation */&lt;BR /&gt;&lt;BR /&gt;/* Generate correlated values for x1, x2, and x3 */&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;do i = 1 to &amp;amp;num_individuals;&lt;BR /&gt;x = rand("Multinormal", 0, &amp;amp;_correlation_matrix); /* Generate correlated values */&lt;BR /&gt;x1 = x[1];&lt;BR /&gt;x2 = x[2];&lt;BR /&gt;x3 = x[3];&lt;BR /&gt;&lt;BR /&gt;/* Calculate the value of y using x1, x2, and x3 */&lt;BR /&gt;y = 2 * x1 + 3 * x2 - 4 * x3 + rand("Normal", 0, 0.5); /* Add some random noise to the prediction */&lt;BR /&gt;&lt;BR /&gt;output; /* Output the current observation */&lt;BR /&gt;end;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;keep y x1 x2 x3; /* Keep only the specified variables */&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* Print the dataset */&lt;BR /&gt;proc print data=my_dataset;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I got the following error message:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;214 %let num_individuals = 200;&lt;BR /&gt;215&lt;BR /&gt;216 /* Set the correlation matrix */&lt;BR /&gt;217 %let correlation_matrix = 1, 0.5, 0.3,&lt;BR /&gt;218 0.5, 1, 0.2,&lt;BR /&gt;219 0.3, 0.2, 1;&lt;BR /&gt;220&lt;BR /&gt;221 /* Create the dataset */&lt;BR /&gt;222 data my_dataset;&lt;BR /&gt;223 array x[3] x1-x3;&lt;BR /&gt;224 call streaminit(12345); /* Set the seed for random number generation */&lt;BR /&gt;225&lt;BR /&gt;226 /* Generate correlated values for x1, x2, and x3 */&lt;BR /&gt;227 do i = 1 to &amp;amp;num_individuals;&lt;BR /&gt;228 x = rand("Multinormal", 0, &amp;amp;_correlation_matrix); /* Generate correlated values */&lt;BR /&gt;-&lt;BR /&gt;22&lt;BR /&gt;WARNING: Apparent symbolic reference _CORRELATION_MATRIX not resolved.&lt;BR /&gt;ERROR: Illegal reference to the array x.&lt;BR /&gt;ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string,&lt;BR /&gt;a numeric constant, a datetime constant, a missing value, INPUT, PUT.&lt;/P&gt;&lt;P&gt;229 x1 = x[1];&lt;BR /&gt;230 x2 = x[2];&lt;BR /&gt;231 x3 = x[3];&lt;BR /&gt;232&lt;BR /&gt;233 /* Calculate the value of y using x1, x2, and x3 */&lt;BR /&gt;234 y = 2 * x1 + 3 * x2 - 4 * x3 + rand("Normal", 0, 0.5); /* Add some random noise to the&lt;BR /&gt;234! prediction */&lt;BR /&gt;235&lt;BR /&gt;236 output; /* Output the current observation */&lt;BR /&gt;237 end;&lt;BR /&gt;238 keep y x1 x2 x3; /* Keep only the specified variables */&lt;BR /&gt;239 run;&lt;/P&gt;&lt;P&gt;NOTE: The SAS System stopped processing this step because of errors.&lt;BR /&gt;WARNING: The data set WORK.MY_DATASET may be incomplete. When this step was stopped there were&lt;BR /&gt;0 observations and 4 variables.&lt;BR /&gt;WARNING: Data set WORK.MY_DATASET was not replaced because this step was stopped.&lt;/P&gt;</description>
      <pubDate>Tue, 30 May 2023 21:26:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878319#M346998</guid>
      <dc:creator>hongjie76</dc:creator>
      <dc:date>2023-05-30T21:26:09Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878491#M347081</link>
      <description>Hi, I've not debugged the rest of the code but it looks like you've got a typo when calling &amp;amp;correlation_matrix. You need to add an underscore to your let statement</description>
      <pubDate>Wed, 31 May 2023 17:03:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878491#M347081</guid>
      <dc:creator>HarrySnart</dc:creator>
      <dc:date>2023-05-31T17:03:22Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878492#M347082</link>
      <description>&lt;P&gt;Thanks! After I changed the let statement to '%let _correlation_matrix ', I still got the error message:&lt;/P&gt;&lt;P&gt;"ERROR: Illegal reference to the array x."&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1 /* Set the number of individuals */&lt;BR /&gt;2 %let num_individuals = 200;&lt;BR /&gt;3&lt;BR /&gt;4&lt;BR /&gt;5&lt;BR /&gt;6 /* Set the correlation matrix */&lt;BR /&gt;7 %let _correlation_matrix = 1, 0.5, 0.3,&lt;BR /&gt;8 0.5, 1, 0.2,&lt;BR /&gt;9 0.3, 0.2, 1;&lt;BR /&gt;10&lt;BR /&gt;11 /* Create the dataset */&lt;BR /&gt;12 data my_dataset;&lt;BR /&gt;13 array x[3] x1-x3;&lt;BR /&gt;14 call streaminit(12345); /* Set the seed for random number generation */&lt;BR /&gt;15&lt;BR /&gt;16 /* Generate correlated values for x1, x2, and x3 */&lt;BR /&gt;17&lt;BR /&gt;18&lt;BR /&gt;19 do i = 1 to &amp;amp;num_individuals;&lt;BR /&gt;20 x = rand("Multinormal", 0, &amp;amp;_correlation_matrix); /* Generate correlated values */&lt;BR /&gt;ERROR: Illegal reference to the array x.&lt;BR /&gt;21 x1 = x[1];&lt;BR /&gt;22 x2 = x[2];&lt;BR /&gt;23 x3 = x[3];&lt;BR /&gt;24&lt;BR /&gt;25 /* Calculate the value of y using x1, x2, and x3 */&lt;BR /&gt;26 y = 2 * x1 + 3 * x2 - 4 * x3 + rand("Normal", 0, 0.5); /* Add some random noise to the&lt;BR /&gt;26 ! prediction */&lt;BR /&gt;27&lt;BR /&gt;28 output; /* Output the current observation */&lt;BR /&gt;29 end;&lt;BR /&gt;30&lt;BR /&gt;31&lt;BR /&gt;32 keep y x1 x2 x3; /* Keep only the specified variables */&lt;BR /&gt;33 run;&lt;/P&gt;&lt;P&gt;NOTE: The SAS System stopped processing this step because of errors.&lt;BR /&gt;WARNING: The data set WORK.MY_DATASET may be incomplete. When this step was stopped there were&lt;BR /&gt;0 observations and 4 variables.&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 17:13:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878492#M347082</guid>
      <dc:creator>hongjie76</dc:creator>
      <dc:date>2023-05-31T17:13:06Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878496#M347085</link>
      <description>&lt;PRE&gt;20 x = rand("Multinormal", 0, &amp;amp;_correlation_matrix); /* Generate correlated values */
ERROR: Illegal reference to the array x.&lt;/PRE&gt;
&lt;P&gt;I'm not sure where you got this syntax from, but a search of the documentation for SAS does not turn up a random number generator that has the distribution "Multinormal". There is the RANDNORMAL function in PROC IML, if that would be of help to you.&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 17:26:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878496#M347085</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-05-31T17:26:45Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878499#M347087</link>
      <description>&lt;P&gt;I think you're mixing IML and data step code.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also you have an array labeled X and a variable X which isn't going to work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 17:30:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878499#M347087</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-05-31T17:30:03Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878519#M347094</link>
      <description>&lt;P&gt;If you don't have access to IML, you can use this technique (&lt;A href="https://blogs.sas.com/content/iml/2017/09/25/simulate-multivariate-normal-data-sas-simnormal.html" target="_self"&gt;Simulate multivariate normal data in SAS by using PROC SIMNORMAL&lt;/A&gt;) described by&amp;nbsp;Rick Wicklin.&amp;nbsp; It uses a DATA step, plus proc simnormal to generate a multinormal distribution for the independent variables.&amp;nbsp; In your case it would be something like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data havecorr (type='CORR');
  input _TYPE_ $4.  @7 _NAME_ $4.  @10 x1 x2 x3 ;
datalines;
MEAN       0    0    0
STD        1    1    1
N          200 200 200
CORR   X1  1    0.5  0.3
CORR   X2  0.5  1    0.2
CORR   X3  0.3  0.2  1
run;

proc simnormal data=havecorr outsim=SimMVN
               numreal = 200           /* number of realizations = size of sample */
               seed = 12345  ;         /* random number seed */
   var x1-x3;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then, from dataset SimMVN, you can simulate Y from the generated X values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note, per Rick's comment, you can directly generate (using, say, PROC CORR), the HAVECORR dataset from original correlated data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can learn more about the proc at&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_simnorm_overview.htm" target="_self"&gt;The SIMNORMAL Procedure&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Jun 2023 14:45:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878519#M347094</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-06-01T14:45:09Z</dc:date>
    </item>
    <item>
      <title>Re: Dataset creation</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878674#M347174</link>
      <description>&lt;P&gt;Thank you all for your help. The problem has been solved.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Jun 2023 14:10:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dataset-creation/m-p/878674#M347174</guid>
      <dc:creator>hongjie76</dc:creator>
      <dc:date>2023-06-01T14:10:24Z</dc:date>
    </item>
  </channel>
</rss>

