BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jerrynetwork
Obsidian | Level 7

I am interested in "Creating Synthetic Data with SAS/OR", it was posted at SAS support community:

https://blogs.sas.com/content/operations/2017/05/17/creating-synthetic-data-sasor/

 

The method involved in a macro including 3 inner macros, when ran the macro to create synthetic data, a error message came up from the macro for IP step portion, here is the log:

229  *IP Step;
230  %macro IPSTEP(OUTPUTDATA, MOMENTORDER, NUMOBS, MINNUMIPCANDS, MILPMAXTIME,
231   RELOBJGAP);
232   num numSynthObs init &NUMOBS;
233   if (numSynthObs = 0) then numSynthObs = nObs;
234   num momRange {mi in MOM_IDX_SET} = momUb[mi] - momLb[mi];
235   var Assigned {IPCANDS[1]} binary;
236   var ScaledEta {MOM_IDX_SET} >= 0;
237   var MaxError >= 0;
238   minimize IpObj = MaxError;
239   con MaxCon {mi in MOM_IDX_SET}:
240   MaxError >= ScaledEta[mi];
241   con UpperIP {mi in MOM_IDX_SET}:
242   (1/numSynthObs) *
243   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob] -
244   momRange[mi]*ScaledEta[mi] <= momUb[mi];
245   con LowerIP {mi in MOM_IDX_SET}:
246   (1/numSynthObs) *
247   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob] +
248   momRange[mi]*ScaledEta[mi] >= momLb[mi];
249   con NumAssigned:
250   sum {ob in IPCANDS[1]} Assigned[ob] = numSynthObs;
251   /* Set an initial solution, then solve */
252   for {i in 1..numSynthObs} Assigned[i] = 1;
253   for {mi in MOM_IDX_SET} ScaledEta[mi] = if momRange[mi] <= 0 then 0
254   else max(((1/numSynthObs) *
255   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
256   - momUb[mi]) / momRange[mi], momLb[mi] - (1/numSynthObs) *
257   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0); /**/
258   MaxError = max {mi in MOM_IDX_SET: momRange[mi] > 0} ScaledEta[mi];
259   solve with MILP / maxtime=&MILPMAXTIME relobjgap=&RELOBJGAP
260   heuristics=3 primalin;
261   /* Save selected observations to data set */
262   set FINALOBS = 1..numSynthObs;
263   num finalObsVal {FINALOBS,VARS};
264   num obIdx init 0;
265   for {ob in IPCANDS[1]: Assigned[ob] > 0.5} do;
266   obIdx = obIdx + 1;
267   for {v in VARS}
268   finalObsVal[obIdx,v] = ipObVal[1,ob,varName2varIdx[v]];
269   end;
270   create data &OUTPUTDATA(drop=tmpvar) from [tmpvar]=FINALOBS
271   {j in VARS} <col(j)=finalObsVal[tmpvar,j]>;
272  %mend IPSTEP;

273  %GENDATA(INPUTDATA=OriginalData, METADATA=Metadata, NUMOBS=100,
274   MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);
NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 9 observations read from the data set WORK.METADATA.
NOTE: There were 5039 observations read from the data set WORK.ORIGINALDATA.
Number of IP step candidate observations: 209
NOTE: Line generated by the invoked macro "IPSTEP".
6    ,mi]*Assigned[ob])/momRange[mi],0);
                                    -
                                    22
                                    200
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
              <>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
              WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

NOTE: Problem generation will use 4 threads.

I have posted this question to "Programming", here is the link for your reference:

https://communities.sas.com/t5/SAS-Programming/Synthetic-data/m-p/717429#M221871 

I would appreciate it very much if anyone can help fix the issue. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Jerrynetwork,

 

I think you (and the SASGF paper) have an unbalanced parenthesis in the second argument of the MAX function where the error occurs (highlighted in red below):

 

max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], (momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0)

I guess that the missing opening parenthesis should be where I inserted the green one above. But I can't test it because SAS/OR (sadly) isn't included in my SAS license. Can you try to reproduce the results of the example using sashelp.heart in the paper after inserting that parenthesis?

 

View solution in original post

12 REPLIES 12
Jerrynetwork
Obsidian | Level 7

I am reading "Creating Synthetic Data with SAS/OR" from SAS support here:

https://blogs.sas.com/content/operations/2017/05/17/creating-synthetic-data-sasor/

 

A paper including more detail can be found here:

https://support.sas.com/resources/papers/proceedings17/1224-2017.pdf

There are 4 macros for the entire process (including 3 inner macros).

When I run the macro to create synthetic data, a error message came up for the IP step portion, it says:

ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
<>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

 

Here is the portion where error message came from (syntax issue at the last row):

for {i in 1..numSynthObs} Assigned[i] = 1;
for {mi in MOM_IDX_SET} ScaledEta[mi] = if momRange[mi] <= 0 then 0
else max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0);

 

macro for IP step was attached.

Is anyone familiar with synthetic data generation and can help fix the issue? Thank you.

PaigeMiller
Diamond | Level 26

Many of us (including me) will not download Microsoft Office (or any other) attachments, as they can be a security threat.


Please do not show us portions of the log detached from the actual code used.

 

Please include the SAS log in the reply. Please show us the log of the relevant section of the log, with nothing chopped out. That relevant section of the log should show the code, the NOTEs WARNINGs and ERRORs, in the sequence that it appears in the log. Please post the log by clicking on the </> icon and pasting the log as text into the window that appears, this will preserve the formatting of the log and make it easier for everyone to read and understand. If you don't preserve the format of the LOG in this way, I usually don't bother reading the log, and ask again for the proper format.

--
Paige Miller
Jerrynetwork
Obsidian | Level 7

Here is the log with error message, thank you!

272  %mend IPSTEP;

273  %GENDATA(INPUTDATA=OriginalData, METADATA=Metadata, NUMOBS=100,
274   MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);
NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 9 observations read from the data set WORK.METADATA.
NOTE: There were 5039 observations read from the data set WORK.ORIGINALDATA.
Number of IP step candidate observations: 209
NOTE: Line generated by the invoked macro "IPSTEP".
6    ,mi]*Assigned[ob])/momRange[mi],0);
                                    -
                                    22
                                    200
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
              <>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
              WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

NOTE: Problem generation will use 4 threads.

  

mkeintz
PROC Star

This is the equivalent of a "clerical error".  

 

Take a look at the GENDATA macro declaration, especially its parameterization:

%macro GENDATA(INPUTDATA, METADATA, OUTPUTDATA=SyntheticData,MOMENTORDER=3, NUMOBS=0
, MINNUMIPCANDS=0, LPBATCHSIZE=10, LPGAP=1E-3,NUMCOFORTHREADS=1, MILPMAXTIME=600, RELOBJGAP=1E-4
, ALPHA=0.95,RANDSEED=0 );

The first two arguments are positional.  They must be specified in the order declared, and they must be specified as values only.  All the other parameters are of the "name=value" category, which can be specified in any order, and must be specified as "name=value" pairs.

 

But your call to %GENDATA is 

%GENDATA(INPUTDATA=OriginalData, METADATA=Metadata, NUMOBS=100,MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);

while it should be

%GENDATA(OriginalData,Metadata, NUMOBS=100,MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);

So one or both of the symbols INPUTDATA or METADATA were not passed to the macro - causing the message your reported.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Jerrynetwork
Obsidian | Level 7

Thank you for your response! but change the first two arguments to position value, the error message still there.

sbxkoenk
SAS Super FREQ
You can post your question to :
Mathematical Optimization, Discrete-Event Simulation and OR
below the
Analytics header
Good chance that the author of the blog / paper will read your question.
Jerrynetwork
Obsidian | Level 7

Thank you!

mkeintz
PROC Star
I've moved the original post to join with the more recent post in the Optimization forum.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

Hello, @Jerrynetwork , next time, as I requested, could you just provide "the relevant section" of the log and not hundreds of lines before and after the error. This will result in you getting better and faster answers. Thanks.

--
Paige Miller
Jerrynetwork
Obsidian | Level 7

ok, it is edited. Thank you.

FreelanceReinh
Jade | Level 19

Hi @Jerrynetwork,

 

I think you (and the SASGF paper) have an unbalanced parenthesis in the second argument of the MAX function where the error occurs (highlighted in red below):

 

max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], (momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0)

I guess that the missing opening parenthesis should be where I inserted the green one above. But I can't test it because SAS/OR (sadly) isn't included in my SAS license. Can you try to reproduce the results of the example using sashelp.heart in the paper after inserting that parenthesis?

 

Jerrynetwork
Obsidian | Level 7

Yes, there is a unbalanced parenthesis, added a parenthesis, error disappeared and it works. but I don't quite understand the calculation, hope the modification meet owner's original design. Thank you!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1267 views
  • 2 likes
  • 5 in conversation