Obsidian | Level 7

## Creating Synthetic Data with SAS/OR, encountered problem.

I am interested in "Creating Synthetic Data with SAS/OR", it was posted at SAS support community:

https://blogs.sas.com/content/operations/2017/05/17/creating-synthetic-data-sasor/

The method involved in a macro including 3 inner macros, when ran the macro to create synthetic data, a error message came up from the macro for IP step portion, here is the log:

```229  *IP Step;
230  %macro IPSTEP(OUTPUTDATA, MOMENTORDER, NUMOBS, MINNUMIPCANDS, MILPMAXTIME,
231   RELOBJGAP);
232   num numSynthObs init &NUMOBS;
233   if (numSynthObs = 0) then numSynthObs = nObs;
234   num momRange {mi in MOM_IDX_SET} = momUb[mi] - momLb[mi];
235   var Assigned {IPCANDS[1]} binary;
236   var ScaledEta {MOM_IDX_SET} >= 0;
237   var MaxError >= 0;
238   minimize IpObj = MaxError;
239   con MaxCon {mi in MOM_IDX_SET}:
240   MaxError >= ScaledEta[mi];
241   con UpperIP {mi in MOM_IDX_SET}:
242   (1/numSynthObs) *
243   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob] -
244   momRange[mi]*ScaledEta[mi] <= momUb[mi];
245   con LowerIP {mi in MOM_IDX_SET}:
246   (1/numSynthObs) *
247   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob] +
248   momRange[mi]*ScaledEta[mi] >= momLb[mi];
249   con NumAssigned:
250   sum {ob in IPCANDS[1]} Assigned[ob] = numSynthObs;
251   /* Set an initial solution, then solve */
252   for {i in 1..numSynthObs} Assigned[i] = 1;
253   for {mi in MOM_IDX_SET} ScaledEta[mi] = if momRange[mi] <= 0 then 0
254   else max(((1/numSynthObs) *
255   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
256   - momUb[mi]) / momRange[mi], momLb[mi] - (1/numSynthObs) *
257   sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0); /**/
258   MaxError = max {mi in MOM_IDX_SET: momRange[mi] > 0} ScaledEta[mi];
259   solve with MILP / maxtime=&MILPMAXTIME relobjgap=&RELOBJGAP
260   heuristics=3 primalin;
261   /* Save selected observations to data set */
262   set FINALOBS = 1..numSynthObs;
263   num finalObsVal {FINALOBS,VARS};
264   num obIdx init 0;
265   for {ob in IPCANDS[1]: Assigned[ob] > 0.5} do;
266   obIdx = obIdx + 1;
267   for {v in VARS}
268   finalObsVal[obIdx,v] = ipObVal[1,ob,varName2varIdx[v]];
269   end;
270   create data &OUTPUTDATA(drop=tmpvar) from [tmpvar]=FINALOBS
271   {j in VARS} <col(j)=finalObsVal[tmpvar,j]>;
272  %mend IPSTEP;

274   MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);
NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 5039 observations read from the data set WORK.ORIGINALDATA.
Number of IP step candidate observations: 209
NOTE: Line generated by the invoked macro "IPSTEP".
6    ,mi]*Assigned[ob])/momRange[mi],0);
-
22
200
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
<>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

NOTE: Problem generation will use 4 threads.```

I have posted this question to "Programming", here is the link for your reference:

I would appreciate it very much if anyone can help fix the issue. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Synthetic data

I think you (and the SASGF paper) have an unbalanced parenthesis in the second argument of the MAX function where the error occurs (highlighted in red below):

```max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], (momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0)```

I guess that the missing opening parenthesis should be where I inserted the green one above. But I can't test it because SAS/OR (sadly) isn't included in my SAS license. Can you try to reproduce the results of the example using sashelp.heart in the paper after inserting that parenthesis?

12 REPLIES 12
Obsidian | Level 7

## Synthetic data

I am reading "Creating Synthetic Data with SAS/OR" from SAS support here:

https://blogs.sas.com/content/operations/2017/05/17/creating-synthetic-data-sasor/

A paper including more detail can be found here:

https://support.sas.com/resources/papers/proceedings17/1224-2017.pdf

There are 4 macros for the entire process (including 3 inner macros).

When I run the macro to create synthetic data, a error message came up for the IP step portion, it says:

ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
<>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

Here is the portion where error message came from (syntax issue at the last row):

for {i in 1..numSynthObs} Assigned[i] = 1;
for {mi in MOM_IDX_SET} ScaledEta[mi] = if momRange[mi] <= 0 then 0
else max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0);

macro for IP step was attached.

Is anyone familiar with synthetic data generation and can help fix the issue? Thank you.

Diamond | Level 26

## Re: Synthetic data

Many of us (including me) will not download Microsoft Office (or any other) attachments, as they can be a security threat.

Please do not show us portions of the log detached from the actual code used.

--
Paige Miller
Obsidian | Level 7

## Re: Synthetic data

Here is the log with error message, thank you!

```272  %mend IPSTEP;

274   MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);
NOTE: Writing HTML Body file: sashtml.htm
NOTE: There were 5039 observations read from the data set WORK.ORIGINALDATA.
Number of IP step candidate observations: 209
NOTE: Line generated by the invoked macro "IPSTEP".
6    ,mi]*Assigned[ob])/momRange[mi],0);
-
22
200
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, *, **, +, -, .., /, <, <=,
<>, =, >, ><, >=, AND, BY, CROSS, DIFF, ELSE, IN, INTER, NOT, OR, SYMDIFF, TO, UNION,
WITHIN, ^, ^=, |, ||, ~, ~=.

ERROR 200-322: The symbol is not recognized and will be ignored.

NOTE: Problem generation will use 4 threads.```

## Re: Synthetic data

This is the equivalent of a "clerical error".

Take a look at the GENDATA macro declaration, especially its parameterization:

``````%macro GENDATA(INPUTDATA, METADATA, OUTPUTDATA=SyntheticData,MOMENTORDER=3, NUMOBS=0
, MINNUMIPCANDS=0, LPBATCHSIZE=10, LPGAP=1E-3,NUMCOFORTHREADS=1, MILPMAXTIME=600, RELOBJGAP=1E-4
, ALPHA=0.95,RANDSEED=0 );``````

The first two arguments are positional.  They must be specified in the order declared, and they must be specified as values only.  All the other parameters are of the "name=value" category, which can be specified in any order, and must be specified as "name=value" pairs.

But your call to %GENDATA is

``%GENDATA(INPUTDATA=OriginalData, METADATA=Metadata, NUMOBS=100,MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);``

while it should be

``%GENDATA(OriginalData,Metadata, NUMOBS=100,MOMENTORDER=2, MILPMAXTIME=60, RANDSEED=100);``

So one or both of the symbols INPUTDATA or METADATA were not passed to the macro - causing the message your reported.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Obsidian | Level 7

## Re: Synthetic data

Thank you for your response! but change the first two arguments to position value, the error message still there.

SAS Super FREQ

## Re: Synthetic data

You can post your question to :
Mathematical Optimization, Discrete-Event Simulation and OR
below the
Good chance that the author of the blog / paper will read your question.
Obsidian | Level 7

Thank you!

## Re: Synthetic data

I've moved the original post to join with the more recent post in the Optimization forum.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Diamond | Level 26

## Re: Synthetic data

Hello, @Jerrynetwork , next time, as I requested, could you just provide "the relevant section" of the log and not hundreds of lines before and after the error. This will result in you getting better and faster answers. Thanks.

--
Paige Miller
Obsidian | Level 7

## Re: Synthetic data

ok, it is edited. Thank you.

## Re: Synthetic data

I think you (and the SASGF paper) have an unbalanced parenthesis in the second argument of the MAX function where the error occurs (highlighted in red below):

```max(((1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob]
- momUb[mi]) / momRange[mi], (momLb[mi] - (1/numSynthObs) *
sum {ob in IPCANDS[1]} ipObMoms[1,ob,mi]*Assigned[ob])/momRange[mi],0)```

I guess that the missing opening parenthesis should be where I inserted the green one above. But I can't test it because SAS/OR (sadly) isn't included in my SAS license. Can you try to reproduce the results of the example using sashelp.heart in the paper after inserting that parenthesis?

Obsidian | Level 7

## Re: Synthetic data

Yes, there is a unbalanced parenthesis, added a parenthesis, error disappeared and it works. but I don't quite understand the calculation, hope the modification meet owner's original design. Thank you!

Discussion stats
• 12 replies
• 793 views
• 2 likes
• 5 in conversation