Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-27-2018 12:44 AM
(840 views)

Hello, I am starting to work with proc optmodel and have a recurrent problem easy for solver but I want to do it in SAS. I have the daily temperatures during one month for 4 years and I need to create new binary variables CU1--CU4 (one for each year) defined as 1 if the temperature is in a specific range. However, I need to find the limits of that range that minimize the standard deviation between the four sum of CUs, I was trying something that probably is totally wrong because I do not know how to define the limits of the range and also if I have to put some constrains or impvar.

```
data temp;
input Log y2013 y2014 y2015 y2016;
cards;
1 28.2 28.3 18.2 27.8
2 29.6 29.7 19.9 29.2 . . .
proc optmodel;
set temp;
num y2013{temp};
num y2014{temp};
num y2015{temp};
num y2016{temp};
var a,b;
impvar CU1{p in temp}=if a<= y2013[p] <= b then 1 else 0;
impvar CU2{p in temp}=if a<= y2014[p] <= b then 1 else 0;
impvar CU3{p in temp}=if a<= y2015[p] <= b then 1 else 0;
impvar CU4{p in temp}=if a<= y2016[p] <= b then 1 else 0;
impvar S1=sum{p in temp} CU1[p]; impvar S2=sum{p in temp} CU2[p]; impvar S3=sum{p in temp} CU3[p]; impvar S4=sum{p in temp} CU4[p];
min SSE=sqrt(.25 * ((S1-(S1+S2+S3+S4)/4)**2+(S2-(S1+S2+S3+S4)/4)**2+(S3-(S1+S2+S3+S4)/4)**2+(S4-(S1+S2+S3+S4)/4)**2));
solve;
print a b;
quit;
run;
```

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Here is a more compact way to write your code:

```
proc optmodel;
set YEARS = 2013..2016;
set DAYS;
num temp{DAYS, YEARS};
read data temp into DAYS=[Log] {y in YEARS} <temp[Log,y]=col('y'||y)>;
var a,b;
impvar CU{p in DAYS, y in YEARS} = (if a <= temp[p,y] <= b then 1 else 0); /* even shorter: (a <= temp[p,y] <= b) */
impvar S{y in YEARS} = sum{p in DAYS} CU[p,y];
impvar mean = (sum{y in YEARS} S[y])/card(YEARS);
min SSE = sum{y in YEARS}(S[y]-mean)**2;
solve;
/* display solution */
print a b;
print CU S SSE;
quit;
```

But the optimal objective value is trivially 0. For example, if you take a to be the smallest data value and b to be the largest data value, then CU[p,y] is always 1, S[y] is always card(DAYS), and SSE = 0. Another optimal solution is to take a = b = an arbitrary value that does not appear in your data, say -999. In that case, CU[p,y] = 0, S[y] = 0, and again SSE = 0.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for your help. However, I think something is still wrong, the point is that the length of the variable is not the same every year so if I use **a** as the smallest data value and **b** as the largest data value, the variation won't be 0. Anyways, I am now minimizing the coefficient of variation to avoid trivial solutions but I wanted to constrain that the mean is different to 0 and that **b** is greater than **a** but I get this response "The problem contains strict inequality or predicate constraints that reference non-integer variables".

Then I tried to approximately address the values of **a** and **b** as shown in the script (**a** GT 3 and **b** GT 16, in this case) but the response is just showing me the same values with some decimals according to 0 iterations.

```
proc optmodel;
set YEARS = 2013..2016;
set DAYS;
num temp{DAYS, YEARS};
read data temp into DAYS=[Log] {y in YEARS} <temp[Log,y]=col('y'||y)>;
var a>=3,b>=16;
impvar CU{p in DAYS, y in YEARS} = (if a <= temp[p,y] <= b then 1 else 0);
impvar S{y in YEARS} = sum{p in DAYS} CU[p,y];
impvar mean = (sum{y in YEARS} S[y])/card(YEARS);
minimize Coefv = (sqrt(sum{y in YEARS}((S[y]-mean)**2)/card(YEARS)))/mean;
solve;
/* display solution */
print a b;
print S mean coefv;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You cannot use strict inequalities with the NLP solver. But you could do this instead to force a positive mean:

```
con PositiveMean: mean >= 1e-6;
```

Similarly, you can force A < B like this:

```
con A_LT_B: A + 1e-6 <= B;
```

With or without these constraints, your new objective does not avoid all trivial solutions. For example, the following code yields an optimal solution with mean = 2 and objective value = 0:

```
proc optmodel;
set YEARS = 2013..2016;
set DAYS;
num temp{DAYS, YEARS};
read data temp into DAYS=[Log] {y in YEARS} <temp[Log,y]=col('y'||y)>;
var a,b;
impvar CU{p in DAYS, y in YEARS} = (if a <= temp[p,y] <= b then 1 else 0);
impvar S{y in YEARS} = sum{p in DAYS} CU[p,y];
impvar mean = (sum{y in YEARS} S[y])/card(YEARS);
minimize CoefvSquared = (sum{y in YEARS}((S[y]-mean)**2)/card(YEARS))/mean^2;
a = min {p in DAYS, y in YEARS} temp[p,y];
b = max {p in DAYS, y in YEARS} temp[p,y];
solve;
/* display solution */
print a b;
print S mean CoefvSquared;
quit;
```

Note that I squared the objective to avoid the SQRT function, which is nondifferentiable at 0. Also, the a and b values supplied as a starting solution are computed as min and max, respectively, across all observations.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.