Dear community members,
I'm working on a selection plan that will be programmed in SAS. There will be various zones, but for simplicity I add here example with four zones (ZONE), each having a preset fixed amount of observations (FIXED).
I have one data table called ZONES with variables ZONE and FIXED with following values:
ZONE | FIXED |
1 | 200 |
2 | 120 |
3 | 80 |
4 | 100 |
Then I have second data table called RESULT with variables ZONE, CLASS and TARGET with following values:
ZONE | CLASS | TARGET |
1 | A | 32.4677 |
1 | B | 36.0680 |
1 | C | 62.6852 |
1 | D | 47.1945 |
1 | E | 31.5844 |
2 | A | 14.5252 |
2 | B | 14.1779 |
2 | C | 29.5432 |
2 | D | 6.7894 |
2 | E | 9.9574 |
3 | A | 51.1834 |
3 | B | 0.6335 |
3 | C | 36.9613 |
3 | D | 5.8361 |
3 | E | 15.3799 |
3 | F | 3.7684 |
4 | A | 39.8149 |
4 | B | 0.4815 |
4 | C | 4.7643 |
4 | D | 2.9040 |
4 | E | 12.9612 |
4 | F | 3.5365 |
Then the target estimate (TARGET) needs to be optimized (with a regulated method). A new variable OPT needs to be derived.
To calculate OPT, I need subtotals of variable TARGET by ZONE. I have done it by using this sample code:
PROC TABULATE data=result; CLASS ZONE /*Classification variable 1, 2, 3 or 4*/; VAR TARGET /*Target variable*/; TABLE ZONE (all), (TARGET)*(SUM) /*Subtotals by ZONE of TARGET*/; RUN;
The result is:
1 | 209.9999 |
2 | 74.9931 |
3 | 113.7625 |
4 | 64.4625 |
The new derived OPT is calculated for each ZONE. For example ZONE=1:
OPT = FIXED*TARGET divided by subtotal of TARGET in ZONE=1 (209.9999). Sample code I have cut below, I have used numbers either given in data table ZONES (for instance 100) or calculated subtotals (for instance 64.4625).
data result; set result; if ZONE=1 then OPT=200*TARGET/209.9999 ; if ZONE=2 then OPT=120*TARGET/74.9931 ; if ZONE=3 then OPT=80*TARGET/113.7625 ; if ZONE=4 then OPT=100*TARGET/64.4625 ; /* FIXED*TARGET divided by subtotals by ZONE of TARGET */ run;
The result is like this:
ZONE | CLASS | TARGET | OPT |
1 | A | 32.4677 | 30.9216 |
1 | B | 36.0680 | 34.3505 |
1 | C | 62.6852 | 59.7002 |
1 | D | 47.1945 | 44.9471 |
1 | E | 31.5844 | 30.0804 |
2 | A | 14.5252 | 23.2425 |
2 | B | 14.1779 | 22.6867 |
2 | C | 29.5432 | 47.2735 |
2 | D | 6.7894 | 10.8640 |
2 | E | 9.9574 | 15.9333 |
3 | A | 51.1834 | 35.9932 |
3 | B | 0.6335 | 0.4455 |
3 | C | 36.9613 | 25.9919 |
3 | D | 5.8361 | 4.1040 |
3 | E | 15.3799 | 10.8154 |
3 | F | 3.7684 | 2.6500 |
4 | A | 39.8149 | 61.7645 |
4 | B | 0.4815 | 0.7469 |
4 | C | 4.7643 | 7.3908 |
4 | D | 2.9040 | 4.5050 |
4 | E | 12.9612 | 20.1066 |
4 | F | 3.5365 | 5.4862 |
I have trouble finding a way to refer in tables with FIXED and calculated subtotals so that I could generate a new variable OPT that is different for each ZONE. I highlighted the issues:
data result; set result; if ZONE=1 then OPT=200*TARGET/209.9999 ; if ZONE=2 then OPT=120*TARGET/74.9931 ; if ZONE=3 then OPT=80*TARGET/113.7625 ; if ZONE=4 then OPT=100*TARGET/64.4625 ; /* FIXED*TARGET divided by subtotals by ZONE of TARGET */ run;
Kindly, help me out to find a solution.
1. The code below shows how your input data should be posted.
2. The code below does the calculation required.
3. Does this make sense?
data ZONES;
input ZONE FIXED;
cards;
1 200
2 120
3 80
4 100
run;
data RESULT;
input ZONE CLASS $ TARGET ;
cards;
1 A 32.4677
1 B 36.0680
1 C 62.6852
1 D 47.1945
1 E 31.5844
2 A 14.5252
2 B 14.1779
2 C 29.5432
2 D 6.7894
2 E 9.9574
3 A 51.1834
3 B 0.6335
3 C 36.9613
3 D 5.8361
3 E 15.3799
3 F 3.7684
4 A 39.8149
4 B 0.4815
4 C 4.7643
4 D 2.9040
4 E 12.9612
4 F 3.5365
run;
proc sql;
create table WANT as
select b.*
, FIXED*TARGET/SUM_BY_ID as OPT
from ZONES a
inner join
(select *, sum(TARGET) as SUM_BY_ID
from RESULT
group by ZONE) b
on a.ZONE = b.ZONE;
quit;
ZONE | CLASS | TARGET | OPT |
---|---|---|---|
1 | A | 32.4677 | 30.92165 |
1 | B | 36.068 | 34.35051 |
1 | C | 62.6852 | 59.70025 |
1 | E | 31.5844 | 30.08041 |
1 | D | 47.1945 | 44.94719 |
2 | B | 14.1779 | 22.68673 |
2 | A | 14.5252 | 23.24246 |
2 | E | 9.9574 | 15.93331 |
2 | D | 6.7894 | 10.86404 |
2 | C | 29.5432 | 47.27347 |
3 | D | 5.8361 | 4.104055 |
3 | E | 15.3799 | 10.81543 |
3 | C | 36.9613 | 25.99188 |
3 | B | 0.6335 | 0.445489 |
3 | A | 51.1834 | 35.99313 |
3 | F | 3.7684 | 2.65001 |
4 | F | 3.5365 | 5.486144 |
4 | E | 12.9612 | 20.1066 |
4 | D | 2.904 | 4.504952 |
4 | C | 4.7643 | 7.39082 |
4 | B | 0.4815 | 0.746947 |
4 | A | 39.8149 | 61.76453 |
I don't understand the question.
Where are TARGET and 209.9999 coming from?
Hello,
TARGET is variable that includes calculated estimates. TARGET variable is shown in the attached table. The figure 209.9999 is a subtotal for ZONE=1. The sum of red TARGET values in table below equals 209.9999.
ZONE | CLASS | TARGET | OPT |
1 | A | 32,4677 | 30,9216 |
1 | B | 36,0680 | 34,3505 |
1 | C | 62,6852 | 59,7002 |
1 | D | 47,1945 | 44,9471 |
1 | E | 31,5844 | 30,0804 |
2 | A | 14,5252 | 23,2425 |
2 | B | 14,1779 | 22,6867 |
2 | C | 29,5432 | 47,2735 |
2 | D | 6,7894 | 10,8640 |
2 | E | 9,9574 | 15,9333 |
3 | A | 51,1834 | 35,9932 |
3 | B | 0,6335 | 0,4455 |
3 | C | 36,9613 | 25,9919 |
3 | D | 5,8361 | 4,1040 |
3 | E | 15,3799 | 10,8154 |
3 | F | 3,7684 | 2,6500 |
4 | A | 39,8149 | 61,7645 |
4 | B | 0,4815 | 0,7469 |
4 | C | 4,7643 | 7,3908 |
4 | D | 2,9040 | 4,5050 |
4 | E | 12,9612 | 20,1066 |
4 | F | 3,5365 | 5,4862 |
>The sum of red TARGET values in table below equals 209.9999.
It isn't.
The sum is 2,099,998.
Oh I see. The comma is the decimal separator then?
And what data set contains the TARGET variable?
Hello,
data set 'result' contains variable TARGET. This data set is clipped into original post.
Hello,
TARGET |
32.4677 |
36.0680 |
62.6852 |
47.1945 |
31.5844 |
Sum of these figures equals 209.9999.
So table ZONES contains variables ZONE FIXED TARGET and CLASS? And OPT is derived?
Why this then: ?
data zones;
input ZONE FIXED;
datalines;
1 200
2 120
3 80
4 100
;
Please don't let us look for the pieces of the puzzle.
Provide a clear and comprehensive question.
Also use the {i} or running man icon to post code. And test the code you provide by copying and pasting it back into SAS.
Awaiting a replacement for the data step above. 🙂
I updated the original post to clarify your questions.
1. The code below shows how your input data should be posted.
2. The code below does the calculation required.
3. Does this make sense?
data ZONES;
input ZONE FIXED;
cards;
1 200
2 120
3 80
4 100
run;
data RESULT;
input ZONE CLASS $ TARGET ;
cards;
1 A 32.4677
1 B 36.0680
1 C 62.6852
1 D 47.1945
1 E 31.5844
2 A 14.5252
2 B 14.1779
2 C 29.5432
2 D 6.7894
2 E 9.9574
3 A 51.1834
3 B 0.6335
3 C 36.9613
3 D 5.8361
3 E 15.3799
3 F 3.7684
4 A 39.8149
4 B 0.4815
4 C 4.7643
4 D 2.9040
4 E 12.9612
4 F 3.5365
run;
proc sql;
create table WANT as
select b.*
, FIXED*TARGET/SUM_BY_ID as OPT
from ZONES a
inner join
(select *, sum(TARGET) as SUM_BY_ID
from RESULT
group by ZONE) b
on a.ZONE = b.ZONE;
quit;
ZONE | CLASS | TARGET | OPT |
---|---|---|---|
1 | A | 32.4677 | 30.92165 |
1 | B | 36.068 | 34.35051 |
1 | C | 62.6852 | 59.70025 |
1 | E | 31.5844 | 30.08041 |
1 | D | 47.1945 | 44.94719 |
2 | B | 14.1779 | 22.68673 |
2 | A | 14.5252 | 23.24246 |
2 | E | 9.9574 | 15.93331 |
2 | D | 6.7894 | 10.86404 |
2 | C | 29.5432 | 47.27347 |
3 | D | 5.8361 | 4.104055 |
3 | E | 15.3799 | 10.81543 |
3 | C | 36.9613 | 25.99188 |
3 | B | 0.6335 | 0.445489 |
3 | A | 51.1834 | 35.99313 |
3 | F | 3.7684 | 2.65001 |
4 | F | 3.5365 | 5.486144 |
4 | E | 12.9612 | 20.1066 |
4 | D | 2.904 | 4.504952 |
4 | C | 4.7643 | 7.39082 |
4 | B | 0.4815 | 0.746947 |
4 | A | 39.8149 | 61.76453 |
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.