Hi,
I am using SAS Enterprise Guide 5.1 version 9.3.
I am trying to create a new column by looking at other columns.
I have columns: amt_reached, amt_needed, Level and each are split by ID.
Currently if the amt_reached is not greater than or equal to amt_needed, then Level = 1, otherwise it is 0. I would like a new column to count Level up to 3 and then this is the max and all values after would be at 0 unless the following line the amt_reached is greater than or equals amt_needed. Then the count would restart.
Here is what I want:
| ID | Level | amt_reached | amt_needed | What I want |
| 123 | 0 | 231.71 | 482.73 | 1 |
| 123 | 1 | 251.02 | 235.1 | 2 |
| 123 | 1 | 235.1 | 476.33 | 3 |
| 123 | 1 | 241.23 | 478.95 | 0 |
| 123 | 1 | 237.72 | 470.33 | 0 |
| 123 | 0 | 465.77 | 465.77 | 0 |
| 123 | 1 | . | 215.89 | 1 |
| 123 | 1 | 215.89 | 460.7 | 2 |
| 123 | 1 | 244.81 | 486.22 | 3 |
| 123 | 1 | 241.41 | 477.02 | 0 |
| 123 | 1 | 235.61 | 469.03 | 0 |
| 123 | 1 | 233.42 | 475.1 | 0 |
| 234 | 1 | 139.35 | 278.7 | 1 |
| 234 | 1 | 139.35 | 278.7 | 2 |
| 234 | 1 | 139.35 | 278.7 | 3 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 234 | 1 | 139.35 | 278.7 | 0 |
| 156 | 1 | . | 18.99 | 1 |
| 156 | 1 | 18.99 | 39.07 | 2 |
| 156 | 1 | 20.08 | 40.26 | 3 |
| 156 | 1 | 20.18 | 40.46 | 0 |
| 156 | 1 | 20.28 | 40.63 | 0 |
| 156 | 1 | 20.35 | 40.8 | 0 |
| 156 | 1 | 20.45 | 40.97 | 0 |
| 156 | 0 | 61.64 | 41.12 | 0 |
| 156 | 1 | . | 18.02 | 1 |
| 156 | 0 | 37.09 | 37.09 | 0 |
| 156 | 1 | . | 17.68 | 1 |
| 156 | 1 | 17.68 | 36.4 | 2 |
SAS Code:
data WORK.COUNTWITHMAX;
infile datalines dsd truncover;
input ID:BEST12. Level:BEST12. amt_reached:BEST12. amt_needed:BEST12.;
format ID BEST12. Level BEST12. amt_reached BEST12. amt_needed BEST12.;
datalines;
123 0 231.71 482.73
123 1 251.02 235.1
123 1 235.1 476.33
123 1 241.23 478.95
123 1 237.72 470.33
123 0 465.77 465.77
123 1 . 215.89
123 1 215.89 460.7
123 1 244.81 486.22
123 1 241.41 477.02
123 1 235.61 469.03
123 1 233.42 475.1
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
156 1 . 18.99
156 1 18.99 39.07
156 1 20.08 40.26
156 1 20.18 40.46
156 1 20.28 40.63
156 1 20.35 40.8
156 1 20.45 40.97
156 0 61.64 41.12
156 1 . 18.02
156 0 37.09 37.09
156 1 . 17.68
156 1 17.68 36.4
;;;;Here is what I have tried so far:
retain MaxCount;
if first.ID
then MaxCount = 0;
if Level = 0
then MaxCount = 0;
else do;
MaxCount = mod(MaxCount + 1,4);
if MaxCount = 0 and amt_reached >= amt_needed then MaxCount = 1;
end;It does not currently work, but I can't quite finish it.
Any help would be greatly appreciated
Thanks!
Per your description, you would restart with 1 in observation 5, as in observation 6 amt_reached equals amt_needed. Please clarify.
Hi KurtBremser,
Thanks for your question.
As in observation 6 column amt_reached = amt_needed the new column remains at 0. It is observation 7 where the count can restart, not observation 5. If amt_reached in observation 6 had been less than amt_needed then the new column would remain at 0 until there was on occasion where amt_reached >= amt_needed.
The count always gets to 3 and remains at 0 for the next observations until amt_reached >= amt_needed.
Hope this clarifies.
So you mean to start a new series when the previous observation satisfied the condition?
But then I do no understand the last 4 observations:
156 1 . 18.02 1 156 0 37.09 37.09 0 156 1 . 17.68 1 156 1 17.68 36.4 2
Should't those values be 1 - 2 - 1 - 2?
I think I have found it. It creates your output:
data have;
input ID Level amt_reached amt_needed;
datalines;
123 0 231.71 482.73
123 1 251.02 235.1
123 1 235.1 476.33
123 1 241.23 478.95
123 1 237.72 470.33
123 0 465.77 465.77
123 1 . 215.89
123 1 215.89 460.7
123 1 244.81 486.22
123 1 241.41 477.02
123 1 235.61 469.03
123 1 233.42 475.1
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
156 1 . 18.99
156 1 18.99 39.07
156 1 20.08 40.26
156 1 20.18 40.46
156 1 20.28 40.63
156 1 20.35 40.8
156 1 20.45 40.97
156 0 61.64 41.12
156 1 . 18.02
156 0 37.09 37.09
156 1 . 17.68
156 1 17.68 36.4
;
data want;
set have;
by id notsorted;
_level = lag(level);
if first.id then new = 1;
else do;
if level = 0 then new = 0;
if new in (1,2) then new + 1;
else if new = 3 then new = 0;
else if _level = 0 then new = 1;
end;
run;
proc print data=want noobs;
run;
Result:
amt_ amt_ ID Level reached needed _level new 123 0 231.71 482.73 . 1 123 1 251.02 235.10 0 2 123 1 235.10 476.33 1 3 123 1 241.23 478.95 1 0 123 1 237.72 470.33 1 0 123 0 465.77 465.77 1 0 123 1 . 215.89 0 1 123 1 215.89 460.70 1 2 123 1 244.81 486.22 1 3 123 1 241.41 477.02 1 0 123 1 235.61 469.03 1 0 123 1 233.42 475.10 1 0 234 1 139.35 278.70 1 1 234 1 139.35 278.70 1 2 234 1 139.35 278.70 1 3 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 234 1 139.35 278.70 1 0 156 1 . 18.99 1 1 156 1 18.99 39.07 1 2 156 1 20.08 40.26 1 3 156 1 20.18 40.46 1 0 156 1 20.28 40.63 1 0 156 1 20.35 40.80 1 0 156 1 20.45 40.97 1 0 156 0 61.64 41.12 1 0 156 1 . 18.02 0 1 156 0 37.09 37.09 1 0 156 1 . 17.68 0 1 156 1 17.68 36.40 1 2
Thanks that does work for these cases. I have found a case where it doesn't work and I think perhaps because there is no reference to the amt_reached and amt_needed columns.
In reference to your previous question for the last 4 observations, the reason why I want it to be 1,0,1,2 is because the 3rd from last observation is as follows:
| ID | Level | amt_reached | amt_needed | What I want |
| 156 | 0 | 37.09 | 37.09 | 0 |
Since the columns amt_reached and amt_needed fall are equal, then the new column will be at 0. If amt_reached < amt_needed, then it should be 1,2,3,0 for the last 4 observations. Whenever Level = 0 and amt_reached >= amt_needed then that row should = 0 and the count restarts at the next observation.
However if Level = 1 and amt_reached >= amt_needed, then the new column should equal 1 at that point. These two columns are the main decision points.
Hope that is a bit clearer.
Here is an example of when it doesn't work with your new code:
| ID | _Level | new | Level | amt_reached | amt_needed | What I want |
| 111 | 0 | 1 | 1 | 125.27 | 247.92 | 1 |
| 111 | 1 | 2 | 1 | 122.65 | 242.71 | 2 |
| 111 | 1 | 3 | 1 | 120.06 | 237.33 | 3 |
| 111 | 1 | 0 | 1 | 117.27 | 232.1 | 0 |
| 111 | 1 | 0 | 1 | 114.83 | 227.21 | 0 |
| 111 | 1 | 0 | 1 | 112.38 | 222.43 | 0 |
| 111 | 1 | 0 | 1 | 110.05 | 217.77 | 0 |
| 111 | 1 | 0 | 0 | 213.27 | 213.27 | 0 |
| 111 | 0 | 1 | 0 | 97.51 | 97.51 | 0 |
| 111 | 0 | 1 | 0 | 95.02 | 95.02 | 0 |
| 111 | 0 | 1 | 0 | 92.67 | 92.67 | 0 |
| 111 | 0 | 1 | 0 | 90.3 | 90.3 | 0 |
Thanks again!
I think a small change in the code should do it:
data want;
set have;
by id notsorted;
_level = lag(level);
if first.id then new = 1;
else do;
if level = 0 then new = 0;
else if new in (1,2) then new + 1;
else if new = 3 then new = 0;
else if _level = 0 then new = 1;
end;
run;
Note the extra "else" in line 8 that makes it a whole if-then-else-if chain.
That did the trick.
The other thing is with the code saying if first.ID then new = 1, this is causing some cases where a new ID begins to have 1 on the first observation, even if Level doesn't equal 1. I also can see cases where new = 2 but the previous observation isn't there is at 0
Please see below:
| ID | _Level | new | Level | amt_reached | amt_needed | What I want |
| 222 | . | 1 | 0 | 21 | 20.59 | 0 |
| 222 | 0 | 0 | 0 | 21 | 20.25 | 0 |
| 222 | 0 | 0 | 0 | 21 | 20.45 | 0 |
| 222 | 1 | 0 | 0 | 22.54 | 22.54 | 0 |
| 222 | 0 | 0 | 0 | 21 | 20.06 | 0 |
| 222 | 0 | 0 | 0 | 20 | 19.46 | 0 |
| 222 | 0 | 0 | 0 | 20 | 19.92 | 0 |
| 222 | 0 | 1 | 1 | 18.5 | 18.95 | 1 |
| 222 | 0 | 0 | 0 | 23 | 21.59 | 0 |
| 222 | 0 | 0 | 0 | 30 | 19.14 | 0 |
| 222 | 0 | 0 | 0 | 21 | 20.97 | 0 |
| 222 | 0 | 0 | 0 | 21 | 19.71 | 0 |
| 333 | 0 | 1 | 0 | 8.99 | 7.2 | 0 |
| 333 | 0 | 0 | 0 | 7.69 | 5.13 | 0 |
| 333 | 0 | 0 | 0 | 5.44 | 5 | 0 |
| 333 | 0 | 0 | 0 | 7.54 | 5 | 0 |
| 333 | 0 | 0 | 0 | 10.51 | 5.53 | 0 |
| 333 | 0 | 0 | 0 | 10 | 5.9 | 0 |
| 333 | 0 | 0 | 0 | 7.67 | 5.13 | 0 |
| 333 | 1 | 0 | 0 | 6.77 | 5.34 | 0 |
| 333 | 0 | 0 | 0 | 10 | 5 | 0 |
| 333 | 0 | 1 | 1 | 6 | 5 | 1 |
| 333 | 0 | 0 | 0 | 7.36 | 5 | 0 |
| 333 | 0 | 0 | 0 | 7 | 5 | 0 |
| 444 | 1 | 2 | 1 | . | 5 | 1 |
| 444 | 1 | 0 | 0 | 77.7 | 10 | 0 |
| 444 | 0 | 1 | 1 | 20 | 5 | 1 |
| 444 | 0 | 0 | 0 | 20 | 5 | 0 |
| 444 | 0 | 1 | 0 | 20 | 12.09 | 0 |
| 444 | 0 | 1 | 1 | . | 5 | 1 |
| 444 | 0 | 0 | 0 | 20 | 5.52 | 0 |
| 444 | 1 | 0 | 0 | 20 | 10.34 | 0 |
| 444 | 0 | 1 | 1 | 20 | 5 | 1 |
| 444 | 0 | 0 | 0 | . | 0 | 0 |
| 444 | 1 | 2 | 1 | 50 | 13.8 | 1 |
Sorry, I realised I had ordered the data incorrectly for my last comment. The issue with new = 2 is fine. The issue is only with the starting ID = 1.
Apologies.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.