BookmarkSubscribeRSS Feed
1SasUser1
Calcite | Level 5

Hi,

 

I am using SAS Enterprise Guide 5.1 version 9.3.

 

I am trying to create a new column by looking at other columns.

 

I have columns: amt_reached, amt_needed, Level and each are split by ID.

 

Currently if the amt_reached is not greater than or equal to amt_needed, then Level = 1, otherwise it is 0. I would like a new column to count Level up to 3 and then this is the max and all values after would be at 0 unless the following line the amt_reached is greater than or equals amt_needed. Then the count would restart.

 

Here is what I want:

IDLevelamt_reachedamt_neededWhat I want
1230231.71482.731
1231251.02235.12
1231235.1476.333
1231241.23478.950
1231237.72470.330
1230465.77465.770
1231.215.891
1231215.89460.72
1231244.81486.223
1231241.41477.020
1231235.61469.030
1231233.42475.10
2341139.35278.71
2341139.35278.72
2341139.35278.73
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
2341139.35278.70
1561.18.991
156118.9939.072
156120.0840.263
156120.1840.460
156120.2840.630
156120.3540.80
156120.4540.970
156061.6441.120
1561.18.021
156037.0937.090
1561.17.681
156117.6836.42

 

SAS Code:

data WORK.COUNTWITHMAX;
  infile datalines dsd truncover;
  input ID:BEST12. Level:BEST12. amt_reached:BEST12. amt_needed:BEST12.;
  format ID BEST12. Level BEST12. amt_reached BEST12. amt_needed BEST12.;
datalines;
123 0 231.71 482.73
123 1 251.02 235.1
123 1 235.1 476.33
123 1 241.23 478.95
123 1 237.72 470.33
123 0 465.77 465.77
123 1 . 215.89
123 1 215.89 460.7
123 1 244.81 486.22
123 1 241.41 477.02
123 1 235.61 469.03
123 1 233.42 475.1
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
156 1 . 18.99
156 1 18.99 39.07
156 1 20.08 40.26
156 1 20.18 40.46
156 1 20.28 40.63
156 1 20.35 40.8
156 1 20.45 40.97
156 0 61.64 41.12
156 1 . 18.02
156 0 37.09 37.09
156 1 . 17.68
156 1 17.68 36.4
;;;;

Here is what I have tried so far:

retain MaxCount;
	if first.ID	
		then MaxCount = 0;
	if Level = 0
		then MaxCount = 0;
	else do;
		MaxCount = mod(MaxCount + 1,4);
if MaxCount = 0 and amt_reached >= amt_needed then MaxCount = 1;


end;

It does not currently work, but I can't quite finish it.

 

Any help would be greatly appreciated

 

Thanks!

 

8 REPLIES 8
1SasUser1
Calcite | Level 5

Hi KurtBremser,

 

Thanks for your question.

 

As in observation 6 column amt_reached = amt_needed the new column remains at 0. It is observation 7 where the count can restart, not observation 5. If amt_reached in observation 6 had been less than amt_needed then the new column would remain at 0 until there was on occasion where amt_reached >= amt_needed.

 

The count always gets to 3 and remains at 0 for the next observations until amt_reached >= amt_needed.

 

Hope this clarifies.

Kurt_Bremser
Super User

So you mean to start a new series when the previous observation satisfied the condition?

But then I do no understand the last 4 observations:

156	1	.	18.02	1
156	0	37.09	37.09	0
156	1	.	17.68	1
156	1	17.68	36.4	2

Should't those values be 1 - 2 - 1 - 2?

Kurt_Bremser
Super User

I think I have found it. It creates your output:

data have;
input ID Level amt_reached amt_needed;
datalines;
123 0 231.71 482.73
123 1 251.02 235.1
123 1 235.1 476.33
123 1 241.23 478.95
123 1 237.72 470.33
123 0 465.77 465.77
123 1 . 215.89
123 1 215.89 460.7
123 1 244.81 486.22
123 1 241.41 477.02
123 1 235.61 469.03
123 1 233.42 475.1
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
234 1 139.35 278.7
156 1 . 18.99
156 1 18.99 39.07
156 1 20.08 40.26
156 1 20.18 40.46
156 1 20.28 40.63
156 1 20.35 40.8
156 1 20.45 40.97
156 0 61.64 41.12
156 1 . 18.02
156 0 37.09 37.09
156 1 . 17.68
156 1 17.68 36.4
;

data want;
set have;
by id notsorted;
_level = lag(level);
if first.id then new = 1;
else do;
  if level = 0 then new = 0;
  if new in (1,2) then new + 1;
  else if new = 3 then new = 0;
  else if _level = 0 then new = 1;
end;
run;

proc print data=want noobs;
run;

Result:

                  amt_      amt_
 ID    Level    reached    needed    _level    new

123      0       231.71    482.73       .       1 
123      1       251.02    235.10       0       2 
123      1       235.10    476.33       1       3 
123      1       241.23    478.95       1       0 
123      1       237.72    470.33       1       0 
123      0       465.77    465.77       1       0 
123      1          .      215.89       0       1 
123      1       215.89    460.70       1       2 
123      1       244.81    486.22       1       3 
123      1       241.41    477.02       1       0 
123      1       235.61    469.03       1       0 
123      1       233.42    475.10       1       0 
234      1       139.35    278.70       1       1 
234      1       139.35    278.70       1       2 
234      1       139.35    278.70       1       3 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
234      1       139.35    278.70       1       0 
156      1          .       18.99       1       1 
156      1        18.99     39.07       1       2 
156      1        20.08     40.26       1       3 
156      1        20.18     40.46       1       0 
156      1        20.28     40.63       1       0 
156      1        20.35     40.80       1       0 
156      1        20.45     40.97       1       0 
156      0        61.64     41.12       1       0 
156      1          .       18.02       0       1 
156      0        37.09     37.09       1       0 
156      1          .       17.68       0       1 
156      1        17.68     36.40       1       2 
1SasUser1
Calcite | Level 5

Thanks that does work for these cases. I have found a case where it doesn't work and I think perhaps because there is no reference to the amt_reached and amt_needed columns.

 

In reference to your previous question for the last 4 observations, the reason why I want it to be 1,0,1,2 is because the 3rd from last observation is as follows:

 

IDLevelamt_reachedamt_neededWhat I want
156037.0937.090

 

Since the columns amt_reached and amt_needed fall are equal, then the new column will be at 0. If amt_reached < amt_needed, then it should be 1,2,3,0 for the last 4 observations. Whenever Level = 0 and amt_reached >= amt_needed then that row should = 0 and the count restarts at the next observation. 

 

However if Level = 1 and amt_reached >= amt_needed, then the new column should equal 1 at that point. These two columns are the main decision points.

 

Hope that is a bit clearer.

 

Here is an example of when it doesn't work with your new code:

 

ID_LevelnewLevelamt_reachedamt_neededWhat I want
111011125.27247.921
111121122.65242.712
111131120.06237.333
111101117.27232.10
111101114.83227.210
111101112.38222.430
111101110.05217.770
111100213.27213.270
11101097.5197.510
11101095.0295.020
11101092.6792.670
11101090.390.30

 

Thanks again!

Kurt_Bremser
Super User

I think a small change in the code should do it:

data want;
set have;
by id notsorted;
_level = lag(level);
if first.id then new = 1;
else do;
  if level = 0 then new = 0;
  else if new in (1,2) then new + 1;
  else if new = 3 then new = 0;
  else if _level = 0 then new = 1;
end;
run;

Note the extra "else" in line 8 that makes it a whole if-then-else-if chain.

1SasUser1
Calcite | Level 5

That did the trick.

 

The other thing is with the code saying if first.ID then new = 1, this is causing some cases where a new ID begins to have 1 on the first observation, even if Level doesn't equal 1. I also can see cases where new = 2 but the previous observation isn't there is at 0

 

Please see below:

 

ID_LevelnewLevelamt_reachedamt_neededWhat I want
222.102120.590
2220002120.250
2220002120.450
22210022.5422.540
2220002120.060
2220002019.460
2220002019.920
22201118.518.951
2220002321.590
2220003019.140
2220002120.970
2220002119.710
3330108.997.20
3330007.695.130
3330005.4450
3330007.5450
33300010.515.530
333000105.90
3330007.675.130
3331006.775.340
3330001050
333011651
3330007.3650
333000750
444121.51
44410077.7100
4440112051
4440002050
4440102012.090
444011.51
444000205.520
4441002010.340
4440112051
444000.00
4441215013.81

 

1SasUser1
Calcite | Level 5

Sorry, I realised I had ordered the data incorrectly for my last comment. The issue with new = 2 is fine. The issue is only with the starting ID = 1.

 

Apologies.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 815 views
  • 0 likes
  • 2 in conversation