BookmarkSubscribeRSS Feed
OscarUvalle
Obsidian | Level 7

 

Hi, I'm having a headache computing new variables based on the observation above. 

I have variables  P_1, bad and goods in a merged table, from I am calculating as follows:

Cumbad is 1 for the first observations, then Cumbad + bad. Is the cumulative values of bad.

Cumgood is the same as cumbad.

CumPercentBad is Cumbad / total of bads, which do not know how to calculate it. I tried max(bads) not success at all. So, just put the whole number as such.

CumPercentGood the same as above.

Total accepted is cumulative of the total of bads and goods in descending order. If it the first observation then total accepted 5000, else, total accepted minus bad (from the above observation), minus good (from the above observations).

Solving the problem from total accepted should fix the problem with the rest of the variables. 

This is the code I tried.

I uploaded a txt if you want to give it a try.

 

data matrix;
	set merged;
	* by P_1;
	* retain CumBad CumGood; 
	retain _all_ ;
		  If _n_ = 1 then CumBad + Bad; else CumBad = CumBad + Bad; * works fine;
		  If _n_ = 1 then CumGood + Good; else CumGood = CumGood + Good; * works fine;
		  Cum_Perc_Bad = CumBad / 747; * Total of Bads, in fact, 747 is the max value of CumBads. works fine;
		  Cum_Perc_Good = CumGood / 4253; * Total of Gods; 4253 is the max value of CumGood. works fine;
		  Abs_diff = abs(Cum_Perc_Bad - Cum_Perc_Good); * works fine;
		  if _n_ = 1 then Total_Accepted + 5000 ; else Total_Accepted = Total_Accepted - Bad - Good; * Total accepted starts in 5000. esle Total_Accepted - bad[n-1] and good [n-1]. It fails, as it takes the current bad and good value observations instead of the above ;
		  Accepted_rate = Total_Accepted / 5000 ; * 5000 total number of goods and bads, in fact, the sum of bads and goods. Works fine;
		  If _n_ = 1 then Num_Bads_among_Acc +  747; else Num_Bads_among_Acc = Num_Bads_among_Acc - Bad; * 747 is the total os bads. It fails, as it takes bad[n] instead of bad [n-1];	  
Bad_Rate_among_Acc = Num_Bads_among_Acc / Total_Accepted;
	      If _n_ = 1 then Num_Goods_among_Acc + 4253; else Num_Goods_among_Acc = Num_Goods_among_Acc - Good ; * 4253 is the total of goods. It fails, as it takes good[n] instead of good [n-1];
		  Marg_BR = Good / Total_Accepted ; * It works ;
run; 
proc print data = matrix; format Accepted_rate Bad_Rate_among_Acc percent8.2; run;

What I am getting is:

Output.PNG

 

 The desired output would be like:

Capture.PNG

 

Thank you in advance.

3 REPLIES 3
Tom
Super User Tom
Super User

Your RETAIN statement is doing nothing.  The _ALL_ keyword is evaluated when the statement is compiled.  At that point the only variables that have been defined to the compiler are the ones being read from the input dataset MERGED.  But variables being read from source datasets are always retained.  So it does nothing.

 

Either list the actual variables that you want to retain. Or move it to the end of the data step where the _ALL_ variable list will actually include ALL of the variables.

OscarUvalle
Obsidian | Level 7

Yes Tom, I have now stated all the variables I need to retain, yet not having the desired output. 

All the values are ok for the first row but not for the rest. The problem I think is in the "else" section where I need to take the three values from the row above, instead, it is taking the values within the current row. 

 

data matrix;
	set merged;
	retain CumBad CumGood Bad Good Total_Accepted 5000 Num_Bads_among_Acc 747  Num_Goods_among_Acc 4253;
		  If _n_ = 1 then CumBad + Bad; else CumBad = CumBad + Bad; * connected to retain;
		  If _n_ = 1 then CumGood + Good; else CumGood = CumGood + Good; * connected to retain;
		  Cum_Perc_Bad = CumBad / 747; * Total of Bads;
		  Cum_Perc_Good = CumGood / 4253; * Total of Gods;
		  IF _n_ = 1 then Total_Accepted + 0 ; else Total_Accepted = Total_Accepted - Bad - Good ;
		  Abs_diff = abs(Cum_Perc_Bad - Cum_Perc_Good);
		  Accepted_rate = Total_Accepted / 5000 ; * 5000 total number of goods and bads;
		  If _n_ = 1 then Num_Bads_among_Acc +  0; else Num_Bads_among_Acc = Num_Bads_among_Acc - Bad; * connected to retain;
		  Bad_Rate_among_Acc = Num_Bads_among_Acc / Total_Accepted  ;
	      If _n_ = 1 then Num_Goods_among_Acc + 0; else Num_Goods_among_Acc = Num_Goods_among_Acc - Good ; * connected to retain;
		  Marg_BR = Good / Total_Accepted ;
run; 
proc print data = matrix; format Accepted_rate Bad_Rate_among_Acc percent8.2; run;
Tom
Super User Tom
Super User
I do not see where you are MAKING this new variable BAD that you want to retain from the previous observation.
If BAD is a variable that is being read from the input data then you need assign it to a new variable that you can retain. Or use the LAG() function. Make sure to NOT use the LAG() conditionally.
Something like:
lag_bad=lag(bad);
if xxxx then yyy = zzz + bad;
else yyy = zzz + lag_bad;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 646 views
  • 1 like
  • 2 in conversation