BookmarkSubscribeRSS Feed
OscarUvalle
Obsidian | Level 7

 

Hi, I'm having a headache computing new variables based on the observation above. 

I have variables  P_1, bad and goods in a merged table, from I am calculating as follows:

Cumbad is 1 for the first observations, then Cumbad + bad. Is the cumulative values of bad.

Cumgood is the same as cumbad.

CumPercentBad is Cumbad / total of bads, which do not know how to calculate it. I tried max(bads) not success at all. So, just put the whole number as such.

CumPercentGood the same as above.

Total accepted is cumulative of the total of bads and goods in descending order. If it the first observation then total accepted 5000, else, total accepted minus bad (from the above observation), minus good (from the above observations).

Solving the problem from total accepted should fix the problem with the rest of the variables. 

This is the code I tried.

I uploaded a txt if you want to give it a try.

 

data matrix;
	set merged;
	* by P_1;
	* retain CumBad CumGood; 
	retain _all_ ;
		  If _n_ = 1 then CumBad + Bad; else CumBad = CumBad + Bad; * works fine;
		  If _n_ = 1 then CumGood + Good; else CumGood = CumGood + Good; * works fine;
		  Cum_Perc_Bad = CumBad / 747; * Total of Bads, in fact, 747 is the max value of CumBads. works fine;
		  Cum_Perc_Good = CumGood / 4253; * Total of Gods; 4253 is the max value of CumGood. works fine;
		  Abs_diff = abs(Cum_Perc_Bad - Cum_Perc_Good); * works fine;
		  if _n_ = 1 then Total_Accepted + 5000 ; else Total_Accepted = Total_Accepted - Bad - Good; * Total accepted starts in 5000. esle Total_Accepted - bad[n-1] and good [n-1]. It fails, as it takes the current bad and good value observations instead of the above ;
		  Accepted_rate = Total_Accepted / 5000 ; * 5000 total number of goods and bads, in fact, the sum of bads and goods. Works fine;
		  If _n_ = 1 then Num_Bads_among_Acc +  747; else Num_Bads_among_Acc = Num_Bads_among_Acc - Bad; * 747 is the total os bads. It fails, as it takes bad[n] instead of bad [n-1];	  
Bad_Rate_among_Acc = Num_Bads_among_Acc / Total_Accepted;
	      If _n_ = 1 then Num_Goods_among_Acc + 4253; else Num_Goods_among_Acc = Num_Goods_among_Acc - Good ; * 4253 is the total of goods. It fails, as it takes good[n] instead of good [n-1];
		  Marg_BR = Good / Total_Accepted ; * It works ;
run; 
proc print data = matrix; format Accepted_rate Bad_Rate_among_Acc percent8.2; run;

What I am getting is:

Output.PNG

 

 The desired output would be like:

Capture.PNG

 

Thank you in advance.

3 REPLIES 3
Tom
Super User Tom
Super User

Your RETAIN statement is doing nothing.  The _ALL_ keyword is evaluated when the statement is compiled.  At that point the only variables that have been defined to the compiler are the ones being read from the input dataset MERGED.  But variables being read from source datasets are always retained.  So it does nothing.

 

Either list the actual variables that you want to retain. Or move it to the end of the data step where the _ALL_ variable list will actually include ALL of the variables.

OscarUvalle
Obsidian | Level 7

Yes Tom, I have now stated all the variables I need to retain, yet not having the desired output. 

All the values are ok for the first row but not for the rest. The problem I think is in the "else" section where I need to take the three values from the row above, instead, it is taking the values within the current row. 

 

data matrix;
	set merged;
	retain CumBad CumGood Bad Good Total_Accepted 5000 Num_Bads_among_Acc 747  Num_Goods_among_Acc 4253;
		  If _n_ = 1 then CumBad + Bad; else CumBad = CumBad + Bad; * connected to retain;
		  If _n_ = 1 then CumGood + Good; else CumGood = CumGood + Good; * connected to retain;
		  Cum_Perc_Bad = CumBad / 747; * Total of Bads;
		  Cum_Perc_Good = CumGood / 4253; * Total of Gods;
		  IF _n_ = 1 then Total_Accepted + 0 ; else Total_Accepted = Total_Accepted - Bad - Good ;
		  Abs_diff = abs(Cum_Perc_Bad - Cum_Perc_Good);
		  Accepted_rate = Total_Accepted / 5000 ; * 5000 total number of goods and bads;
		  If _n_ = 1 then Num_Bads_among_Acc +  0; else Num_Bads_among_Acc = Num_Bads_among_Acc - Bad; * connected to retain;
		  Bad_Rate_among_Acc = Num_Bads_among_Acc / Total_Accepted  ;
	      If _n_ = 1 then Num_Goods_among_Acc + 0; else Num_Goods_among_Acc = Num_Goods_among_Acc - Good ; * connected to retain;
		  Marg_BR = Good / Total_Accepted ;
run; 
proc print data = matrix; format Accepted_rate Bad_Rate_among_Acc percent8.2; run;
Tom
Super User Tom
Super User
I do not see where you are MAKING this new variable BAD that you want to retain from the previous observation.
If BAD is a variable that is being read from the input data then you need assign it to a new variable that you can retain. Or use the LAG() function. Make sure to NOT use the LAG() conditionally.
Something like:
lag_bad=lag(bad);
if xxxx then yyy = zzz + bad;
else yyy = zzz + lag_bad;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 600 views
  • 1 like
  • 2 in conversation