BookmarkSubscribeRSS Feed
rahulsaha2127
Fluorite | Level 6

I have this SAS code using first. and recursive if else loops which I need to convert to an SQL query. But I am not able to understand how to do it because of the retain statement and calling numerous variables in multiple loops. 

Can someone please help me to convert it to SQL query?

 

data new_data;
	set raw_data(where=(appr_date ne .));
	by loan_number period_dt;
	retain prev_appr_date prev_upd_appr_date appraisal_cross;

	if first.loan_number then do;
		if appr_date > period_dt then do;
			prev_appr_date=appr_date;
			upd_appr_date=period_dt;
			prev_upd_appr_date=upd_appr_date;
			appraisal_cross=1;
			x=1;
		end;
		else if appr_date <= period_dt then do; 
			prev_appr_date=appr_date;
			upd_appr_date=appr_date;
			prev_upd_appr_date=appr_date;
			appraisal_cross=0;
			x=2;
		end;
	end;

	else do;
		if appr_date<prev_upd_appr_date and appr_date ne prev_appr_date then do;
			upd_appr_date=max(period_dt,prev_upd_appr_date);
			appraisal_cross=1;
			x=3;
		end;
		else if appr_date=prev_appr_date and appraisal_cross=1 then do;
			upd_appr_date=prev_upd_appr_date;
			appraisal_cross=1;
			x=4;
		end;
		else if appr_date>=prev_upd_appr_date and appr_date>period_dt then do;
			upd_appr_date=period_dt;
			appraisal_cross=1;
			x=5;
		end;		
		else if appr_date>=prev_upd_appr_date and appr_date <=period_dt then do;
			upd_appr_date=appr_date;
			appraisal_cross=0;
			x=6;
		end;
	
		prev_upd_appr_date=upd_appr_date;
		prev_appr_date=appr_date;
	end;
run;

Thank you in advance. Hope to get some help on this.

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Some things in a data step have no exact equivalent in PROC SQL. In particular, first.loan_number has no SQL equivalent, and there may be others things in your code that also don't translate.

--
Paige Miller
Tom
Super User Tom
Super User

SQL is not the right tool to replicate this algorithm since it does not have the concept of FIRST or LAST.  It just operates on SETs.

 

Can you explain the purpose of the code? Or is your actual question to understand what the code is doing?

 

PS  There are no loops in that code, other than the implied loop over all of the observations being read from RAW_DATA.

rahulsaha2127
Fluorite | Level 6
I have to convert this code to Pyspark. Since I am unable to do that I was trying to convert it to sql, and then pass the sql query in Pyspark
Tom
Super User Tom
Super User

@rahulsaha2127 wrote:
I have to convert this code to Pyspark. Since I am unable to do that I was trying to convert it to sql, and then pass the sql query in Pyspark

So please explain in words what the code is doing. And perhaps better WHY it is doing that as part of the larger problem you are trying to solve.

Reeza
Super User

Use panda's instead or a method in pyspark that supports looping through each row. 

 

Or break the process down to it's steps and redesign it entirely using a SQL methodology which may involve multiple steps and reads.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1313 views
  • 0 likes
  • 4 in conversation