Desktop productivity for business analysts and programmers

data steps with nested do loops

Accepted Solution Solved
Reply
Contributor
Posts: 62
Accepted Solution

data steps with nested do loops

Hi

 

I am trying to run these do loops on this data , there are 21 variables and under each variable some credit codes apply,  my query is find the volume of these rules on these variables.

 

Three rules are : Affordability,  Character and Policy. How many application have affordability or character or policy hits.

 

Under each rule there are certain codes which determine that these rules are applied on the application.

 

I have written this code for each rule and then run proc freq to get the volumes.

 

Question 1:  Is there a better way to write do loops for these rules as I might have to add more rules later on. As there are 21 variables

to test these rules , is there any other way to combine them or not.

 

Question 2:  if I have to exclude those applications which are withdrawn,

I put where statement in the end, but when I tried to combine 02 statements then it gives me error:

like if I say where assessordecision <> "withdrawn"  and <> ' ' ;  it works if I exclude the missing value and keep it to withdrawn

only, how do I combine the statements with where.

 

code is attached as an attachment its a long one.

 

 


Accepted Solutions
Solution
‎12-04-2017 04:35 AM
Super User
Posts: 6,921

Re: data steps with nested do loops

Maybe I should have spelled out a more complete program.  Let me take what you have done so far, and nudge it in the right direction.  Here would be a better version:

 

data tk.volume_sbos_sbl;

set sastrain.application_extract;

format affordability_refer 1.;

affordability_refer = 0 ;

 

/*Capacity rule*/

 

array aff{*} RB_postBurreasoncdetable_01-RB_postBurreasoncdetable_21;

if AppSystem in ('SBLoans') then do i=1 to 21 until (affordability_refer=1);

if aff{i} in   

('RA83','RA51','RA92','RA93','RA12','RA13','RA57','RA29') then affordability_refer= 1;

end;

run;

 

The DO loop uses UNTIL so it can stop checking the remaining items in the array, as soon as it finds a match on one item.

 

View solution in original post


All Replies
Valued Guide
Posts: 538

Re: data steps with nested do loops

It is always a good idea to suspect optimizations exist if you notice repetition in your code.

 

Without trying to do the coding itself I can give some pointers:

 

1) Variable lists with a numeric postfix (but also others) are good candidates stick in an array. You can then loop over that array. See this older but stil relevant SGF paper. It is easy to add or change your arrays at a later stage.

2) These groups of codes that map to a single flag can be handled by a format. Your "if value in ('a', 'b', ...) then" can be transfored into "flag = put(value, $mappingfmt.)". The  proc format at the top of your code would be the single place to add or change mappings.

 

Hope this helps you on your way,

-- Jan

Valued Guide
Posts: 538

Re: data steps with nested do loops

Posted in reply to jklaverstijn
An entirely different approach could be to transpose your data. If you have so many similar columns consider transforming that to a dataset with just one variable plus an extra indicator for the suffix that you would remove. The aforementioned technique with arrays and do loops could be used for the purpose of transposing if proc transpose doesn't cut it. You would then not have to repeat the mapping for each variable in each row but just the single variable in each row. This wide to long transposition usually gives you a dataset that is easier to analyze so you may benefit from this approach in other scenario's as well.

- Jan.
Valued Guide
Posts: 538

Re: data steps with nested do loops

And on question 2: since your variable is character, the use of "<>" is not appropriate. Also, the variable your testing must be repeated after the AND:

 

AssessorDecision ne "Withdrawn" and AssessorDecision ne ' '

Regards,

- Jan.

Super User
Posts: 6,921

Re: data steps with nested do loops

This may or may not run faster.  But it's a whole lot easier to write, read, and QC:

 

proc format;

invalue afford 

'RA83', 'RA51', 'RA92', 'RA93', 'RA12', 'RA13', 'RA57', 'RA29' = 1  other=0;
run;
 
array aff {*} RB_PostBurReasonCdeTable_01 - RB_PostBurReasonCdeTable_21;
 
if AppSystem in ('SBLoans') then do _n_=1 to dim(aff) until (affordability_refer=1);
   affordability_refer = input(aff{_n_}, afford.);
end;
 
Similar code needed for the other two measures.  This is untested, so may need to be tweaked.
Valued Guide
Posts: 538

Re: data steps with nested do loops

Posted in reply to Astounding

Yes very much as I suggested. Kudo's for using an informat instead of a format. Wouldn't use the name _n_ for the loop index however as this is a special variable.

 

regards,

-- Jan

Super User
Posts: 6,921

Re: data steps with nested do loops

Posted in reply to jklaverstijn

Just a side note really ...

 

I often use _n_ as the index for an array.  The minor benefit is that I don't have to remember to drop it later.  But the major benefit occurs in this sort of code:

 

array nums {*} _numeric_;

do i=1 to dim(nums);

   * something, anything that changes nums{i};

end;

 

If by some oversight the list _NUMERIC_ includes the variable i, big trouble results.  Either the results are wrong (likely with no note or warning), or else an infinite loop results (if, for example, the loop changes nums{i}=0).  So it's a choice I make which to me is the lesser of evils.

Contributor
Posts: 62

Re: data steps with nested do loops

Posted in reply to Astounding

Hi

 

Thanks for your reply,  I tried the code:

 

but comes up with errors, I am expert on arrays so not sure how to fix it.

 

 

17 invalue afford

18

19 'RA83', 'RA51', 'RA92', 'RA93', 'RA12', 'RA13', 'RA57', 'RA29' = 1 other = 0;

NOTE: Informat AFFORD is already on the library.

NOTE: Informat AFFORD has been output.

19 ! .

20

19 'RA83', 'RA51', 'RA92', 'RA93', 'RA12', 'RA13', 'RA57', 'RA29' = 1 other = 0;.

_

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

21 run;

NOTE: The previous statement has been deleted.

21 ! .

22

21 run;.

_

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

23 .

24

25 array aff {*} RB_PostBurReasonCdeTable_01 - RB_PostBurReasonCdeTable_21;

NOTE: The previous statement has been deleted.

25 ! .

26

25 array aff {*} RB_PostBurReasonCdeTable_01 - RB_PostBurReasonCdeTable_21;.

_

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

27 .

28

29

30 if AppSystem in ('SBLoans') then do _n_=1 to dim(aff) until (affordability_refer=1);

NOTE: The previous statement has been deleted.

2 The SAS System 11:21 Sunday, December 3, 2017

31

32 affordability_refer = input(aff{_n_}, afford.);

___________________

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

NOTE: The previous statement has been deleted.

33

34 end;

___

180

ERROR 180-322: Statement is not valid or it is used out of proper order.

NOTE: The previous statement has been deleted.

35 run;

 

 

******************************************

 

 

I also tried this code : didn't work;

 

data tk.volume_sbos_sbl;

set sastrain.application_extract;

format affordability_refer 1.;

affordability_refer = 0 ;

 

/*Capacity rule*/

 

array aff{*} RB_postBurreasoncdetable_01-RB_postBurreasoncdetable_21;

if AppSystem in ('SBLoans') then do;

if aff{*} in ('RA83','RA51','RA92','RA93','RA12','RA13','RA57','RA29') then affordability_refer= 1;

do i = 1 to 21;

end;

end;

run;

 

Solution
‎12-04-2017 04:35 AM
Super User
Posts: 6,921

Re: data steps with nested do loops

Maybe I should have spelled out a more complete program.  Let me take what you have done so far, and nudge it in the right direction.  Here would be a better version:

 

data tk.volume_sbos_sbl;

set sastrain.application_extract;

format affordability_refer 1.;

affordability_refer = 0 ;

 

/*Capacity rule*/

 

array aff{*} RB_postBurreasoncdetable_01-RB_postBurreasoncdetable_21;

if AppSystem in ('SBLoans') then do i=1 to 21 until (affordability_refer=1);

if aff{i} in   

('RA83','RA51','RA92','RA93','RA12','RA13','RA57','RA29') then affordability_refer= 1;

end;

run;

 

The DO loop uses UNTIL so it can stop checking the remaining items in the array, as soon as it finds a match on one item.

 

Contributor
Posts: 62

Re: data steps with nested do loops

Thanks Astounding, this is perfect as I wanted, it worked perfectly...it created a new column i which I removed with drop statement . All the results are good with the proc freq statement. Just one question , I have got these numbers from yearly data , so proc freq has given me the volume that how many times these rules are hit , if I want to see a trend during the whole year how these rules are hit to this data what will be the easy option to achieve that..
Super User
Posts: 6,921

Re: data steps with nested do loops

You can only do that if you have some version of a date in the data set.  Then you could pick out the month (how you do that depends on what the date variable contains).  Give the month a value from 1 through 12, and use that as one dimension of the table in a PROC FREQ.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 405 views
  • 0 likes
  • 3 in conversation