DATA Step, Macro, Functions and more

if then efficiency / options

Accepted Solution Solved
Reply
Valued Guide
Posts: 860
Accepted Solution

if then efficiency / options

I have the code below, it works fine but with a larger dataset it will be innefficient.  If anyone can help get the same results, I want to do something if else/then but am not sure how to do it with the multiple filters.  The segment_id and product_type will change each line, but I only care about the one per line.  The selection should account for the entire population.

 

_0005 = missing(segment_id);
if substr(segment_id,1,2) in ('01','02','03','04','05') then do;
_0006 = 0;else _0006 = 1;
if substr(segment_id,1,2) = '01' and product_type = 'Secured-Revolving' then
_0007 = 1;else _0007 = 1;
if substr(segment_id,1,2) = '02' and product_type = 'Secured-Installment' then
_0008 = 0;else _0008 = 1;
if substr(segment_id,1,2) = '03' and product_type = 'Unsecured-Revolving' then
_0009 = 0;else _0009 = 1;
if substr(segment_id,1,2) = '04' and product_type = 'Unsecured-Installment' then
_0010 = 0;else _0010 = 1;
if substr(segment_id,1,2) = '05' and product_type = 'Overdraft' then
_0011 = 0;else _0011 = 1;


Accepted Solutions
Solution
‎03-17-2016 11:45 AM
Respected Advisor
Posts: 4,925

Re: if then efficiency / options

Posted in reply to Steelers_In_DC
_0005 = missing(segment_id);
_0006 = 0; /* Might not be necessary since Missing and 0 are both FALSE */
select (substr(segment_id, 1, 2));
	when ("01") _0007 = not (product_type = 'Secured-Revolving');
	when ("02") _0008 = not (product_type = 'Secured-Installment');
	when ("03") _0009 = not (product_type = 'Unsecured-Revolving');
	when ("04") _0010 = not (product_type = 'Unsecured-Installment');
	when ("05") _0011 = not (product_type = 'Overdraft');
	otherwise _0006 = 1;
end;

(untested)

PG

View solution in original post


All Replies
Super User
Posts: 5,509

Re: if then efficiency / options

Posted in reply to Steelers_In_DC

Your biggest cost here isn't the IF/THEN logic.  It's the repeated use of SUBSTR.  You would do much better along these lines:

 

length newvar $ 2;

newvar = substr;

drop newvar;

_0005 = missing(segment_id);
if newvar in ('01','02','03','04','05') then do;
   _0006 = 0;else _0006 = 1;
   if newvar = '01' and product_type = 'Secured-Revolving' then
   _0007 = 1;else _0007 = 1;
   if newvar = '02' and product_type = 'Secured-Installment' then
   _0008 = 0;else _0008 = 1;
   if newvar = '03' and product_type = 'Unsecured-Revolving' then
   _0009 = 0;else _0009 = 1;
   if newvar = '04' and product_type = 'Unsecured-Installment' then
   _0010 = 0;else _0010 = 1;
   if newvar = '05' and product_type = 'Overdraft' then
   _0011 = 0;else _0011 = 1;

end;

else _0006 = 1;

 

I'm not sure if I got all the assignments right for all the flags, but this would definitely improve the speed.

 

Good luck.

Super User
Posts: 5,429

Re: if then efficiency / options

Posted in reply to Steelers_In_DC
What I ser from a data management perspective is still lot of hard coded logic. I would favour a lookup table with substr of segment and product type. Use that with a join or a format.
Another performance killer is I/O. Use s code for product type instead. Required less space, and makes it easier to rename in the future.
Are you going to use you flags for calculation? If so set the length to 3. Else char 1.
Data never sleeps
Solution
‎03-17-2016 11:45 AM
Respected Advisor
Posts: 4,925

Re: if then efficiency / options

Posted in reply to Steelers_In_DC
_0005 = missing(segment_id);
_0006 = 0; /* Might not be necessary since Missing and 0 are both FALSE */
select (substr(segment_id, 1, 2));
	when ("01") _0007 = not (product_type = 'Secured-Revolving');
	when ("02") _0008 = not (product_type = 'Secured-Installment');
	when ("03") _0009 = not (product_type = 'Unsecured-Revolving');
	when ("04") _0010 = not (product_type = 'Unsecured-Installment');
	when ("05") _0011 = not (product_type = 'Overdraft');
	otherwise _0006 = 1;
end;

(untested)

PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 270 views
  • 2 likes
  • 4 in conversation