I have the code below, it works fine but with a larger dataset it will be innefficient. If anyone can help get the same results, I want to do something if else/then but am not sure how to do it with the multiple filters. The segment_id and product_type will change each line, but I only care about the one per line. The selection should account for the entire population.
_0005 = missing(segment_id);
if substr(segment_id,1,2) in ('01','02','03','04','05') then do;
_0006 = 0;else _0006 = 1;
if substr(segment_id,1,2) = '01' and product_type = 'Secured-Revolving' then
_0007 = 1;else _0007 = 1;
if substr(segment_id,1,2) = '02' and product_type = 'Secured-Installment' then
_0008 = 0;else _0008 = 1;
if substr(segment_id,1,2) = '03' and product_type = 'Unsecured-Revolving' then
_0009 = 0;else _0009 = 1;
if substr(segment_id,1,2) = '04' and product_type = 'Unsecured-Installment' then
_0010 = 0;else _0010 = 1;
if substr(segment_id,1,2) = '05' and product_type = 'Overdraft' then
_0011 = 0;else _0011 = 1;
_0005 = missing(segment_id);
_0006 = 0; /* Might not be necessary since Missing and 0 are both FALSE */
select (substr(segment_id, 1, 2));
when ("01") _0007 = not (product_type = 'Secured-Revolving');
when ("02") _0008 = not (product_type = 'Secured-Installment');
when ("03") _0009 = not (product_type = 'Unsecured-Revolving');
when ("04") _0010 = not (product_type = 'Unsecured-Installment');
when ("05") _0011 = not (product_type = 'Overdraft');
otherwise _0006 = 1;
end;
(untested)
Your biggest cost here isn't the IF/THEN logic. It's the repeated use of SUBSTR. You would do much better along these lines:
length newvar $ 2;
newvar = substr;
drop newvar;
_0005 = missing(segment_id);
if newvar in ('01','02','03','04','05') then do;
_0006 = 0;else _0006 = 1;
if newvar = '01' and product_type = 'Secured-Revolving' then
_0007 = 1;else _0007 = 1;
if newvar = '02' and product_type = 'Secured-Installment' then
_0008 = 0;else _0008 = 1;
if newvar = '03' and product_type = 'Unsecured-Revolving' then
_0009 = 0;else _0009 = 1;
if newvar = '04' and product_type = 'Unsecured-Installment' then
_0010 = 0;else _0010 = 1;
if newvar = '05' and product_type = 'Overdraft' then
_0011 = 0;else _0011 = 1;
end;
else _0006 = 1;
I'm not sure if I got all the assignments right for all the flags, but this would definitely improve the speed.
Good luck.
_0005 = missing(segment_id);
_0006 = 0; /* Might not be necessary since Missing and 0 are both FALSE */
select (substr(segment_id, 1, 2));
when ("01") _0007 = not (product_type = 'Secured-Revolving');
when ("02") _0008 = not (product_type = 'Secured-Installment');
when ("03") _0009 = not (product_type = 'Unsecured-Revolving');
when ("04") _0010 = not (product_type = 'Unsecured-Installment');
when ("05") _0011 = not (product_type = 'Overdraft');
otherwise _0006 = 1;
end;
(untested)
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.