Yes Sir you are right that in your code as the condition is satisfied, other will be disregarded
But in this case, we need all the IF's statement to execute
If i use IF-THEN/ELSE, only first time SAS will execute the statement as the condition is true and second ELSE statement won't execute
So collasping will not be done in all the regions
I may be wrong in SAS concepts but i got the output just because of you sir
Thanks again
Nice to Meet you
I disagree as only one statement can be true out of all of the conditions. However, it will work both ways, thus mark the question as answered and let's move on to the next question someone might have.
Thanks for help
I learned so many new concepts from you
Select statements should be faster than so many non-overlapping if or if/else methods. Give this a try. I did not spend time to make it pretty...
data _null_;
file "C:\changes_new.sas";
set data.collapse end=last;
by region26 var;
if _n_=1 then put 'select(region26);';
if first.region26 then put 'when(' region26 ') do;';
if first.var then put 'select(' var ');';
select(var);
when ('marriage') do;
select(group);
when (2,5) put 'when(2,5) ' var '=2;';
when (3,4) put 'when(3,4) ' var '=3;';
otherwise;
end;
end;
when ('income') do;
select(group);
when (1,2) put 'when(1,2) ' var '=2;';
when (3,4) put 'when(3,4) ' var '=4;';
when (5,6) put 'when(5,6) ' var '=6;';
otherwise;
end;
end;
otherwise;
end;
if last.var then put 'otherwise; end;';
if last.region26 then put 'end;';
if last then put 'otherwise; end;';
run;
I did some tests with expanding your provided data:
data coded;
input E3010 income hh_size region10 region26 minority_i hh_size_i income_i marriage_i own_rent_i E112_n marriage marriage_r own_rent minority total_i fadj f2 w0;
do i=1 to 100000;
output; *simulate more data;
end;
cards;
19000100000000 5 4 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
19000100000000 3 4 1 1 1 0 0 0 0 96 2 5 1 0 1 1.10534711 0.26676473 0.2948688
19000100000000 4 1 1 1 0 0 0 0 0 96 2 5 1 0 0 1.10534711 0.26676473 0.2948688
19000100000000 1 4 1 1 0 0 0 0 0 96 1 1 0 1 0 1.10534711 0.26676473 0.2948688
19000100000000 3 3 1 1 1 0 0 0 1 96 1 1 1 0 2 1.10534711 0.26676473 0.2948688
19000100000000 7 3 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
19000100000000 6 3 1 1 0 0 1 0 0 96 2 5 1 0 1 1.10534711 0.26676473 0.2948688
19000100000000 6 3 1 1 0 0 0 0 0 96 2 5 0 0 0 1.10534711 0.26676473 0.2948688
19000200000000 7 2 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 7 2 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 7 2 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 6 2 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 7 3 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 5 1 1 1 0 0 0 0 0 96 3 2 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 1 2 1 1 0 0 1 0 0 96 2 5 0 0 1 1.10534711 0.26676473 0.2948688
21005000000000 5 1 1 1 0 0 0 0 0 96 3 3 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 4 4 1 1 0 0 0 0 0 96 2 5 0 0 0 1.10534711 0.26676473 0.2948688
21005000000000 7 2 1 1 0 0 0 0 0 96 1 1 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 6 1 1 1 0 0 0 0 0 96 2 2 1 0 0 1.10534711 0.26676473 0.2948688
21005000000000 7 2 1 1 1 0 1 0 0 96 1 1 1 0 2 1.10534711 0.26676473 0.2948688
;
run;
Average performance over 5 runs:
Urvish Method 0.73 cpu time
Art Method 0.99 cpu time (I agree that Art's method should be more efficient, this is very strange to me. The efficiency would increase if the nested else's were in priority of likelihood... This is still strange to me.)
FE Method 0.60 cpu time (I win! )
Matt,
I sit corrected! While if then else CAN improve performance, given this particular set of sample data it won't unless one can optimize the constructs more efficiently than I did.
Art
Matt,
I disagree that select is faster than if then. Here is a more optimized version of my original code. In my tests, it runs neck and neck with your version:
data _null_;
file "c:\changes_newer.sas";
set collapse;
by region26 var;
if _n_ eq 1 then counter=0;
if count lt 10 then do;
if first.region26 then do;
counter+1;
if counter eq 1 then put "if region26 eq " @;
else put "else if region26 eq " @;
put region26 " then do;";
end;
end;
if var eq 'income' then do;
if first.var then put "if income in (";
else put "else if income in (";
if group in (1, 2) then put "1, 2) then income=2;";
else if group in (3, 4) then put "3, 4) then income=4;";
else if group in (5, 6) then put "5, 6) then income=6;";
end;
if var eq 'marriage' then do;
if first.var then put "if marriage in (";
else put "else if marriage in (";
if group in (2, 5) then put "2, 5) then marriage=2;";
else if group in (3, 4) then put "3, 4) then marriage=3;";
end;
if last.region26 then put "end;";
run;
data want_newer;
set coded;
%include "c:\changes_newer.sas";
run;
It is not meant as a blanket statement. In my experience it holds true dependent on number of nests and their order, etc. a well optimized if/then can definitely outperform a select statement. Everything with performance is always case based.
See this note from the select statement syntax document.
Comparisons
Use IF-THEN/ELSE statements for programs with few statements. Use subsetting IF statements without a THEN clause to continue processing only those observations or records that meet the condition that is specified in the IF clause.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.