Suppose I have one data set of economic data by US States (named "master"), and I want to divide this data set by individual states. I want to be able to perform DATA STEP merge later using the individual state data, so I need PROC SORT (or SQL order by) before I can do this. Considering the size of my "master" data (50 GB), I am curious which of the following will be more efficient and why. For demonstration purposes, I am only pulling data for 4 US states: NC, SC, GA, and FL. The target file that I want is called sorted_NC, sorted_SC, sorted_GA, and sorted_FL, respectively. OPTION 1: Passing Through "Master" Table Multiple Times %macro sort_state(location);
proc sort data = master (where=(state="&location"))
out = sorted_&location;
by var1 var2;
run;
%mend sort_state;
%sort_state(NC)
%sort_state(SC)
%sort_state(GA)
%sort_state(FL) OPTION 2: Passing Through "Master" Table One Time Only, Then Sort data NC SC GA FL;
set master;
if state = "NC" then output NC;
if state = "SC" then output SC;
if state = "GA" then output GA;
if state = "FL" then output FL;
run;
%macro sort2(location);
proc sort data = &location.
out = sorted_&location.;
by var1 var2;
run;
%mend;
%sort2(NC)
%sort2(SC)
%sort2(GA)
%sort2(FL) I really appreciate any input/comment you have. -AH
... View more