BookmarkSubscribeRSS Feed
Lapis Lazuli | Level 10


PROC SORT DATA=lib.data1 OUT=data2;

WHERE mine=1 ;

BY va1 va3 ;



I need your advice to how to improve my code so that it can be faster. I need to run this above code first and then, I need to run about 500 lines of code.

I need to compare the difference in these three situations, when mine=1, 0, or in the whole population.

If I only need to compare the difference between mine=1 and mine=0, it is easy to have a sas macro to do that. But I need to compare these two to the over all population, which means, when I need numbers for the whole population, I need to remove this statement, where mine= ;

any suggestions to use macro here? Thank you

Diamond | Level 26

There's no way that comparing "differences", whatever that means, takes 500 lines of code.


So for us to help you, we'd need to see your code; or for you to explain what comparing "differences" means in a lot more detail.


Also, if the program is taking a long time, that could mean that you have a huge amount of data (not an unusual situation) and so it takes a long time to process. You haven't even told us how much total data you have ... and of course, you haven't been specific in what "faster" means ...

Paige Miller
Super User

What "numbers" do you need? It may well be that one of the report procedures will do what you need. For instance suppose I am looking at the SAS supplied SASHELP.CLASS data set and want to compare the height and weight variables mean and standard deviation statistics and see the overall statistics and for each level of the variable Sex:



proc tabulate data=sashelp.class;
   class sex;
   var height weight;
   table all sex,
         (Height weight)*(mean std);

will have 3 rows of output for the total and the two levels of sex with mean and standard deviations. Or changing the order in the TABLE statement could have the statistics as rows and the Sex and all as columns.


Diamond | Level 26 RW9
Diamond | Level 26

Tip: Post test data in the form of a datastep, and what you want the output to look like. 

There really is not enough information in your post to formulate much of a guess, that being said, it may be that you can transpose your data up and then compare;

data want;
  set have;
  array mineresults{100} 8.;
  retain mineresults:;

Macro language will not speed up anything here.  In fact, it will likely slow things down.  Macro language saves in programming time so when you have a working process for one data set, you can apply it easily to another data set.  But it doesn't speed up the processing of the original data set.


Since you haven't given us any of the 500 lines of code, let me give  you an example using just the PROC SORT code that you supplied.  If you had a macro to generate and sort each subgroup, you would never think along these lines ...


Consider sorting the original data set permanently:


PROC SORT DATA=lib.data1;

BY mine va1 va3 ;



At that point,  you can get statistics for each subgroup including a BY statement (by MINE; ) in the analysis.  Or, you could use:


PROC SORT DATA=lib.data1;

BY va1 va3 ;


data mine0 mine1;

set lib.data1;

if mine=0 then output mine0;

else output mine1;



Now you have only sorted once.  But all three data sets are  in sorted order.


These are the sorts of things you have to plan around to speed things up.  Macros won't help you think through the process.


Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 5 in conversation