BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Steelers_In_DC
Barite | Level 11

I am using the following code.  Please notice the drop= _d: and keep= _:

 

There are a large amount of columns that are affected by this, this step generates a table with 1999410 observations and 465 variables, that later needs to be sorted.  It is by far the slowest part of my process.  Write now I'm using excel, copy/paste to write out a sql union statement with an order by after, I'm going to test the performance of that in comparison. I am wondering if there is a better way.

 

 

data first_lien_combine_&monyear(drop=_d:);
set First_Lien_Rules_ETL_&propdate (keep= sys loan_number Run_Date reporting_month _: )
    cp_rules_&monyear (keep = sys loan_number Run_Date reporting_month _: );
run;

 

Thank You in Advance,

 

Mark
quit;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You haven't provided enough information for us to help. 

 

You're trying to append a set of tables it sounds like? 

 

The SAS code is very simple - it stacks two datasets together so there isn't much that can be optimized there. Can you presort your small files before appending? It will probably still take the same amount of time overall. 

View solution in original post

5 REPLIES 5
ballardw
Super User

Since you do not show how the _d or _ variables are created it is a tad difficult to make suggestions.

 

You also don't mention what the following sort criteria could be in terms of those indeterminate variables.

Steelers_In_DC
Barite | Level 11

The list of keeps is 480 of this nature:

'_131_All Loans'n,
'_132_All Loans'n,
'_132.1_All Loans'n,
'_133_All Loans'n,
'_133.2_All Loans'n,
'_133.3_All Loans'n,
'_134_All Loans'n,
'_135_All Loans'n,

 

the list of drops is df0 to df38.

 

On a side note, my union all isn't working, the error says:

 

ERROR: Ambiguous reference, column '_14.1_All loans'n is in more than one table.

 

But it's in both tables, it should be.  I copied and pasted the list from excel rather than write out all those variables.  I know it's ambiguous, that's why i'm using union all...  I'm having problems all around.

Reeza
Super User

You haven't provided enough information for us to help. 

 

You're trying to append a set of tables it sounds like? 

 

The SAS code is very simple - it stacks two datasets together so there isn't much that can be optimized there. Can you presort your small files before appending? It will probably still take the same amount of time overall. 

Steelers_In_DC
Barite | Level 11

sorting both datasets before the append, then sorting after made an acceptional difference.  That is going to improve the total run time by a lot.  Thanks very much.  Have a great weekend.  I have another question within the same code but will post in another string.

 

Thanks!

Astounding
PROC Star

I think this is worth a try:

 

data first_lien_combine_&monyear;
set First_Lien_Rules_ETL_&propdate (keep= sys loan_number Run_Date reporting_month _: drop=_d: )
    cp_rules_&monyear (keep = sys loan_number Run_Date reporting_month _:  drop=_d: );
run;

 

The idea is that you are bringing in all the _d variables, never using them, and then dropping them.  While this will need to be tested, I think this will bring in all the _ variables except for the _d variables.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1853 views
  • 0 likes
  • 4 in conversation