Re: SAS technique to optimize the memory issue

Sourav_sas · Posted 03-23-2018 08:14 AM

Hi

I am looking for some optimize techniques to solve the memory issue in Workspace library.

I am using couple of techniques like

1.Option Compress = Yes;

2.DROP in SQL statement.

Anyone has any other way pls let me know.

Regards

Sourav

Kurt_Bremser · Posted 03-23-2018 08:33 AM

First of all, the WORK library is not a memory issue, but a disk storage issue.

Since there are a gazillion possible reasons for an overflow in the WORK library, you need to be much more specific.

Which operation fails, and at which part of the code? Post the log.

If the intent of your code is not clearly discernable from the code you post, add a description of the task. May well be that another method to achieve your goal needs less resources.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Criptic · Posted 03-23-2018 08:42 AM

Just to add what @Kurt_Bremser said:
Try and look at the % of storage used of the work directory and if it hits 100% than you have to increase your storage or decrease your data.
If you are on AIX you can use df -g .

Sourav_sas · Posted 03-24-2018 05:39 AM

Hey Kurt

Thanks for your observation here.

I am giving scenario here.

I am using this code inside a report

PROC SQL;
		CREATE TABLE WORK.TECH AS 
			SELECT 	a.*						,
				B.IP_SICSID  				,
				b.MAIN_CURRISOALPHA3_ID		,
			 	CASE WHEN b.LEVEL_OF_BUS_ID = "IAB" THEN "inward"	
					 WHEN b.LEVEL_OF_BUS_ID = "ORP" THEN "outward"
	                 ELSE '' 
				END as inw_outw format $10. ,
			 	b.BUS_ID					,
			    b.TYPE_OF_BUS_ID			,
			 	b.BUS_TTL					,
				%IF &PROFIT_CENTER = Y %THEN %DO;
					b.PRFT_CENT_TTL         ,
				%END;
		/* DWH-2811 Change starts */
				%IF &INSURED_ID_TTL = Y %THEN %DO;
					b.INS_ID                ,
					b.INS_TTL               ,
				%END;
				%IF &IP_FROM_TO = Y %THEN %DO;
					b.IP_FROM               ,
					b.IP_TO                 ,
				%END;
				b.PAYM_PARTN_ID             ,
				b.PAYM_PARTN_TTL            ,
				b.ACCOUNT_GROUP_ID          ,
				b.LEVEL_OF_BUS_ID 			,
				b.BUS_PARTN_ID_BOOK			,
				b.ORIG_CDNT_ID				,
				b.ORIG_CDNT_TTL 			,
				b.ORIG_CDNT_COUNTRY			 
			FROM WORK.AGING_TECHNICAL a INNER JOIN 
				WORK.IP_SECT_SMALL b on
				a.SECTION_SICSID=b.SECT_SICSID 
	;
	QUIT;

This code is part of the report.

After running this part I ma getting observation with maximum selection as

We don't have much space/memory in DEV environment Workspace. For that it is giving us error as

In the whole report I have many other joins like the above there also observations like millions of. I am thinking if I will go for data step instead of SQL join, but there data will not be same, because merging and inner join both are not same... For that I am looking for some optimized method.. I can not remove the variables also, because we need in the report minimum 50 column. It will be varied as per the selection, sometime more than 60.

Please suggest me some way according to you.. If we can do something code except increase the space technically, then it will be great.

I want try the best level I can..

Thanks

Regards

Sourav

SASKiwi · Posted 03-24-2018 06:01 AM

What about using a permanent SAS library instead of WORK for some of your datasets - is that an option or are you space-limited there as well?

Kurt_Bremser · Posted 03-24-2018 08:26 AM

I see no dataset tech_m in your code, but in the log. Please post code and log that correspond to each other, or we will unnecessarily be forced to make guesses.

From the log, it looks as if there's a one-to-many relationship, which would allow the use of sort and data steps. These are usually less resource-hungry.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Astounding · Posted 03-24-2018 09:36 AM

Based on your sample program, here's one way to proceed. Break up the query into two steps. In the first step (saved as a view instead of a table), subset source B down to just the records and fields that will be needed. Then use a second step to join with A. The first step might look like this:

CREATE view TECH AS 
			SELECT 	
				IP_SICSID  				,
				MAIN_CURRISOALPHA3_ID		,
			 	CASE WHEN LEVEL_OF_BUS_ID = "IAB" THEN "inward"	
					 WHEN LEVEL_OF_BUS_ID = "ORP" THEN "outward"
	                 ELSE '' 
				END as inw_outw format $10. ,
			 	BUS_ID					,
			    TYPE_OF_BUS_ID			,
			 	BUS_TTL					,
				%IF &PROFIT_CENTER = Y %THEN %DO;
					PRFT_CENT_TTL         ,
				%END;
		/* DWH-2811 Change starts */
				%IF &INSURED_ID_TTL = Y %THEN %DO;
					INS_ID                ,
					INS_TTL               ,
				%END;
				%IF &IP_FROM_TO = Y %THEN %DO;
					IP_FROM               ,
					IP_TO                 ,
				%END;
				PAYM_PARTN_ID             ,
				PAYM_PARTN_TTL            ,
				ACCOUNT_GROUP_ID          ,
				LEVEL_OF_BUS_ID 			,
				BUS_PARTN_ID_BOOK			,
				ORIG_CDNT_ID				,
				ORIG_CDNT_TTL 			,
				ORIG_CDNT_COUNTRY			 
			FROM 	WORK.IP_SECT_SMALL 
				where SECT_SICSID in 
                                (select distinct SECTION_SICSID from WORK.AGING_TECHNICAL) 
	;

Then combining with A might be an inner join or might be simpler, along the lines of:

create table want as select * from TECH a, WORK.AGING_TECHNICAL b where a,SECTION_SICSID = b.SECT_SICSID;

Obviously this is untested. If the data set relative sizes are different, you might reverse the subsetting and use step 1 to select only the records from A that will be needed.

Patrick · Posted 03-24-2018 08:27 PM

@Sourav_sas

My first question would be: Why are you running out of Work/Utility space? Is there a quota set for you as user? Or is Work eventually cluttered with leftovers from crashed EG sessions? Have you already contacted your SAS admin to try and get the workspace you need.

As for your code: What's the size of the two tables (in GB). If one of the two tables is small enough to fit into memory then you could go for a hash lookup and avoid sorting at all.

~~Another option could be to create indices for the two tables you want to join. I believe that could avoid sorting as well.~~

Tried it. Didn't make a difference in my tests.

Astounding · Posted 03-23-2018 08:54 AM

For one-time-use-data-sets, create a view instead of a data set.

SASKiwi · Posted 03-23-2018 06:45 PM

Another option is to cleanup as you go, assuming you don't need any of the earlier data:

proc datasets library = work kill;
run;

At the end of the day these options don't help if you have way too little SAS WORK space to start with.

ChrisBrooks · Posted 03-23-2018 08:34 PM

Only keep variables that you absolutely MUST have in your data set

LinusH · Posted 03-25-2018 12:59 AM

Ny te cenys (of course including all the good suggestions from our fellow community users):

1. Go through your whole set of code and see if you can do it smarter, vs budiness requirements. It's near impossible for us to help you here, partly because you didn't post the code to create the input WORK tables.

2. Buy more disk! It's very cheap and would probably repay quickly comparing with your efforts and report delivery interruptions.

Data never sleeps

Registration is open

SAS Training: Just a Click Away