Architecting, installing and maintaining your SAS environment

Stop Hogging all my resource!

Reply
New Contributor
Posts: 2

Stop Hogging all my resource!

I'm running a merge on a rather big dataset. The problem is it sloooows my PC down it becomes annoying.

I've tried using bufsiz and bufno to reduce the amount SAS uses and in desperation set them to 1 but to no discernible difference.

I don't mind if the process takes longer, but is there a way so it only uses 1 GB of RAM?

The two tables are 17,396,594 and 1,763,513 rows and sit on my C Drive.

I'm using SAS Base 9.3 (Windows 7) and I don't want answers offering me a different 'Technical' solution (i.e. SAS Server etc.) but are there any settings I can change in my current environment?

Thanks for any help offered!

/* adding sequence count to records - in case incorrect #'s were given */

data clean.UPDT UCST;

  merge clean.UPDT (in=a)

  SEQ_LST (in=b rename=SEQUENCE_COUNT=LSEQ);

  by CUST_ID TMP_CUST;

  if SEQUENCE_COUNT < LSEQ then output UCST;

  else output clean.UPDT;

run;

Super User
Super User
Posts: 7,395

Re: Stop Hogging all my resource!

If its just a space issue, consider splitting the two datasets up, e.g. take half the distinct custids, and merge those, then take the other etc

proc sql;

  create table FIRST_BLOCK as

  select  A.*, 

          B.MAKE as B_MAKE

  from    SASHELP.CARS A

  left join SASHELP.CARS B

  on      A.MAKE=B.MAKE

  where   A.MAKE in (select MAKE from (select distinct MAKE from SASHELP.CARS) where MONOTONIC() <= 10);

quit;

Super User
Posts: 6,930

Re: Stop Hogging all my resource!

You're just I/O bound, mainly because:

- your disk is not suited for this kind of work

- all your files reside on the same disk, causing the disk heads to permanently jump between the physical positions of the infiles and the outfiles.

- on top of that, you may be the victim of disk fragmentation, causing I/O to slow down even more.

You can alleviate this by getting a faster disk (preferably SSD, that eliminates all kinds of latencies) or moving to several disks and allocating the libraries on physically separate drives.

You can also take a look at the hash object method. This lets you read the smaller file in one sweep before reading through the larger file. But you still have disk contention between reading and writing,

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 6,930

Re: Stop Hogging all my resource!

And oh yeah, doing a defragmentation of your C: drive may help a little.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
New Contributor RCW
New Contributor
Posts: 4

Re: Stop Hogging all my resource!

I'm inclined to agree with KurtBremser, but another possibility is to try UPDATE (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition).

In addition to his suggestions, look at the location of your temporary storage, and clean out any residues directories there.

Trouble with the SQL is that it's memory intensive - good chance you'll run into memory problems there., since you mention the 1G limitation, the way to go is DATA step.

Other possibilities: have you checked COMPRESS= data set option (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition) - depending on your data set you might squeeze space out there. Also the length of your variables (SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition). They won't make a difference for temporary data sets (WORK) - SAS expands thing during processing (or maybe that's my 'experience' showing), but will make a difference in your permanent data set.

Hope these suggestions help

Regular Contributor
Posts: 213

Re: Stop Hogging all my resource!

Hi danny,

If you want to restrict SAS to only use 1 GB of RAM, then use the -memsize 1GB option within your SAS Windows shortcut. I'm guessing that's how you launch/start your SAS session.

If you cannot create unique index for your smaller data set "SEQ_LST" (1,763,513) on CUST_ID TMP_CUST, then I would suggest looking into using Hash Object for as alternative. There are several papers showing how to do that. Check www.lexjansen.com

Hope this helps,

Ahmed

New Contributor RCW
New Contributor
Posts: 4

Re: Stop Hogging all my resource!

I just got home and started looking at your code again and a couple of things occurred to me.

In particular,

     if SEQUENCE_COUNT < LSEQ then output UCST

     else output clean.UPDT;


might have a problem by not checking for sequence_count > lseq

Another thing, you've got IN= set:


     merge clean.UPDT (in=a)  

     SEQ_LST (in=b rename=SEQUENCE_COUNT=LSEQ);

but you never test it, so is it possible that records are coming in from seq_lst when there is no corresponding updt?

In reference to the memory issue, unless told otherwise, in a data step SAS does it I/O directly - i.e. it reads a record(s) from disk, processes it, and writes to disk. There are exceptions (hash tables for example and Data set options that keep it loaded), but those are exceptions. It helps to remember that SAS was written for mainframes using punch cards (I know this personally). Memory was barely measured in K (hence the reference to K in some of the options), let alone M or G. As I stated before, its PROC SQL that tends to hog

memory.

Another possibility that updates a data set directly, hence is more dangerous, is the MODIFY statement.

http://support.sas.com/documentation/cdl/en/lestmtsref/63323/HTML/default/viewer.htm#n0g9jfr4x5hgsfn...

Go through a couple of examples and back up you data.

Super User
Super User
Posts: 6,499

Re: Stop Hogging all my resource!

Probably because you are reading and writing to the same disk that Windows is using as its swap disk.

Get another disk drive.

New Contributor
Posts: 2

Re: Stop Hogging all my resource!

Thanks all for suggestions.

The issue is the RAM that it's using so another drive won't help (i.e. I could use our network drive but that will be even slower).

I was hoping MEMSIZE would resolve my problems - (documentation isn't clear wither this is hard disk memory or RAM). But it said it was unable to process due the limitation!

Super User
Posts: 6,930

Re: Stop Hogging all my resource!

What exactly was the ERROR message? Please post the log including the code that caused the ERROR.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 9 replies
  • 556 views
  • 0 likes
  • 6 in conversation