BookmarkSubscribeRSS Feed
Kayalvizhi
Fluorite | Level 6

SPDE PROC SQL code utilising a large SAS work space. Please suggests me ideas to optimize and use the work space to minimal.

 

-In current scenario, I'm performing PROC SQL join on 36 datasets having 12 columns each.

 

Kindly consider below limitations:

-Without redirecting - work or - utiloc locations

-Without using system commands

 

Please let me know for further details.

9 REPLIES 9
Sajid01
Meteorite | Level 14

@Kayalvizhi 
Proc SQL  joins do have large workspace requirement. Why don't you try using data step. That works row by row.

Kayalvizhi
Fluorite | Level 6
I’m migrating huge set of codes from SAS to SPDE( using HDFS). So I doesn’t want to make much changes to existing code.

Can you please tell me why SPDE SQL using SAS workspace, and why not HDFS workspace.

Is there a way to use HDFS as workspace when using SPDE
SASKiwi
PROC Star

Please post your code so we can see the type of joins you are using. We can't offer much useful advice until we see what you are currently doing.

 

 

Kayalvizhi
Fluorite | Level 6
Proc sql;
Create table spde.new from select
*
<some functions are used>
from spde.temp %do I=1 %to 36;
Left join. Spde.month&i ; %end;
Run
SASKiwi
PROC Star

Without knowing more about the data and the joining, that would work better with appending using the SET statement:

data want;
  set spde.month1 - spde.month36;
run;

 

Kurt_Bremser
Super User

@Kayalvizhi wrote:
Proc sql;
Create table spde.new from select
*
<some functions are used>
from spde.temp %do I=1 %to 36;
Left join. Spde.month&i ; %end;
Run

This code produces only syntax errors.

  • there's a dot where it must not be
  • the semicolon at the end of the first spde.month&i terminates the CREATE TABLE, the others will be invalid
  • there are no ON conditions
  • PROC SQL needs to be terminated with a QUIT, and no RUNs are needed, as SQL statements are always executed immediately.

Please post something that at least resembles your real code, and give us at least some ideas of the datasets involved (variables used for matching, variables contained, match relationships).

LinusH
Tourmaline | Level 20

You can use the SPDEUTILLOC option to specify where SPDE utility files will be stored.

Unclear how this wirks with hdfs implmentaions of SPDE data.

Maybe the environment variable SPDE_HADOOP_WORK_PATH is the way to go: variable https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/engspdehdfsug/n07ieiyvjz2wrxn10n7bs7xupvkc.ht...

 

You can also use SPDESORTSIZE if you have available memory, which will impove performance and lessen the burdon on utility file areas.

Data never sleeps
Kayalvizhi
Fluorite | Level 6

Hi,

 

Thanks!!.

 

>SPDEUTILLOC- Not privileged to use options while invocation/change config file. So unable to use this option.

>SPDESORTSIZE- Increasing spdesortsize uses more/full workspace (which I don't want to happen).

>SPDE_HADOOP_WORK_PATH - I wanted to know more on how it works, because when I give a HDFS location, it still seems to use workspace and not HDFS path.

LinusH
Tourmaline | Level 20

I think the idea is to use as much RAM as possible by tweaking SPDESORTSIZE, hence use less disk space.

I haven't used SPDE in hdfs other than in a test environment (and that was quite a while ago), and I just refer to the documentation.

But since at least I think the documentation does not clarify how different SPDE options work in a hdfs scenario, I suggest that you contact SAS tech support to get advice/clarifications.

Data never sleeps

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 647 views
  • 1 like
  • 5 in conversation