BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dennis_oz
Quartz | Level 8

Hi ,

I am trying to perform a sort on a dataset on a oracle table by referencing the dataset 

as Oracle_schema.table1

 

*--------------------------------------------------------------------------------------*

data _null_;
if 0 then set DW.Table1 ;
dcl hash hh ( dataset: ' DW.Table1', ordered: 'A', multidata: 'Y') ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'ADMIT_MONTH','PAtientid') ;
hh.DefineData ('FUND_MAJOR_DESC' ,'generic_month','ADMIT_MONTH','PAtientid') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;

hh.output (dataset: "x1234") ;
rc = hi.next () ;
end ;

stop ;
run ;

****------------------------------------------------------------------**

But  it's taking forever to execute . hasn't still completed execution. the table1 has 1.5 million records

 

I in parallel execute a proc sort for the above method but it took only  3 minutes to execute .

Could some one explain why the above hash code won't execute🤔

 

 

proc sort data= DW.Table1
(keep=FUND_MAJOR_DESC generic_month ADMIT_MONTH PAtientid)
out=x1234 ;
by ADMIT_MONTH PAtientid;
run;

 

*---------------------------------------------------------------------------

to confirm my hash code i executed with sashelp.cars and it works , so i presume the syntax is correct 

 

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;
hh.output (dataset: "x1") ;
rc = hi.next () ;
end ;
put 13 * '-' ;
stop ;
run ;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @dennis_oz,

 

Unlike the OUTPUT statement in a DATA step, the OUTPUT method of a hash object writes the entire hash table to the dataset specified in the "dataset:" argument tag, not only a single record. That is, you are trying to write the 1.5 million observations (unnecessarily) 1.5 million times to dataset x1234 (which is, of course, much more time-consuming than to create a copy of SASHELP.CARS and to overwrite it 427 times again).

 

So, you don't need the hash iterator object. Remove the DO loop and execute the OUTPUT method only once.

dennis_oz
Quartz | Level 8

do youmean to take out the  hiter statement and remove the do statement like below. 

I have submitted the code and it has been more than 10 minutes. 

 

*----------------------------------------------------------------------*

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;

hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;

hh.output (dataset: "x1") ;
put 13 * '-' ;
stop ;
run ;

FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

dennis_oz
Quartz | Level 8
it is working seamleasly .

thank u

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 617 views
  • 2 likes
  • 2 in conversation