BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dennis_oz
Quartz | Level 8

Hi ,

I am trying to perform a sort on a dataset on a oracle table by referencing the dataset 

as Oracle_schema.table1

 

*--------------------------------------------------------------------------------------*

data _null_;
if 0 then set DW.Table1 ;
dcl hash hh ( dataset: ' DW.Table1', ordered: 'A', multidata: 'Y') ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'ADMIT_MONTH','PAtientid') ;
hh.DefineData ('FUND_MAJOR_DESC' ,'generic_month','ADMIT_MONTH','PAtientid') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;

hh.output (dataset: "x1234") ;
rc = hi.next () ;
end ;

stop ;
run ;

****------------------------------------------------------------------**

But  it's taking forever to execute . hasn't still completed execution. the table1 has 1.5 million records

 

I in parallel execute a proc sort for the above method but it took only  3 minutes to execute .

Could some one explain why the above hash code won't execute🤔

 

 

proc sort data= DW.Table1
(keep=FUND_MAJOR_DESC generic_month ADMIT_MONTH PAtientid)
out=x1234 ;
by ADMIT_MONTH PAtientid;
run;

 

*---------------------------------------------------------------------------

to confirm my hash code i executed with sashelp.cars and it works , so i presume the syntax is correct 

 

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;
hh.output (dataset: "x1") ;
rc = hi.next () ;
end ;
put 13 * '-' ;
stop ;
run ;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @dennis_oz,

 

Unlike the OUTPUT statement in a DATA step, the OUTPUT method of a hash object writes the entire hash table to the dataset specified in the "dataset:" argument tag, not only a single record. That is, you are trying to write the 1.5 million observations (unnecessarily) 1.5 million times to dataset x1234 (which is, of course, much more time-consuming than to create a copy of SASHELP.CARS and to overwrite it 427 times again).

 

So, you don't need the hash iterator object. Remove the DO loop and execute the OUTPUT method only once.

dennis_oz
Quartz | Level 8

do youmean to take out the  hiter statement and remove the do statement like below. 

I have submitted the code and it has been more than 10 minutes. 

 

*----------------------------------------------------------------------*

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;

hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;

hh.output (dataset: "x1") ;
put 13 * '-' ;
stop ;
run ;

FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

dennis_oz
Quartz | Level 8
it is working seamleasly .

thank u

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 639 views
  • 2 likes
  • 2 in conversation