BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dennis_oz
Quartz | Level 8

Hi ,

I am trying to perform a sort on a dataset on a oracle table by referencing the dataset 

as Oracle_schema.table1

 

*--------------------------------------------------------------------------------------*

data _null_;
if 0 then set DW.Table1 ;
dcl hash hh ( dataset: ' DW.Table1', ordered: 'A', multidata: 'Y') ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'ADMIT_MONTH','PAtientid') ;
hh.DefineData ('FUND_MAJOR_DESC' ,'generic_month','ADMIT_MONTH','PAtientid') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;

hh.output (dataset: "x1234") ;
rc = hi.next () ;
end ;

stop ;
run ;

****------------------------------------------------------------------**

But  it's taking forever to execute . hasn't still completed execution. the table1 has 1.5 million records

 

I in parallel execute a proc sort for the above method but it took only  3 minutes to execute .

Could some one explain why the above hash code won't execute🤔

 

 

proc sort data= DW.Table1
(keep=FUND_MAJOR_DESC generic_month ADMIT_MONTH PAtientid)
out=x1234 ;
by ADMIT_MONTH PAtientid;
run;

 

*---------------------------------------------------------------------------

to confirm my hash code i executed with sashelp.cars and it works , so i presume the syntax is correct 

 

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
dcl hiter hi ( 'hh' ) ;
hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;


do rc = hi.first () by 0 while ( rc = 0 ) ;
hh.output (dataset: "x1") ;
rc = hi.next () ;
end ;
put 13 * '-' ;
stop ;
run ;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @dennis_oz,

 

Unlike the OUTPUT statement in a DATA step, the OUTPUT method of a hash object writes the entire hash table to the dataset specified in the "dataset:" argument tag, not only a single record. That is, you are trying to write the 1.5 million observations (unnecessarily) 1.5 million times to dataset x1234 (which is, of course, much more time-consuming than to create a copy of SASHELP.CARS and to overwrite it 427 times again).

 

So, you don't need the hash iterator object. Remove the DO loop and execute the OUTPUT method only once.

dennis_oz
Quartz | Level 8

do youmean to take out the  hiter statement and remove the do statement like below. 

I have submitted the code and it has been more than 10 minutes. 

 

*----------------------------------------------------------------------*

data _null_;
if 0 then set sashelp.cars ;
dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;

hh.DefineKey ( 'make' ) ;
hh.DefineData ( 'invoice' , 'model','make') ;
hh.DefineDone () ;

hh.output (dataset: "x1") ;
put 13 * '-' ;
stop ;
run ;

FreelanceReinh
Jade | Level 19

Exactly. At least for SASHELP.CARS it should definitely not take 10 minutes, see below.

13   data _null_;
14   if 0 then set sashelp.cars ;
15   dcl hash hh ( dataset: 'sashelp.cars', ordered: 'a',multidata: "Y") ;
16
17   hh.DefineKey ( 'make' ) ;
18   hh.DefineData ( 'invoice' , 'model','make') ;
19   hh.DefineDone () ;
20
21   hh.output (dataset: "x1") ;
22   put 13 * '-' ;
23   stop ;
24   run ;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.X1 has 428 observations and 3 variables.
-------------
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

For a much larger dataset I would not be surprised if PROC SORT was the most efficient approach to sorting.

dennis_oz
Quartz | Level 8
it is working seamleasly .

thank u

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 914 views
  • 2 likes
  • 2 in conversation