BookmarkSubscribeRSS Feed
My_SAS
Calcite | Level 5


HI i having a dataset with 72 variables and having 12lacs observation
i want to sort the data on two variables but do reduce the time i want to do it by hash hey
how can i do it.

proc sort data=temp;
by name date;
run;

9 REPLIES 9
FriedEgg
SAS Employee

Are you confusing a hash join with a type of sort?  Are you wanting to create a new key as a hash value of the concatenation of the name and date fields?  Or something else? A sort of a file with only 72 variables and 1,200,000 observations should be very quick already...  What kind of system are you using?  You could try using the tagsort option.

FriedEgg
SAS Employee

There is a balance in the performance offered by tagsorting.  If you use the tagsort option you remove the ability for the sort to be multithreaded.  So if your performance is hindered by I/O avaiablility for your utility files from the sort, then tagsort is a good option.  If you are not bounded by I/O or disk space for these files then you will likely see hindered sort performance.  You could try allocating additional memory to the sort procedure (see the sortsize option and other options in the memory/performace groups).  On my system, with default memory options a table with 4,000 columns and 1,000,000 obs took about 5 minutes with a lot of system competition.  However if I run the same sort with the tagsort option on it took over 10 minutes.

Ksharp
Super User

You only need to use argument ordered: 'a' in hash table constructor.

and put these name into definekey () .Hash will automatically sort them for you.

declare hash ha1(dataset:'temp',ordered: 'a');

  ha1.definekey('name','age');

Ksharp

My_SAS
Calcite | Level 5

data temp;

input name age;

declare hash ha1(dataset:'temp',ordered: 'a');

  ha1.definekey('name','age');

cards;

23 45

67 89

12 56

run;

proc print;

run;

I have tryed this but it did not worked how can i do it and how can i use the hash key in proc sort:

proc sort data=temp;

by ha1.definekey;

run;

if i have used this iam getting error. can any one help me

FriedEgg
SAS Employee

Good thinking Ksharp, I do believe this is the method the OP was trying to ask about utilizing.  However, I do not think he understands it's use or implementation.

Tom
Super User Tom
Super User

I am really not clear on what you mean by "use the hash key" in proc sort.

Why do you want to sort your data?

Would an index serve your needs?

Also how large is your dataset? What number does 12lacs represent?

art297
Opal | Level 21

Tom,  I can't answer why the OP would want to do this, but I can answer what the number represents: 1.2 million.  See: http://en.wikipedia.org/wiki/Lakh

Ksharp
Super User

You do not use Hash Table correctly.

See the followig code of example.

BTW. I am not sure this method will be more efficient than proc sort;

data temp;
input name age;
cards;
23 45
67 89
12 56
;
run;
data _null_;
*load dataset temp into Hash Table;
if 0 then set temp;
declare hash ha1(dataset:'temp',ordered: 'a');
  ha1.definekey('name','age');
  ha1.definedata('name','age');
  ha1.definedone();
*output Hash Table to a dataset;
  ha1.output(dataset:'want');
run;
*now see the data has already been sort by name age;
proc print data=want;run;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1651 views
  • 2 likes
  • 5 in conversation