BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sebster24
Quartz | Level 8

Hello,

I am trying to create a hash code, which does the following:

1) A table lookup with output showing matches,

2) Provide a count of the output table (duplicate and nonduplicate) and store value into a variable,

I have provided the code in proc sql format which works. THis is the output that i want.

I inserted the hash code as well, for any corrections/ changes.

I am really keen on learning Hash programming - would u recommend any website/ books with examples?

All help would be deeply appreciated.

 

my proc sql code that works:

data sat;
  input client $;
  datalines;
531 
620
531
908
620
143
 run;

 data main;
  input keys $;
  datalines;
530 
620
532
909
620
142
 run;

proc sql;
create table duplicates as 
select keys
from main
where keys in (select distinct client from sat);
select count(1) into :dupcnt
from duplicates;
run;


proc sql;
create table nonduplicates as 
select distinct keys
from main
where keys in (select distinct client from sat);
select count(1) into :nondupcnt
from nonduplicates;
run;

%Put Duplicate count: &dupcnt.;
%Put nonduplicate count: &nondupcnt.;

 

 

 

 

My Hash code is as below: (not working)

data duplicates nonduplicates (keep=client);
if _n_=1 then
		do;
		if 0 then set sat;
			declare hash w (dataset:"work.sat");
			rc=w.definekey('client');
			rc=w.definedata('client');
			rc=w.definedone();
		  end;
set work.main;
	rc= w.find(key:keys);
	if rc eq 0 then output duplicates;
	else output nonduplicates;

run;

many thanks.

 

regards,

Sebastian

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Here are some sample code based on your data.

data sat;
	input client $;
	datalines;
531 
620
531
908
143
 run;

data main;
	input keys $;
	datalines;
530 
620
532
909
620
142
 run;

data _null_;
	if _n_=1 then
		do;
			dcl hash sat(dataset:'work.sat(keep=client)', multidata:'y');
			sat.definekey('client');
			sat.definedone();
			length client $ 3;
			dcl hash uni();
			uni.definekey('keys');
			uni.definedone();
		end;

	set main end=last;

	if sat.check(key:keys)=0 then
		do;
			rc=uni.check();

			if rc ne 0 then
				do;
					unique+1;
					rc=uni.add();
				end;

			duplicate+1;
		end;

	if last then
		do;
			call symputx('dupcnt',duplicate);
			call symputx('nondupcnt',unique);
		end;
run;

%Put Total Matched count: &dupcnt.;
%Put Total Matched Distinct count: &nondupcnt.;

Notice that I have used "Total Matched count" and "Total Matched Distinct count" at the final output. Dup or nondup is a bit confusing. Good Luck and Happy Learning! 

 

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Its not clear to me why you are going to all that trouble.  This will create three tables, both has matches, p_sat has those only in sat, p_main only those in p_main.  You can then run freq's and things off this data to get counts?

data sat;
  input client $;
  datalines;
531 
620
531
908
620
143
;
run;

 data main;
  input keys $;
  datalines;
530 
620
532
909
620
142
;
run;

proc sort data=sat;  
  by client;
run;
proc sort data=main (rename=(keys=client));
  by client;
run;

data both p_sat p_main;
  merge sat (in=sat) main (in=main);
  by client;
  if sat and main then output both;
  else if sat then output p_sat;
  else output p_main;
run;
sebster24
Quartz | Level 8

Thank you for your reply RW9.

You are right in your post. However, I have multiple massive datasets, with require long waiting times. I could brew my coffee between each session 🙂

After browsing a few forums and doing a bit of reading, i found HASH tables to be a quicker solution.

I also, wanted to incoporate a min() and max() function to show the min max values of a given variable, with some further´if statements, but want to solve this one step at a time by myself, after i understand the basic code.

 

Steelers_In_DC
Barite | Level 11

An inexpensive book with examples is SAS Hash Object Programming Made Easy by Michele M. Burlew.  I am not a hash expert and am just learning myself but it's been helpful.

 

Best,

 

Mark

sebster24
Quartz | Level 8

THank you Mark. I have already done the purchase on Amazon 🙂

Haikuo
Onyx | Level 15

Here are some sample code based on your data.

data sat;
	input client $;
	datalines;
531 
620
531
908
143
 run;

data main;
	input keys $;
	datalines;
530 
620
532
909
620
142
 run;

data _null_;
	if _n_=1 then
		do;
			dcl hash sat(dataset:'work.sat(keep=client)', multidata:'y');
			sat.definekey('client');
			sat.definedone();
			length client $ 3;
			dcl hash uni();
			uni.definekey('keys');
			uni.definedone();
		end;

	set main end=last;

	if sat.check(key:keys)=0 then
		do;
			rc=uni.check();

			if rc ne 0 then
				do;
					unique+1;
					rc=uni.add();
				end;

			duplicate+1;
		end;

	if last then
		do;
			call symputx('dupcnt',duplicate);
			call symputx('nondupcnt',unique);
		end;
run;

%Put Total Matched count: &dupcnt.;
%Put Total Matched Distinct count: &nondupcnt.;

Notice that I have used "Total Matched count" and "Total Matched Distinct count" at the final output. Dup or nondup is a bit confusing. Good Luck and Happy Learning! 

 

sebster24
Quartz | Level 8

THank you for this solution Haikuo. Its friday, and more time to crack some extra SAS features 🙂

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1104 views
  • 1 like
  • 4 in conversation