BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sebster24
Quartz | Level 8

Hello,

I am trying to create a hash code, which does the following:

1) A table lookup with output showing matches,

2) Provide a count of the output table (duplicate and nonduplicate) and store value into a variable,

I have provided the code in proc sql format which works. THis is the output that i want.

I inserted the hash code as well, for any corrections/ changes.

I am really keen on learning Hash programming - would u recommend any website/ books with examples?

All help would be deeply appreciated.

 

my proc sql code that works:

data sat;
  input client $;
  datalines;
531 
620
531
908
620
143
 run;

 data main;
  input keys $;
  datalines;
530 
620
532
909
620
142
 run;

proc sql;
create table duplicates as 
select keys
from main
where keys in (select distinct client from sat);
select count(1) into :dupcnt
from duplicates;
run;


proc sql;
create table nonduplicates as 
select distinct keys
from main
where keys in (select distinct client from sat);
select count(1) into :nondupcnt
from nonduplicates;
run;

%Put Duplicate count: &dupcnt.;
%Put nonduplicate count: &nondupcnt.;

 

 

 

 

My Hash code is as below: (not working)

data duplicates nonduplicates (keep=client);
if _n_=1 then
		do;
		if 0 then set sat;
			declare hash w (dataset:"work.sat");
			rc=w.definekey('client');
			rc=w.definedata('client');
			rc=w.definedone();
		  end;
set work.main;
	rc= w.find(key:keys);
	if rc eq 0 then output duplicates;
	else output nonduplicates;

run;

many thanks.

 

regards,

Sebastian

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Here are some sample code based on your data.

data sat;
	input client $;
	datalines;
531 
620
531
908
143
 run;

data main;
	input keys $;
	datalines;
530 
620
532
909
620
142
 run;

data _null_;
	if _n_=1 then
		do;
			dcl hash sat(dataset:'work.sat(keep=client)', multidata:'y');
			sat.definekey('client');
			sat.definedone();
			length client $ 3;
			dcl hash uni();
			uni.definekey('keys');
			uni.definedone();
		end;

	set main end=last;

	if sat.check(key:keys)=0 then
		do;
			rc=uni.check();

			if rc ne 0 then
				do;
					unique+1;
					rc=uni.add();
				end;

			duplicate+1;
		end;

	if last then
		do;
			call symputx('dupcnt',duplicate);
			call symputx('nondupcnt',unique);
		end;
run;

%Put Total Matched count: &dupcnt.;
%Put Total Matched Distinct count: &nondupcnt.;

Notice that I have used "Total Matched count" and "Total Matched Distinct count" at the final output. Dup or nondup is a bit confusing. Good Luck and Happy Learning! 

 

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Its not clear to me why you are going to all that trouble.  This will create three tables, both has matches, p_sat has those only in sat, p_main only those in p_main.  You can then run freq's and things off this data to get counts?

data sat;
  input client $;
  datalines;
531 
620
531
908
620
143
;
run;

 data main;
  input keys $;
  datalines;
530 
620
532
909
620
142
;
run;

proc sort data=sat;  
  by client;
run;
proc sort data=main (rename=(keys=client));
  by client;
run;

data both p_sat p_main;
  merge sat (in=sat) main (in=main);
  by client;
  if sat and main then output both;
  else if sat then output p_sat;
  else output p_main;
run;
sebster24
Quartz | Level 8

Thank you for your reply RW9.

You are right in your post. However, I have multiple massive datasets, with require long waiting times. I could brew my coffee between each session 🙂

After browsing a few forums and doing a bit of reading, i found HASH tables to be a quicker solution.

I also, wanted to incoporate a min() and max() function to show the min max values of a given variable, with some further´if statements, but want to solve this one step at a time by myself, after i understand the basic code.

 

Steelers_In_DC
Barite | Level 11

An inexpensive book with examples is SAS Hash Object Programming Made Easy by Michele M. Burlew.  I am not a hash expert and am just learning myself but it's been helpful.

 

Best,

 

Mark

sebster24
Quartz | Level 8

THank you Mark. I have already done the purchase on Amazon 🙂

Haikuo
Onyx | Level 15

Here are some sample code based on your data.

data sat;
	input client $;
	datalines;
531 
620
531
908
143
 run;

data main;
	input keys $;
	datalines;
530 
620
532
909
620
142
 run;

data _null_;
	if _n_=1 then
		do;
			dcl hash sat(dataset:'work.sat(keep=client)', multidata:'y');
			sat.definekey('client');
			sat.definedone();
			length client $ 3;
			dcl hash uni();
			uni.definekey('keys');
			uni.definedone();
		end;

	set main end=last;

	if sat.check(key:keys)=0 then
		do;
			rc=uni.check();

			if rc ne 0 then
				do;
					unique+1;
					rc=uni.add();
				end;

			duplicate+1;
		end;

	if last then
		do;
			call symputx('dupcnt',duplicate);
			call symputx('nondupcnt',unique);
		end;
run;

%Put Total Matched count: &dupcnt.;
%Put Total Matched Distinct count: &nondupcnt.;

Notice that I have used "Total Matched count" and "Total Matched Distinct count" at the final output. Dup or nondup is a bit confusing. Good Luck and Happy Learning! 

 

sebster24
Quartz | Level 8

THank you for this solution Haikuo. Its friday, and more time to crack some extra SAS features 🙂

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2382 views
  • 1 like
  • 4 in conversation