Solved: Re: SAS HASH update only blank records

AP718 · Posted 03-01-2021 02:48 PM

Using SAS hash to merge/update records. How can I use hash to only update the records that are missing on my set dataset. Which in the below example will be ID's - 1, 2, 4 and 6? And without setting ID# 3, 5, 7 and 8 to a blank as it's blank on my define dataset (doesn't need an update). Thanks!

ID#	Date
1	.
2	.
3	1/1/2021
4	.
5	7/10/2020
6	.
7	3/1/2019
8	5/2/2017

AP718 · Posted 03-02-2021 09:02 AM

I found the solution by adding a second hash. The first HASH retains the DATE values from Table 2 and the second hash updates only the DATE values on Table 2 where the DATE is not missing. Which is exactly what I wanted. Helps to be passionate or "obsessed" to solve problems and find solutions.

data Want;
drop rc;
length Flag $35 Comment $35;

if _n_=1 then
do;
declare hash lookup(dataset:"Have");
lookup.defineKey("ID");
lookup.defineData("Gender"
,"Code"
,"Class"
,"Flag"
,"Comment");
lookup.definedone();
call missing(Flag,Comment, DATE);

declare hash lookup2(dataset:"Have (where=(not missing(DATE))");

lookup2.defineKey("ID");
lookup2.defineData("Date")
lookup2.definedone();

end;

set Need_Update;

rc=lookup.find();

if rc eq 0 then do;

rc=lookup2.find();

end;

run;

View solution in original post

data_null__ · Posted 03-01-2021 02:59 PM

I'm guessing an IF missing(DATE) then do; UPDATE; end;

Don't you reckon?

AP718 · Posted 03-01-2021 03:07 PM

OKay----where in the hash statement will this be declared? Below is the syntax I am using. Thanks!
data ;
drop rc;
length ;

if _n_=1 then
do;
declare hash lookup(dataset:"");
lookup.defineKey("");
lookup.defineData("");
lookup.definedone();
call missing();
end;
set ;
rc=lookup.find();
run;

data_null__ · Posted 03-01-2021 03:38 PM

   set ;
   if (this obs needs update) then rc=lookup.find();
   run;

You have not provided enough details for me to know what "this obs needs update" actually is. Perhaps

if missing(date) then rc=lookup.find();

AP718 · Posted 03-01-2021 04:14 PM

I have multiple vars I am attempting to update. For the Date var I only want to update where it's missing. Below is the code I am using and the result I get. How can I update all the vars and only the instance where the Date is missing? Thanks

data Want;
drop rc;
length Flag $35 Comment $35;

if _n_=1 then
do;
declare hash lookup(dataset:"Have");
lookup.defineKey("ID");
lookup.defineData("Date"
,"Gender"
,"Code"
,"Class"
,"Flag"
,"Comment");
lookup.definedone();
call missing(Flag,Comment);
end;

set Need_Update;

rc=lookup.find();
run;

Have

ID#	Date	Gender	Code	Class
1	.	F	AB
2	.	F
3	1/1/2021	M		MIDDLE
4	.			MIDDLE
5	7/10/2020	M	AB
6	.	M		HIGH
7	3/1/2019		AB
8	5/2/2017		AB	HIGH

Want

ID#	Date	Gender	Code	Class
1	7/1/2020	F	AB	LOW
2	7/2/2020	F	CD	LOW
3	1/1/2021	M	CD	MIDDLE
4	1/2/2021	F	CD	MIDDLE
5	7/10/2020	M	AB	LOW
6	7/11/2020	M	DE	HIGH
7	3/1/2019	F	AB	HIGH
8	5/2/2017	F	AB	HIGH

What I am getting
ID#	Date	Gender	Code	Class
1	7/1/2020	F	AB	LOW
2	7/2/2020	F	CD	LOW
3		M	CD	MIDDLE
4	1/2/2021	F	CD	MIDDLE
5		M	AB	LOW
6	7/11/2020	M	DE	HIGH
7		F	AB	HIGH
8		F	AB	HIGH

AP718 · Posted 03-01-2021 04:34 PM

Correction Below is what I Have and the output I am getting on my Need_Update dataset

Have
ID#	Date	Gender	Code	Class
1	.	F	AB	LOW
2	.	F	CD	LOW
3	1/1/2021	M	CD	MIDDLE
4	.	F	CD	MIDDLE
5	7/10/2020	M	AB	LOW
6	.	M	DE	HIGH
7	3/1/2019	F	AB	HIGH
8	5/2/2017	F	AB	HIGH

Need_Update
ID#	Date	Gender	Code	Class
1	7/1/2020	F	AB	LOW
2	7/2/2020	F	CD	LOW
3		M	CD	MIDDLE
4	1/2/2021	F	CD	MIDDLE
5		M	AB	LOW
6	7/11/2020	M	DE	HIGH
7		F	AB	HIGH
8		F	AB	HIGH

data_null__ · Posted 03-01-2021 05:27 PM

Which data set is MASTER and which is CHANGE/TRANSACTION?

AP718 · Posted 03-01-2021 08:50 PM

Sorry for the confusion. Let me try again:

Table 1 is the table I am using to update/merge the missing values in Table 2.

Table 3 is what I get as the result. The values from Table 1 replaces the values on Table 2. How can I only update values if missing on Table 2? In this instance ID# 1, 2, 4 and 6 will be retained after the merge.

Table 4 is what I want. Thank you

Table 1
ID#	Date	Gender	Code	Class
1	.	F	AB	LOW
2	.	F	CD	LOW
3	1/1/2021	M	CD	MIDDLE
4	.	F	CD	MIDDLE
5	7/10/2020	M	AB	LOW
6	.	M	DE	HIGH
7	3/1/2019	F	AB	HIGH
8	5/2/2017	F	AB	HIGH

Table 2
ID#	Date	Gender	Code	Class
1	7/1/2020	F	AB
2	7/2/2020	F
3		M		MIDDLE
4	1/2/2021			MIDDLE
5		M	AB
6	7/11/2020	M		HIGH
7			AB
8			AB	HIGH

Table 3
ID#	Date	Gender	Code	Class
1	.	F	AB	LOW
2	.	F	CD	LOW
3	1/1/2021	M	CD	MIDDLE
4	.	F	CD	MIDDLE
5	7/10/2020	M	AB	LOW
6	.	M	DE	HIGH
7	3/1/2019	F	AB	HIGH
8	5/2/2017	F	AB	HIGH

Table 4
ID#	Date	Gender	Code	Class
1	7/1/2020	F	AB	LOW
2	7/2/2020	F	CD	LOW
3	1/1/2021	M	CD	MIDDLE
4	1/2/2021	F	CD	MIDDLE
5	7/10/2020	M	AB	LOW
6	7/11/2020	M	DE	HIGH
7	3/1/2019	F	AB	HIGH
8	5/2/2017	F	AB	HIGH

Tom · Posted 03-02-2021 12:46 AM

I don't understand what HASH objects have to do with this problem.

A simple UPDATE statement should handle it, but you need to reverse how you are thinking about the data. You are not updating TWO with the looked up values from ONE. You are overriding the default values in ONE with the actual values from TWO. So TWO is your transaction file. Any non-missing value replace the value from ONE. And a missing value means leave the value from ONE in place.

ata one ;
  input ID Date :mmddyy. Gender $ Code $ Class $;
  format date yymmdd10.;
cards;
1 . F AB LOW
2 . F CD LOW
3 1/1/2021 M CD MIDDLE
4 . F CD MIDDLE
5 7/10/2020 M AB LOW
6 . M DE HIGH
7 3/1/2019 F AB HIGH
8 5/2/2017 F AB HIGH
;

data two;
  input ID Date :mmddyy. Gender $ Code $ Class $;
  format date yymmdd10.;
cards;
1 7/1/2020 F AB . 
2 7/2/2020 F . .
3        . M . MIDDLE
4 1/2/2021 . . MIDDLE
5        . M AB .  
6 7/11/2020 M .  HIGH
7  .  . AB  .
8  .  . AB HIGH
;
data want ;
  update two one;
  by id;
run;

Results:

AP718 · Posted 03-02-2021 06:56 AM

I am using Table 1 to update the missing values in Table 2 as most of the actual values don't exist in Table 2. But I only want to replace the missing values in Table 2 if missing and retain the values if not missing. I am using hash for processing as it's a large table. Is there a way to do this with a hash statement? Thanks

Kurt_Bremser · Posted 03-02-2021 07:09 AM

Don't be obsessed with a certain tool. See if @Tom's suggestion works. If not, supply data that illustrates where the result deviates from what you expected.

And do supply example data in usable form (data step(s) with datalines), as demonstrated.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

AP718 · Posted 03-02-2021 09:02 AM

I found the solution by adding a second hash. The first HASH retains the DATE values from Table 2 and the second hash updates only the DATE values on Table 2 where the DATE is not missing. Which is exactly what I wanted. Helps to be passionate or "obsessed" to solve problems and find solutions.

data Want;
drop rc;
length Flag $35 Comment $35;

if _n_=1 then
do;
declare hash lookup(dataset:"Have");
lookup.defineKey("ID");
lookup.defineData("Gender"
,"Code"
,"Class"
,"Flag"
,"Comment");
lookup.definedone();
call missing(Flag,Comment, DATE);

declare hash lookup2(dataset:"Have (where=(not missing(DATE))");

lookup2.defineKey("ID");
lookup2.defineData("Date")
lookup2.definedone();

end;

set Need_Update;

rc=lookup.find();

if rc eq 0 then do;

rc=lookup2.find();

end;

run;

Registration is open

SAS Training: Just a Click Away