BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
theponcer
Quartz | Level 8

I am having trouble getting a fuzzy hash merge to work on dates. The price table is a list of prices that have changed over time. For example, Item A cost $1 from 1/1/2018 to 6/1/2018, and $2 from 6/2/2018 to 12/31/2018. The receipts table is a list of items that were purchased, and the date they were purchased. The code below is attempting to match the two datasets, but it does not appear to be using the key fields to match - instead, it only matches on the date ranges. What am I doing wrong? Let me know if I need to provide more information. 

 

data merge;

if 0 then set prices;

  /* On the first iteration of the step, declare and instantiate   */
  /* the hash object H and hash iterator HITER.  Use the DATASET:  */
  /* argument tag to load WORK.TWO into H. Define the 'lookup key' */
  /* for the hash table H using DEFINEKEY.  Define the KEY's       */
  /* associated values with DEFINEDATA.  The KEY is not            */
  /* automatically output to the data set, so add STARTRANGE as    */
  /* data so it will be included in WORK.OUT.                      */ 
  if _n_=1 then do;
    declare hash h(dataset:'prices');
    h.definekey('location','item','start_date');
    h.definedata('location','price','start_date','end_date','name');
    declare hiter hiter('h');
    h.definedone();

  end;
  set receipts;

  /* Move to the first item in the hash table using the method FIRST()*/
  rc=hiter.first();

  /* For each observation, iterate through the hash table */
  /* until the value of date is between the value of start_date */
  /* and end_date.                                                    */
  do while (rc=0);
    if start_dt le date le end_dt then leave;
    rc=hiter.next();
  end;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

As @novinosrin suggests, this can be done.  But it would be a much simpler program if you would create a format instead of a hash table.  The format would define ranges, such as

 

start = location + item + start date (unformatted)

end = location + item + end date (unformatted)

label = price

 

Not only could you save the format permanently if you wanted to re-use it, but  you could retrieve a price with one long statement (or 2 short statements) in a DATA step.  (Creating the format originally takes a little more work.)

 

If you want more details, just ask.

 

 

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

HI @theponcer  Why don't you plz provide a sample of your INPUT and expected OUTPUT dataset so that somebody can code, test and post the appropriate solution 

theponcer
Quartz | Level 8

 

image.png

 

 

image.png

novinosrin
Tourmaline | Level 20

HI @theponcer  Thanks but can you post that as plain text to make it easy for copy/paste plz

theponcer
Quartz | Level 8

Input

 

 

 

 

Prices

 

 

 

 

Store

Item

Price

From

To

A

1

$1

1/1/2018

6/1/2018

A

1

$2

6/2/2018

12/31/2018

A

2

$10

1/1/2018

12/31/2018

B

1

$0.75

1/1/2018

12/31/2018

B

2

$8.00

1/1/2018

6/1/2018

B

2

$9

6/2/2018

12/31/2018

 

 

 

 

 

Receipts

 

 

 

 

Store

Item

Date

 

 

A

1

3/21/2018

 

 

B

1

12/20/2018

 

 

A

1

7/4/2018

 

 

 

Output

 

 

 

Merge Test

 

 

Store

Item

Date

Price

A

1

3/21/2018

$1

B

1

12/20/2018

$0.75

A

1

7/4/2018

$2

novinosrin
Tourmaline | Level 20

Hi @theponcer  See if this works

 




data Prices;
input Store	$ Item	Price :comma12.2	From :mmddyy10.	To :mmddyy10.;
format Price dollar10.2 From mmddyy10.	To mmddyy10.;
cards;
A	1	$1 	1/1/2018	6/1/2018
A	1	$2 	6/2/2018	12/31/2018
A	2	$10 	1/1/2018	12/31/2018
B	1	$0.75 	1/1/2018	12/31/2018
B	2	$8.00 	1/1/2018	6/1/2018
B	2	$9 	6/2/2018	12/31/2018
;

data Receipts;
input Store $	Item	Date :mmddyy10.;
format Date mmddyy10.;
cards;
A	1	3/21/2018
B	1	12/20/2018
A	1	7/4/2018
;


data want_merge;
if 0 then set prices;
  if _n_=1 then do;
    declare hash h(dataset:'prices',multidata:'y');
    h.definekey('store','item');
    h.definedata('from','to','price');
/*    declare hiter hiter('h');*/
	h.definedone();
 end;
 set Receipts;
 call missing(price);
 do rc=h.find() by 0 while(rc=0);
 if   from le date le to then leave;
 rc=h.find_next();
 end;
 drop rc from to;
 run;
Astounding
PROC Star

As @novinosrin suggests, this can be done.  But it would be a much simpler program if you would create a format instead of a hash table.  The format would define ranges, such as

 

start = location + item + start date (unformatted)

end = location + item + end date (unformatted)

label = price

 

Not only could you save the format permanently if you wanted to re-use it, but  you could retrieve a price with one long statement (or 2 short statements) in a DATA step.  (Creating the format originally takes a little more work.)

 

If you want more details, just ask.

 

 

theponcer
Quartz | Level 8

Awesome! I should've thought of this. Thank you!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1222 views
  • 1 like
  • 3 in conversation