Help using Base SAS procedures

Iterate over hash key

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Iterate over hash key

So I have a hash with id and date as its keys. I'd like to use these two keys to find data in "set" dataset. If the data can not be found using id and date, I'd like to see if it can be found using id and date+1. If not then try id and date+2, id and date+3 ... up to date+10. I'm not sure how to achieve this "if then" effect and how to iterate over hash keys. 

 

The dataset is really large, so I was suggested to use Hash rather than proc sql. Any idea?

 

Thanks!!


Accepted Solutions
Solution
‎11-01-2016 07:01 PM
Respected Advisor
Posts: 4,173

Re: Iterate over hash key

[ Edited ]

I'm not really sure that I understand what you have. What I've built on is: You have a very large data set and you have a not so large data set for look-up with {id,date}. You need to determine if an {id, date} combination in your very large data set matches with {id,date} in your lookup table - and you want to use a range from date to date+10 for this check.

 

The following code assumes that your large table is stored as a SAS dataset and not as a database table (else please tell us).

data have_rathersmall_lookup;
  do id=1,4,6;
    do date='25sep2016'd,'01oct2016'd;
      output;
    end;
  end;
  format date date9.;
run;

data have_large;
  do id=1 to 7;
    do date='01sep2016'd to '30oct2016'd;
      output;
    end;
  end;
  format date date9.;
run;

data want;
  set have_large;

  if _n_=1 then
    do;
      /* mapping step to ensure variables used in the hash have been defined */
      if 0 then set have_rathersmall_lookup(keep=id date);
      /* define hash */
      dcl hash h1(dataset:'have_rathersmall_lookup(keep=id date)', duplicate:'r');
      h1.defineKey('id','date');
      h1.defineDone();  
    end;

  found_flg=0;
  do date_inc=0 to 9;
    if h1.check(key:id, key:date+date_inc)=0 then 
      do;
        found_flg=1;
        leave;
      end;
  end;
  if date_inc=10 then call missing(date_inc);

run;

added comments to code for clarity based on follow up question

View solution in original post


All Replies
Solution
‎11-01-2016 07:01 PM
Respected Advisor
Posts: 4,173

Re: Iterate over hash key

[ Edited ]

I'm not really sure that I understand what you have. What I've built on is: You have a very large data set and you have a not so large data set for look-up with {id,date}. You need to determine if an {id, date} combination in your very large data set matches with {id,date} in your lookup table - and you want to use a range from date to date+10 for this check.

 

The following code assumes that your large table is stored as a SAS dataset and not as a database table (else please tell us).

data have_rathersmall_lookup;
  do id=1,4,6;
    do date='25sep2016'd,'01oct2016'd;
      output;
    end;
  end;
  format date date9.;
run;

data have_large;
  do id=1 to 7;
    do date='01sep2016'd to '30oct2016'd;
      output;
    end;
  end;
  format date date9.;
run;

data want;
  set have_large;

  if _n_=1 then
    do;
      /* mapping step to ensure variables used in the hash have been defined */
      if 0 then set have_rathersmall_lookup(keep=id date);
      /* define hash */
      dcl hash h1(dataset:'have_rathersmall_lookup(keep=id date)', duplicate:'r');
      h1.defineKey('id','date');
      h1.defineDone();  
    end;

  found_flg=0;
  do date_inc=0 to 9;
    if h1.check(key:id, key:date+date_inc)=0 then 
      do;
        found_flg=1;
        leave;
      end;
  end;
  if date_inc=10 then call missing(date_inc);

run;

added comments to code for clarity based on follow up question

Occasional Contributor
Posts: 5

Re: Iterate over hash key

[ Edited ]

Hi That's very helpful. But what does the line "if 0 then set want_lookup" mean and what is the want_lookup table? 

 

I tried this code, and it outputs a dataset that includes all the data. What I'd prefer is a dataset that include one indicator of the earlist date it is found. For example, if id and date are found in date0 ,there is no need to check whether it can be found in date 1, or 2 etc, and no need to output that info. Any idea?

Thanks!

Respected Advisor
Posts: 4,173

Re: Iterate over hash key

[ Edited ]

what does the line "if 0 then set want_lookup" mean

 

It should have been "if 0 then set have_rathersmall_lookup(keep=id date);"  That's a mapping step to define variables used by the hash. I've added some comments to the already posted code which hopefully will increase clarity.

 

"What I'd prefer is a dataset that include one indicator of the earlist date it is found."

Best would be if you could post some sample data and then explain the desired result. What would you expect for example if you have records for an ID where the earliest data doesn't match your look-up table? And what would you expect to get for ID's which never have a date in the range?

 

Whatever you need can be done - but it will be quite a bit of coding. To pre-sort your large table would make things much easier. How big is your table really? Is it worth all the extra coding logic.?

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 367 views
  • 0 likes
  • 2 in conversation