I'm new to SAS and have a problem that I would solve using SQL cursors in other languages but can't find any reference to them in SAS. Does SAS have a similar concept or can anyone recommend another technique to solve the following.
I have 2 datasets; the first containing a number of customer-related variables including UK postcode and the second containing a list of postcode changes (the UK post office does change geographical post codes from time to time). The postcode on the customer file is that recorded when the customer first registered. This could, subsequently, have changed and we need to run some analysis on current postcodes.
The second file contains 3 variables; old post code, new post code and date of change. It's possible for a postcode to change multiple times so postcode A could become postcode B, which could then later change to postcode C. In such a case we'd want to record the current postcode of any customer originally registered at postcode A as being postcode C.
I'd like to be able to take each record on the customer file in turn and walk the postcode changes file to be able to track the chain of changes made for that customer's postcode.
The only way of doing this that I've found so far is to have a series of iterative joins but there has to be a more efficient way of doing this.
Based on the work presented in SUGI paper 232-31, "Crafting your own index" (http://www2.sas.com/proceedings/sugi31/232-31.pdf), the following approach may work:
/*Let's create some dummy data:*/
/*1. Postcode changes */
input dt_chg date7. pc $ pc_new $;
format dt_chg date.;
01Jan08 A11AA A11AB
01Jan08 B11AA B11AB
01Jan09 A11AB A11AC
/*2. Customer addresses */
input nm $ dt_reg date7. pc $;
format dt_reg date.;
Smith 01Aug07 A11AA
Jones 01Aug08 A11AB
Brown 01Aug08 A11AA
Dixon 01Aug08 B11AA
/* Now we index the postcode change file on old post code */
create index pc on pc_chgs(pc);
/* Now the tricksy bit */
/* Use a data step with two input data sets: cust and pc_chgs */
/* The key= option to specify that pc_chgs is a lookup table */
/* Uncomment "and dt_reg lt dt_chg", below, if postcodes are re-used */
data cust_chgs(keep=nm dt_reg dt_chg pc_old pc_new);
set cust (rename = (pc = _pc));
length pc $8.;
if _pc = lag(_pc) then do;
pc = 'ZZZ';
_error_ = 0;
pc = _pc;
do _iorc_ = 0 by 0
until ( _iorc_ ne 0);
if _iorc_ eq 0 /* and dt_reg lt dt_chg */ then do;
pc_old = pc;
pc = pc_new;
else _error_ = 0;
fetch: set pc_chgs
This should give you three or six records, depending on whether you allow postcodes to be recycled.
Base SAS doesn't currently have cursor support. As far as I know, none of the non-base SAS products have it either, but I'm not certain about that. There may be something similar to cursors in a future version (something called DS2, but I haven't heard anything about that recently).
Many tasks that are sometimes solved with cursors can also be solved with joins, and that's the usual SAS approach. That doesn't work, though, if you need recursion or other iterative processing, and SAS's version of SQL doesn't support the standard recursion operators.
JackHamilton - I suspected that this was the case and had coded a recursive select similar to the example that you have provided, if nowhere near as elegant.
ProcMe - the lookup table approach seems a much more efficient way of tackling this problem and I've managed to tweak your code to get it to do exactly what I want. However, I'm not comfortable using this yet as I'm not sure I completely understand what's going on in a couple of places.
The main point of confusion is in checking the value of _pc against that in the previous observation. At first sight I couldn't figure out why this was necessary but then I attempted to write a modified version of your program using data where the customer postcode was the same in consecutive records and noticed that the value of _iorc_ was not zero for the second record. Is this "if" statement a way of preventing this by setting to a junk value and performing a lookup on this? Do you know why this is necessary?
Well done getting it to work, you're spot on: we need to set to a junk value to reset the index search - the SUGI paper covers this in detail and explains it better than I can.
Transferring from SQL to SAS can be frustrating until you start figuring out that SAS has better - or at least different - ways of doing things. One tool I found really helpful was the archive of SUGI papers. If you search for Proc SQL papers you'll find quite a wealth of information. Most of them are written from the point of view of introducing SAS coders to SQL, but Ifound they work quite well the other way round.