Hi,
I have two datasets, one of which is my main dataset and the other of which is essentially a lookup table. The main dataset is at the person/date level and has ID, date, the key variable that I want to connect to the lookup table, and a categorical variable. The lookup table has a variable matching the key variable (though it has a different name) and the variable that I want to pull in to give me information about the key variable.
Main Dataset
ID Date Key_var Categ_var
1 4/5/15 123 A
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A
Lookup Table:
Lookup_key_var Oth_var
123 abc
456 def
789 ghi
The issue is that while I want to keep all of the rows in the main dataset, I only want to join to the lookup table those rows that have a value of A for the categorical variable (other rows have other values, including missing, for this variable).
Any help is much appreciated.
Use a format with conditional assignment
if cat ne 'A' then other_var=put(lookup, $char_fmt.);
Please show what you expect the output to be. There are a couple of different solutions depending on what that output looks like.
The output dataset would look like this:
Main Dataset
ID Date Key_var Categ_var
1 4/5/15 123 A
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A
Lookup Table:
Lookup_key_var Oth_var
123 abc
456 def
789 ghi
Combined dataset:
ID Date Key_var Categ_var Oth_var
1 4/5/15 123 A abc
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A ghi
In other words, exactly the same as the Main dataset except that for rows where Categ_var=A, the lookup table has provided values of Oth_var (linked to values of Key_var).
A left join would do the trick and IMHO the easiest to code.
You need a CASE statement in the Join I think, wouldn't you?
proc sql;
create table want as
select Main.*, lookup.oth_var
from main left join lookup on main.key_var = lookup.key_var;
quit;
How does that take into account when Categ_var=B?
proc sql;
create table want as
select Main.*, lookup.oth_var
from main left join lookup on main.key_var = lookup.key_var and categ_var = 'A';
quit;
Thanks
My pleasure
If the second data set is a reasonable size and stable build a table lookup with Proc Format and use the statement that Reeza suggested.
Proc format; value $char_fmt
‘123’=’abc’ ‘456’=’def’ ‘789’=’ghi’;
if cat_var =’A’ then oth_var=put(key_var,$char_fmt.); else oth_var=’ ‘;
Hi, a more prolix approach than the SQL idea (assumes variable in LOOKUP is named same as in MAIN, i.e. KEY_VAR)...
data main;
retain oth_var 'xxx';
declare hash h(dataset: "lookup");
h.defineKey('key_var');
h.defineData('oth_var');
h.definedone();
do until (last);
set main end=last;
if categ_var eq 'A' then h.find();
output;
call missing(oth_var);
end;
stop;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.