Hi,
I have two datasets, one of which is my main dataset and the other of which is essentially a lookup table. The main dataset is at the person/date level and has ID, date, the key variable that I want to connect to the lookup table, and a categorical variable. The lookup table has a variable matching the key variable (though it has a different name) and the variable that I want to pull in to give me information about the key variable.
Main Dataset
ID Date Key_var Categ_var
1 4/5/15 123 A
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A
Lookup Table:
Lookup_key_var Oth_var
123 abc
456 def
789 ghi
The issue is that while I want to keep all of the rows in the main dataset, I only want to join to the lookup table those rows that have a value of A for the categorical variable (other rows have other values, including missing, for this variable).
Any help is much appreciated.
Use a format with conditional assignment
if cat ne 'A' then other_var=put(lookup, $char_fmt.);
Please show what you expect the output to be. There are a couple of different solutions depending on what that output looks like.
The output dataset would look like this:
Main Dataset
ID Date Key_var Categ_var
1 4/5/15 123 A
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A
Lookup Table:
Lookup_key_var Oth_var
123 abc
456 def
789 ghi
Combined dataset:
ID Date Key_var Categ_var Oth_var
1 4/5/15 123 A abc
1 3/9/14 456 B
1 2/3/15 123
2 2/1/13 B
2 1/5/12 789 A ghi
In other words, exactly the same as the Main dataset except that for rows where Categ_var=A, the lookup table has provided values of Oth_var (linked to values of Key_var).
A left join would do the trick and IMHO the easiest to code.
You need a CASE statement in the Join I think, wouldn't you?
proc sql;
create table want as
select Main.*, lookup.oth_var
from main left join lookup on main.key_var = lookup.key_var;
quit;
How does that take into account when Categ_var=B?
proc sql;
create table want as
select Main.*, lookup.oth_var
from main left join lookup on main.key_var = lookup.key_var and categ_var = 'A';
quit;
Thanks
My pleasure
If the second data set is a reasonable size and stable build a table lookup with Proc Format and use the statement that Reeza suggested.
Proc format; value $char_fmt
‘123’=’abc’ ‘456’=’def’ ‘789’=’ghi’;
if cat_var =’A’ then oth_var=put(key_var,$char_fmt.); else oth_var=’ ‘;
Hi, a more prolix approach than the SQL idea (assumes variable in LOOKUP is named same as in MAIN, i.e. KEY_VAR)...
data main;
retain oth_var 'xxx';
declare hash h(dataset: "lookup");
h.defineKey('key_var');
h.defineData('oth_var');
h.definedone();
do until (last);
set main end=last;
if categ_var eq 'A' then h.find();
output;
call missing(oth_var);
end;
stop;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.