I have a table with observations and I want to assign those to specific categories based on rules. My sample data is e.g. the following: create table work.observations
(id char(2),
can_fly num,
furry num,
levitates num)
;
insert into work.observations
values('AB', 0, 0, 0)
values('AC', 1, 0, 0)
values('AD', 0, 1, 0)
values('BC', ., 1, 0)
values('BD', ., 0, 0)
; Now I want to assign the species. For this example, assume there are three possibilities, a penguin, sparrow, or a cat. If the animal is furry, I want to assign "cat". If it doesn't, it's a bird, which one is determined by whether or not it can fly. If I already know it's a cat, I don't care about the value of the "can_fly" column. I could do this for one species at a time in the following way: %let species = cat;
%let furry = (1);
%let can_fly = (0, 1, .);
%let levitates = (0);
proc sql;
create table work.assigned_observations as
select distinct
o.id,
o.can_fly,
o.furry,
o.levitates,
case when o.can_fly in &can_fly.
and o.furry in &furry.
and o.levitates in &levitates.
then "&species." end as species
from work.observations o
;
quit;run; However, I want to do this for all at once. I was thinking of writing the possible settings to a file (e.g. excel), then loading this into a table and perform a join. However, I haven't been able to get the possibilities as lists as in the above example, so the best I could get working includes string comparisons, which is inefficient. See the following: proc sql;
create table work.species
(species char(10),
can_fly char(10),
furry char(10),
levitates char(10))
;
insert into work.species
values('penguin', '(0)', '(0)', '(0)')
values('cat', '(0, 1, .)', '(1)', '(0)')
values('sparrow', '(1)', '(0)', '(0)')
;
title 'Different species'
select * from work.species
proc printto; run;
proc sql;
create table work.assigned_observations as
select distinct
o.id,
o.can_fly,
o.furry,
o.levitates,
s.species
from work.observations o
left join work.species s
on s.can_fly like catx('', '%', put(o.can_fly, 11.), '%')
and s.furry like catx('', '%', put(o.furry, 11.), '%')
and s.levitates like catx('', '%', put(o.levitates, 11.), '%')
;
quit;run; How can I improve this, while keeping the flexibility of having multiple options and keeping the input file (with the settings for the different species) readable?
... View more