ID Log_Code
11 A
12 A
12 B
13 A
14 A
14 B
I want to eliminate both rows that meet the criteria of having 'B' as a log code. So in this case I want to eliminate ID 12 and 14. How can I do this in a SAS datastep or proc sql
There are a few ways, but this is probably the most intuitive. It assumes your data are sorted by ID:
data want;
merge have
have (keep=id log_code where=(log_code='B') in=drop_these);
by id;
if drop_these then delete;
run;
It is conceivable you will need to ignore messages indicating a many-to-many merge.
The SQL coders here can show you a way that may be just as intuitive to you ... depends on whether SQL or DATA steps are a stronger area for you.
Good luck.
Maybe Im missing something but you should be able to just use a where statement in either case:
Data new;
set old;
where log_code ne "B";
run;
proc sql;
select *
from old
where log_code ne "B"
;
quit;
If the varying padded log_codes are how the data appears then you might have to use strip(log_code) to remove leading and trailing spaces.
Hope this helps!
EJ
There are a few ways, but this is probably the most intuitive. It assumes your data are sorted by ID:
data want;
merge have
have (keep=id log_code where=(log_code='B') in=drop_these);
by id;
if drop_these then delete;
run;
It is conceivable you will need to ignore messages indicating a many-to-many merge.
The SQL coders here can show you a way that may be just as intuitive to you ... depends on whether SQL or DATA steps are a stronger area for you.
Good luck.
SQL solution -
data have;
input id $3. log_code $3.;
datalines;
11 A
12 A
12 B
13 A
14 A
14 B
run;
proc sql;
create table want as
select *
from have
where id not in (select distinct id
from have
where log_code in ('B'));
quit;
You can also simply do:
proc sql;
create table want as
select * from have group by id having sum(log_code='B') <1; quit;
Haikuo
Like EJ, I'm having trouble understanding what needs done here. I know it's been answered multiple times now, but out of curiosity, I'd like to understand what's being asked here exactly.
As far as I can tell, we simply want to exclude anything with log_code = 'B'? Correct? Is that the case?
No, it's more complex. If log_code='B', delete all observations for that ID, including those that have log_code='A'.
Ok. I see. Now it makes sense. Thanks.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.