BookmarkSubscribeRSS Feed
sweetpeaindeed
Calcite | Level 5
I'm trying to find a way to delete observations if particular variable labels are not included in more than one variable. For example, I want to delete observations that do not have the icd-9 code for hip fracture (8208) in 9 different diagnosis code variables. In other words, if one observation doesn't have the code 8208 listed in the first variable (diagnosis_code_1) but does in the second variable (diagnosis_code_2) then that observation is deleted. My current code seems to delete the observation if the relevant codes are not listed in the first variable (diagnosis_code_1). What am I doing wrong?


data medicare_proc;
set medicare_final;
if diagnosis_code_1 not in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209)or
diagnosis_code_2 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_3 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_4 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_5 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_6 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_8 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_9 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
drg_code not in (209 236)
then delete;
run;
2 REPLIES 2
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
First, suggest you use a WHERE instead of IF, for possible performance gain. Also, you may consider using a SAS VIEW as well.

Then, review your use of OR vs AND after your first test. And so with what you have explained, you will need to surround the remaining tests with parentheses, so that the first test is for not-in-your-list *AND* is-in-your-list (for the remaining SAS variables).

And, consider using a SAS macro variable (using a %LET statement) to list the value-string that you want to test, presuming they are all the same variable values.

Scott Barry
SBBWorks, Inc. Message was edited by: sbb
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello SweetPeaIndeed,

I tried your porgram on the simplified version of your data and it works as expected:
[pre]
data medicare_final;
input diagnosis_code_1 diagnosis_code_2 drg_code;
datalines;
82085 8208 209
8208 8208 209
8208 82005 209
run;
data medicare_proc;
set medicare_final;
if diagnosis_code_1 not in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
diagnosis_code_2 in (82000 82001 82002 82003 82009 82020 82021 82022 8208 8209) or
drg_code not in (209 236)
then delete;
run;
[/pre]
The output dataset contains only the last observation from medicare_final. First observation is excluded because it is not in code1 but is in code 2, second one is excluded because it is in code1. So it looks like the problem is not in your problem but may be in data?
Sincerely,
SPR

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1228 views
  • 0 likes
  • 3 in conversation