BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

I'd like to look at the patients who did not take certain codes, in this example: 1,2,6 and create data WANT. So I will be able to assess the patients who did not take the codes.

 

data have;
input id code;
cards;
1	1
1	2
1	3
1	3
1	5
2	1
2	2
2	2
2	4
2	5
3	3
3	4
3	5
4   8
4   9
5   6
5   10
;

data want;
input id code;
cards;
3	3
3	4
3	5
4   8
4   9
;

 I don't patients' info anymore as long as they were defined having 1,2 or 6. I tried the code below, but it doesn't exclude the patients who has codes of 1,2 and 6.

 

proc sql;
create table want as
select * from have 
where id in 
    (select id from have where code not in (1,2,6));
quit;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
proc sql;
create table want as
select * from have
where id not in
(select id from have where code in (1,2,6));
quit;

What about that?

View solution in original post

11 REPLIES 11
Reeza
Super User
proc sql;
create table want as
select * from have
where id not in
(select id from have where code in (1,2,6));
quit;

What about that?
Cruise
Ammonite | Level 13
Thanks! I see a bad logic in my code.
ChrisNZ
Tourmaline | Level 20

Your logic is wrong.

ID=1 is kept since it has CODE=3.

This works:

proc sql;
  create table WANT as
  select * from HAVE 
  where ID not in 
    (select unique ID from HAVE where CODE in (1,2,6));
quit;

Boolean logic requires careful thinking. 🙂

Cruise
Ammonite | Level 13

@ChrisNZ and @Reeza

 

both codes produced the same result with N=46,840,259 using my actual datasets. 

 

proc sql;
create table want_Reeza as
select * from have
where id not in
(select id from have where code in (1,2,6));
quit;

proc sql;
create table WANT_cris as
select * from HAVE
where ID not in
(select unique ID from HAVE where CODE in (1,2,6));
quit;

Cruise
Ammonite | Level 13
I guess Chris's version would deduplicate ID to distinct ones had I had the same IDs multiple times.
ChrisNZ
Tourmaline | Level 20

I only return each ID once in the list. My reply is basically the same as @Reeza's, I didn't see there already was an answer.

Cruise
Ammonite | Level 13

@Reeza ,@ChrisNZ@

 

I'm PUZZLED here. So I expect to end up with mutually exclusive datasets to defined vs not defined. And n of rows in final datasets suppose to add up to n of rows in the SOURCE_DATA. However, it doesn't and also there is no missing in CODE variable in the source data. 

 

15,101+947,133 do NOT add up to 997,910

while

15,2014 + 608 do NOT add up to 17,068.

 

Here is something wrong.

 

Any hints, as to why? Any guess or pointers?

 

proc sql;
create table not_defined /*N=15,101*/ as
select * from SOURCE_DATA /*N=997,910*/
where display_id not in 
(select display_id from med_dx where substr(code,1,3) in 
('153','154','159','C18','C19','C78','D01','C20','C21','C7A','239',
 '197','209','230','211','235','Z12','199','D12','D37','D49'));
quit;

proc sql;
create table defined /*N=947,133*/ as
select * from SOURCE_DATA /*N=997,910*/
where display_id in 
(select display_id from med_dx_mmyy where substr(code,1,3) in ('153','154','159','C18','C19','C78','D01','C20','C21',
'C7A','239','197','209','230','211','235','Z12','199','D12','D37','D49'));
quit;

/*VALIDATING*/

proc sql;
select count(distinct display_id)
from SOURCE_DATA; /*N=17,068*/
quit;

proc sql;
select count(distinct display_id)
from defined; /*N=15,214 defined*/
quit;

proc sql;
select count(distinct display_id)
from not_defined; /*N=608 not defined*/
quit;

 

 

ChrisNZ
Tourmaline | Level 20

Are you querying SAS tables?

If not, the issue is probably null values, which satisfy neither condition.

Reeza
Super User
You may have some categories not included in your list. Check the codes in your not defined dataset.
ChrisNZ
Tourmaline | Level 20

Also note that a join would be much faster. Something like

proc sql;
  create table UNDEFINED as
  select * 
from SOURCE_DATA source left join (select unique DISPLAY_ID
from med_dx_mmyy
where substr(code,1,3) in ('153','154','159','C18','C19','C78','D01','C20','C21', 'C7A','239','197','209','230','211','235','Z12','199','D12','D37','D49')) list on source.DISPLAY_ID=list.DISPLAY_ID where list.DISPLAY_ID is null; quit;

 

 

Cruise
Ammonite | Level 13

@ChrisNZ@Reeza

 

This solved the problem


data; set;
code=substr(omm_dx_cd,1,3);
run;

 

proc sql;
create table undefined as
select * from source_data
where display_id not in
(select display_id from source_data where code in
('153','154','159','C18','C19','C78','D01','C20','C21','C7A','239', '197','209','230','211','235','Z12','199','D12','D37','D49'));
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1961 views
  • 5 likes
  • 3 in conversation