BookmarkSubscribeRSS Feed
prjadhav00
Calcite | Level 5

I am executing a query which has multiple columns in where clause which has multiple values. I know that in PROC SQL you can use IN condition to satisfy and get the correct output.

 

FAKE DATA

Question: CHF is defined using the following diagnosis codes:
398.91, 402.11, 402.91, 404.11, 404.13, 404.91, 404.93 (Codes are in CHARATER form)

 

1.Query: (I merge two tables)

proc sql;
create table BENE_CHF as
select distinct a.BENE_ID, b.ICD9_DGNS_CD_1, ICD9_DGNS_CD_2, ICD9_DGNS_CD_3, ICD9_DGNS_CD_4, ICD9_DGNS_CD_5,ICD9_DGNS_CD_6, ICD9_DGNS_CD_7, ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_1,
LINE_ICD9_DGNS_CD_2,
LINE_ICD9_DGNS_CD_3,
LINE_ICD9_DGNS_CD_4,
LINE_ICD9_DGNS_CD_5,                                              THIS QUERY IS WORKING
LINE_ICD9_DGNS_CD_6,
LINE_ICD9_DGNS_CD_7,
LINE_ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_9,
LINE_ICD9_DGNS_CD_10,
LINE_ICD9_DGNS_CD_11,
LINE_ICD9_DGNS_CD_12,
LINE_ICD9_DGNS_CD_13
from work.beneficiary_summary_file_2008 as a,
work.carrier_2008 as b
where a.BENE_ID = b.BENE_ID;
quit;

 

/***********************************************************************************************************/

Next sept is to sort the data according to ICD codes methioned above.

I tried different quries such as;

1.proc sql;
select *
from bene_chf
where ICD9_DGNS_CD_1 and
ICD9_DGNS_CD_2 and
ICD9_DGNS_CD_3 and
ICD9_DGNS_CD_4 and
ICD9_DGNS_CD_5 and
ICD9_DGNS_CD_6 and
ICD9_DGNS_CD_7 and
ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_1 and
LINE_ICD9_DGNS_CD_2 and
LINE_ICD9_DGNS_CD_3 and
LINE_ICD9_DGNS_CD_4 and
LINE_ICD9_DGNS_CD_5 and
LINE_ICD9_DGNS_CD_6 and
LINE_ICD9_DGNS_CD_7 and
LINE_ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_9 and
LINE_ICD9_DGNS_CD_10 and
LINE_ICD9_DGNS_CD_11 and
LINE_ICD9_DGNS_CD_12 and
LINE_ICD9_DGNS_CD_13 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830');
quit;

 

Not able to get a right output.

any suggestion or comments are highly appriciated.

4 REPLIES 4
Reeza
Super User

Data steps work better for this because Arrays allow you to loop through all the variables with a loop. SQL isn't a good solution. If you must use SQL a macro is what you want.

Your code as posted is not correct, it would need to be:

where ICD9_DGNS_CD_1 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830') or ICD9_DGNS_CD_2 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830') .....
;

Note that this wouldn't be a SORT though, it's a filter. If you need to sort the codes, you'll have to explain, but if you're trying to sort within a specific row for example, I'm not even sure SQL can do that easily. Again, this is trivial in a data step. 

 


@prjadhav00 wrote:

I am executing a query which has multiple columns in where clause which has multiple values. I know that in PROC SQL you can use IN condition to satisfy and get the correct output.

 

FAKE DATA

Question: CHF is defined using the following diagnosis codes:
398.91, 402.11, 402.91, 404.11, 404.13, 404.91, 404.93 (Codes are in CHARATER form)

 

1.Query: (I merge two tables)

proc sql;
create table BENE_CHF as
select distinct a.BENE_ID, b.ICD9_DGNS_CD_1, ICD9_DGNS_CD_2, ICD9_DGNS_CD_3, ICD9_DGNS_CD_4, ICD9_DGNS_CD_5,ICD9_DGNS_CD_6, ICD9_DGNS_CD_7, ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_1,
LINE_ICD9_DGNS_CD_2,
LINE_ICD9_DGNS_CD_3,
LINE_ICD9_DGNS_CD_4,
LINE_ICD9_DGNS_CD_5,                                              THIS QUERY IS WORKING
LINE_ICD9_DGNS_CD_6,
LINE_ICD9_DGNS_CD_7,
LINE_ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_9,
LINE_ICD9_DGNS_CD_10,
LINE_ICD9_DGNS_CD_11,
LINE_ICD9_DGNS_CD_12,
LINE_ICD9_DGNS_CD_13
from work.beneficiary_summary_file_2008 as a,
work.carrier_2008 as b
where a.BENE_ID = b.BENE_ID;
quit;

 

/***********************************************************************************************************/

Next sept is to sort the data according to ICD codes methioned above.

I tried different quries such as;

1.proc sql;
select *
from bene_chf
where ICD9_DGNS_CD_1 and
ICD9_DGNS_CD_2 and
ICD9_DGNS_CD_3 and
ICD9_DGNS_CD_4 and
ICD9_DGNS_CD_5 and
ICD9_DGNS_CD_6 and
ICD9_DGNS_CD_7 and
ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_1 and
LINE_ICD9_DGNS_CD_2 and
LINE_ICD9_DGNS_CD_3 and
LINE_ICD9_DGNS_CD_4 and
LINE_ICD9_DGNS_CD_5 and
LINE_ICD9_DGNS_CD_6 and
LINE_ICD9_DGNS_CD_7 and
LINE_ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_9 and
LINE_ICD9_DGNS_CD_10 and
LINE_ICD9_DGNS_CD_11 and
LINE_ICD9_DGNS_CD_12 and
LINE_ICD9_DGNS_CD_13 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830');
quit;

 

Not able to get a right output.

any suggestion or comments are highly appriciated.


 

prjadhav00
Calcite | Level 5
Hello Reeza,

Thanks for your suggestions. I tried the query that you provided but it's
not giving me the specific DGNS_CD which I am looking for. Could you
provide me with any other query related to this?
Reeza
Super User

@prjadhav00 wrote:
Hello Reeza,

Thanks for your suggestions. I tried the query that you provided but it's
not giving me the specific DGNS_CD which I am looking for. Could you
provide me with any other query related to this?

Post the code you tried. I only posted a partial snippet so it depends on how you completed the query. Please include the log as well. 

Astounding
PROC Star

It's possible you need to use SQL in your first step.  That depends on where the data resides.  But once the first step is complete, the second step could easily use a DATA step instead of SQL:

 

data want;
set bene_chf;
array codes {21} ICD9_DGNS_CD_1 - ICD9_DGNS_CD_8
LINE_ICD9_DGNS_CD_1 - LINE_ICD9_DGNS_CD_13;
do k=1 to 21;
   if codes{k} in ('39891', '40211', '40291', '40411', '40413', '40491', '400493', '42830') then do;
      output;
      delete;
end;
drop k;
run;

Since the original logic was not working, it's not 100% sure that this does exactly what you want.  But it uses good tools for the job, and it should be easy to tweak it if needed to get what you want.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 4 replies
  • 2106 views
  • 2 likes
  • 3 in conversation