I am executing a query which has multiple columns in where clause which has multiple values. I know that in PROC SQL you can use IN condition to satisfy and get the correct output.
FAKE DATA
Question: CHF is defined using the following diagnosis codes:
398.91, 402.11, 402.91, 404.11, 404.13, 404.91, 404.93 (Codes are in CHARATER form)
1.Query: (I merge two tables)
proc sql;
create table BENE_CHF as
select distinct a.BENE_ID, b.ICD9_DGNS_CD_1, ICD9_DGNS_CD_2, ICD9_DGNS_CD_3, ICD9_DGNS_CD_4, ICD9_DGNS_CD_5,ICD9_DGNS_CD_6, ICD9_DGNS_CD_7, ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_1,
LINE_ICD9_DGNS_CD_2,
LINE_ICD9_DGNS_CD_3,
LINE_ICD9_DGNS_CD_4,
LINE_ICD9_DGNS_CD_5, THIS QUERY IS WORKING
LINE_ICD9_DGNS_CD_6,
LINE_ICD9_DGNS_CD_7,
LINE_ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_9,
LINE_ICD9_DGNS_CD_10,
LINE_ICD9_DGNS_CD_11,
LINE_ICD9_DGNS_CD_12,
LINE_ICD9_DGNS_CD_13
from work.beneficiary_summary_file_2008 as a,
work.carrier_2008 as b
where a.BENE_ID = b.BENE_ID;
quit;
/***********************************************************************************************************/
Next sept is to sort the data according to ICD codes methioned above.
I tried different quries such as;
1.proc sql;
select *
from bene_chf
where ICD9_DGNS_CD_1 and
ICD9_DGNS_CD_2 and
ICD9_DGNS_CD_3 and
ICD9_DGNS_CD_4 and
ICD9_DGNS_CD_5 and
ICD9_DGNS_CD_6 and
ICD9_DGNS_CD_7 and
ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_1 and
LINE_ICD9_DGNS_CD_2 and
LINE_ICD9_DGNS_CD_3 and
LINE_ICD9_DGNS_CD_4 and
LINE_ICD9_DGNS_CD_5 and
LINE_ICD9_DGNS_CD_6 and
LINE_ICD9_DGNS_CD_7 and
LINE_ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_9 and
LINE_ICD9_DGNS_CD_10 and
LINE_ICD9_DGNS_CD_11 and
LINE_ICD9_DGNS_CD_12 and
LINE_ICD9_DGNS_CD_13 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830');
quit;
Not able to get a right output.
any suggestion or comments are highly appriciated.
Data steps work better for this because Arrays allow you to loop through all the variables with a loop. SQL isn't a good solution. If you must use SQL a macro is what you want.
Your code as posted is not correct, it would need to be:
where ICD9_DGNS_CD_1 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830') or ICD9_DGNS_CD_2 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830') .....
;
Note that this wouldn't be a SORT though, it's a filter. If you need to sort the codes, you'll have to explain, but if you're trying to sort within a specific row for example, I'm not even sure SQL can do that easily. Again, this is trivial in a data step.
@prjadhav00 wrote:
I am executing a query which has multiple columns in where clause which has multiple values. I know that in PROC SQL you can use IN condition to satisfy and get the correct output.
FAKE DATA
Question: CHF is defined using the following diagnosis codes:
398.91, 402.11, 402.91, 404.11, 404.13, 404.91, 404.93 (Codes are in CHARATER form)
1.Query: (I merge two tables)
proc sql;
create table BENE_CHF as
select distinct a.BENE_ID, b.ICD9_DGNS_CD_1, ICD9_DGNS_CD_2, ICD9_DGNS_CD_3, ICD9_DGNS_CD_4, ICD9_DGNS_CD_5,ICD9_DGNS_CD_6, ICD9_DGNS_CD_7, ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_1,
LINE_ICD9_DGNS_CD_2,
LINE_ICD9_DGNS_CD_3,
LINE_ICD9_DGNS_CD_4,
LINE_ICD9_DGNS_CD_5, THIS QUERY IS WORKING
LINE_ICD9_DGNS_CD_6,
LINE_ICD9_DGNS_CD_7,
LINE_ICD9_DGNS_CD_8,
LINE_ICD9_DGNS_CD_9,
LINE_ICD9_DGNS_CD_10,
LINE_ICD9_DGNS_CD_11,
LINE_ICD9_DGNS_CD_12,
LINE_ICD9_DGNS_CD_13
from work.beneficiary_summary_file_2008 as a,
work.carrier_2008 as b
where a.BENE_ID = b.BENE_ID;
quit;
/***********************************************************************************************************/
Next sept is to sort the data according to ICD codes methioned above.
I tried different quries such as;
1.proc sql;
select *
from bene_chf
where ICD9_DGNS_CD_1 and
ICD9_DGNS_CD_2 and
ICD9_DGNS_CD_3 and
ICD9_DGNS_CD_4 and
ICD9_DGNS_CD_5 and
ICD9_DGNS_CD_6 and
ICD9_DGNS_CD_7 and
ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_1 and
LINE_ICD9_DGNS_CD_2 and
LINE_ICD9_DGNS_CD_3 and
LINE_ICD9_DGNS_CD_4 and
LINE_ICD9_DGNS_CD_5 and
LINE_ICD9_DGNS_CD_6 and
LINE_ICD9_DGNS_CD_7 and
LINE_ICD9_DGNS_CD_8 and
LINE_ICD9_DGNS_CD_9 and
LINE_ICD9_DGNS_CD_10 and
LINE_ICD9_DGNS_CD_11 and
LINE_ICD9_DGNS_CD_12 and
LINE_ICD9_DGNS_CD_13 in ('39891', '40211', '40291', '40411', '40413', '40491', '40493', '42830');
quit;
Not able to get a right output.
any suggestion or comments are highly appriciated.
@prjadhav00 wrote:
Hello Reeza,
Thanks for your suggestions. I tried the query that you provided but it's
not giving me the specific DGNS_CD which I am looking for. Could you
provide me with any other query related to this?
Post the code you tried. I only posted a partial snippet so it depends on how you completed the query. Please include the log as well.
It's possible you need to use SQL in your first step. That depends on where the data resides. But once the first step is complete, the second step could easily use a DATA step instead of SQL:
data want;
set bene_chf;
array codes {21} ICD9_DGNS_CD_1 - ICD9_DGNS_CD_8
LINE_ICD9_DGNS_CD_1 - LINE_ICD9_DGNS_CD_13;
do k=1 to 21;
if codes{k} in ('39891', '40211', '40291', '40411', '40413', '40491', '400493', '42830') then do;
output;
delete;
end;
drop k;
run;
Since the original logic was not working, it's not 100% sure that this does exactly what you want. But it uses good tools for the job, and it should be easy to tweak it if needed to get what you want.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.