PROC SQL QUESTION

Reply
Super Contributor
Posts: 1,040

PROC SQL QUESTION

Hi,

I have a question with the SQL code below. The code runs fine but I am not very sure if its picking what I wanted.

Will the AND condition work fine and give me all the specified???

Importantly, My concern is that the data from "ACCOUNT" TABLE I am looking at might not have something with those codes but my Lookup table (DIM_Diag) might have those unused codes...

So wanted to check if I am doing correctly and the braces are put in a proper manner. also is there any better way to write the line indicated in RED??

proc sql;

create table NEW as

select

DDIAG.ICD9_Code

FROM Account FHA

left join Diagnosis AS FDIAG ON FDIAG.account = fha.account

left join DIM_Diag DDIAG on FDIAG.DiagKey = DDIAG.DiaKey

where DischKey between 20000101 and 20150430

and (DDIAG.ICD9_Code in ('042', '42') or  dx_code like '42.%' or dx_code like '042.%');

quit;

Contributor RVA
Contributor
Posts: 21

Re: PROC SQL QUESTION

Which table is the "DischKey" variable in?  If it is in the FHA table, it may be fine.  But if it is in the FDIAG or DDIAG table, using the "DischKey" in the where clause like you did will turn the left join query into an inner join query.  The same for "dx_code" variable.

Just a suggestion: it may be better if you put the table where a variable comes from in front of the variable in order to avoid confusion.  For example, the (?) below indicates where the source table should be referenced.

proc sql;

create table NEW as

select DDIAG.ICD9_Code

FROM Account FHA

left join Diagnosis FDIAG

ON FDIAG.account = FHA.account

left join DIM_Diag DDIAG

on FDIAG.DiagKey = DDIAG.DiaKey

where (?)DischKey between 20000101 and 20150430

and (DDIAG.ICD9_Code in ('042', '42') or  (?)dx_code like '42.%' or (?)dx_code like '042.%');

quit;

Super Contributor
Posts: 1,040

Re: PROC SQL QUESTION

Thanks so much for the reply.

Apart from the aliasing the table name is there any other syntax I need to correct ?especially in the AND statement and braces being used correctly???

Also OR conditions being used Within the AND condition??? etc etc

Regards

Super User
Super User
Posts: 7,424

Re: PROC SQL QUESTION

Its quite difficult to say without seeing the data, and knowing what you want to achieve.

where (?)DischKey between 20000101 and 20150430

and (DDIAG.ICD9_Code in ('042', '42') or  (?)dx_code like '42.%' or (?)dx_code like '042.%');

The above will work with the alias put in.  The where will take effect after the joins.  You could however simplify your code somewhat by putting the various restrictions in the table calls:

proc sql;

  create table NEW as

  select    DDIAG.ICD9_Code

  from      Account FHA

  left join  Diagnosis FDIAG

  on         FDIAG.account = fha.account

  left join  (select * from DIM_Diag where ICD9_CODE in ("042","42") or DX_CODE like '%42.%') DDIAG

  on         FDIAG.DiagKey = DDIAG.DiaKey

  where    DischKey between 20000101 and 20150430;

quit;

Also note, alignements, spacing indetation etc. does help readability.  I also recommend variables and tables in upper, all other in lower, but haven't changed that here. 

Super Contributor
Posts: 1,040

Re: PROC SQL QUESTION

I wanted

42.XXXXXXX(anything following 42.)

042.XXXXXXXXXX(anything following 042.)

When u used '%42.%' I will get unwanted 39242.937.

So I think we cannot avoid the two OR cases..

Could you please check this for me again

left join  (select * from DIM_Diag where ICD9_CODE in ("042","42") or DX_CODE like '%42.%') DDIAG

Super User
Super User
Posts: 7,424

Re: PROC SQL QUESTION

Yes:

proc sql;

  create table NEW as

  select    DDIAG.ICD9_Code

  from      Account FHA

  left join  Diagnosis FDIAG

  on         FDIAG.account = fha.account

  left join  (select * from DIM_Diag where ICD9_CODE in ("042","42") or DX_CODE like '42.%' or DX_CODE like '042.%') DDIAG

  on         FDIAG.DiagKey = DDIAG.DiaKey

  where    DischKey between 20000101 and 20150430;

quit;

Should work.  If ICD9_CODE is 042 or 42 then the row will be present regardless of DX_CODE.  Then the other or, then other  in that sequence.  I would be cautious on the text fields though, why would you have 042 and 42 as they are the same.  Might be worth having both a numerical representation of these as well as character.  It would also shrink your where:

  left join  (select * from DIM_Diag where int(ICD9_CODE)=42 or int(DX_CODE)=42) DDIAG

Ask a Question
Discussion stats
  • 5 replies
  • 268 views
  • 3 likes
  • 3 in conversation