Solved: Re: Multiple variables for same condition PROC SQL

eabc0351 · Posted 08-09-2017 08:46 PM

Hello, I have a large healthcare administrative dataset and have multiple variables that will use the same condition. Previously, I used the following code to create a variable for diabetes:

PROC SQL;
CREATE TABLE MOM_2 AS
SELECT *, CASE WHEN
MOM_ID IN (SELECT MOM_ID FROM MOM_1
WHERE COMPRESS (DX_CODE) BETWEEN "XX.XX" AND "XX.XX")
THEN 1
ELSE 0
END AS DIABETES
FROM MOM_1;
QUIT;

Now I have multiple DX_CODE variables associated with multiple visits. Example:

DX_CODE1_VISIT1

DX_CODE1_VISIT2

DX_CODE2_VISIT1

DX_CODE2_VISIT2

DX_CODE3_VISIT1

DX_CODE3_VISIT2

My previous PROC SQL statement is not working when I include multiple variables in the WHERE statement. For example:

PROC SQL;
CREATE TABLE MOM_1 AS
SELECT *, CASE WHEN
MOM_ID IN (SELECT MOM_ID FROM VISIT1
WHERE COMPRESS (DX_CODE1_VISIT1, DX_CODE2_VISIT1, DX_CODE1_VISIT2, DX_CODE2_VISIT1.....and so on ) BETWEEN "XX.XX" AND "XX.XX")
THEN 1
ELSE 0
END AS DIABETES
FROM MOM_2;
QUIT;

Is there a way to write multiple variables in the statement?

If so, is it possible to put a range for the DX_CODE variables, rather than writing them each separately (to save time)?. For example:

DX_CODE1_VISIT1-DX_CODE1_VISIT10, DX_CODE2_VISIT1-DX_CODE2_VISIT10...

If including multple variables is not possible, any suggestions?

Thank you.

OnSAS · Posted 08-09-2017 10:02 PM

You can go all macro on it, here is a sketch:

%macro t;

PROC SQL;
CREATE TABLE MOM_2 AS
SELECT *

%do i=1 %to 3;

%do j=1 %to 4;

,CASE WHEN
MOM_ID IN (SELECT MOM_ID FROM MOM_1
WHERE COMPRESS (DX_CODE&i._visit&j.) BETWEEN "XX.XX" AND "XX.XX")
THEN 1
ELSE 0
END AS DIABETES_&i._&j.

%end;

END AS DIABETES
FROM MOM_1;
QUIT;

%mend t;

%t;

View solution in original post

OnSAS · Posted 08-09-2017 10:02 PM

You can go all macro on it, here is a sketch:

%macro t;

PROC SQL;
CREATE TABLE MOM_2 AS
SELECT *

%do i=1 %to 3;

%do j=1 %to 4;

,CASE WHEN
MOM_ID IN (SELECT MOM_ID FROM MOM_1
WHERE COMPRESS (DX_CODE&i._visit&j.) BETWEEN "XX.XX" AND "XX.XX")
THEN 1
ELSE 0
END AS DIABETES_&i._&j.

%end;

END AS DIABETES
FROM MOM_1;
QUIT;

%mend t;

%t;

eabc0351 · Posted 08-10-2017 09:07 AM

Excellent, it worked, thank you. Next question, the table it created MOM_2 has multiple columns:

DIABETES_1_1 DIABETES_1_2... DIABETES_2_1...

How do I collapse the columns so that if the mom has diabetes in any of the diagnosis slots, in any of the visits, then she has diabetes (1=yes) . Basically I want to collapse all the columns I just created.

Thanks!

OnSAS · Posted 08-10-2017 02:30 PM

Here, I think you are clearly better of using a data step.

in the data step you can create a new variable that takes the min of all the ones you've created. For example:

IndDiab = min (of DIABETES_1_1 -- DIABETES_<n>_<m>);

replace n & m with values

IndDiab will take 0 when none of the variable in the list is 1.

good luck.

eabc0351 · Posted 08-10-2017 02:33 PM

Great, thanks so much for your help!

OnSAS · Posted 08-10-2017 05:15 PM

You are welcome! glad I was able to.

RW9 · Posted 08-10-2017 04:11 AM

Yes, the problem here is your fighting SQL. SQL is a language developed to work with normalised data, i.e. data going down a table rather than across. Its really not suited to using data which is transposed in nature - going across the dataset. Now in Base SAS you have constructs to work with this type of data, however its still not optimal.

So you have two options - and as you have not provided any Test data in the form of a datastep in the body of the post!! - I can't go further into other than this:

1) Still using SQL, then normalise the DX_CODE list so that visit is a separate variable that goes down th etable like:

VISIT DX_CODE

1 XYZ

...

This is how you should model your data if you want to use SQL (and note, if this is Healthcare, then why are you not using CDISC models anyways?)

2) Drop SQL, and goto datastep, something like:

data...;
  set mom_1;
  array dx_code{6};
  if visit=1 then x=1;
  if visit=2 then x=3;
  else x=5;
  if dx_code=dx_code{x} or dx_code=dx_code{x+1} then output;
run;

SAS Innovate 2025: Call for Content

Classroom Training Available!