BookmarkSubscribeRSS Feed
Wolverine
Quartz | Level 8

I'm working with Medicaid data, and I have a series of variables that represent all of the diagnosis categories that the doctor billed for during each visit.  There are about 250 variables, and they each have a sequentially numbered variable name: DX_cat_1, DX_cat_2 ... DX_cat_250.  If any of these variables have a value of 5, then it is considered a visit that involved a mental health diagnosis.  So here is the code I attempted, which doesn't work:

DATA medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

DO i = 1 to 250 by 1;

   

    IF DX_cat_i = 5 THEN MHvisit = 1;

   

    OUTPUT;

END;

RUN;

6 REPLIES 6
Reeza
Super User

Close, you need to declare an array for the variables. You can loop or use the WHICHN function.

You'll output a line for ever DX_cAT though, effectively transposing it. Is that what you wanted? I'm assuming you only want the mental health visits.

DATA medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

array dx_cat(250) dx_cat_1-dx_cat_250;

DO i = 1 to 250 ;

  

    IF DX_cat(i) = 5 THEN MHvisit = 1;

  

END;

if MHvisit=1 then output;

RUN;

Or using whichn function:

DATA medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

array dx_cat(250) dx_cat_1-dx_cat_250;

mhvisit=1;

if whichn(5, of dx_cat(*))>0 then output;

RUN;

Wolverine
Quartz | Level 8

No luckSmiley Sad  I tried it both ways.  The DO loop outputs a file with 250 variables and 0 observations.  The whichn version outputs a file with 249 variables and 0 observations.

Here is the code I used (the DX_cat variables actually go up to 248).

DATA medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

array dx_cat(248) dx_cat_1-dx_cat_248;

DO i = 1 to 248;

    IF DX_cat(i) = 5 THEN MHvisit = 1;

END;

IF MHvisit=1 THEN output;

RUN;

DATA medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

array dx_cat(248) dx_cat_1-dx_cat_248;

mhvisit=1;

if whichn(5, of dx_cat(*))>0 then output;

RUN;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Are the variables numeric?  Could be that.  Post some test data where it doesn't work as the following works fine:

data have;

  dx_cat1=1; dx_cat2=5; dx_cat3=4; output;

  dx_cat1=3; dx_cat2=2; dx_cat3=7; output;

run;

data want;

  set have;

  array dx_cat{3};

  if whichn(5, of dx_cat{*}) > 0 then output;

run;

Wolverine
Quartz | Level 8

Apparently the problem was that I didn't use separate input and output data sets.  Below is the final version of the code.  Thanks!

/*******************************************************************************/

DATA medicaid.&filename._MHvisit; SET medicaid.&filename._DXcount;

/*Initialize variable to represent if the visit included a mental health diagnosis*/

MHvisit = 0;

array dx_cat{248} dx_cat_1-dx_cat_248;

DO i = 1 to 248;

    IF DX_cat(i) = 5 THEN MHvisit = 1;

END;

IF MHvisit=1 THEN output;

RUN;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

And did you try the whichn() version?  It will be faster than a loop and reads easier?

ballardw
Super User

I would be very tempted to create an entirely new set of variables that are indicators.

Array dx_cat dx_cat: ;

array dx dx_1-dx_250 ;

do I = 1 to dim(dx_cat);

     if not missing(dx_cat) then dx[dx_cat]=1;

end;

dx_5 would be your mental health visits and a 1 indicates that visit had that treatment. I would hope you would be able to assign appropriate variable labels to all of the 250 categories.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1160 views
  • 0 likes
  • 4 in conversation