DATA Step, Macro, Functions and more

Array help

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Array help

In the program below, it works fine when I use just one variable (Diag_cd1) but when I try 5 variables in an array (Diag_cd1-Diag_cd5) it does not work giving me an error message about parsing WHERER clause with an underline under $DIAGL. Can an array be used for what I want to do? The Proc Format has been reduced here for posting. thanks. 

PROC FORMAT;

   VALUE $DIAGL

      "39891",

      "4280 "-"4289 " = "CHF"       /* Congestive heart failure */

  

      "09320"-"09324",

      "3940 "-"3971 ",

      "3979 ",

      "4240 "-"42499",

      "7463 "-"7466 ",

      "V422 ",

      "V433 "         = "VALVE"     /* Valvular disease */

      "41511"-"41519",

      "4160 "-"4169 ",

      "4179 "         = "PULMCIRC"  /* Pulmonary circ disorder;*/

 

"042  "-"0449 " = "AIDS"      /* HIV and AIDS */

;

data monday3;

set monday2;

where put (diag_cd1,$DIAGL.) eq 'AIDS';

run;

data

monday3;

set

monday2;

array

diagg(5) $ diag_cd1-diag_cd5;

DO

i = 1 TO 5;

where

putc(diagg(i),$DIAGL.) eq 'AIDS';

end

;

run

;


Accepted Solutions
Solution
‎11-17-2011 10:48 AM
PROC Star
Posts: 7,357

Array help

Your format only appears to contain the values for aids .. none of the other values appear to match your example data.  If you need to capture all of the records that contain any of the aids values as shown in your format, then you might be able to use something like:

data monday3;

  set monday2;

  array diagg(5) $ diag_cd1-diag_cd5;

  keep=0;

  do i = 1 TO 5;

    if put (diagg(i),$DIAGL5.) eq 'AIDS' then do;

      keep=1;

      leave;

    end;

  end;

  if keep;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,357

Array help

An example dataset, in the form of a datastep, would help everyone to test your code.

Super User
Posts: 17,774

Array help

Can I assume you have you double tested your format code? I'm not sure about character ranges and how they map codes, specifically with the spaces as well.

I think there might be a bit of a logic flaw though trying to delete/keep records in the middle of a do loop without having iterated through the whole loop.

I'm guessing you want only patients who have a diag_cd that maps to AIDS. You could try using an if statement instead and creating a new variable keep that flags whether or not the record should be kept.

data

monday3;

set

monday2;

array

diagg(5) $ diag_cd1-diag_cd5;

keep_record=0;

DO

i = 1 TO 5;

if

putc(diagg(i),$DIAGL.) eq 'AIDS' then keep_record=1;

end;

if keep_record=1;

run;

Super User
Posts: 10,476

Array help

Did you notice that changed from a PUT to a PUTC?

Also are you wanting to include the record if any of the variables meet the condition or all of the variables meet the condition?

If so the loop approach you are using probably isn't going to work.

You may need to be a bit more explicit instead of using the array:

where [put(diag_cd1,$diagl.) eq 'AIDS'] OR [put(diag_cd2,$diagl.) eq 'AIDS'] or <continue through all variables>;

if ALL of the codes then replace the OR with AND.

Occasional Contributor
Posts: 10

Array help

Hello. Below I've included test data. The DIAG_CD1-DIAG_CD5 are all character 5 length. I am using PUTC because I understood that was for character. My intention with the coding is  if any of the five DIAG_CD are AIDS, I want to keep the observation. AIDS code is 04200. I will try the KEEP_RECORD suggestion. thanks. 

DIAG_CD1    DIAG_CD2    DIAG_CD3    DIAG_CD4    DIAG_CD5   

78210       70800       38290                                

E9390       78050       78050       25000       31100       

04200       78020       78020       30000       92000      

64683       64683       79650       04200                  

04200       72879       72879                                           

78900       33819       78900                               

V2210       V2389                                          

78060       78060       07999       78620                  

E9270       72450       84720                                

E8842       72450       92231                               

72950       72950                                           

78703       40190       78703       V4365                   

59970       59320       59970       59900       33829       

E9160       95940       92320       92310       30510      

E8842       95970       92420       98780                  

78620       47819                                          

84790       84790                                          

V0190       53081       V0190       V0190       53081       

E8160       95920       84090       84700       30510      

72380       72380       72310                              

78096       78096       72310       30000       49600     

78096       78096                                         

78330       78330       V4410       34390                 

34390       34390       34320                             

V6709       04200       59200       59390               

E8788       70780       70780       70920       74190    

52250       52250       25000       04200       V5867

Super Contributor
Posts: 1,636

Array help

data have;

infile cards missover;

input (DIAG_CD1    DIAG_CD2    DIAG_CD3    DIAG_CD4    DIAG_CD5) ($);   

cards;

78210       70800       38290                                

E9390       78050       78050       25000       31100       

04200       78020       78020       30000       92000      

64683       64683       79650       04200                  

04200       72879       72879                                           

78900       33819       78900                               

V2210       V2389                                          

78060       78060       07999       78620                  

E9270       72450       84720                                

E8842       72450       92231                               

72950       72950                                           

78703       40190       78703       V4365                   

59970       59320       59970       59900       33829       

E9160       95940       92320       92310       30510      

E8842       95970       92420       98780                  

78620       47819  

84790       84790                                          

V0190       53081       V0190       V0190       53081       

E8160       95920       84090       84700       30510      

72380       72380       72310                              

78096       78096       72310       30000       49600     

78096       78096                                         

78330       78330       V4410       34390                 

34390       34390       34320                             

V6709       04200       59200       59390               

E8788       70780       70780       70920       74190    

52250       52250       25000       04200       V5867

;

run;

data want;

  set have;

  if whichc('04200',of DIAG_CD1-DIAG_CD5) then output;

run;

Solution
‎11-17-2011 10:48 AM
PROC Star
Posts: 7,357

Array help

Your format only appears to contain the values for aids .. none of the other values appear to match your example data.  If you need to capture all of the records that contain any of the aids values as shown in your format, then you might be able to use something like:

data monday3;

  set monday2;

  array diagg(5) $ diag_cd1-diag_cd5;

  keep=0;

  do i = 1 TO 5;

    if put (diagg(i),$DIAGL5.) eq 'AIDS' then do;

      keep=1;

      leave;

    end;

  end;

  if keep;

run;

Occasional Contributor
Posts: 10

Array help

The coding from art297 works for me with the focus on Proc Format "AIDS" text. Thanks to all replying.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 265 views
  • 0 likes
  • 5 in conversation