BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
malena
Calcite | Level 5

I have a dataset of hospital discharges for 57 months (hd1-hd57) with character values (up to 20 in length) if there was a discharge. I want to select the records that have values. I wrote this code but it is not selecting all the records:

 

data kids_case;
set casemix_cancer;

 

array hd(57) cancer_HD1-cancer_HD57;

    do i=1 to dim(hd);
    date_hd=hd(i);
    if missing(date_hd) then cancer_hd=0;
    else cancer_hd=1;
     end;

 

if cancer_hd=1;

run;

 

the data is in this format

id $20, cancer_HD1-cancer_HD57 $20

 

So, one record has id=XXHGG and only has 2 discharges: cancer_HD1=2345 and cancer_HD45=58414528. The rest of cancer_HD are blanks.

 



1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Just to select the entire record, you could use:

 

data kids_case;

set casemix_cancer;

if cat(of cancer_hd1 - cancer_hd57) > ' ';

run;

 

If you are trying to take the existing records and find only the nonblank discharges, that would take a different approach.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep.   This really helps illustrate your problem.  At a guess, you could just concat the array and check for length?

if lengthn(cats(of hd_;)) > 0 then output;

I.e. if they are all blanks lengthn=0 so no output.

ballardw
Super User

The code you have tests all values in the array. The value of cancer_hd is always going to be the result of the last comparison after the do loop.

If you want to set cancer_hd=1 when any of the values are set then perhaps this is what you want;

data kids_case;
   set casemix_cancer;
   array hd(57) cancer_HD1-cancer_HD57;
   do i=1 to dim(hd);
      date_hd=hd(i);
      if missing(date_hd) then cancer_hd=0;
      else do;
         cancer_hd=1;
         leave;
      end;
   end;
   if cancer_hd=1;
run;

The LEAVE statement says to quit the do loop when encountered. So this will interupt the loop the first time if finds a non-missing value.

 

Tom
Super User Tom
Super User

Your logic is flawed. You should set the flag to false before the loop and then in the loop set it to true if any non missing value is found.  Your current code is effectly only testing the last value in the array.

 

But why not just use the CMISS() function to count how many variables have a missing value.

 cancer_hd = not (57 = cmiss(of cancerHD1-cancer_HD57));
Astounding
PROC Star

Just to select the entire record, you could use:

 

data kids_case;

set casemix_cancer;

if cat(of cancer_hd1 - cancer_hd57) > ' ';

run;

 

If you are trying to take the existing records and find only the nonblank discharges, that would take a different approach.

malena
Calcite | Level 5

thank you for all your suggestions! really appreciate it all your help

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 4585 views
  • 2 likes
  • 5 in conversation