BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
malena
Calcite | Level 5

I have a dataset of hospital discharges for 57 months (hd1-hd57) with character values (up to 20 in length) if there was a discharge. I want to select the records that have values. I wrote this code but it is not selecting all the records:

 

data kids_case;
set casemix_cancer;

 

array hd(57) cancer_HD1-cancer_HD57;

    do i=1 to dim(hd);
    date_hd=hd(i);
    if missing(date_hd) then cancer_hd=0;
    else cancer_hd=1;
     end;

 

if cancer_hd=1;

run;

 

the data is in this format

id $20, cancer_HD1-cancer_HD57 $20

 

So, one record has id=XXHGG and only has 2 discharges: cancer_HD1=2345 and cancer_HD45=58414528. The rest of cancer_HD are blanks.

 



1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Just to select the entire record, you could use:

 

data kids_case;

set casemix_cancer;

if cat(of cancer_hd1 - cancer_hd57) > ' ';

run;

 

If you are trying to take the existing records and find only the nonblank discharges, that would take a different approach.

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep.   This really helps illustrate your problem.  At a guess, you could just concat the array and check for length?

if lengthn(cats(of hd_;)) > 0 then output;

I.e. if they are all blanks lengthn=0 so no output.

ballardw
Super User

The code you have tests all values in the array. The value of cancer_hd is always going to be the result of the last comparison after the do loop.

If you want to set cancer_hd=1 when any of the values are set then perhaps this is what you want;

data kids_case;
   set casemix_cancer;
   array hd(57) cancer_HD1-cancer_HD57;
   do i=1 to dim(hd);
      date_hd=hd(i);
      if missing(date_hd) then cancer_hd=0;
      else do;
         cancer_hd=1;
         leave;
      end;
   end;
   if cancer_hd=1;
run;

The LEAVE statement says to quit the do loop when encountered. So this will interupt the loop the first time if finds a non-missing value.

 

Tom
Super User Tom
Super User

Your logic is flawed. You should set the flag to false before the loop and then in the loop set it to true if any non missing value is found.  Your current code is effectly only testing the last value in the array.

 

But why not just use the CMISS() function to count how many variables have a missing value.

 cancer_hd = not (57 = cmiss(of cancerHD1-cancer_HD57));
Astounding
PROC Star

Just to select the entire record, you could use:

 

data kids_case;

set casemix_cancer;

if cat(of cancer_hd1 - cancer_hd57) > ' ';

run;

 

If you are trying to take the existing records and find only the nonblank discharges, that would take a different approach.

malena
Calcite | Level 5

thank you for all your suggestions! really appreciate it all your help

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4110 views
  • 2 likes
  • 5 in conversation