BookmarkSubscribeRSS Feed
JUMMY
Obsidian | Level 7
 data ss;
   infile datalines;
   input Patient_Number Encounter_Number Birth_Date Diagnosis_1$ Diagnosis_2$ Diagnosis_3$ Diagnosis_4$ Diagnosis_5$;
 datalines;
   
    1 1 31JUL1975 250.7 250.7 785.2
    1 2 31JUL1975 250.3 250.3 288.8 995.93 466 250.1
    1 3 31JUL1975 250.3 250.3 271.6                   288.8
    1 4 31JUL1975 250.3 250.3            250.1
    1 5 31JUL1975 250.1 250.1
   

  
Array diag {5} $1 diagnosis_1-diagnosis_5;
	keto=0;
	do i=1 to 5;
			if substr(diag[i],1,5) in ('250.1') then keto=1;
	end;
drop i;
run;

If the first five position of 5 diagnoses are "250.1" then it indicates "XX". Using "ARRAY" and function "SUBSTR", how do i generate a new "XX" indicator variable?

7 REPLIES 7
ballardw
Super User

It will really help to provide an example of what the desired result should be.

For instance what if multiple variables meet the condition? Do you want multiple results for "keto"?

 

If you only want to know "at least one of the variables has a value with 250.1" then this may work:

data ss;
   infile datalines truncover;
   input Patient_Number Encounter_Number Birth_Date :date9. Diagnosis_1$ Diagnosis_2$ Diagnosis_3$ Diagnosis_4$ Diagnosis_5$;
   format birth_date date9.;
   keto = index(catx('_',of diag:),'250.1')>0;

 datalines;
1 1 31JUL1975 250.7 250.7 785.2
1 2 31JUL1975 250.3 250.3 288.8 995.93 466 250.1
1 3 31JUL1975 250.3 250.3 271.6   .      .   .       288.8
1 4 31JUL1975 250.3 250.3   .     .    250.1
1 5 31JUL1975 250.1 250.1
;
run;

If you have values such as 1250.1 that would also indicate keto, so may not be appropriate. We don't know all your possible values.

 

So

data ss;
   infile datalines truncover;
   input Patient_Number Encounter_Number Birth_Date :date9. Diagnosis_1$ Diagnosis_2$ Diagnosis_3$ Diagnosis_4$ Diagnosis_5$;
   format birth_date date9.;
   array d diagnosis:;
   keto=0;
   do i= 1 to dim(d);
      if d[i] =: '250.1' then do;
         keto=1;
         leave;
      end;
   end;

 datalines;
1 1 31JUL1975 250.7 250.7 785.2
1 2 31JUL1975 250.3 250.3 288.8 995.93 466 250.1
1 3 31JUL1975 250.3 250.3 271.6   .      .   .       288.8
1 4 31JUL1975 250.3 250.3   .     .    250.1
1 5 31JUL1975 250.1 250.1
;
run;

The =: is a "begins with" comparison.

 

Leave says to stop the loop as soon as the condition is found to be true.

You might want to leave the I variable in the set as it would have the indicator for which of the diagnosis variables met the condition.

JUMMY
Obsidian | Level 7
@ballardw, I want only one variable created from this called XX. But I want to use the "substr" function too? Yours doesnt incluse that function.
ballardw
Super User

@JUMMY wrote:
@ballardw, I want only one variable created from this called XX. But I want to use the "substr" function too? Yours doesnt incluse that function.

You said "first five position of 5 diagnoses are "250.1" " which is why I propose use of the =: If you use substr  requesting 5 positions and the value does not contain 5 positions you have problems. See this code and the error it generates.

data example;
   x='933';
   y = substr(x,1,5);
run;

By  the time you add in additional code involving handling shorter variables the code is 1) less efficient and 2) just plain longer.

 

 

If there is no reason to use a function why force a solution using it. That way lies bureaucratic madness.

data_null__
Jade | Level 19

 

substrn(x,1,5);

 

or SUBPAD depending on the result needed.

 

Like @ballardw I see no use for SUBSTR or to iterate over the array.

Reeza
Super User

@ballardw wrote:

 

If there is no reason to use a function why force a solution using it. That way lies bureaucratic madness.


It's homework. 

Tom
Super User Tom
Super User

I don't understand the question. You already posted the code for when XX is KETO.  What do you want to do differently?

data_null__
Jade | Level 19
data ss;
   infile datalines missover;
   input Patient_Number Encounter_Number Birth_Date :date9. @;
   array DIAG[6] $5;
   input diag[*];
   keto = '250.1' in diag;
   format Bir: date9.;
   datalines;
 1 1 31JUL1975 250.7 250.7 785.2
 1 2 31JUL1975 250.3 250.3 288.8 995.93 466 250.1
 1 3 31JUL1975 250.3 250.3 271.6    .    .   288.8
 1 4 31JUL1975 250.3 250.3    .  250.1
 1 5 31JUL1975 250.1 250.1
;;;;
   run;
proc print;
   run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1299 views
  • 2 likes
  • 5 in conversation