BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

I have a quite large dataset with many columns (500+).

 

Each row contains information about one individual and their diagnosiscode.

 

I would like to select multiple diagnosiscodes and at the same time exclude certain diagnosiscodes  by using an array and put them in a group.

 

So for example I would like to create a variable for hypertension. So every individual that has one of the following diagnosiscode; 'C02', 'C03', 'C07','C08', 'C09'  will get a 1 assigned for the variable hypertension else will get 0. But if they have diagnosiscode 'C02AC02' or 'C07AA07' they

will not get a 1 for hypertension (but if any of their other diagnosiscode is 'C02', 'C03', 'C07','C08', 'C09' they will get a 1).

 

I have been using an array, see code below, but I haven’t been able to figure out how to select multiple codes while excluding other codes at the same time.

 

data have;

input

id$ atc1$ atc2$ atc3$;

cards;

1  J01  J01XX05  C07

2  J01XX05  A10A  C03C

3  C02  A10A  A10B

4  C07AA07  C02  A10B

5  C07AA07  C02AC02  J01

;

run;

 

 

 

This is the code I have been using but I don’t get it to work. I have got it to work if I want to select one code and exclude another code (as the one use for J01_flag).   

 

data want;

set have;

array code(500) $ atc1-atc500;

Hypertension=0;

J01_flag=0;

do i = 1 to 500 until (Hypertension=1 and J01_flag=1);

   if code(i)=: 'C02' or  'C03' or  'C07' or 'C08' or 'C09'  and code(i) ne 'C02AC02' or 'C07AA07' then Hypertension=1;

   if code(i)=: ‘J01’ and code(i) ne ' J01XX05' then J01_flag=1;

end;

drop i;

run;

 

 

data want;

id$ atc1$ atc2$ atc3$ Hypertoni$ J01_flag$;

cards;

1  J01  J01XX05  C07 1 1

2  J01XX05  A10A  C03C  1 0

3  C02 A10A  A10B 1 0

4  C07AA07  C02 A10B 1 0

5  C07AA07  C02AC02 J01 0 1

;

run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

Does this code meet your needs?

data want;
	set have;
	array code(*) $ atc1-atc500;
	Hypertension=0;
	J01_flag=0;

	do i=1 to dim(code);
		if prxmatch('/(C02|C03|C07|C08|C09).*/',code(i)) and code(i) not in ('C02AC02','C07AA07') then Hypertension=1;
		if prxmatch('/(J01).*/',code(i)) and code(i) ne 'J01XX05' then J01_flag=1;
	end;
	drop i;
run;

Best,

View solution in original post

3 REPLIES 3
ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

Does this code meet your needs?

data want;
	set have;
	array code(*) $ atc1-atc500;
	Hypertension=0;
	J01_flag=0;

	do i=1 to dim(code);
		if prxmatch('/(C02|C03|C07|C08|C09).*/',code(i)) and code(i) not in ('C02AC02','C07AA07') then Hypertension=1;
		if prxmatch('/(J01).*/',code(i)) and code(i) ne 'J01XX05' then J01_flag=1;
	end;
	drop i;
run;

Best,

Chris_LK_87
Quartz | Level 8

Thanks for replying. 

 

The code selects other diagnosiscodes that I am not looking for, for example everything that contains 'C02' like 'R03AC02'.

 

Is there a way around this?

Tom
Super User Tom
Super User

This doesn't make any sense.

  if code(i)=: 'C02' or  'C03' or  'C07' or 'C08' or 'C09' 

You are combining character strings with the boolean operator OR.

Are you looking for the IN operator instead?

  if code(i) in: ('C02'  'C03'  'C07'  'C08'  'C09')

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 966 views
  • 0 likes
  • 3 in conversation