BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

I have a quite large dataset with many columns (500+).

 

Each row contains information about one individual and their diagnosiscode.

 

I would like to select multiple diagnosiscodes and at the same time exclude certain diagnosiscodes  by using an array and put them in a group.

 

So for example I would like to create a variable for hypertension. So every individual that has one of the following diagnosiscode; 'C02', 'C03', 'C07','C08', 'C09'  will get a 1 assigned for the variable hypertension else will get 0. But if they have diagnosiscode 'C02AC02' or 'C07AA07' they

will not get a 1 for hypertension (but if any of their other diagnosiscode is 'C02', 'C03', 'C07','C08', 'C09' they will get a 1).

 

I have been using an array, see code below, but I haven’t been able to figure out how to select multiple codes while excluding other codes at the same time.

 

data have;

input

id$ atc1$ atc2$ atc3$;

cards;

1  J01  J01XX05  C07

2  J01XX05  A10A  C03C

3  C02  A10A  A10B

4  C07AA07  C02  A10B

5  C07AA07  C02AC02  J01

;

run;

 

 

 

This is the code I have been using but I don’t get it to work. I have got it to work if I want to select one code and exclude another code (as the one use for J01_flag).   

 

data want;

set have;

array code(500) $ atc1-atc500;

Hypertension=0;

J01_flag=0;

do i = 1 to 500 until (Hypertension=1 and J01_flag=1);

   if code(i)=: 'C02' or  'C03' or  'C07' or 'C08' or 'C09'  and code(i) ne 'C02AC02' or 'C07AA07' then Hypertension=1;

   if code(i)=: ‘J01’ and code(i) ne ' J01XX05' then J01_flag=1;

end;

drop i;

run;

 

 

data want;

id$ atc1$ atc2$ atc3$ Hypertoni$ J01_flag$;

cards;

1  J01  J01XX05  C07 1 1

2  J01XX05  A10A  C03C  1 0

3  C02 A10A  A10B 1 0

4  C07AA07  C02 A10B 1 0

5  C07AA07  C02AC02 J01 0 1

;

run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

Does this code meet your needs?

data want;
	set have;
	array code(*) $ atc1-atc500;
	Hypertension=0;
	J01_flag=0;

	do i=1 to dim(code);
		if prxmatch('/(C02|C03|C07|C08|C09).*/',code(i)) and code(i) not in ('C02AC02','C07AA07') then Hypertension=1;
		if prxmatch('/(J01).*/',code(i)) and code(i) ne 'J01XX05' then J01_flag=1;
	end;
	drop i;
run;

Best,

View solution in original post

3 REPLIES 3
ed_sas_member
Meteorite | Level 14

Hi @Chris_LK_87 

Does this code meet your needs?

data want;
	set have;
	array code(*) $ atc1-atc500;
	Hypertension=0;
	J01_flag=0;

	do i=1 to dim(code);
		if prxmatch('/(C02|C03|C07|C08|C09).*/',code(i)) and code(i) not in ('C02AC02','C07AA07') then Hypertension=1;
		if prxmatch('/(J01).*/',code(i)) and code(i) ne 'J01XX05' then J01_flag=1;
	end;
	drop i;
run;

Best,

Chris_LK_87
Quartz | Level 8

Thanks for replying. 

 

The code selects other diagnosiscodes that I am not looking for, for example everything that contains 'C02' like 'R03AC02'.

 

Is there a way around this?

Tom
Super User Tom
Super User

This doesn't make any sense.

  if code(i)=: 'C02' or  'C03' or  'C07' or 'C08' or 'C09' 

You are combining character strings with the boolean operator OR.

Are you looking for the IN operator instead?

  if code(i) in: ('C02'  'C03'  'C07'  'C08'  'C09')

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1380 views
  • 0 likes
  • 3 in conversation