BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
udden2903
Obsidian | Level 7

Hi,

I have the following dataset:

ObsIDABC
111..
21.1.
311..



How do I make SAS look through columns A, B and C for each observation, to return a table as below:

 



ObsIDABCCategory
111..A
21.1.B
311..A
1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Here, after a minor tweak,

 

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . 1 
;

data want;
set have;
array _t a--c;
length category $ 100;
do over _t;
if _t=1 then category=catx('-',category,vname(_t));
/*category=vname(_t(whichn(1,of _t(*))));*/
end;
run;

View solution in original post

13 REPLIES 13
ballardw
Super User

Let's clarify the problem a little. Are you looking for the "first" variable of A B or C with ANY value or specifically the first with a value of 1?

What do you want for output if all of the values are missing?

udden2903
Obsidian | Level 7
Hi!

A, B and C are dummy variables. In the original dataset, each ID represents a unique patient, while A, B and C are different diagnoses.

I want to insert an additional column where the patient's diagnosis is given as a string variable. For example, if patient 1 had a value of 1 for the diagnosis B dummy, then I want a new variable that says B. If a patient has missing values for every dummy variable A, B and C, then I want the new variable to show missing as well. Similarly, if they had any combination of A, B and C, I want the new variable to show a combination: For example, if patient 1 has a value of 1 for A and B, then I want my new variable to say AB.
DanZ
Obsidian | Level 7
data have;
   input id:$1. A B C;
cards;
1 1 . .
1 . 1 .
1 1 . .
;
run;
data want(drop=str key lastcat);
format key $8. category $1.;
if _n_ = 1 then do;
	lastCat = 0;
	declare hash lu();
	lu.defineKey('key');
	lu.defineData('category');
	lu.defineDone();
end;
set have;
array vars {3} A B C;
str = 'ABCDFGHIJKLMNOPQRSTUVWXYZ';

key=catx(',',of vars{*});
if lu.find()=0 then output;
else do;
	lastCat + 1;
	category = substr(str,lastCat,1);
	lu.add();
	output;
end;
run;

The question of what the category value should be is a little tricky. If it has to be a letter, then this code will work for up to 26 categories.

 

It would be easiet if the category values were numbers, in which case you would just change category = lastCat instead of substr(), and remove the str variable creation.

udden2903
Obsidian | Level 7
Hi!

The categories refer to different diagnoses in my original dataset. These are made up of one letter followed by two digits, e.g. L40, M06, K50.
DanZ
Obsidian | Level 7
Do you require the category values to be letters?
udden2903
Obsidian | Level 7
No, I only need to be able to categorize the observations based on diagnosis.
DanZ
Obsidian | Level 7
And, I'm assuming that you're not concerned with the order of the diagnosis? Ie. the variables A and B with values L01/M04 would be the same as M04/L01.
udden2903
Obsidian | Level 7
The variables A and B are dummy variables. In my original dataset, these variables have names like L01 and M04, but the actual values are either 1 or "."
Haikuo
Onyx | Level 15

The following code choose the first '1' if you have multiple, tweak it to accomodate your real data:

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . . 
;

data want;
set have;
array _t a--c;
length category $ 10;
category=vname(_t(whichn(1,of _t(*))));
run;
udden2903
Obsidian | Level 7
Hi!

In my original dataset, an observation can have a value of 1 in more than one of the columns. In that case, I need SAS to return the name of both columns where the observation has a value of 1.
Haikuo
Onyx | Level 15

Here, after a minor tweak,

 

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . 1 
;

data want;
set have;
array _t a--c;
length category $ 100;
do over _t;
if _t=1 then category=catx('-',category,vname(_t));
/*category=vname(_t(whichn(1,of _t(*))));*/
end;
run;
udden2903
Obsidian | Level 7
This solved the problem. Many thanks for your help!
DanZ
Obsidian | Level 7
data have;
   input 
		id:$1. 
		M01 : 1. 
		M02 : 1. 
		L01 : 1.;
cards;
A 1 . .
B 1 . 1
C 1 . . 
D . 1 1
E 1 . 1
;
run;
data want(drop=key lastcat);
format key $60.;
if _n_ = 1 then do;
	lastCat = 0;
	declare hash lu();
	lu.defineKey('key');
	lu.defineData('category');
	lu.defineDone();
end;
set have;
array vars {3} M01 -- L01;
key=catQ('a',',',of vars{*});
if lu.find()=0 then output;
else do;
	lastCat +1;
	category = lastCat;
	lu.add();
	output;
end;
run;

Ok, I think this is what you're describing.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2487 views
  • 2 likes
  • 4 in conversation