BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
udden2903
Obsidian | Level 7

Hi,

I have the following dataset:

ObsIDABC
111..
21.1.
311..



How do I make SAS look through columns A, B and C for each observation, to return a table as below:

 



ObsIDABCCategory
111..A
21.1.B
311..A
1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Here, after a minor tweak,

 

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . 1 
;

data want;
set have;
array _t a--c;
length category $ 100;
do over _t;
if _t=1 then category=catx('-',category,vname(_t));
/*category=vname(_t(whichn(1,of _t(*))));*/
end;
run;

View solution in original post

13 REPLIES 13
ballardw
Super User

Let's clarify the problem a little. Are you looking for the "first" variable of A B or C with ANY value or specifically the first with a value of 1?

What do you want for output if all of the values are missing?

udden2903
Obsidian | Level 7
Hi!

A, B and C are dummy variables. In the original dataset, each ID represents a unique patient, while A, B and C are different diagnoses.

I want to insert an additional column where the patient's diagnosis is given as a string variable. For example, if patient 1 had a value of 1 for the diagnosis B dummy, then I want a new variable that says B. If a patient has missing values for every dummy variable A, B and C, then I want the new variable to show missing as well. Similarly, if they had any combination of A, B and C, I want the new variable to show a combination: For example, if patient 1 has a value of 1 for A and B, then I want my new variable to say AB.
DanZ
Obsidian | Level 7
data have;
   input id:$1. A B C;
cards;
1 1 . .
1 . 1 .
1 1 . .
;
run;
data want(drop=str key lastcat);
format key $8. category $1.;
if _n_ = 1 then do;
	lastCat = 0;
	declare hash lu();
	lu.defineKey('key');
	lu.defineData('category');
	lu.defineDone();
end;
set have;
array vars {3} A B C;
str = 'ABCDFGHIJKLMNOPQRSTUVWXYZ';

key=catx(',',of vars{*});
if lu.find()=0 then output;
else do;
	lastCat + 1;
	category = substr(str,lastCat,1);
	lu.add();
	output;
end;
run;

The question of what the category value should be is a little tricky. If it has to be a letter, then this code will work for up to 26 categories.

 

It would be easiet if the category values were numbers, in which case you would just change category = lastCat instead of substr(), and remove the str variable creation.

udden2903
Obsidian | Level 7
Hi!

The categories refer to different diagnoses in my original dataset. These are made up of one letter followed by two digits, e.g. L40, M06, K50.
DanZ
Obsidian | Level 7
Do you require the category values to be letters?
udden2903
Obsidian | Level 7
No, I only need to be able to categorize the observations based on diagnosis.
DanZ
Obsidian | Level 7
And, I'm assuming that you're not concerned with the order of the diagnosis? Ie. the variables A and B with values L01/M04 would be the same as M04/L01.
udden2903
Obsidian | Level 7
The variables A and B are dummy variables. In my original dataset, these variables have names like L01 and M04, but the actual values are either 1 or "."
Haikuo
Onyx | Level 15

The following code choose the first '1' if you have multiple, tweak it to accomodate your real data:

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . . 
;

data want;
set have;
array _t a--c;
length category $ 10;
category=vname(_t(whichn(1,of _t(*))));
run;
udden2903
Obsidian | Level 7
Hi!

In my original dataset, an observation can have a value of 1 in more than one of the columns. In that case, I need SAS to return the name of both columns where the observation has a value of 1.
Haikuo
Onyx | Level 15

Here, after a minor tweak,

 

data have;
input Obs ID A B C ;
cards;
1 1 1 . . 
2 1 . 1 . 
3 1 1 . 1 
;

data want;
set have;
array _t a--c;
length category $ 100;
do over _t;
if _t=1 then category=catx('-',category,vname(_t));
/*category=vname(_t(whichn(1,of _t(*))));*/
end;
run;
udden2903
Obsidian | Level 7
This solved the problem. Many thanks for your help!
DanZ
Obsidian | Level 7
data have;
   input 
		id:$1. 
		M01 : 1. 
		M02 : 1. 
		L01 : 1.;
cards;
A 1 . .
B 1 . 1
C 1 . . 
D . 1 1
E 1 . 1
;
run;
data want(drop=key lastcat);
format key $60.;
if _n_ = 1 then do;
	lastCat = 0;
	declare hash lu();
	lu.defineKey('key');
	lu.defineData('category');
	lu.defineDone();
end;
set have;
array vars {3} M01 -- L01;
key=catQ('a',',',of vars{*});
if lu.find()=0 then output;
else do;
	lastCat +1;
	category = lastCat;
	lu.add();
	output;
end;
run;

Ok, I think this is what you're describing.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1822 views
  • 2 likes
  • 4 in conversation