Convert character variable into numeric binary variables for each char...

leeklammer · Posted 11-07-2013 10:46 AM

Does anyone know of a quick way to convert several character variables into binary variables where the binary variables represents a yes/no for each character value?

For example, consider a dataset of student names and the letter grade they received on each quiz (with no numeric score information):

Student	Quiz1	Quiz2	Quiz3
Jack	A	C	D
Jill	B	A	A
John	D	F	C
Jane	B	B	F

I would like to convert as follows

Student	Quiz1	Quiz2	Quiz3	Quiz1_A	Quiz1_B	Quiz1_D	...
Jack	A	C	D	1	0	0	...
Jill	B	A	A	0	1	0	...
John	D	F	C	0	0	1	...
Jane	B	B	F	0	1	0	...

I realize that I could use a whole bunch of IF-THEN statements; but it would be great if there was a more efficient way, especially since I need to do it for 70+ variables that have up to 20 character values apiece.

Thanks!

-Lee

brsinco · Posted 11-07-2013 11:19 AM

Try using arrays.

Set up a character array and a corresponding numerical array.

Hope this helps.

Brandy

SteveDenham · Posted 11-08-2013 08:55 AM

PROC GLMMOD will do this. You'll need a dummy response variable (which you can use later to merge the results back against the original dataset).

data have;

set have;

resp=_n_;

run;

proc glmmod data=have outparm=parm;

class quiz1 quiz2 quiz3;

model resp=quiz1 quiz2 quiz3/noint;

run;

proc sql

create table want as select a.student, b.* from have a, parm b

where a.resp=b.resp;

quit;

This will generate only columns for grades that were actually scored on the quizzes. If there is a grade that no student achieved, it can be added in a data step.

Steve Denham

Message was edited by: Steve Denham

data_null__ · Posted 11-08-2013 09:04 AM

I wonder since the OP wants the variables named using source-variable-name+value that the features of PROC TRANSREG DESIGN might be a more direct path.

I wonder about 70 variables that will be lots of dummies.

You could add some dummy students to insure that all grade levels are represented.

SteveDenham · Posted 11-08-2013 09:32 AM

I agree, but if the OP only needs labels attached to the variables named col1-coln, this will work.

With TRANSREG, need to drop Intercept, _TYPE_ and _NAME_. The following code will work:

proc transreg data=have design;

model class(quiz1 quiz2 quiz3);

id student;

output out=coded;

run;

About the 70 variables, if that is for 70 quizzes, it won't be so hard to code given they all have the same name, but if they all have different names, it might be easier to put them into a macro string variable.

Steve Denham

JillChen0131 · Posted 01-17-2024 01:22 PM

I have patients' medication data, need to convert it to binary variables. Do you have any solutions?

what I have:

Patient	Medication1	Medication2	Medication3
ID1	A	B	C
ID2	C	E	F
ID3	E	G	H

what I want:

Patient	A	B	C	D	E	F	G	H
ID1	1	1	1
ID2			1		1	1
ID3					1		1	1

Rick_SAS · Posted 11-08-2013 02:47 PM

You said GLMMOD, but you wrote GLIMMIX.

SteveDenham · Posted 11-08-2013 02:54 PM

Thanks, Rick. Edited now. Sometimes fingers type what the muscle memory is, rather than what you really want...

Steve Denham

Reeza · Posted 11-08-2013 03:27 PM

You have a bunch of solutions as to how, I'm curious as to why?

I'd bet (not my salary ) that there's a better way to do whatever you're trying to do with this.

ballardw · Posted 11-08-2013 04:23 PM

I used to deal with a survey data collection package that defaulted to the output of binary output such as this for each level of a categorical variable. We were forever doing the reverse of this process to get back to the A,B,C .. values. So I also wonder what the next steps are unless this is just an exercise in data manipulation.

Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Re: Convert character variable into numeric binary variables for each character value

Classroom Training Available!