BookmarkSubscribeRSS Feed
sharonlee
Quartz | Level 8

I have the following dataset:

IDQuestionResponse
alpha1Y
alpha2N
alpha3Y
beta1N
beta2N

 

But I want this:

 

IDQ1Q2Q3
alphaYNY
betaNN 

 

I tried to use proc transpose as follows:

 

proc transpose data = have out = want;

by id;

var Question response;

run;

 

My output is 

idcol1col2col3
alpha123
alphaYNY
beta12 
betaNN 

 

I'm investigating arrays (admittedly my knowledge is not so good with arrays), but I don't think this is possible.

 

A workaround I think can be to create a dataset with just Q1 and then run a proc transpose.  However, I have over 100 questions, so even with a macro statement it is very onerous.

 

Any thoughts?

Thanks

6 REPLIES 6
japelin
Rhodochrosite | Level 12

how about this code using arrray statement.

data now;
  length id $10 Question 8 Response $1;
  infile cards;
  input id $ Question Response $;
cards;
ALPHA 1 Y
ALPHA 2 N
ALPHA 3 Y
BETA 1 N
BETA 2 N
;
run;

proc sort data=now out=tmp nodupkey;
  by descending Question;
run;
data _null_;
  set tmp(obs=1);
  call symputx('max_q',Question);/* set max of questions */
run;

proc sort data=now out=new;
  by id question;
run;

data new(keep=id q1-q&max_q);
  set new;
  by id question;
  array q{&max_q} $1;/* create array */
  retain q1-q&max_q;/* keep array variables to next observation */
  if first.id then do;/* init array value */
    do i=1 to &max_q;
      call missing(q{i});
    end;
  end;
  q{question}=Response;/* set response value to array variable by using question values as array subscript. */
  if last.id;/* output only last.id observations */
run;

 

Patrick
Opal | Level 21

May I ask why you want to transpose the data in first place? A long and narrow structure is better for most SAS procedures and often also easier to deal with in SAS data steps. 

novinosrin
Tourmaline | Level 20

Hi @sharonlee  , Most SAS users and I will concur @Patrick  to keep your dataset long and narrow. The long and narrow offers enormous convenience.

 

Anyways proc transpose is rather straight forward

 


data have;
input ID $	Question	Response $ ;
cards;
alpha	1	Y
alpha	2	N
alpha	3	Y
beta	1	N
beta	2	N
;

proc transpose data=have out=want(drop=_:) prefix=Q;
by id;
var response;
id Question;
run;
ed_sas_member
Meteorite | Level 14

Hi @sharonlee 

As proposed by @novinosrin, the ID statement enables you to identify a variable that will be used to name the new columns, which are col1, col2, ... by default. The option ‘prefix=Q’ will add the prefix ‘Q’ to these column names -> so Q1, Q2, ... as the modalities of question are 1, 2, ...)

Best,

 

proc transpose data=have out=want (drop=_name_) prefix=Q;
	by id;
	var response;
	id question;
run;

 

sharonlee
Quartz | Level 8
@Patrick and @novinosrin, thank you for replying. The reason why I want to convert to long is because I'm calculating a number of frequency tables and I find it easier if the data are long. Plus, I'll be doing some regression analyses and I have to have it long format.
novinosrin
Tourmaline | Level 20

Ok it's all good. Well, I posted the transpose in my previous post. I hope that helps.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 844 views
  • 1 like
  • 5 in conversation