BookmarkSubscribeRSS Feed
sharonlee
Quartz | Level 8

I have the following dataset:

IDQuestionResponse
alpha1Y
alpha2N
alpha3Y
beta1N
beta2N

 

But I want this:

 

IDQ1Q2Q3
alphaYNY
betaNN 

 

I tried to use proc transpose as follows:

 

proc transpose data = have out = want;

by id;

var Question response;

run;

 

My output is 

idcol1col2col3
alpha123
alphaYNY
beta12 
betaNN 

 

I'm investigating arrays (admittedly my knowledge is not so good with arrays), but I don't think this is possible.

 

A workaround I think can be to create a dataset with just Q1 and then run a proc transpose.  However, I have over 100 questions, so even with a macro statement it is very onerous.

 

Any thoughts?

Thanks

6 REPLIES 6
japelin
Rhodochrosite | Level 12

how about this code using arrray statement.

data now;
  length id $10 Question 8 Response $1;
  infile cards;
  input id $ Question Response $;
cards;
ALPHA 1 Y
ALPHA 2 N
ALPHA 3 Y
BETA 1 N
BETA 2 N
;
run;

proc sort data=now out=tmp nodupkey;
  by descending Question;
run;
data _null_;
  set tmp(obs=1);
  call symputx('max_q',Question);/* set max of questions */
run;

proc sort data=now out=new;
  by id question;
run;

data new(keep=id q1-q&max_q);
  set new;
  by id question;
  array q{&max_q} $1;/* create array */
  retain q1-q&max_q;/* keep array variables to next observation */
  if first.id then do;/* init array value */
    do i=1 to &max_q;
      call missing(q{i});
    end;
  end;
  q{question}=Response;/* set response value to array variable by using question values as array subscript. */
  if last.id;/* output only last.id observations */
run;

 

Patrick
Opal | Level 21

May I ask why you want to transpose the data in first place? A long and narrow structure is better for most SAS procedures and often also easier to deal with in SAS data steps. 

novinosrin
Tourmaline | Level 20

Hi @sharonlee  , Most SAS users and I will concur @Patrick  to keep your dataset long and narrow. The long and narrow offers enormous convenience.

 

Anyways proc transpose is rather straight forward

 


data have;
input ID $	Question	Response $ ;
cards;
alpha	1	Y
alpha	2	N
alpha	3	Y
beta	1	N
beta	2	N
;

proc transpose data=have out=want(drop=_:) prefix=Q;
by id;
var response;
id Question;
run;
ed_sas_member
Meteorite | Level 14

Hi @sharonlee 

As proposed by @novinosrin, the ID statement enables you to identify a variable that will be used to name the new columns, which are col1, col2, ... by default. The option ‘prefix=Q’ will add the prefix ‘Q’ to these column names -> so Q1, Q2, ... as the modalities of question are 1, 2, ...)

Best,

 

proc transpose data=have out=want (drop=_name_) prefix=Q;
	by id;
	var response;
	id question;
run;

 

sharonlee
Quartz | Level 8
@Patrick and @novinosrin, thank you for replying. The reason why I want to convert to long is because I'm calculating a number of frequency tables and I find it easier if the data are long. Plus, I'll be doing some regression analyses and I have to have it long format.
novinosrin
Tourmaline | Level 20

Ok it's all good. Well, I posted the transpose in my previous post. I hope that helps.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 566 views
  • 1 like
  • 5 in conversation