BookmarkSubscribeRSS Feed
sharonlee
Quartz | Level 8

I have the following dataset:

IDQuestionResponse
alpha1Y
alpha2N
alpha3Y
beta1N
beta2N

 

But I want this:

 

IDQ1Q2Q3
alphaYNY
betaNN 

 

I tried to use proc transpose as follows:

 

proc transpose data = have out = want;

by id;

var Question response;

run;

 

My output is 

idcol1col2col3
alpha123
alphaYNY
beta12 
betaNN 

 

I'm investigating arrays (admittedly my knowledge is not so good with arrays), but I don't think this is possible.

 

A workaround I think can be to create a dataset with just Q1 and then run a proc transpose.  However, I have over 100 questions, so even with a macro statement it is very onerous.

 

Any thoughts?

Thanks

6 REPLIES 6
japelin
Rhodochrosite | Level 12

how about this code using arrray statement.

data now;
  length id $10 Question 8 Response $1;
  infile cards;
  input id $ Question Response $;
cards;
ALPHA 1 Y
ALPHA 2 N
ALPHA 3 Y
BETA 1 N
BETA 2 N
;
run;

proc sort data=now out=tmp nodupkey;
  by descending Question;
run;
data _null_;
  set tmp(obs=1);
  call symputx('max_q',Question);/* set max of questions */
run;

proc sort data=now out=new;
  by id question;
run;

data new(keep=id q1-q&max_q);
  set new;
  by id question;
  array q{&max_q} $1;/* create array */
  retain q1-q&max_q;/* keep array variables to next observation */
  if first.id then do;/* init array value */
    do i=1 to &max_q;
      call missing(q{i});
    end;
  end;
  q{question}=Response;/* set response value to array variable by using question values as array subscript. */
  if last.id;/* output only last.id observations */
run;

 

Patrick
Opal | Level 21

May I ask why you want to transpose the data in first place? A long and narrow structure is better for most SAS procedures and often also easier to deal with in SAS data steps. 

novinosrin
Tourmaline | Level 20

Hi @sharonlee  , Most SAS users and I will concur @Patrick  to keep your dataset long and narrow. The long and narrow offers enormous convenience.

 

Anyways proc transpose is rather straight forward

 


data have;
input ID $	Question	Response $ ;
cards;
alpha	1	Y
alpha	2	N
alpha	3	Y
beta	1	N
beta	2	N
;

proc transpose data=have out=want(drop=_:) prefix=Q;
by id;
var response;
id Question;
run;
ed_sas_member
Meteorite | Level 14

Hi @sharonlee 

As proposed by @novinosrin, the ID statement enables you to identify a variable that will be used to name the new columns, which are col1, col2, ... by default. The option ‘prefix=Q’ will add the prefix ‘Q’ to these column names -> so Q1, Q2, ... as the modalities of question are 1, 2, ...)

Best,

 

proc transpose data=have out=want (drop=_name_) prefix=Q;
	by id;
	var response;
	id question;
run;

 

sharonlee
Quartz | Level 8
@Patrick and @novinosrin, thank you for replying. The reason why I want to convert to long is because I'm calculating a number of frequency tables and I find it easier if the data are long. Plus, I'll be doing some regression analyses and I have to have it long format.
novinosrin
Tourmaline | Level 20

Ok it's all good. Well, I posted the transpose in my previous post. I hope that helps.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1477 views
  • 1 like
  • 5 in conversation