Re: Proc Transpose Issue

RobertNYC · Posted 11-27-2012 02:21 PM

Hi all--

I'm having trouble with this transpose. When I try to transpose the data set below I get an error message that says duplicate variables in the by group so I need to rename the variables.

Id	Question	responce1	responce2	responce3	responce4	responce5	responce6	responce7
fgg45	S1	YES
fgg45	S2	NO	NO	NO	NO	NO	NO
fgg45	G5	YES	YES	YES	YES	YES	YES	YES
fgg45	G6	NO
gg453	S1	YES	YES
gg453	S2	NO	NO	NO	NO	NO	NO	NO
gg453	G5	YES	YES	YES
gg453	G6	NO	NO	NO	NO

What I need should look like this. I need to add a _#x to the end of each question depending on how many responses there are.

ID	S1_1x	S1_2x	S2_1x	S2_2x	S2_3x	S2_4x	S2_5x	S2_6x	S2_7x
fgg45	YES	NO	NO	NO	NO	NO	NO
gg453	YES	YES	NO	NO	NO	NO	NO	NO	NO

Any assistance is greatly appreciated.

Matt--

art297 · Posted 11-27-2012 02:26 PM

Are you NOT interested in the G5 and G6 responses and does it matter whether S1 has two or all seven levels?

RobertNYC · Posted 11-27-2012 02:36 PM

Hi Arthur--

Sorry. I should have specified what. I am interested in all of the Question values, I just didn't add G5 and G6 to the sample table. There are actually over 100 questions.

Thanks--

art297 · Posted 11-27-2012 02:56 PM

Would something like the following suffice?:

data have;

infile cards truncover;

input (Id Question responce1-responce7) ($);

cards;

fgg45 S1 YES

fgg45 S2 NO NO NO NO NO NO

fgg45 G5 YES YES YES YES YES YES YES

fgg45 G6 NO

gg453 S1 YES YES

gg453 S2 NO NO NO NO NO NO NO

gg453 G5 YES YES YES

gg453 G6 NO NO NO NO

;

proc sort data=have;

by id question;

run;

proc transpose data=have (rename=(responce1-responce7=_1-_7))

out=need;

by id question;

var _1-_7;

run;

data need (drop=question);

set need;

_name_=catt(Question,_name_);

run;

proc transpose data=need out=want;

by id;

var col1;

run;

Linlin · Posted 11-27-2012 03:08 PM

Hi Art,

Has SAS agreed to reward you with a lifelong sas license when you get 10,000 points?

art297 · Posted 11-27-2012 03:14 PM

: Good question. Haven't heard anything but I would hope that also includes a trip to the next SGF to be formally awarded the prize.

We should know fairly soon as the milestone should be approaching very quickly. Minimally, one would at least expect to see a post from Dr.G

Linlin · Posted 11-27-2012 03:20 PM

Agreed! only 5 points away.

art297 · Posted 11-27-2012 03:26 PM

: Look again! Someone must have marked one of my posts as being a correct answer.

Linlin · Posted 11-27-2012 03:29 PM

Congratulations!!! I am looking for a different status. - Linlin

RobertNYC · Posted 11-27-2012 03:58 PM

Thanks Art. Last thing. There aren't necessarily 7 responses in each data set There could be fore or there could be 100 it changes every time.

data_null__ · Posted 11-28-2012 09:42 AM

PROC TRANSPOSE has options that allow you to skip the data step to modify the values of _NAME_. It even has an option to add the superfluous X as the end.

I wasn’t sure but you also create variables that are all missing. This can be address with a where statement or
left as is.

data have;
  infile cards missover; 
  input (Id Question responce1-responce7) ($);
  cards; 
fgg45 S1 YES
fgg45 S2 NO NO NO NO NO NO
fgg45 G5 YES YES YES YES YES YES YES
fgg45 G6 NO
gg453 S1 YES YES
gg453 S2 NO NO NO NO NO NO NO
gg453 G5 YES YES YES
gg453 G6 NO NO NO NO
;;;;
   run; 

proc sort data=have;
   by id question;
   run; 

proc transpose 
      data=have(rename=(responce1-responce7=_1-_7))
      out=need;
   by id question;
   var _1-_7;
   run; 

 
proc transpose data=need out=want(drop=_name_) suffix=x;
   where not missing(col1);
   by id;
   var col1;
   id question _name_;
   run; 

joehinson · Posted 11-28-2012 09:48 AM

Thanks, DN.

I chose to create labels within the data step because they were needed in the hash-of-hashes processing.

I'm very much aware of those useful features of Proc Transpose. As a matter of fact, if I were not hard pressed for time, I wouldn't have needed Proc Transpose at all.

Regards,

Joe

data_null__ · Posted 11-28-2012 10:45 AM

Joe, my comments were for Art's first version of this program not yours.

Ksharp · Posted 11-28-2012 12:27 AM

OK. If I understand what you mean.

data have;
  infile cards truncover;
  input (Id Question responce1-responce7 ) ($);
  cards;
fgg45 S1 YES
fgg45 S2 NO NO NO NO NO NO
fgg45 G5 YES YES YES YES YES YES YES
fgg45 G6 NO
gg453 S1 YES YES
gg453 S2 NO NO NO NO NO NO NO
gg453 G5 YES YES YES
gg453 G6 NO NO NO NO
;
run;
proc sql noprint;
create table x as
select question,name,catx('=',name,tranwrd(cats(question,name),'responce','_')) as rename length=100 from
 (select distinct question from have) , 
 (select lowcase(name) as name from dictionary.columns where libname='WORK' and memname='HAVE' and lowcase(name) like 'responce%')
;
quit; 
data temp;
 set x;
 by question;
 length want $ 32767;
 retain want;
 want=catx(' ',want,rename);
 if last.question then do; output;call missing(want);end;
 drop rename name;
run; 
data _null_;
 set temp end=last;
 if _n_ eq 1 then call execute('data want(drop=question); merge ');
 call execute('have(where=(question="'||trim(question)||'") rename=('||trim(want)||')) ');
 if last then call execute(';run;');
 run;

Ksharp

joehinson · Posted 11-28-2012 08:51 AM

....and here is another approach that also works for any number of columns.

(I did this mainly as an academic exercise, testing the use of data access functions (eg. getvarc()), and the hash-of-hashes for group processing):

data have;

infile datalines truncover;

input (Id Question responce1-responce7) ($);

datalines;

fgg45 S1 YES

fgg45 S2 NO NO NO NO NO NO

fgg45 G5 YES YES YES YES YES YES YES

fgg45 G6 NO

gg453 S1 YES YES

gg453 S2 NO NO NO NO NO NO NO

gg453 G5 YES YES YES

gg453 G6 NO NO NO NO

;

run;

data _null_;

if _N_=1 then do;

length Responce $3 colx $12;

if(1=2)then set have;

declare hash final (ordered:"a",multidata:"Y");

final.defineKey("id","colx");

final.defineData("id","colx","responce");

final.defineDone();

declare hash ids (ordered:"a");

ids.defineKey("id");

ids.defineData("id", "objQues","hobjQues");

ids.defineDone();

declare hiter hids("ids");

declare hash objQues;

declare hiter hobjQues;

declare hash objResp;

declare hiter hobjResp;

call missing (of _all_);

dsid=open("have"); put dsid=;

nrows=attrn(dsid,"NOBS");put nrows=;

ncols=attrn(dsid,"NVARS");put ncols=;

end;

do y=1 to nrows;

rc=fetchobs(dsid,y);

id=getvarc(dsid,1);question=getvarc (dsid,2);

if ids.find() ne 0 then do;

objQues=_new_ hash(ordered:"a");

objQues.defineKey("id","Question");

objQues.defineData("Question","objResp","hobjResp");

objQues.defineDone();

hobjQues=_new_ hiter("objQues");

rci=ids.replace();

end;

if objQues.find() ne 0 then do;

objResp=_new_ hash(ordered:"a",multidata:"Y");

objResp.defineKey("id","Question");

objResp.defineData("id","Question","colx","Responce");

objResp.defineDone();

hobjResp=_new_ hiter("objResp");

rcq=objQues.replace();

do n=3 to ncols;

cx=varname(dsid,n);

dx=compress(cx,"0123456789","k");

colx=strip(Question)||"_"||"X"||dx;

responce=getvarc(dsid,n);

rcr=objResp.replace();

rcf=final.replace();

end;

final.output(dataset:"finaldata");

stop;

run;

proc transpose data=finaldata out=want;;

by id;

id colx;

var responce;

run;

Registration is open

SAS Training: Just a Click Away