Help on PROC TRANSPOSE

SushilNayak · Posted 10-30-2008 08:03 AM

I've tried hard, but i think it's not possible so i thought why not ask you all before I come to any conclusion.

I have a data like this :::

Internal_id First_nm Last_nm Plan_id Country
X Sushil Nayak 10 IND
X Sushil Nayak 20 IND
Z Dummy1 Dummy1 10 USA
Z Dummy1 Dummy1 10 IND
Y Dummy2 Dummy2 10 CHI
Y Dummy2 Dummy2 20 CHI
Y Dummy2 Dummy2 30 CHI

With this kinda of data using proc transpose only once, can we make the output like this

Internal_id First_nm Last_nm Plan_id_1 Plan_id_2 Plan_id_3 Country_1 Country_2 Country_3
X Sushil Nayak 10 20 IND IND
Z Dummy1 Dummy1 10 10 USA IND
Y Dummy2 Dummy2 10 20 30 CHI CHI CHI

This is a sample data, my problem has plenty of other variables likes Country and Plan which force me in using PROC transpose many times. I've tried datastep manipulation as well using macro, but i wanted to know if there is any way i can convert the problematic data given above in the format like above using PROC TRANSPOSE only once.

Thanks!

Cynthia_sas · Posted 10-30-2008 06:06 PM

Hi:
I think that while you could do what you want with multiple transposes, you might as well use a DATA step program and ARRAYs to build your data set with numbered variables for Plan_id and Country.

The program below creates some simple data and does what you want. There are a few assumptions about the program:
1) there are no more than 5 possible rows for any unique internal_id
2) first and last names are the same for every unique internal_id

cynthia
[pre]
** 1) Make some test data;
data plandata;
length internal_id $1 first_nm last_nm $8
Plan_id $2 Country $3;
infile datalines;
input internal_id $ first_nm $ last_nm $ Plan_id $ Country $;
return;
datalines;
A Alan Adams 10 FRA
A Alan Adams 11 BEL
B Bob Baxter 11 USA
B Bob Baxter 12 MEX
B Bob Baxter 13 PR
C Carla Castle 21 CHI
C Carla Castle 22 JP
C Carla Castle 23 TH
C Carla Castle 24 CHI
C Carla Castle 25 JP
D Darla Danes 30 USA
D Darla Danes 31 USA
D Darla Danes 32 BEL
D Darla Danes 33 FRA
X Sushil Nayak 10 IND
X Sushil Nayak 20 IND
;
run;

** 2) Sort the Data;
proc sort data=plandata;
by internal_id first_nm last_nm Plan_id;
run;

** 3) Use a DATA Step program to create numbered variables;
** based on Internal_ID;
data make_arr;
** read in the sorted data file and turn "on" BY group;
** processing and drop unwanted variables;
set plandata;
by internal_id;
drop country plan_id i index;

** Retain the numbered variables that we want to treat as array "members";
** and the "index" variable that will be used in the ;
** arrays;
retain pid1-pid5
cntry1-cntry5 index;

** Declare the ARRAYs with a max of 5 possible values;
** for PID (Plan_ID) and CNTRY (Country);
array pid $2 pid1-pid5;
array cntry $3 cntry1-cntry5;

** Initialize the array members to spaces;
** at the first unique internal_id -- will happen once;
** for every unique value of internal_id;
** (assumes that first_nm and last_nm are the same for each ID);
if first.internal_id then do;
do i = 1 to 5 by 1;
pid(i) = ' ';
cntry(i) = ' ';
end;
** set the index variable to 0 after initializing;
index = 0;
end;

** for every row in the by group: ;
** initialize index by 1;
** assign the array member pid(index) the Plan_id value;
** assign the array member cntry(index) the Country value;
index+1;
pid(index) = plan_id;
cntry(index) = country;

** because the array members are retained, when we are;
** on the LAST of the group for internal_id, the arrays;
** are now "filled" and we can output the new observation;
if last.internal_id then do;
arrcnt = index;
output;
end;
run;

ods listing;
options nodate nonumber nocenter;
** 4) Print the new data set;
ods html file='c:\temp\makearr.html' style=sasweb;
proc print data=make_arr;
title 'All Plan and Country variables in numbered vars';
var internal_id first_nm last_nm arrcnt pid1-pid5 cntry1-cntry5;
run;
ods html close;
[/pre]

SushilNayak · Posted 10-30-2008 11:46 PM

Hi Cynthia,
Thanks for looking into my problem. I totally agree with the datastep solution that you have given, but i was wondering if this sort of outcome can be possible using PROC TRANSPOSE only ONCE. Just wanted confirm this up, as i believe this can't be done using PROC TRNASPOSE only once.

Thanks!

data_null__ · Posted 10-31-2008 12:27 PM

If you have less than 100 observations per "flipped group" you can use PROC SUMMARY in a one step solution. Well 2 to get the group dimension, which would be needed for data step / array also. Plus the variable attributes are inherited from the flipped variables.

I borrowed Cynthia's data thank you.

[pre]
data plandata;
length internal_id $1 first_nm last_nm $8
Plan_id $2 Country $3;
input internal_id $ first_nm $ last_nm $ Plan_id $ Country $;
label Plan_id = 'Plan ID' country='Country of origin';
format Plan_id $quote5.;
datalines;
A Alan Adams 10 FRA
A Alan Adams 11 BEL
B Bob Baxter 11 USA
B Bob Baxter 12 MEX
B Bob Baxter 13 PR
C Carla Castle 21 CHI
C Carla Castle 22 JP
C Carla Castle 23 TH
C Carla Castle 24 CHI
C Carla Castle 25 JP
D Darla Danes 30 USA
D Darla Danes 31 USA
D Darla Danes 32 BEL
D Darla Danes 33 FRA
X Sushil Nayak 10 IND
X Sushil Nayak 20 IND
;
run;

proc sql noprint;
select max(N) into :outN
from (select count(*) as N from plandata group by internal_id)
;
quit;
run;

proc summary nway missing data=plandata;
class internal_id first_nm last_nm;
output out=FlatPlan(drop=_:) idgroup(out[&outN](plan_id country)=);
run;

proc contents varnum;
proc print;
run;
[/pre]

SushilNayak · Posted 11-01-2008 09:15 AM

Hey Null 🙂 ,
gr8 solution. Honestly i never thought of proc summary to solve this problem....wow!!..solves my problem with just one time usage of proc sql and proc summary.

Thanks!

SushilNayak · Posted 11-05-2008 06:33 AM

Hey Null/Cynthia,
Just found this :: http://support.sas.com/kb/33/814.html ... a basic solution to my problem. but not with proc transpose...but a different approach..kinda like it

Thanks to SAMPLES & SAS NOTES for the solution ::

data visits;
input dovisit date9. person_id sex :$1. nvisit :$1. fvisit :$1. avisit :$1.;
datalines;
18dec2007 444 M T F F
18dec2007 444 M T F F
10jan2007 365 M T F F
10jan2007 365 M T F F
01feb2007 212 F T T T
01feb2007 212 F T F T
;
run;

/*create a data set of the duplicates using DUPOUT= option */

proc sort data=visits dupout=visits_dup nodupkey;
by person_id;
run;

/* Create a macro variable with the variable names that are to */
/* be merged. The variables considered BY variables are excluded */
/* from going into the macro variable using the NOT IN operator. */
/* The resulting macro variable is in the format varname=varname_2 */ */
proc sql noprint;
select trim(name) || '=' || trim(name) || '_2'
into :varlist separated by ' '
from DICTIONARY.COLUMNS
WHERE LIBNAME EQ "WORK" and MEMNAME EQ "VISITS"
and upcase(name) not in ('PERSON_ID' 'DOVISIT' 'SEX');
quit;

/*Merge the two data sets using the macro variable to rename the */
/*common variables in the second (duplicates) data set. */
data merged;
merge visits visits_dup (rename=(&varlist));
by person_id;
run;

proc print;
run;

Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Re: Help on PROC TRANSPOSE

Click image to register for webinar

Classroom Training Available!