Solved: Transpose or Array?

gpv2000 · Posted 07-18-2018 07:31 AM

I need to transpose each subset of subject. But after transpose I want each subject to have unique columns thus resulting in a wide data set. I don't think I can accomplish this with transpose alone.

data have;

   input subject count start end change;

datalines;

12345   1   20   35   1

12345   1   36    39  2

12345    1  40    45    0


23456    2  15   19   0

23456    2   20   24   1 

34567     3   33   44   1

34567     3   45  50    2

45678     1   15   18  0    
45678     1   19   22  1


run;

So for example below if the first subject occupies columns st_1, ed_1, chg_1 through to st_3, ed_3, chg_3
Then the second subject will commence with st_4, ed_4, chg_4 through st_5, ed_5, chg_5 etc. But columns can be reused once the count variable resets to 1. Count resets after every 3 distinct subjects.

Desired output;

subject	count	st_1	ed_1	chg_1	st_2	ed_2	chg_2	st_3	ed_3	chg_3	st_4	ed_4	chg_4	st_5	ed_5	chg_5	st_6	ed_6	chg_6	st_7	ed_7	chg_7
12345	1	20	35	1	36	39	2	40	45	0
23456	2										15	19	0	20	24	1
34567	3																33	44	1	45	50	2
45678	1	15	18	0	19	22	1

Any suggestions?

Kurt_Bremser · Posted 07-18-2018 08:43 AM

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 07-18-2018 08:43 AM

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 07-18-2018 08:54 AM

I found a better method, using a double transpose:

(start with the int dataset from my previous post)

proc transpose data=int out=int2;
by subject vcount;
var start end change;
run;

data int3;
set int2;
_name_ = cats(_name_,strip(put(vcount,best.)));
run;

proc transpose data=int3 out=want (drop=_name_);
by subject;
var col1;
id _name_;
run;

no macro programming needed, everything sorts out by itself.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

gpv2000 · Posted 07-18-2018 08:59 AM

Thanks. I was trying your earlier method. But I will now try your new suggestion.

gpv2000 · Posted 07-18-2018 02:33 PM

your solution works great for the sample data i provided. I think I always oversimplify my sample data. I tried to get to work for my data and it gives me errors. I have posted more realistic data below now. The count is assigned for first 15 unique (cohort,subject) and then it resets.

data have;
input cohort subject count start end change;
datalines;

xxx_020 679910101 1 0 43 0
xxx_040 119910102 2 0 60 0
xxx_040 679910102 3 0 43 0
xxx_060 119930101 4 0 15 0
xxx_060 119930101 4 15 	35	1
xxx_060 119930101 4 35	50	0
xxx_060 119930101 4 50	57	1
xxx_060 119930101 4 57	64	0
xxx_060 119930101 4 64	71	1
xxx_060 119930101 4 71	78	0
xxx_060 119930101 4 78	84	1
xxx_060 119930101 4 84	99	0
xxx_060 119930101 4 99	141	1
xxx_060 119930101 4 141	148	2
xxx_060 119930101 4 148	168	1
xxx_060 119930101 4 168	176	2
xxx_060 119930101 4 176	197	1
xxx_060 119930101 4 197	204	2
xxx_060 119930101 4 204	225	1
xxx_060 119930101 4 225	268	2
xxx_060 139910101 5 0	23	0
xxx_100 119910103 6 0	22	0
xxx_100 119910103 6 22	36	1
xxx_100 119910103 6 36	58	2
xxx_100 119910103 6 58	67	0
xxx_100 119910103 6 67	71	1
xxx_100 119910103 6 71	78	0
xxx_100 119910103 6 78	87	1
xxx_100 119910103 6 87	108	0
xxx_100 119910103 6 108	134	1
xxx_100 119910103 6 134	149	0
xxx_100 119910103 6 149	155	2
xxx_100 119910103 6 155	169	0
xxx_100 119910103 6 169	184	1
xxx_100 119910103 6 184	190	2
xxx_100 119910103 6 190	197	0
xxx_100 119910103 6 197	211	1
xxx_100 119910103 6 211	218	0
xxx_100 119910103 6 218	233	1
xxx_100 139910102 7 0	163	0
xxx_100 139910102 7 163	169	2
xxx_100 139910102 7 169	176	1
xxx_100 139910102 7 176	183	0
xxx_100 139910102 7 183	190	2
xxx_100 139910102 7 190	205	1
xxx_100 139910102 7 205	211	2
xxx_100 139910102 7 211	218	0
xxx_100 139910102 7 218	233	2
xxx_100 139910102 7 233	234	0
xxx_140 119910104 8 0	7	0
xxx_140 119910104 8 7	15	2
xxx_140 119910104 8 15	22	3
xxx_140 119910104 8 22	36	2
xxx_140 119910104 8 36	38	3
xxx_140 119910104 8 38	43	4
xxx_140 119910104 8 43	50	1
xxx_140 119910104 8 50	57	2
xxx_140 119910104 8 57	64	3
xxx_140 119910104 8 64	78	1
xxx_140 119910104 8 78	92	2
xxx_140 119910104 8 92	113	3
xxx_140 119910104 8 113	120	2
xxx_140 119910104 8 120	148	0
xxx_140 119910104 8 148	149	3
xxx_140 119910105 9 0	15	0
xxx_140 119910105 9 15	29	1
xxx_140 119910105 9 29	36	2
xxx_140 119910105 9 36	51	1
xxx_140 119910105 9 51	97	0
xxx_140 119910106 10	0	16	0
xxx_140 119910106 10	16	30	2
xxx_140 119910106 10	30	37	4
xxx_140 119910106 10	37	52	0
xxx_140 139910104 11	0	8	0
xxx_140 139910104 11	8	10	3
xxx_140 139910104 11	10	17	4
xxx_140 139910104 11	17	20	2
xxx_140 139910104 11	20	29	0
xxx_140 139910104 11	29	31	3
xxx_140 139910104 11	31	34	0
xxx_140 139910105 12	0	8	0
xxx_140 139910105 12	8	22	2
xxx_140 139910105 12	22	26	3
xxx_140 139910105 12	26	36	0
xxx_140 139910105 12	36	43	2
xxx_140 139910105 12	43	50	3
xxx_140 139910105 12	50	86	2
xxx_140 139910105 12	86	87	0
xxx_140 139910106 13	0	8	0
xxx_140 139910106 13	8	15	1
xxx_140 139910106 13	15	23	2
xxx_140 139910106 13	23	30	1
xxx_140 139910106 13	30	44	2
xxx_140 139910106 13	44	45	3
xxx_140 679910104 14	0	7	0
xxx_140 679910104 14	7	28	2
xxx_140 679910104 14	28	36	3
xxx_140 679910104 14	36	42	0
xxx_140 679910104 14	42	49	2
xxx_140 679910104 14	49	105	0
xxx_140 679910104 14	105	112	2
xxx_140 679910104 14	112	126	0
xxx_140 679910104 14	126	133	1
xxx_140 679910104 14	133	140	2
xxx_140 679910104 14	140	148	0
xxx_140 679910105 15	0	24	0
xxx_140 679910105 15	24	28	1
xxx_140 679910105 15	28	42	2
xxx_140 679910105 15	42	92	0
xxx_160 119910107 1 0	8	0
xxx_160 119910107 1 8	15	1
xxx_160 119910107 1 15	22	2
xxx_160 119910107 1 22	25	0
xxx_160 119910107 1 25	29	1
xxx_160 119910107 1 29	30	0
xxx_160 119910108 2 0	1	0
xxx_160 119910108 2 1	10	1
xxx_160 119910108 2 10	17	3
xxx_160 119910108 2 17	39	2
xxx_160 139910107 3 0	17	0
xxx_160 139910107 3 17	31	2
xxx_160 139910107 3 31	38	0
xxx_160 139910107 3 38	43	2
xxx_160 139910107 3 43	48	0
        ;
run;

Kurt_Bremser · Posted 07-19-2018 02:30 AM

To me it looks like you have to use cohort wherever you also use subject.

Please fix your data step code so that it works, I get all kinds of messages and incomplete records. Make sure your delimiters are consistent (either all blanks or all tabs). I can only test code against solid data.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ksharp · Posted 07-18-2018 09:11 AM

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;
proc sql noprint;
select max(n) into : n
 from (select count(*) as n from have group by subject,count);
quit;
proc summary data=have;
by subject count;
output out=want idgroup(out[&n] (start end change)=);
run;
proc print noobs;run;

Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Re: Transpose or Array?

Registration is open

Registration is open

SAS Training: Just a Click Away