BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gpv2000
Calcite | Level 5

I need to transpose each subset of subject. But after transpose I want each subject to have unique columns thus resulting in a wide data set. I don't think I can accomplish this with transpose alone.



data have;

   input subject count start end change;

datalines;

12345   1   20   35   1

12345   1   36    39  2

12345    1  40    45    0


23456    2  15   19   0

23456    2   20   24   1 

34567     3   33   44   1

34567     3   45  50    2

45678     1   15   18  0    
45678     1   19   22  1


run;

So for example below if the first subject occupies columns st_1, ed_1, chg_1 through to st_3, ed_3chg_3
Then the second subject will commence with st_4, ed_4, chg_4 through  st_5, ed_5, chg_5 etc. But columns can be reused once the count variable resets to 1. Count resets after every 3 distinct subjects.

Desired output;

 

subject countst_1ed_1 chg_1st_2ed_2chg_2st_3ed_3chg_3st_4ed_4chg_4st_5ed_5chg_5st_6ed_6chg_6st_7ed_7chg_7
123451203513639240450            
234562         1519020241      
345673               3344145502
4567811518019221               

    Any suggestions?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars
Kurt_Bremser
Super User

I found a better method, using a double transpose:

(start with the int dataset from my previous post)

proc transpose data=int out=int2;
by subject vcount;
var start end change;
run;

data int3;
set int2;
_name_ = cats(_name_,strip(put(vcount,best.)));
run;

proc transpose data=int3 out=want (drop=_name_);
by subject;
var col1;
id _name_;
run;

no macro programming needed, everything sorts out by itself.

gpv2000
Calcite | Level 5
Thanks. I was trying your earlier method. But I will now try your new suggestion.
gpv2000
Calcite | Level 5

your solution works great for the sample data i provided. I think I always oversimplify my sample data. I tried to get to work for my data and it gives me errors. I have posted more realistic data below now. The count is assigned for first 15 unique (cohort,subject) and then it resets.

data have;
input cohort subject count start end change;
datalines;

xxx_020 679910101 1 0 43 0
xxx_040 119910102 2 0 60 0
xxx_040 679910102 3 0 43 0
xxx_060 119930101 4 0 15 0
xxx_060 119930101 4 15 	35	1
xxx_060 119930101 4 35	50	0
xxx_060 119930101 4 50	57	1
xxx_060 119930101 4 57	64	0
xxx_060 119930101 4 64	71	1
xxx_060 119930101 4 71	78	0
xxx_060 119930101 4 78	84	1
xxx_060 119930101 4 84	99	0
xxx_060 119930101 4 99	141	1
xxx_060 119930101 4 141	148	2
xxx_060 119930101 4 148	168	1
xxx_060 119930101 4 168	176	2
xxx_060 119930101 4 176	197	1
xxx_060 119930101 4 197	204	2
xxx_060 119930101 4 204	225	1
xxx_060 119930101 4 225	268	2
xxx_060 139910101 5 0	23	0
xxx_100 119910103 6 0	22	0
xxx_100 119910103 6 22	36	1
xxx_100 119910103 6 36	58	2
xxx_100 119910103 6 58	67	0
xxx_100 119910103 6 67	71	1
xxx_100 119910103 6 71	78	0
xxx_100 119910103 6 78	87	1
xxx_100 119910103 6 87	108	0
xxx_100 119910103 6 108	134	1
xxx_100 119910103 6 134	149	0
xxx_100 119910103 6 149	155	2
xxx_100 119910103 6 155	169	0
xxx_100 119910103 6 169	184	1
xxx_100 119910103 6 184	190	2
xxx_100 119910103 6 190	197	0
xxx_100 119910103 6 197	211	1
xxx_100 119910103 6 211	218	0
xxx_100 119910103 6 218	233	1
xxx_100 139910102 7 0	163	0
xxx_100 139910102 7 163	169	2
xxx_100 139910102 7 169	176	1
xxx_100 139910102 7 176	183	0
xxx_100 139910102 7 183	190	2
xxx_100 139910102 7 190	205	1
xxx_100 139910102 7 205	211	2
xxx_100 139910102 7 211	218	0
xxx_100 139910102 7 218	233	2
xxx_100 139910102 7 233	234	0
xxx_140 119910104 8 0	7	0
xxx_140 119910104 8 7	15	2
xxx_140 119910104 8 15	22	3
xxx_140 119910104 8 22	36	2
xxx_140 119910104 8 36	38	3
xxx_140 119910104 8 38	43	4
xxx_140 119910104 8 43	50	1
xxx_140 119910104 8 50	57	2
xxx_140 119910104 8 57	64	3
xxx_140 119910104 8 64	78	1
xxx_140 119910104 8 78	92	2
xxx_140 119910104 8 92	113	3
xxx_140 119910104 8 113	120	2
xxx_140 119910104 8 120	148	0
xxx_140 119910104 8 148	149	3
xxx_140 119910105 9 0	15	0
xxx_140 119910105 9 15	29	1
xxx_140 119910105 9 29	36	2
xxx_140 119910105 9 36	51	1
xxx_140 119910105 9 51	97	0
xxx_140 119910106 10	0	16	0
xxx_140 119910106 10	16	30	2
xxx_140 119910106 10	30	37	4
xxx_140 119910106 10	37	52	0
xxx_140 139910104 11	0	8	0
xxx_140 139910104 11	8	10	3
xxx_140 139910104 11	10	17	4
xxx_140 139910104 11	17	20	2
xxx_140 139910104 11	20	29	0
xxx_140 139910104 11	29	31	3
xxx_140 139910104 11	31	34	0
xxx_140 139910105 12	0	8	0
xxx_140 139910105 12	8	22	2
xxx_140 139910105 12	22	26	3
xxx_140 139910105 12	26	36	0
xxx_140 139910105 12	36	43	2
xxx_140 139910105 12	43	50	3
xxx_140 139910105 12	50	86	2
xxx_140 139910105 12	86	87	0
xxx_140 139910106 13	0	8	0
xxx_140 139910106 13	8	15	1
xxx_140 139910106 13	15	23	2
xxx_140 139910106 13	23	30	1
xxx_140 139910106 13	30	44	2
xxx_140 139910106 13	44	45	3
xxx_140 679910104 14	0	7	0
xxx_140 679910104 14	7	28	2
xxx_140 679910104 14	28	36	3
xxx_140 679910104 14	36	42	0
xxx_140 679910104 14	42	49	2
xxx_140 679910104 14	49	105	0
xxx_140 679910104 14	105	112	2
xxx_140 679910104 14	112	126	0
xxx_140 679910104 14	126	133	1
xxx_140 679910104 14	133	140	2
xxx_140 679910104 14	140	148	0
xxx_140 679910105 15	0	24	0
xxx_140 679910105 15	24	28	1
xxx_140 679910105 15	28	42	2
xxx_140 679910105 15	42	92	0
xxx_160 119910107 1 0	8	0
xxx_160 119910107 1 8	15	1
xxx_160 119910107 1 15	22	2
xxx_160 119910107 1 22	25	0
xxx_160 119910107 1 25	29	1
xxx_160 119910107 1 29	30	0
xxx_160 119910108 2 0	1	0
xxx_160 119910108 2 1	10	1
xxx_160 119910108 2 10	17	3
xxx_160 119910108 2 17	39	2
xxx_160 139910107 3 0	17	0
xxx_160 139910107 3 17	31	2
xxx_160 139910107 3 31	38	0
xxx_160 139910107 3 38	43	2
xxx_160 139910107 3 43	48	0
        ;
run;
Kurt_Bremser
Super User

To me it looks like you have to use cohort wherever you also use subject.

Please fix your data step code so that it works, I get all kinds of messages and incomplete records. Make sure your delimiters are consistent (either all blanks or all tabs). I can only test code against solid data.

 

Ksharp
Super User
data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;
proc sql noprint;
select max(n) into : n
 from (select count(*) as n from have group by subject,count);
quit;
proc summary data=have;
by subject count;
output out=want idgroup(out[&n] (start end change)=);
run;
proc print noobs;run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1488 views
  • 0 likes
  • 3 in conversation