BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gpv2000
Calcite | Level 5

I need to transpose each subset of subject. But after transpose I want each subject to have unique columns thus resulting in a wide data set. I don't think I can accomplish this with transpose alone.



data have;

   input subject count start end change;

datalines;

12345   1   20   35   1

12345   1   36    39  2

12345    1  40    45    0


23456    2  15   19   0

23456    2   20   24   1 

34567     3   33   44   1

34567     3   45  50    2

45678     1   15   18  0    
45678     1   19   22  1


run;

So for example below if the first subject occupies columns st_1, ed_1, chg_1 through to st_3, ed_3chg_3
Then the second subject will commence with st_4, ed_4, chg_4 through  st_5, ed_5, chg_5 etc. But columns can be reused once the count variable resets to 1. Count resets after every 3 distinct subjects.

Desired output;

 

subject countst_1ed_1 chg_1st_2ed_2chg_2st_3ed_3chg_3st_4ed_4chg_4st_5ed_5chg_5st_6ed_6chg_6st_7ed_7chg_7
123451203513639240450            
234562         1519020241      
345673               3344145502
4567811518019221               

    Any suggestions?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Since we want to preserve variable names, we need to do a transpose for each variable separately; we also need a preliminary step that creates the repeat count within each group, and a final macro to reorder the columns as wanted:

data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;

data int;
set have;
by count notsorted;
retain vcount group 0;
if first.count and count = 1
then vcount = 1;
else vcount + 1;
run;

proc transpose data=int out=want1 (drop=_name_) prefix=start;
by subject;
id vcount;
var start;
run;

proc transpose data=int out=want2 (drop=_name_) prefix=end;
by subject;
id vcount;
var end;
run;

proc transpose data=int out=want3 (drop=_name_) prefix=change;
by subject;
id vcount;
var change;
run;

data want;
merge
  want1
  want2
  want3
;
by subject;
run;

proc sql noprint;
select max(vcount) into :vcount from int;
quit;

%macro order_vars;
data want;
format subject
%do i = 1 %to &vcount;
  start&i
  end&i
  change&i
%end;
;
set want;
run;
%mend;
%order_vars
Kurt_Bremser
Super User

I found a better method, using a double transpose:

(start with the int dataset from my previous post)

proc transpose data=int out=int2;
by subject vcount;
var start end change;
run;

data int3;
set int2;
_name_ = cats(_name_,strip(put(vcount,best.)));
run;

proc transpose data=int3 out=want (drop=_name_);
by subject;
var col1;
id _name_;
run;

no macro programming needed, everything sorts out by itself.

gpv2000
Calcite | Level 5
Thanks. I was trying your earlier method. But I will now try your new suggestion.
gpv2000
Calcite | Level 5

your solution works great for the sample data i provided. I think I always oversimplify my sample data. I tried to get to work for my data and it gives me errors. I have posted more realistic data below now. The count is assigned for first 15 unique (cohort,subject) and then it resets.

data have;
input cohort subject count start end change;
datalines;

xxx_020 679910101 1 0 43 0
xxx_040 119910102 2 0 60 0
xxx_040 679910102 3 0 43 0
xxx_060 119930101 4 0 15 0
xxx_060 119930101 4 15 	35	1
xxx_060 119930101 4 35	50	0
xxx_060 119930101 4 50	57	1
xxx_060 119930101 4 57	64	0
xxx_060 119930101 4 64	71	1
xxx_060 119930101 4 71	78	0
xxx_060 119930101 4 78	84	1
xxx_060 119930101 4 84	99	0
xxx_060 119930101 4 99	141	1
xxx_060 119930101 4 141	148	2
xxx_060 119930101 4 148	168	1
xxx_060 119930101 4 168	176	2
xxx_060 119930101 4 176	197	1
xxx_060 119930101 4 197	204	2
xxx_060 119930101 4 204	225	1
xxx_060 119930101 4 225	268	2
xxx_060 139910101 5 0	23	0
xxx_100 119910103 6 0	22	0
xxx_100 119910103 6 22	36	1
xxx_100 119910103 6 36	58	2
xxx_100 119910103 6 58	67	0
xxx_100 119910103 6 67	71	1
xxx_100 119910103 6 71	78	0
xxx_100 119910103 6 78	87	1
xxx_100 119910103 6 87	108	0
xxx_100 119910103 6 108	134	1
xxx_100 119910103 6 134	149	0
xxx_100 119910103 6 149	155	2
xxx_100 119910103 6 155	169	0
xxx_100 119910103 6 169	184	1
xxx_100 119910103 6 184	190	2
xxx_100 119910103 6 190	197	0
xxx_100 119910103 6 197	211	1
xxx_100 119910103 6 211	218	0
xxx_100 119910103 6 218	233	1
xxx_100 139910102 7 0	163	0
xxx_100 139910102 7 163	169	2
xxx_100 139910102 7 169	176	1
xxx_100 139910102 7 176	183	0
xxx_100 139910102 7 183	190	2
xxx_100 139910102 7 190	205	1
xxx_100 139910102 7 205	211	2
xxx_100 139910102 7 211	218	0
xxx_100 139910102 7 218	233	2
xxx_100 139910102 7 233	234	0
xxx_140 119910104 8 0	7	0
xxx_140 119910104 8 7	15	2
xxx_140 119910104 8 15	22	3
xxx_140 119910104 8 22	36	2
xxx_140 119910104 8 36	38	3
xxx_140 119910104 8 38	43	4
xxx_140 119910104 8 43	50	1
xxx_140 119910104 8 50	57	2
xxx_140 119910104 8 57	64	3
xxx_140 119910104 8 64	78	1
xxx_140 119910104 8 78	92	2
xxx_140 119910104 8 92	113	3
xxx_140 119910104 8 113	120	2
xxx_140 119910104 8 120	148	0
xxx_140 119910104 8 148	149	3
xxx_140 119910105 9 0	15	0
xxx_140 119910105 9 15	29	1
xxx_140 119910105 9 29	36	2
xxx_140 119910105 9 36	51	1
xxx_140 119910105 9 51	97	0
xxx_140 119910106 10	0	16	0
xxx_140 119910106 10	16	30	2
xxx_140 119910106 10	30	37	4
xxx_140 119910106 10	37	52	0
xxx_140 139910104 11	0	8	0
xxx_140 139910104 11	8	10	3
xxx_140 139910104 11	10	17	4
xxx_140 139910104 11	17	20	2
xxx_140 139910104 11	20	29	0
xxx_140 139910104 11	29	31	3
xxx_140 139910104 11	31	34	0
xxx_140 139910105 12	0	8	0
xxx_140 139910105 12	8	22	2
xxx_140 139910105 12	22	26	3
xxx_140 139910105 12	26	36	0
xxx_140 139910105 12	36	43	2
xxx_140 139910105 12	43	50	3
xxx_140 139910105 12	50	86	2
xxx_140 139910105 12	86	87	0
xxx_140 139910106 13	0	8	0
xxx_140 139910106 13	8	15	1
xxx_140 139910106 13	15	23	2
xxx_140 139910106 13	23	30	1
xxx_140 139910106 13	30	44	2
xxx_140 139910106 13	44	45	3
xxx_140 679910104 14	0	7	0
xxx_140 679910104 14	7	28	2
xxx_140 679910104 14	28	36	3
xxx_140 679910104 14	36	42	0
xxx_140 679910104 14	42	49	2
xxx_140 679910104 14	49	105	0
xxx_140 679910104 14	105	112	2
xxx_140 679910104 14	112	126	0
xxx_140 679910104 14	126	133	1
xxx_140 679910104 14	133	140	2
xxx_140 679910104 14	140	148	0
xxx_140 679910105 15	0	24	0
xxx_140 679910105 15	24	28	1
xxx_140 679910105 15	28	42	2
xxx_140 679910105 15	42	92	0
xxx_160 119910107 1 0	8	0
xxx_160 119910107 1 8	15	1
xxx_160 119910107 1 15	22	2
xxx_160 119910107 1 22	25	0
xxx_160 119910107 1 25	29	1
xxx_160 119910107 1 29	30	0
xxx_160 119910108 2 0	1	0
xxx_160 119910108 2 1	10	1
xxx_160 119910108 2 10	17	3
xxx_160 119910108 2 17	39	2
xxx_160 139910107 3 0	17	0
xxx_160 139910107 3 17	31	2
xxx_160 139910107 3 31	38	0
xxx_160 139910107 3 38	43	2
xxx_160 139910107 3 43	48	0
        ;
run;
Kurt_Bremser
Super User

To me it looks like you have to use cohort wherever you also use subject.

Please fix your data step code so that it works, I get all kinds of messages and incomplete records. Make sure your delimiters are consistent (either all blanks or all tabs). I can only test code against solid data.

 

Ksharp
Super User
data have;
input subject count start end change;
datalines;
12345   1   20   35   1
12345   1   36    39  2
12345    1  40    45    0
23456    2  15   19   0
23456    2   20   24   1 
34567     3   33   44   1
34567     3   45  50    2
45678     1   15   18  0
45678     1   19   22  1
;
run;
proc sql noprint;
select max(n) into : n
 from (select count(*) as n from have group by subject,count);
quit;
proc summary data=have;
by subject count;
output out=want idgroup(out[&n] (start end change)=);
run;
proc print noobs;run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 591 views
  • 0 likes
  • 3 in conversation