BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
renanm
Obsidian | Level 7

Hello everyone,

 

I've been trying to do a transformation which I though would be quite simple. After trying most approaches shared in this forum I'm still not getting the expected results so looking for some help/guidance.

 

I have a dataset that looks like:

data have;
input id tr1 tr2 tr3 tr4;
cards;
1 AAA BBB GGG FFF
2 DDD CCC AAA BBB
3 BBB AAA JJJ VVV
;
run;

 

The only difference is my real data is that variable TRX goes from 1 to 125 (tr1....tr125).

 

I would like the data to look like:

 

data want;
input id tr;
cards;
1 AAA
1 BBB
1 GGG
1 FFF
2 DDD
2 CCC
2 AAA
2 BBB
3 BBB
3 AAA
3 JJJ
3 VVV
;
run;

In essence, instead of multiple columns I'd like multiple rows.

 

The following code seems to work in very similar cases but not working for this one:

 

proc transpose data=have out=have_trans;
   by ID;
   var tr1-tr4;
run;

This yield a dataset with the following columns:

ID _NAME_ COL1 COL2 COL3 COL4 COL5........

 

How can I make it so that it yields only ID _NAME_ COL1 with all values previous in multiple columns contained in COL1.

 

Thanks !

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Examine your data. I will bet that you find that you have duplicate values of one or more values of ID.

Example: ID=1 appears twice, so we get col1 and col2 , one column for each duplicate of id. So if your output includes COL25 then at least one value of ID is duplicated 25 times in the input set.

data have;
input id tr1 $ tr2 $ tr3 $ tr4 $;
cards;
1 AAA BBB GGG FFF
1 DDD CCC AAA BBB
3 BBB AAA JJJ VVV
;
run;

proc transpose data=have out=have_trans;
   by id;
   var tr1-tr4;
run;

 

 

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Works fine?

 

data have;
input id tr1 $ tr2 $ tr3 $ tr4 $;
cards;
1 AAA BBB GGG FFF
2 DDD CCC AAA BBB
3 BBB AAA JJJ VVV
;
run;

proc transpose data=have out=have_trans(drop=_NAME_ rename=(COl1=tr));
   by ID;
   var tr1-tr4;
run;
renanm
Obsidian | Level 7

Thanks for your response.

 

I had previously tried this, but unfortunately it doesn't work as expected even though it does work in this sample dataset.

 

Essentially, in my real data this returns the following structure:

 

ID TR COL2 COL3 COL4 COL5.....COL125

 

Not sure why the remaining columns continue to show up.

 

I need the final data to have only the following with all values contained in the variable TR:

ID TR

 

Thanks for the support.

 

 

 

renanm
Obsidian | Level 7

interestingly, I I just re-ran the same code on a smaller dataset (for speed purposes) and it now returns only up to COL3:

ID TR COL2 COL3

 

@Kurt_Bremser, here's the log:

 

;*';*";*/;quit;run;
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='Tracking File';
4 %LET _CLIENTPROJECTPATH='';
5 %LET _CLIENTPROJECTNAME='';
6 %LET _SASPROGRAMFILE=;
7
8 ODS _ALL_ CLOSE;
9 OPTIONS DEV=ACTIVEX;
10 GOPTIONS XPIXELS=0 YPIXELS=0;
11 FILENAME EGSR TEMP;
12 ODS tagsets.sasreport13(ID=EGSR) FILE=EGSR
13 STYLE=HtmlBlue
14 STYLESHEET=(URL="file:///C:/Program%20Files/SASHome2/SASEnterpriseGuide/7.1/Styles/HtmlBlue.css")
15 NOGTITLE
16 NOGFOOTNOTE
17 GPATH=&sasworklocation
18 ENCODING=UTF8
19 options(rolap="on")
20 ;
NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
21
22 GOPTIONS ACCESSIBLE;
23 proc transpose data=have out=want_trans (drop=_NAME_ rename=(COl1=tr));
24 by ID;
25 var TR0-TR125;
26 run;

NOTE: There were 15824 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT_TRANS has 276423 observations and 4 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.10 seconds
cpu time 0.09 seconds

27
28 GOPTIONS NOACCESSIBLE;
29 %LET _CLIENTTASKLABEL=;
30 %LET _CLIENTPROJECTPATH=;
31 %LET _CLIENTPROJECTNAME=;
32 %LET _SASPROGRAMFILE=;
33
34 ;*';*";*/;quit;run;
35 ODS _ALL_ CLOSE;
36
37
38 QUIT; RUN;
39

ballardw
Super User

Examine your data. I will bet that you find that you have duplicate values of one or more values of ID.

Example: ID=1 appears twice, so we get col1 and col2 , one column for each duplicate of id. So if your output includes COL25 then at least one value of ID is duplicated 25 times in the input set.

data have;
input id tr1 $ tr2 $ tr3 $ tr4 $;
cards;
1 AAA BBB GGG FFF
1 DDD CCC AAA BBB
3 BBB AAA JJJ VVV
;
run;

proc transpose data=have out=have_trans;
   by id;
   var tr1-tr4;
run;

 

 

renanm
Obsidian | Level 7

@ballardw thanks for taking the time to help. 

 

You're absolutely right. After deduping the file it worked as expected. 

 

Glad to learn duplicates should be eliminated before running this procedure. 

 

Thanks again !

ballardw
Super User

@renanm wrote:

@ballardw thanks for taking the time to help. 

 

You're absolutely right. After deduping the file it worked as expected. 

 

Glad to learn duplicates should be eliminated before running this procedure. 

 

Thanks again !


Knowing the cause is helpful. Often you may actually want multiple columns like that for some purposes.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2738 views
  • 2 likes
  • 4 in conversation