BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
andrewjason
Fluorite | Level 6

Hi, I have a dataset that is duplicating data from two rows if there is more than one variable in the final column like below:

 

ID   fruit            quality

1     banana      yellow

2     apple         green

2     apple         round

 

What I Want is for each row to be unique like below, such that if the quality column has multiple entries, it will create a new column and insert it there - column_2, column_3, column_4 and so on.

 

ID   fruit           quality_1    Quality_2

1    banana      yellow

2    apple         green          round

 

 

I have looked around the forum and some text books looking for an answer to this and haven't found much - any help would be appreciated. Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

That is called transposing data from long to wide. Proc transpose will do this is SAS.

For almost every purpose processing is easier in the long form.

Example: if both Apple and Banana have the quality of "sweet" there is no way to ensure that Sweet is in the same quality variable so you spend lots of time having to search through many variables for everything done later on. And if you later have another data set to combine the quality value for Apple is very likely to appear in a different variable for the same value and the number of quality variables may change. Which complicates all of those searches through multiple variables to determine if "sweet" is one of the qualities.

 

Sort the data by Id fruit.

Then

Proc transpose data=have out=want prefix=quanlity_;

by id fruit;

var quality;

run;

View solution in original post

4 REPLIES 4
Reeza
Super User

Have you looked at PROC TRANSPOSE? That works fine for me and generates the output you indicated.

 

delete_transpose.JPG


@andrewjason wrote:

Hi, I have a dataset that is duplicating data from two rows if there is more than one variable in the final column like below:

 

ID   fruit            quality

1     banana      yellow

2     apple         green

2     apple         round

 

What I Want is for each row to be unique like below, such that if the quality column has multiple entries, it will create a new column and insert it there - column_2, column_3, column_4 and so on.

 

ID   fruit           quality_1    Quality_2

1    banana      yellow

2    apple         green          round

 

 

I have looked around the forum and some text books looking for an answer to this and haven't found much - any help would be appreciated. Thanks!

 


 

andrewjason
Fluorite | Level 6
This is great! thanks!
ballardw
Super User

That is called transposing data from long to wide. Proc transpose will do this is SAS.

For almost every purpose processing is easier in the long form.

Example: if both Apple and Banana have the quality of "sweet" there is no way to ensure that Sweet is in the same quality variable so you spend lots of time having to search through many variables for everything done later on. And if you later have another data set to combine the quality value for Apple is very likely to appear in a different variable for the same value and the number of quality variables may change. Which complicates all of those searches through multiple variables to determine if "sweet" is one of the qualities.

 

Sort the data by Id fruit.

Then

Proc transpose data=have out=want prefix=quanlity_;

by id fruit;

var quality;

run;

andrewjason
Fluorite | Level 6

Great Explanation! Thank you!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 480 views
  • 2 likes
  • 3 in conversation