BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sarahsasuser
Quartz | Level 8

I have the following dataset (forgive the "fake" data) and want to merge the Content column (and have the data separated by a comma) based on the Type column.

Building     Type     Content

A22          Office     chair

A22          Office     desk

A22          Res.       bed

A25          Res.       bed

A25          Res.       lamp

A26          Office     chair

A26          Res.       chair

A26          Res.       bed

A26          Res.       lamp

I want to produce the following output:

Building TypeContent
A22Officechair, desk
A22Res.bed
A25Res.bed, lamp
A26Officechair, desk
A26Res.chair, bed, lamp

Is there a way to do this in proc report, or should I merge the content cells in a data step?

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Try something like the following, and then use the last (commented out right now) to keep only the last record.

proc sort data=have; by building type; run;

data want;

length content_all $200.;

set have;

by building type;

retain content_all;

if first.type then content_all=content;

else content_all=trim(content_all)|| ", "||trim(compress(content));

*If last.type then output;

run; 

View solution in original post

11 REPLIES 11
sascom10
Calcite | Level 5

I feel better to do this in data step. ll giv you  more handle on achieving conditional statement.

sarahsasuser
Quartz | Level 8

Thanks sascom10. I don't know how to do that though. Do you have an example code or paper you could share with me?

Reeza
Super User

Try something like the following, and then use the last (commented out right now) to keep only the last record.

proc sort data=have; by building type; run;

data want;

length content_all $200.;

set have;

by building type;

retain content_all;

if first.type then content_all=content;

else content_all=trim(content_all)|| ", "||trim(compress(content));

*If last.type then output;

run; 

sarahsasuser
Quartz | Level 8

Hi Reeza,

This was a great suggestion.  I realized that there is another variable that is important. However, for some reason it is associating the wrong records to each other. For instance, building A22 will be associated with ID3. So the data looks like this:

ID           Building     Type     Content    

1               A22          Office     chair    

1              A22          Office     desk    

2              A22          Res.       bed         

2              A25          Res.       bed    

2             A25          Res.       lamp    

3             A26          Office     chair

3             A26          Res.       chair

4              A26          Res.       bed

4              A26          Res.       lamp

Then I added the ID to the code:

proc sort data=have; by building id type; run;

data want;

length content_all $200.;

set have;

by building id type;

retain content_all;

if first.type then content_all=content;

else content_all=trim(content_all)|| ", "||trim(compress(content));

*If last.type then output;

run;

I can't get everything to match up with the right ID as well.

Reeza
Super User

That should work still, but you may need ID before building as you do in the data (BY ID Building type).

You'll need to post what your answer should be with the new variable to help figure out what's going wrong.

sarahsasuser
Quartz | Level 8

Hi Reeza,

I understand the problem now. Your solution is perfect. My data set is the problem. I have a couple of building id numbers that appear twice, but the second iteration has a trailing blank so that SAS thinks they are 2 separate ids. Do you have a solution for deleting trailing blanks?

Thanks,

Sarah

Reeza
Super User

id=trim(compress(id));

sarahsasuser
Quartz | Level 8

Thanks, I possibly have 2 trailing blanks because I've been using that solution but it did not work. Also, for some of the ids there are no trailing blanks.

Reeza
Super User

it should remove all trailing blanks regardless.

To see if that's the issue, run a proc freq and if they appear in different columns then they're different.

Another solution is to convert to numeric and then back to character. not sure if the following is correct, but that's the idea.

id=input(put(id, 8.)$8.);

sarahsasuser
Quartz | Level 8

Thanks, I'm not sure why it didn't work. I several if-then statements to recode the IDs that had blanks.

thanks.

sascom10
Calcite | Level 5

Try this -

data test1;
  input building $3. type $8. content $8.;
datalines;
A22 Office chair
A22 Office desk
A22 Res.   bed
A25 Res.   bed
A25 Res.   lamp
A26 Office chair
A26 Res.   chair
A26 Res.   bed
A26 Res.   lamp
run;


proc sort data=test1;
  by building type content;
run;

proc transpose data=test1 out=test1_wide;
  by building type;
  var content;
run;

data test2;
  length content $30.;
  set test1_wide;
  retain building type content;
 
  content = trim(col1)||' '||trim(col2)||' '||trim(col3);

  keep building type content;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3074 views
  • 0 likes
  • 3 in conversation