BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
soham_sas
Quartz | Level 8

Hi i have 3 tables called one two and three (in realtime i have around 40 to 50 different tables)

 

i have to summarize these tables based on the ID as the key variable , like ID 3 is in all the table with different dept name, so in the final summarized table the dept for Id 3 should be all the 3 dept separated by a comma (,) 

 

data i have below 

 

data one;
input id dept$;
cards;
1 del
2 mum
3 bang
;
run;

data two;
input id dept$;
cards;
3 bbsr
4 cal
;
run;

data three;
input id dept$;
cards;
5 pun
3 hyd
6 che
;
run;

 

data i want :

 

iddept
1del
2mum
3bang ,Hyd ,bbsr
4cal
5pun
6che

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

That would happen if you leave out the statement:

 

if last.id;

 

If you do have further issues, be sure to post the log.

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

since no duplicates, 

1. do a one to one merge 

2. catx with delim

3. keep the vars you want in final and you are done

Reeza
Super User

Here's an illustration of two different methods:

 

*create sample data for demonstration;
data have;
    infile cards dlm='09'x;
    input OrgID Product $   States $;
    cards;
1   football    DC
1   football    VA
1   football    MD
2   football    CA
3   football    NV
3   football    CA
;
run;

*Sort - required for both options;
proc sort data=have;
    by orgID;
run;

**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
    set have;
    by orgID;
    length combined $100.;
    retain combined;

    if first.orgID then
        combined=states;
    else
        combined=catx(', ', combined, states);

    if last.orgID then
        output;
run;

**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
    by orgID;
    var states;
run;

data want_option2;
    set wide;
    length combined $100.;
    combined=catx(', ', of state_:);
run;

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

 


@soham_sas wrote:

Hi i have 3 tables called one two and three (in realtime i have around 40 to 50 different tables)

 

i have to summarize these tables based on the ID as the key variable , like ID 3 is in all the table with different dept name, so in the final summarized table the dept for Id 3 should be all the 3 dept separated by a comma (,) 

 

data i have below 

 

data one;
input id dept$;
cards;
1 del
2 mum
3 bang
;
run;

data two;
input id dept$;
cards;
3 bbsr
4 cal
;
run;

data three;
input id dept$;
cards;
5 pun
3 hyd
6 che
;
run;

 

data i want :

 

id dept
1 del
2 mum
3 bang ,Hyd ,bbsr
4 cal
5 pun
6 che

 


 

Astounding
PROC Star

First, sort all three data sets by ID. 

 

I'm not sure why you want Hyd as the middle value instead of the last value (or if that is just a mistake).

 

After sorting, you could use:

 

data want;

set one two three;

by id;

length all_depts $ 100;

if first.id then all_depts = dept;

else all_depts = catx(', ', all_depts, dept);

if last.id;

drop dept;

run;

 

Notice that you need a new, longer variable to hold the full set of characters.  If you want to, you could add this as the final statement (before the run statement):

 

rename all_depts = dept;

 

If you really want Hyd as the middle value you would need to change the SET statement:

 

set one three two;

 

Finally, note that this is probably a bad idea.  The vast majority of the time it will be easier to use the data in its original form.

soham_sas
Quartz | Level 8

@Astounding i tried the code , its giving the output like below , but we need the 5th observation only for Id 3 (i.e 3 bang,bbsr,hyd)

 

please suggest 

 

also as there are many datasets so i am appendeing all the datsets , then sort it by Id and then i am applying  the code 

 

1 del
2 mum
3 bang
3 bang,bbsr
3 bang,bbsr,hyd
4 cal
5 pun
6 che

Astounding
PROC Star

That would happen if you leave out the statement:

 

if last.id;

 

If you do have further issues, be sure to post the log.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1168 views
  • 0 likes
  • 4 in conversation