Hi i have 3 tables called one two and three (in realtime i have around 40 to 50 different tables)
i have to summarize these tables based on the ID as the key variable , like ID 3 is in all the table with different dept name, so in the final summarized table the dept for Id 3 should be all the 3 dept separated by a comma (,)
data i have below
data one;
input id dept$;
cards;
1 del
2 mum
3 bang
;
run;
data two;
input id dept$;
cards;
3 bbsr
4 cal
;
run;
data three;
input id dept$;
cards;
5 pun
3 hyd
6 che
;
run;
data i want :
id | dept |
1 | del |
2 | mum |
3 | bang ,Hyd ,bbsr |
4 | cal |
5 | pun |
6 | che |
That would happen if you leave out the statement:
if last.id;
If you do have further issues, be sure to post the log.
since no duplicates,
1. do a one to one merge
2. catx with delim
3. keep the vars you want in final and you are done
Here's an illustration of two different methods:
*create sample data for demonstration;
data have;
infile cards dlm='09'x;
input OrgID Product $ States $;
cards;
1 football DC
1 football VA
1 football MD
2 football CA
3 football NV
3 football CA
;
run;
*Sort - required for both options;
proc sort data=have;
by orgID;
run;
**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
set have;
by orgID;
length combined $100.;
retain combined;
if first.orgID then
combined=states;
else
combined=catx(', ', combined, states);
if last.orgID then
output;
run;
**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
by orgID;
var states;
run;
data want_option2;
set wide;
length combined $100.;
combined=catx(', ', of state_:);
run;
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@soham_sas wrote:
Hi i have 3 tables called one two and three (in realtime i have around 40 to 50 different tables)
i have to summarize these tables based on the ID as the key variable , like ID 3 is in all the table with different dept name, so in the final summarized table the dept for Id 3 should be all the 3 dept separated by a comma (,)
data i have below
data one;
input id dept$;
cards;
1 del
2 mum
3 bang
;
run;data two;
input id dept$;
cards;
3 bbsr
4 cal
;
run;data three;
input id dept$;
cards;
5 pun
3 hyd
6 che
;
run;
data i want :
id dept 1 del 2 mum 3 bang ,Hyd ,bbsr 4 cal 5 pun 6 che
First, sort all three data sets by ID.
I'm not sure why you want Hyd as the middle value instead of the last value (or if that is just a mistake).
After sorting, you could use:
data want;
set one two three;
by id;
length all_depts $ 100;
if first.id then all_depts = dept;
else all_depts = catx(', ', all_depts, dept);
if last.id;
drop dept;
run;
Notice that you need a new, longer variable to hold the full set of characters. If you want to, you could add this as the final statement (before the run statement):
rename all_depts = dept;
If you really want Hyd as the middle value you would need to change the SET statement:
set one three two;
Finally, note that this is probably a bad idea. The vast majority of the time it will be easier to use the data in its original form.
@Astounding i tried the code , its giving the output like below , but we need the 5th observation only for Id 3 (i.e 3 bang,bbsr,hyd)
please suggest
also as there are many datasets so i am appendeing all the datsets , then sort it by Id and then i am applying the code
1 del
2 mum
3 bang
3 bang,bbsr
3 bang,bbsr,hyd
4 cal
5 pun
6 che
That would happen if you leave out the statement:
if last.id;
If you do have further issues, be sure to post the log.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.