So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID | Product | States |
1 | football | DC |
1 | football | VA |
1 | football | MD |
2 | football | CA |
3 | football | NV |
3 | football | CA |
My desired output data should look like
OrganizationID | Product | States |
1 | football | DC, VA, MD |
2 | football | CA |
3 | football | NV, CA |
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
Thank you Reeza, this is great!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.