So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
| OrganizationID | Product | States |
| 1 | football | DC |
| 1 | football | VA |
| 1 | football | MD |
| 2 | football | CA |
| 3 | football | NV |
| 3 | football | CA |
My desired output data should look like
| OrganizationID | Product | States |
| 1 | football | DC, VA, MD |
| 2 | football | CA |
| 3 | football | NV, CA |
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
Thank you Reeza, this is great!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.