So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
| OrganizationID | Product | States |
| 1 | football | DC |
| 1 | football | VA |
| 1 | football | MD |
| 2 | football | CA |
| 3 | football | NV |
| 3 | football | CA |
My desired output data should look like
| OrganizationID | Product | States |
| 1 | football | DC, VA, MD |
| 2 | football | CA |
| 3 | football | NV, CA |
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
There are several options, you can use a data step with BY group processing or you can TRANSPOSE and then use CATX().
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
@LisaYIN9309 wrote:
So I've came across this task with a big dataset, I've make two brief tables to sum the core of this problem. I know if it's a very small table, I can just manually add more columns for each row to list all states for each organization ID and then use TRIM or CATX, but with big data, I couldn't figure out how to do it systematically, since the # of states an organization operates in varies, and the values are stored in different rows of the same column, rather than in different columns of the same row.
What I want to do is to bring States values to a higher level, as an aggregate of all states that this organization operate in.
My input data looks like:
OrganizationID Product States 1 football DC 1 football VA 1 football MD 2 football CA 3 football NV 3 football CA
My desired output data should look like
OrganizationID Product States 1 football DC, VA, MD 2 football CA 3 football NV, CA
Thank you Reeza, this is great!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.