I have a dataset of customer sales with customers and regions. There are only two regions and I'm trying to figure out how to subset my data so I only have customers that have sales in both regions.
My data looks something like this:
Customer | Region | Sales |
---|---|---|
1 | 2 | 12 |
1 | 2 | 14 |
1 | 2 | 11 |
1 | 2 | 16 |
2 | 2 | 15 |
2 | 2 | 22 |
2 | 6 | 21 |
3 | 2 | 14 |
3 | 6 | 16 |
3 | 6 | 15 |
3 | 6 | 14 |
3 | 2 | 15 |
4 | 6 | 15 |
4 | 6 | 11 |
4 | 6 | 21 |
4 | 6 | 20 |
So basically, in this case, I want to know how I can have a subset of this dataset that will only include customers 2 and 3, since they are the only customers with sales in both regions.
I'm using SAS Enterprise Guide, and I don't know if there is a way to do this using the GUI or if I should use a code node
Thanks!le
Try this.
proc sql;
create table want as
select * from have
group by customer
having count(distinct region)>1;
quit;
Do you need all the records or just the id of customer 2/3?
I need all the records for those customers
Try this.
proc sql;
create table want as
select * from have
group by customer
having count(distinct region)>1;
quit;
stat@sas,
Perfect! Thank you!
Here's another way with Query Builder. I'd go with code myself
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.