BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
umeshgiri48
Obsidian | Level 7

I have a dataset in which I am trying to find the Outlets which are selling only two category i.e 'CSP' and 'Stills' and not other categories

for eg:- 

 

24034743CSP
24034743Stills
27344795Water
27344795Stills
27344795CSP
24660690CSP


In this outlet 24034743 sells CSP and Stills, outlet 27344795 sells  CSP,Water and Stills and outlet 24660690 sells only CSP in this case I need only outlet which sells CSP and Stills.

 

I tried few methods not got success.

 

Please help me on this.

option compress=yes;

data test;
input Outlet $ Category $;
Cards;
27344795	CSP
29306680	Water
29106861	Coffee
24660690	CSP
25298585	CSP
27852713	CSP
27344795	Water
24034743	CSP
24792773	Coffee
2991031	        CSP
21362260	CSP
29003700	CSP
27314730	CSP
26678079	CSP
22625435	CSP
29702708	CSP
25459917	Stills
24010565	CSP
28131642	CSP
22078107	CSP
21501108	CSP
28393563	CSP
25784580	CSP
24976068	CSP
29306680	CSP
25784580	Stills
21038003	CSP
22145937	CSP
21906503	CSP
28506284	Stills
29993536	CSP
23915168	CSP
21153504	Coffee
24981847	CSP
21977529	Water
22684050	Water
29291387	Stills
25298585	Coffee
29106861	CSP
24976068	Stills
28393563	Coffee
28506284	CSP
22078107	Stills
29106861	Stills
21906503	Stills
21501108	Coffee
29003700	Stills
29306680	Stills
28393563	Stills
27314730	Stills
27344795	Stills
28506284	Water
27933637	CSP
2991031	        Stills
28131642	Water
24034743	Stills
22078107	Water
24976068	Coffee
29306680	Coffee
26857155	CSP
21362260	Water
;
run;

proc sort data =test out=abc nodupkey;
by Outlet Category;
where Category in ('CSP','Stills') and Category not in ('Water', 'Coffee');
run;

proc sort data =test out=abc nodupkey;
by Outlet Category;
where Category in ('CSP','Stills') and Category not in ('Water', 'Coffee');
run;

data abc;
set test;
where Category not in ('Water', 'Coffee') and Category in ('CSP','Stills');
run;

proc sql;
select count(distinct outlet) from test
where Category not in ('Water', 'Coffee') and Category in ('CSP','Stills');
run;

proc sql;
select count (distinct c.Outlet) as o_count
from test as c, test as s
where c.Outlet = s.Outlet
and s.Category like '%CSP%'
and c.Category like '%Stills%';
quit;
run;

proc sql;
 create table abc as select distinct b.Outlet as outlet 
 from  test a left join test b
  on a.Outlet=b.Outlet
  and a.Category=b.Category
  where b.Category in ('CSP','Stills') and a.Category not in ('Water', 'Coffee');
  quit;
  run;




 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

Hi @umeshgiri48 

 

To handle this use case, you can simply add the NODUPKEY option in the proc sort as below:

proc sort data=test out=test_sorted nodupkey;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;

View solution in original post

10 REPLIES 10
ed_sas_member
Meteorite | Level 14

Hi @umeshgiri48 

Here is one approach to do that:


proc sort data=test out=test_sorted;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;
umeshgiri48
Obsidian | Level 7

@ed_sas_member  this code is fine when there are no duplicate OUTLET/CATEGORY records

 

For instance, if an OUTLET has, say, 4 CSP's and 3 Still (and no other records), it won't be captured and we will get 0 output 

ed_sas_member
Meteorite | Level 14

Hi @umeshgiri48 

 

To handle this use case, you can simply add the NODUPKEY option in the proc sort as below:

proc sort data=test out=test_sorted nodupkey;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;
umeshgiri48
Obsidian | Level 7
Hi @ed_sas_member thanks for the solution, Is there is another way to do in Proc Sql
ed_sas_member
Meteorite | Level 14

Hi @umeshgiri48 

 

You can use this for example :

proc sql;
	select outlet
	from (select distinct *
	  	 from test
		 group by Outlet)
	group by outlet
	having count(distinct Category)=2 and sum(Category in ('CSP','Stills'))=2;
quit;
umeshgiri48
Obsidian | Level 7
It's not working
novinosrin
Tourmaline | Level 20

HI @umeshgiri48  This is a natural case  for PROC SQL  and just one step if i understand your requirement correctly

 


data test;
input Outlet $ Category $;
Cards;
24034743	CSP
24034743	Stills
27344795	Water
27344795	Stills
27344795	CSP
24660690	CSP
;
run;


proc sql;
create table want as
select *
from test
group by outlet
having count(distinct Category)=2 and sum(Category in ('CSP','Stills'))=2;
quit;

 

/*Or I am also assuming the instead of the previous HAVING filter, the one below could be what you are after*/

having count(distinct Category)>1 and count(*)= sum(Category in ('CSP','Stills'));
mkeintz
PROC Star

@novinosrin   I agree that this is a ready-made task for PROC SQL.

 

BUT … there is an embedded assumption in your code, namely that there are no duplicate OUTLET/CATEGORY records.

 

For instance, if an OUTLET has, say, 2 CSP's and 1 Still (and no other records), it won't be captured - presumably a false negative.

 

Of course, the assumption may very well be the case.  If so, the OP would be safe.

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

Hi @mkeintz  You are right.  I had that doubt too. I often prefer any OP's to offer us a feedback and confirm if the communication is right for us to make edit/changes if needed. So yes, some OPs are interactive while others not so much. Let's see.

umeshgiri48
Obsidian | Level 7

@novinosrin  this query will only provide an outlet which is having CSP and Stills but I need only those Outlets which are selling only CSP and Stills not other categories like water and etc

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 946 views
  • 0 likes
  • 4 in conversation