Solved: How to use only condition in sas

umeshgiri48 · Posted 01-06-2020 09:10 AM

I have a dataset in which I am trying to find the Outlets which are selling only two category i.e 'CSP' and 'Stills' and not other categories

for eg:-

24034743	CSP
24034743	Stills
27344795	Water
27344795	Stills
27344795	CSP
24660690	CSP

In this outlet 24034743 sells CSP and Stills, outlet 27344795 sells CSP,Water and Stills and outlet 24660690 sells only CSP in this case I need only outlet which sells CSP and Stills.

I tried few methods not got success.

Please help me on this.

option compress=yes;

data test;
input Outlet $ Category $;
Cards;
27344795	CSP
29306680	Water
29106861	Coffee
24660690	CSP
25298585	CSP
27852713	CSP
27344795	Water
24034743	CSP
24792773	Coffee
2991031	        CSP
21362260	CSP
29003700	CSP
27314730	CSP
26678079	CSP
22625435	CSP
29702708	CSP
25459917	Stills
24010565	CSP
28131642	CSP
22078107	CSP
21501108	CSP
28393563	CSP
25784580	CSP
24976068	CSP
29306680	CSP
25784580	Stills
21038003	CSP
22145937	CSP
21906503	CSP
28506284	Stills
29993536	CSP
23915168	CSP
21153504	Coffee
24981847	CSP
21977529	Water
22684050	Water
29291387	Stills
25298585	Coffee
29106861	CSP
24976068	Stills
28393563	Coffee
28506284	CSP
22078107	Stills
29106861	Stills
21906503	Stills
21501108	Coffee
29003700	Stills
29306680	Stills
28393563	Stills
27314730	Stills
27344795	Stills
28506284	Water
27933637	CSP
2991031	        Stills
28131642	Water
24034743	Stills
22078107	Water
24976068	Coffee
29306680	Coffee
26857155	CSP
21362260	Water
;
run;

proc sort data =test out=abc nodupkey;
by Outlet Category;
where Category in ('CSP','Stills') and Category not in ('Water', 'Coffee');
run;

proc sort data =test out=abc nodupkey;
by Outlet Category;
where Category in ('CSP','Stills') and Category not in ('Water', 'Coffee');
run;

data abc;
set test;
where Category not in ('Water', 'Coffee') and Category in ('CSP','Stills');
run;

proc sql;
select count(distinct outlet) from test
where Category not in ('Water', 'Coffee') and Category in ('CSP','Stills');
run;

proc sql;
select count (distinct c.Outlet) as o_count
from test as c, test as s
where c.Outlet = s.Outlet
and s.Category like '%CSP%'
and c.Category like '%Stills%';
quit;
run;

proc sql;
 create table abc as select distinct b.Outlet as outlet 
 from  test a left join test b
  on a.Outlet=b.Outlet
  and a.Category=b.Category
  where b.Category in ('CSP','Stills') and a.Category not in ('Water', 'Coffee');
  quit;
  run;

ed_sas_member · Posted 01-07-2020 01:55 AM

Hi @umeshgiri48

To handle this use case, you can simply add the NODUPKEY option in the proc sort as below:

proc sort data=test out=test_sorted nodupkey;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;

View solution in original post

ed_sas_member · Posted 01-06-2020 09:24 AM

Hi @umeshgiri48

Here is one approach to do that:


proc sort data=test out=test_sorted;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;

umeshgiri48 · Posted 01-07-2020 01:31 AM

@ed_sas_member this code is fine when there are no duplicate OUTLET/CATEGORY records

For instance, if an OUTLET has, say, 4 CSP's and 3 Still (and no other records), it won't be captured and we will get 0 output

ed_sas_member · Posted 01-07-2020 01:55 AM

Hi @umeshgiri48

To handle this use case, you can simply add the NODUPKEY option in the proc sort as below:

proc sort data=test out=test_sorted nodupkey;
	by Outlet Category;
run;
proc transpose data=test_sorted out=test_tr;
	by Outlet;
	var Category;
run;

data want (keep=outlet);
	set test_tr;
	Category = catx(" ",of col:);
	if find(Category,"CSP") and find(Category,"Stills") and countw(Category)=2 then output;
run;

umeshgiri48 · Posted 01-07-2020 02:19 AM

Hi @ed_sas_member thanks for the solution, Is there is another way to do in Proc Sql

ed_sas_member · Posted 01-07-2020 02:29 AM

Hi @umeshgiri48

You can use this for example :

proc sql;
	select outlet
	from (select distinct *
	  	 from test
		 group by Outlet)
	group by outlet
	having count(distinct Category)=2 and sum(Category in ('CSP','Stills'))=2;
quit;

umeshgiri48 · Posted 01-07-2020 03:05 AM

It's not working

novinosrin · Posted 01-06-2020 10:01 AM

HI @umeshgiri48 This is a natural case for PROC SQL and just one step if i understand your requirement correctly


data test;
input Outlet $ Category $;
Cards;
24034743	CSP
24034743	Stills
27344795	Water
27344795	Stills
27344795	CSP
24660690	CSP
;
run;


proc sql;
create table want as
select *
from test
group by outlet
having count(distinct Category)=2 and sum(Category in ('CSP','Stills'))=2;
quit;

/*Or I am also assuming the instead of the previous HAVING filter, the one below could be what you are after*/

having count(distinct Category)>1 and count(*)= sum(Category in ('CSP','Stills'));

mkeintz · Posted 01-06-2020 05:09 PM

@novinosrin I agree that this is a ready-made task for PROC SQL.

BUT … there is an embedded assumption in your code, namely that there are no duplicate OUTLET/CATEGORY records.

For instance, if an OUTLET has, say, 2 CSP's and 1 Still (and no other records), it won't be captured - presumably a false negative.

Of course, the assumption may very well be the case. If so, the OP would be safe.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

novinosrin · Posted 01-06-2020 05:15 PM

Hi @mkeintz You are right. I had that doubt too. I often prefer any OP's to offer us a feedback and confirm if the communication is right for us to make edit/changes if needed. So yes, some OPs are interactive while others not so much. Let's see.

umeshgiri48 · Posted 01-07-2020 01:20 AM

@novinosrin this query will only provide an outlet which is having CSP and Stills but I need only those Outlets which are selling only CSP and Stills not other categories like water and etc

How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas

Re: How to use only condition in sas