Hello:
I have a sample dataset list below. I would like to keep all the text in column "Class", "Component", and "Product" over (>=) 3 times and keep them. The rest would delete. Please help, thank you.
data sample;
infile datalines dsd;
input Class : $20. Component : $20. Product : $20.;
datalines;
ANTI/INSULIN, , ,
ANTI/INSULIN, , ,
ANTI/VITA, , ,
PHENAZO, , ,
Large, , ,
ANTI/VITA, , ,
ANTI/VITA, , ,
BETA-ADRENERGIC, , ,
FLOWER, , ,
PHENAZO, , ,
PHENAZO, , ,
ANTI/INSULIN, , ,
ANTI/INSULIN, , ,
, ANTI/INSULIN, ,
, BETA-ADRENERGIC, ,
, ANTI/INSULIN, ,
, PSYCH, ,
, HERBS, ,
, PHENAZO, ,
, ANTI/VITA, ,
, ANTI/INSULIN, ,
, ANTI/INSULIN, ,
, HERBS, ,
, PSYCH, ,
, PSYCH, ,
, HERBS, ,
, HERBS, ,
, BETA-ADRENERGIC, ,
, BETA-ADRENERGIC, ,
, , BANZOCAINE,
, , BANZOCAINE,
, , FLOWER,
, , ANTI/INSULIN,
, , FLOWER,
, , ANTI/INSULIN,
, , ANTI/INSULIN,
, , ANTI/INSULIN,
, , HERBS,
, , BETA-ADRENERGIC,
, , SKIN,
, , BETA-ADRENERGIC,
, , PHENAZO,
, , PHENAZO,
, , FLOWER,
;
proc sort data=sample; by class component product; run;
data frequencyover3;
infile datalines dsd;
input Class : $20. Component : $20. Product : $20.;
datalines;
ANTI/INSULIN, , ,
ANTI/INSULIN, , ,
ANTI/VITA, , ,
PHENAZO, , ,
ANTI/VITA, , ,
ANTI/VITA, , ,
PHENAZO, , ,
PHENAZO, , ,
ANTI/INSULIN, , ,
ANTI/INSULIN, , ,
, ANTI/INSULIN, ,
, BETA-ADRENERGIC, ,
, ANTI/INSULIN, ,
, PSYCH, ,
, HERBS, ,
, ANTI/INSULIN, ,
, ANTI/INSULIN, ,
, HERBS, ,
, PSYCH, ,
, PSYCH, ,
, HERBS, ,
, HERBS, ,
, BETA-ADRENERGIC, ,
, BETA-ADRENERGIC, ,
, , FLOWER,
, , ANTI/INSULIN,
, , FLOWER,
, , ANTI/INSULIN,
, , ANTI/INSULIN,
, , ANTI/INSULIN,
, , FLOWER,
;
proc sort data=final; by class component product; run;
use HAVING in a SQL clause.
select *
from have
group by class
having count(class) >=3;
use HAVING in a SQL clause.
select *
from have
group by class
having count(class) >=3;
Hello:
I modified Reeza's code below. However, I got an error message from Log. Please help. Thanks.
data sample;
infile datalines dsd;
input Class : $20. Component : $20. Product : $20.;
datalines;
ANTI/INSULIN, PSYCH, BANZOCAINE,
ANTI/INSULIN, HERBS, BANZOCAINE,
ANTI/VITA, PHENAZO, FLOWER,
PHENAZO, ANTI/VITA, ANTI/INSULIN,
PHENAZO, ANTI/INSULIN, FLOWER,
Large, ANTI/INSULIN, ANTI/INSULIN,
ANTI/VITA, HERBS, ANTI/INSULIN,
ANTI/VITA, PSYCH, ANTI/INSULIN,
BETA-ADRENERGIC, PSYCH, HERBS,
FLOWER, SKIN, BETA-ADRENERGIC,
PHENAZO, HERBS, SKIN,
PHENAZO, HERBS, BETA-ADRENERGIC,
ANTI/INSULIN, BETA-ADRENERGIC, PHENAZO,
ANTI/INSULIN, BETA-ADRENERGIC, PHENAZO,
SUBST, ANTI/INSULIN, FLOWER,
BANZOCAINE, BETA-ADRENERGIC, SUBST,
PHENAZO, ANTI/INSULIN, BANZOCAINE,
;
proc sort data=sample; by class component product; run;
proc sql;
create table frequencyover3
select *
from sample
group by Class Component product
having count(Class or Component or product ) >=3;
quit;
147 proc sql;
148 create table frequencyover3
149 select *
------
73
ERROR 73-322: Expecting an AS.
150 from sample
151 group by Class Component product
---------
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, ',', -,
'.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET,
GT, GTT, HAVING, IN, INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN,
OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
152 having count(Class or Component or product ) >=5;
153 quit;
NOTE: The SAS System stopped processing this step because of errors.
Like your log says: you're missing the string "as". i.e., your line should have been:
create table frequencyover3 as
Art, CEO, AnalystFinder.com
Before you spend more time fixing the syntax (and logic) of your proc sql statements, why not provide an example of what you expect the output to look like?
It's easy when you show what it looks like dealing with one column.
But you specified three variables for which you want to filter for values with frequency >= 3. Does that mean you want all the variables for every observation in which at least one of the variables has a value seen 3+ times? (even if the other 2 variables are seen fewer time)?
Or do you really want 3 tables, one per variable, each listed in alphabetic order, per your example?
I apologize that my sample dataset was post right at the first beginning. I correct it now. Please see the first message. I have edited it, and I add the final version I would like to create. However, I continue receiving error message when I add "as" in Proc SQL. Please help.
proc sql;
create table frequencyover5 as
select *
from sample
group by Class Component product
having count(Class or Component or product ) >=3;
quit;
243 proc sql;
244 create table frequencyover3 as
245 select *
246 from sample
247 group by Class Component product
---------
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, ',', -,
'.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET,
GT, GTT, HAVING, IN, INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN,
OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
248 having count(Class or Component or product ) >=3;
249 quit;
NOTE: The SAS System stopped processing this step because of errors.
The variables in the GROUP BY statement in SQL need to be comma separated.
ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, ',', -,
'.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET,
GT, GTT, HAVING, IN, INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN,
OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=.
That's what the error is telling you, it expects comma's between the variables in the line indicated.
Thank you so much for everyone's great suggestion. I got it worked!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.