BookmarkSubscribeRSS Feed
Ranjeeta
Pyrite | Level 9

 

Hello,

Would appreciate if someone can advise the cause of the error code pasted below as well 

thanks

proc sql ;
222 create table Insttable_without_DupKey as
223 select DISTINCT(Reporting_INST_NAME), Reporting_INST_AT, Reporting_INST_AM, Reporting_FAC_NUM,FAC_NUM,INST_NAME2,INST_NAME,INST_AM, INST_AT,
224 Strokecentre2,Strokeunit2,TPA_IS,EVT1718_IS
225 group by Reporting_INST_NAME
-----
22
76
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, AS, BETWEEN, CONTAINS,
EQ, EQT, FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=,
|, ||, ~, ~=.

ERROR 76-322: Syntax error, statement will be ignored.

226 having (Reporting_INST_NAME) = MAX(Reporting_INST_NAME)
227 AND Reporting_INST_AT = MAX(Reporting_INST_AT)
228 AND Reporting_INST_AM = MAX(Reporting_INST_AM)
229 AND Strokeunit2 = MAX(Strokeunit2)
230 ;
231 quit;

proc sql ;
 create table Insttable_without_DupKey as
 select DISTINCT(Reporting_INST_NAME), Reporting_INST_AT, Reporting_INST_AM, Reporting_FAC_NUM,FAC_NUM,INST_NAME2,INST_NAME,INST_AM, INST_AT,
 Strokecentre2,Strokeunit2,TPA_IS,EVT1718_IS
 group by Reporting_INST_NAME
 having (Reporting_INST_NAME) = MAX(Reporting_INST_NAME)
 AND Reporting_INST_AT = MAX(Reporting_INST_AT)
 AND Reporting_INST_AM = MAX(Reporting_INST_AM)
 AND Strokeunit2 = MAX(Strokeunit2)
;
quit;

 

 

 

4 REPLIES 4
ed_sas_member
Meteorite | Level 14

Hi @Ranjeeta 

 

It seems that the "FROM" clause indicating the input dataset is missing before the "GROUP BY" clause.

 

Best,

Ranjeeta
Pyrite | Level 9
proc sql ;
 create table Insttable_without_DupKey1 as
 select DISTINCT(Reporting_INST_NAME), Reporting_INST_AT, Reporting_INST_AM, Reporting_FAC_NUM,FAC_NUM,INST_NAME2,INST_NAME,INST_AM, INST_AT,
 Strokecentre2,Strokeunit2,TPA_IS,EVT1718_IS,StrokeUnit_FYEAR
 from insttable
 group by Reporting_INST_NAME
 having (Reporting_INST_NAME) = MAX(Reporting_INST_NAME)
 AND Reporting_INST_AT = MAX(Reporting_INST_AT)
 AND Reporting_INST_AM = MAX(Reporting_INST_AM)
 AND Strokeunit2 = MAX(Strokeunit2)
 and StrokeUnit_FYEAR =max(StrokeUnit_FYEAR)
;
quit;/*256*/

Thankyou 

I would appreciate if someone could explain how the query above works?

Would the select distinct select unique rows for combination of all the columns in the select statement or only Reporting_INST_NAME?

Also the having clause works on the data grouped by ie Reporting_INST_NAME does that mean that if two same ReportingINST_NAMES exist then the row with the higher value of the variables in the having clause would be selected ?

Please advise how the query works?

Tom
Super User Tom
Super User

The DISTINCT keyword is for selecting distinct observations.  Wrapping those parentheses around the first variable in the select list does nothing, other than confuse people.  Just as 4+5 is the same as (4+5) so is (Reporting_INST_NAME) the sames as Reporting_INST_NAME.

 

The first condition in your HAVING clause does nothing. Since you are grouping on Reporting_INST_NAME there is only one value of Reporting_INST_NAME in each group so that value is also the MAX() and the MIN() value in the group.

 

The effect of the other conditions is that only rows that have the max of ALL 4 of those variables (within the group of observations with this value of Reporting_INST_NAME) will be selected.  So you migth not get any observations for some values of Reporting_INST_NAME.  Consider the case when there are exactly four observations where Reporting_INST_NAME='AA' and it so happens that the first one has the maximum values of Reporting_INST_AT among the four and the second has the maximum value of Reporting_INST_AM , etc. Then none of the four has the maximum value of all of them.  Or you might get more than one observation in the group if there are more than one observations that does have the max for all four being tested, but the values of the other variables being kept are different so the DISTINCT operation does not weed them out.

 

What is it that you WANT to do?

Ranjeeta
Pyrite | Level 9

I have some duplicates in the data and i want to remove the record that has blank for the variables in the Having clause

I want to do this for every unique combination of Reporting_INST_NAME Reporting_INST_AT Reporting_INST_AM FAC_NUM

I think the code below is achieving the same?

proc sql ;
 create table Insttable_without_DupKey as
 select DISTINCT Reporting_INST_NAME, Reporting_INST_AT, Reporting_INST_AM, Reporting_FAC_NUM,FAC_NUM,INST_NAME2,INST_NAME,INST_AM, INST_AT,
 Strokecentre2,Strokeunit2,TPA_IS,EVT1718_IS,StrokeUnit_FYEAR
 from insttable
 group by Reporting_INST_NAME, Reporting_INST_AT, Reporting_INST_AM, FAC_NUM
 having Strokeunit2 = MAX(Strokeunit2)
 and StrokeUnit_FYEAR = max(StrokeUnit_FYEAR)
;
quit;/*259*/

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1017 views
  • 5 likes
  • 3 in conversation