BookmarkSubscribeRSS Feed
ganatbasa
Calcite | Level 5

The following code runs perfectly in aginity/netezza sql, but does not work in

 

proc sql: 
Create table new_table as 
select * from (select * from( select a.* from (select *, rank () over (partition by var_a order by var_id) as rank1 from old_table) a where a.rank1=1) a) b

 

 

When I try to run this code in proc sql, it won't let me do this because of problems in the "over" statement. This code is used for removing duplicates : for groups of duplicate records within the data, it leaves only one record. Is there a way to modify this sql code so that in runs in proc sql? Thanks!

3 REPLIES 3
Tom
Super User Tom
Super User

That is not valid ANSI 92 SQL syntax so it cannot work in PROC SQL.

 

If you want to take only the first observation in a BY group use a DATA step instead of trying to trick the set logic of SQL to handle observation order.

So if you have a dataset that is sorted by GROUP and DATE you can use a step like this to find the earliest observation for each group.

proc sort data=have;
  by group date;
run;
data first_per_group;
  set have;
  by group date ;
  if first.group;
run;

Looks like your SQL is using var_a as the GROUP variable and var_id as the sub-ordering variable within the groups, like the DATE variable in my example data step.

ballardw
Super User

Please post code or log examples in a text box opened on the forum using the </> icon above the message box. That will separate out code from question or problem descriptions and make it easier to follow. Also code pasted into a text box will retain formatting, such as code indentation, or diagnostic characters correctly if posting SAS log text as the main message windows will reformat pasted text.

 

Your SQL uses several items that are not available in SAS besides "over" : "Rank()" and "partition".

 

The SAS procedure SORT will remove duplicate observations.

Here is a brief example creating a small data set and then sort code to create wanted data without duplicate records.

data example;
  input x y z;
datalines;
1  2  3
2  3  4
1  2  3
2  3  5
3  4  5
1  2  3
;

proc sort data=example out=want nodup;
  by x;
run;

Note that the duplicate rows for 1 2 3 are removed.

The data step as shown to create the example data set is the way to provide an example of your data if needed. If this basic sort does not do what you want then provide an example starting data set and the desired result made from the example describing any rules.

 

HB
Barite | Level 11 HB
Barite | Level 11

If you simply must have an SQL solution use DISTINCT.

 

Using BallardW's data statement you can do 

 

proc sql;
select
distinct a.*
from
example a;
quit;

to get


x y z
1 2 3
2 3 4
2 3 5
3 4 5

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 496 views
  • 0 likes
  • 4 in conversation