BookmarkSubscribeRSS Feed
karrora1201
Calcite | Level 5

Hi Everyone, Can anybody help me with the below issue?

-> wanted to convert the following datastep (containing first. last. functionality) into Proc sql

 

data readin;
input ID Name $ Score;
cards;
1     David   45
1     David   74
2     Sam     45
2     Ram     54
3     Bane    87
3     Mary    92
3     Bane    87
4     Dane    23
5     Jenny   87
5     Ken     87
6     Simran  63
8     Priya   72
;
run;

proc sort data=readin;
by ID Name;
run; data b; set readin; by ID Name; if last.ID or last.Name then output; run;
5 REPLIES 5
Ksharp
Super User
data readin;
input ID Name $ Score;
cards;
1     David   45
1     David   74
2     Sam     45
2     Ram     54
3     Bane    87
3     Mary    92
3     Bane    87
4     Dane    23
5     Jenny   87
5     Ken     87
6     Simran  63
8     Priya   72
;
run;

proc sql;
create table want as
select distinct *
 from (select monotonic() as _n_,* from readin)
  group by id,name
   having _n_=max(_n_);
quit;
PaigeMiller
Diamond | Level 26

The only problem using the MONOTONIC() function in PROC SQL is that it is unsupported. It works today, it may not work in the next release (or some future release) of SAS. And so, if you are writing code with an unsupported feature, this can lead to problems and the original problem is probably best left as a DATA step.

--
Paige Miller
Astounding
PROC Star

There is no such thing.  SQL does not guarantee the order in which it selects observation, so the "last" observation is unreliable.  It could change from run to run.

 

There is an unsupported, unguaranteed "monotonic" feature of SQL if you want to go that route.

Kurt_Bremser
Super User

Maxim 14: Use the Right Tool, which is the data step for such tasks. SQL code would be much more complicated, harder to maintain, and have worse performance. SQL does not have the concept of sequence that the data step has, and so that must be forced, usually with a sub-select which has very bad performance, or an intermediate step that creates a sequence number through use of the undocumented (and therefore not advisable) MONOTONIC() function.

 

BTW last.ID implies last.Name, so it is sufficient to check for last.Name in your example.

 

A possible SQL code might be

proc sql;
create table want as
  select
    id,
    name,
    score
  from (
    select *, monotonic() as seq
    from readin
  )
  group by id, name
  having seq = max(seq)
;
quit;

but that is NOT guaranteed to create the same result as the SORT/DATA steps.

novinosrin
Tourmaline | Level 20

I agree to both @Astounding  and @PaigeMiller for their professional neat advice. Genius @Ksharp's solution is something that came to my mind too. However, this certainly  is a not a question that's production code related and a fun question perhaps? 

 

The subject matter though that's worth learning is whether or not there is a clean way of reading a dataset in a sequential order. Once we have this, a possible conditional construct utility may surface. I'm afraid Proc SQL doesn't quite have the row order functionality and in SQL of any variant requires explicit determination of a row order unlike the datastep.

 

The author of the best-selling SQL book of all time Ben Forta writes, "the retrieved data  from underlying tables  is not displayed in a mere random order. If unsorted, data will typically be displayed in the order in which it appears in the underlying tables. This could be the order in which the data was added to the tables initially. However, if data was subsequently updated or deleted, the order will be affected by how the DBMS reuses reclaimed storage space. The end result is that you cannot (and should not) rely on the sort order if you do not explicitly control it. Relational database design theory states that the sequence of retrieved data cannot be assumed to have significance if ordering was not explicitly specified" 

 

Note: Some even go as far to believe almighty perhaps created EF Codd to relate and normalise tables and then Ben forta to play with it. If Ben wrote it, I believe him.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1556 views
  • 3 likes
  • 6 in conversation