BookmarkSubscribeRSS Feed
thanikondharish
Calcite | Level 5

Data Treatment ;

input subid $ Date : date9. Dose $;

Format Date : date9. ;

cards;

P101    13Jan2005       Asp-05mg

P102    14Jan2005       Asp-10mg

P101    19Jan2005       Asp-10mg

P103    19Jan2005       Asp-15mg

P102    20Jan2005       Asp-05mg

P101    25Jan2005       Asp-15mg

P103    25Jan2005       Asp-10mg

P102    26Jan2005       Asp-15mg

P104    26Jan2005       Asp-10mg

P101    31Jan2005       Asp-05mg

P103    30Jan2005       Asp-10mg

P102    01Feb2005      Asp-15mg

P104    02Feb2005      Asp-10mg

P105    02Feb2005      Asp-20mg

Run;

 

 

 

how to extract records except first and last records in each group by using sql?

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

Why do you want to limit yourself to PROC SQL when doing this in a data step is much simpler. SQL deals with set operations. The data step deals with individual rows. If you absolutely want to do this in SQL, create a variable to keep track of first/last occurrences within by groups and use SQL like this

 

Data Treatment ;
input subid $ Date : date9. Dose $;
Format Date : date9. ;
cards;
P101 13Jan2005 Asp-05mg
P102 14Jan2005 Asp-10mg
P101 19Jan2005 Asp-10mg
P103 19Jan2005 Asp-15mg
P102 20Jan2005 Asp-05mg
P101 25Jan2005 Asp-15mg
P103 25Jan2005 Asp-10mg
P102 26Jan2005 Asp-15mg
P104 26Jan2005 Asp-10mg
P101 31Jan2005 Asp-05mg
P103 30Jan2005 Asp-10mg
P102 01Feb2005 Asp-15mg
P104 02Feb2005 Asp-10mg
P105 02Feb2005 Asp-20mg
Run;

data temp;
   set Treatment;
   n=_N_;
run;

proc sql;
   create table want as
   select * from temp
   group by subid
   having n ne max(n) & n ne min(n);
quit;
thanikondharish
Calcite | Level 5
I know how to extract in dataset block but I want to know in process SQL.
PeterClemmensen
Tourmaline | Level 20

Is your data sorted by Date? Ie does the earliest/latest date imply the first/last record within each By Group? In that case, do

 

proc sql;
   create table want as
   select * from Treatment
   group by subid
   having Date ne max(Date) and Date ne min(Date);
quit;
hashman
Ammonite | Level 13

@thanikondharish:

 

1. You didn't state what "group" means. Is it SUBID?

2. If it is, you didn't present your desired output. Do you want to (a) keep the groups with fewer than 3 rows or (b) drop them? Because if a group contains 1 or 2 rows, your criterion of keeping only the rows in except the first and the last would mean that you'd keep no rows from that group at all.

3. Are your dates within the group (c) unique as in your data sample or (d) they can be duplicate?

 

If the answers are (a) and (c), then you can code:

data treatment ;                   
  input subid $ date:date. dose $ ;
  format date date9. ;             
  cards ;                          
P101    13Jan2005     Asp-05mg     
P102    14Jan2005     Asp-10mg     
P101    19Jan2005     Asp-10mg     
P103    19Jan2005     Asp-15mg     
P102    20Jan2005     Asp-05mg     
P101    25Jan2005     Asp-15mg     
P103    25Jan2005     Asp-10mg     
P102    26Jan2005     Asp-15mg     
P104    26Jan2005     Asp-10mg     
P101    31Jan2005     Asp-05mg     
P103    30Jan2005     Asp-10mg     
P102    01Feb2005     Asp-15mg     
P104    02Feb2005     Asp-10mg     
P105    02Feb2005     Asp-20mg     
run ;        
                      
proc sql ;                                                       
   create table want as                                          
   select *
from treatment group subid having (date < max(date) and date > min(date)) or count(*) < 3 ; quit ;

 If the answers are (a) and (d), you can't rely on the dates and have to rely on a surrogate unique row number (RID below) instead:

proc sql ;                                                     
   create table want as                                        
   select *                                                    
   from (select *, monotonic() as rid from treatment)          
   group  subid                                                
   having (rid < max (rid) and rid > min (rid)) or count(*) < 3
   ;                                                           
quit ;                                                         

This approach is generally safer since you while may think that your DATE values per SUBID are unique, the actual data may know otherwise. Finally, if your answer is (b), i.e. you don't want to keep the groups with fewer than 3 rows, then eliminate the text:

 

or count(*) < 3

 

from both queries shown above.

 

Paul D.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1001 views
  • 0 likes
  • 3 in conversation