SAS Programming

thanikondharish · Posted 08-08-2018 01:14 AM

Data Treatment ;

input subid $ Date : date9. Dose $;

Format Date : date9. ;

cards;

P101 13Jan2005 Asp-05mg

P102 14Jan2005 Asp-10mg

P101 19Jan2005 Asp-10mg

P103 19Jan2005 Asp-15mg

P102 20Jan2005 Asp-05mg

P101 25Jan2005 Asp-15mg

P103 25Jan2005 Asp-10mg

P102 26Jan2005 Asp-15mg

P104 26Jan2005 Asp-10mg

P101 31Jan2005 Asp-05mg

P103 30Jan2005 Asp-10mg

P102 01Feb2005 Asp-15mg

P104 02Feb2005 Asp-10mg

P105 02Feb2005 Asp-20mg

Run;

how to extract records except first and last records in each group by using sql?

PeterClemmensen · Posted 08-08-2018 01:57 AM

Why do you want to limit yourself to PROC SQL when doing this in a data step is much simpler. SQL deals with set operations. The data step deals with individual rows. If you absolutely want to do this in SQL, create a variable n to keep track of first/last occurrences within by groups and use SQL like this

Data Treatment ;
input subid $ Date : date9. Dose $;
Format Date : date9. ;
cards;
P101 13Jan2005 Asp-05mg
P102 14Jan2005 Asp-10mg
P101 19Jan2005 Asp-10mg
P103 19Jan2005 Asp-15mg
P102 20Jan2005 Asp-05mg
P101 25Jan2005 Asp-15mg
P103 25Jan2005 Asp-10mg
P102 26Jan2005 Asp-15mg
P104 26Jan2005 Asp-10mg
P101 31Jan2005 Asp-05mg
P103 30Jan2005 Asp-10mg
P102 01Feb2005 Asp-15mg
P104 02Feb2005 Asp-10mg
P105 02Feb2005 Asp-20mg
Run;

data temp;
   set Treatment;
   n=_N_;
run;

proc sql;
   create table want as
   select * from temp
   group by subid
   having n ne max(n) & n ne min(n);
quit;

The DATA to DATA Step Macro
Blog: SASnrd

thanikondharish · Posted 08-08-2018 02:54 AM

I know how to extract in dataset block but I want to know in process SQL.

PeterClemmensen · Posted 08-08-2018 02:58 AM

Is your data sorted by Date? Ie does the earliest/latest date imply the first/last record within each By Group? In that case, do

proc sql;
   create table want as
   select * from Treatment
   group by subid
   having Date ne max(Date) and Date ne min(Date);
quit;

The DATA to DATA Step Macro
Blog: SASnrd

hashman · Posted 08-08-2018 03:23 AM

@thanikondharish:

1. You didn't state what "group" means. Is it SUBID?

2. If it is, you didn't present your desired output. Do you want to (a) keep the groups with fewer than 3 rows or (b) drop them? Because if a group contains 1 or 2 rows, your criterion of keeping only the rows in except the first and the last would mean that you'd keep no rows from that group at all.

3. Are your dates within the group (c) unique as in your data sample or (d) they can be duplicate?

If the answers are (a) and (c), then you can code:

data treatment ;                   
  input subid $ date:date. dose $ ;
  format date date9. ;             
  cards ;                          
P101    13Jan2005     Asp-05mg     
P102    14Jan2005     Asp-10mg     
P101    19Jan2005     Asp-10mg     
P103    19Jan2005     Asp-15mg     
P102    20Jan2005     Asp-05mg     
P101    25Jan2005     Asp-15mg     
P103    25Jan2005     Asp-10mg     
P102    26Jan2005     Asp-15mg     
P104    26Jan2005     Asp-10mg     
P101    31Jan2005     Asp-05mg     
P103    30Jan2005     Asp-10mg     
P102    01Feb2005     Asp-15mg     
P104    02Feb2005     Asp-10mg     
P105    02Feb2005     Asp-20mg     
run ;        
                      
proc sql ;                                                       
   create table want as                                          
   select *   from   treatment                                       
   group  subid                                                  
   having (date < max(date) and date > min(date)) or count(*) < 3
   ;                                                             
quit ;

If the answers are (a) and (d), you can't rely on the dates and have to rely on a surrogate unique row number (RID below) instead:

proc sql ;                                                     
   create table want as                                        
   select *                                                    
   from (select *, monotonic() as rid from treatment)          
   group  subid                                                
   having (rid < max (rid) and rid > min (rid)) or count(*) < 3
   ;                                                           
quit ;

This approach is generally safer since you while may think that your DATE values per SUBID are unique, the actual data may know otherwise. Finally, if your answer is (b), i.e. you don't want to keep the groups with fewer than 3 rows, then eliminate the text:

or count(*) < 3

from both queries shown above.

Paul D.

SAS Programming

how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Follow Us

What is...

SAS Programming

how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

Re: how to extract records expect first and last records in each group by using sql

🚨 Early Bird Rate Extended!

SAS Training: Just a Click Away

Follow Us

What is...