BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
joon1
Quartz | Level 8

Hi Madam/Sir,

 

I would like to choose the last observation out of two observations in the same firm-year. For instance, I would like to choose observations # 2, 4, and 6 and drop observations #1,2, and 3. It will be highly appreciative if you can provide the programming code. Thank you. Joon1

Obs CIKCODE cyear ANNUALREPORTDATE dirind123456
000000202420162016010110
000000202420162016120111
00000102542014201403013
00000102542014201412015
00000181692011201101014
00000181692011201112015
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
proc sort data=have;
by cikcode cyear annualReportdate;
run;

data want;
set have;
by cikcode cyear annualreportdate;
if last.cyear;
run;

@joon1 wrote:

Hi Madam/Sir,

 

I would like to choose the last observation out of two observations in the same firm-year. For instance, I would like to choose observations # 2, 4, and 6 and drop observations #1,2, and 3. It will be highly appreciative if you can provide the programming code. Thank you. Joon1

Obs CIKCODE cyear ANNUALREPORTDATE dirind123456
0000002024 2016 20160101 10
0000002024 2016 20161201 11
0000010254 2014 20140301 3
0000010254 2014 20141201 5
0000018169 2011 20110101 4
0000018169 2011 20111201 5

 

View solution in original post

3 REPLIES 3
Reeza
Super User
proc sort data=have;
by cikcode cyear annualReportdate;
run;

data want;
set have;
by cikcode cyear annualreportdate;
if last.cyear;
run;

@joon1 wrote:

Hi Madam/Sir,

 

I would like to choose the last observation out of two observations in the same firm-year. For instance, I would like to choose observations # 2, 4, and 6 and drop observations #1,2, and 3. It will be highly appreciative if you can provide the programming code. Thank you. Joon1

Obs CIKCODE cyear ANNUALREPORTDATE dirind123456
0000002024 2016 20160101 10
0000002024 2016 20161201 11
0000010254 2014 20140301 3
0000010254 2014 20141201 5
0000018169 2011 20110101 4
0000018169 2011 20111201 5

 

ballardw
Super User

if the data are in order as your example implies:

 

Data want;
   set have;
   by cikcode cyear;
   if last.cyear;
run;
hashman
Ammonite | Level 13

@joon1:

If your file is already properly sorted, just use the DATA step posted by @ballardw and @Reeza.

 

If it's not sorted, a single-step solution can be:

data have ;                                                                                                                                                                                                                                                     
  input CIKCODE cyear ANNUALREPORTDATE dirind123456 ;                                                                                                                                                                                                           
  cards ;                                                                                                                                                                                                                                                       
0000002024  2016  20160101  10                                                                                                                                                                                                                                  
0000002024  2016  20161201  11                                                                                                                                                                                                                                  
0000010254  2014  20140301   3                                                                                                                                                                                                                                  
0000010254  2014  20141201   5                                                                                                                                                                                                                                  
0000018169  2011  20110101   4                                                                                                                                                                                                                                  
0000018169  2011  20111201   5                                                                                                                                                                                                                                  
;                                                                                                                                                                                                                                                               
run ;                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                
data _null_ ;                                                                                                                                                                                                                                                   
  dcl hash h (dataset:"have", duplicate:"r", ordered:"a") ;                                                                                                                                                                                                     
  h.definekey ("cikcode", "cyear") ;                                                                                                                                                                                                                            
  h.definedata (all:"y") ;                                                                                                                                                                                                                                      
  h.definedone () ;                                                                                                                                                                                                                                             
  h.output (dataset:"want") ;                                                                                                                                                                                                                                   
  stop ;                                                                                                                                                                                                                                                        
  set have ;                                                                                                                                                                                                                                                    
run ;                             

This assumes that you have enough memory to hold the output data set, which for the sheet in your Excel workbook is guaranteed. But if in reality you have a much bigger file, here is another version requiring the hash table to hold only the record identifier in its data portion (rather than all the non-key variables):

data want ;                                                                                                                                                                                                                                                     
  dcl hash h (ordered:"a") ;                                                                                                                                                                                                                                    
  h.definekey ("cikcode", "cyear") ;                                                                                                                                                                                                                            
  h.definedata ("rid") ;                                                                                                                                                                                                                                        
  h.definedone () ;                                                                                                                                                                                                                                             
  do rid = 1 by 1 until (z) ;                                                                                                                                                                                                                                   
    set have end = z ;                                                                                                                                                                                                                                          
    h.replace() ;                                                                                                                                                                                                                                               
  end ;                                                                                                                                                                                                                                                         
  dcl hiter hi ("h") ;                                                                                                                                                                                                                                          
  do while (hi.next() = 0) ;                                                                                                                                                                                                                                    
    set have point = rid ;                                                                                                                                                                                                                                      
    output ;                                                                                                                                                                                                                                                    
  end ;                                                                                                                                                                                                                                                         
run ;                 

Suit yourself.

 

Kind regards

Paul D.      

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1146 views
  • 1 like
  • 4 in conversation