How to find first N and last N obs with By variable?

Accepted Solution Solved
Reply
Contributor QLi
Contributor
Posts: 59
Accepted Solution

How to find first N and last N obs with By variable?

Hello,

I have a problem to find first n and last n obs for each month. I figured out how to find first n obs. Could anybody have an efficient way to get last n obs based on each month?

Thanks in advance.

The below is the code for the question.

data have;

length MOn $ 7 ;

informat opendate yymmdd10. ;

format opendate yymmdd10.;

input Mon Opendate acctNum;

datalines;

2014-01 2014-01-02 2384

2014-01 2014-01-02 4723

2014-01 2014-01-02 5954

2014-01 2014-01-06 2947

2014-01 2014-01-06 4349

2014-01 2014-01-06 4478

2014-01 2014-01-09 6139

2014-01 2014-01-09 7655

2014-01 2014-01-27 2065

2014-01 2014-01-28 7038

2014-02 2014-02-06 3120

2014-02 2014-02-07 2864

2014-02 2014-02-10 1409

2014-02 2014-02-11 0456

2014-02 2014-02-11 2704

2014-02 2014-02-11 5842

2014-02 2014-02-18 0934

2014-02 2014-02-18 7630

2014-02 2014-02-27 3931

2014-03 2014-03-04 4751

2014-03 2014-03-04 8732

2014-03 2014-03-13 4792

2014-03 2014-03-17 8868

2014-03 2014-03-25 8168

2014-04 2014-04-08 6910

2014-04 2014-04-09 4496

2014-04 2014-04-15 3525

2014-04 2014-04-17 6023

2014-04 2014-04-18 7453

2014-04 2014-04-23 3930

2014-04 2014-04-24 4925

2014-04 2014-04-28 3405

2014-04 2014-04-28 4038

2014-04 2014-04-28 8540

2014-04 2014-04-29 3964

2014-04 2014-04-30 2561

2014-04 2014-04-30 8133

2014-05 2014-05-05 6331

2014-05 2014-05-05 9796

2014-05 2014-05-09 5393

2014-05 2014-05-12 2715

2014-05 2014-05-19 0007

2014-05 2014-05-19 2741

2014-05 2014-05-19 5780

2014-05 2014-05-21 1717

2014-05 2014-05-21 5595

2014-05 2014-05-21 7656

2014-05 2014-05-28 1779

2014-05 2014-05-30 2382

2014-05 2014-05-30 5113

2014-05 2014-05-30 9139

2014-06 2014-06-12 2133

2014-06 2014-06-13 4578

2014-06 2014-06-16 7189

2014-06 2014-06-18 3996

2014-06 2014-06-23 7799

2014-06 2014-06-24 9880

2014-06 2014-06-30 8321

2014-07 2014-07-01 1332

2014-07 2014-07-08 4770

2014-07 2014-07-09 0097

2014-07 2014-07-10 7365

2014-07 2014-07-15 5766

2014-07 2014-07-15 8149

2014-07 2014-07-17 8945

2014-07 2014-07-18 9757

2014-07 2014-07-21 4171

2014-07 2014-07-21 4569

2014-07 2014-07-22 8733

2014-07 2014-07-23 9421

2014-07 2014-07-24 6270

2014-07 2014-07-30 6779

2014-08 2014-08-01 4313

2014-08 2014-08-05 6366

2014-08 2014-08-08 7315

2014-08 2014-08-11 7244

2014-08 2014-08-11 8788

2014-08 2014-08-12 2978

2014-08 2014-08-13 0967

2014-08 2014-08-13 4598

2014-08 2014-08-14 1587

2014-08 2014-08-14 6579

2014-08 2014-08-14 9235

2014-08 2014-08-19 6960

2014-08 2014-08-19 7424

2014-08 2014-08-21 4643

2014-08 2014-08-22 8183

2014-08 2014-08-25 6993

2014-08 2014-08-26 5970

2014-08 2014-08-26 7337

2014-08 2014-08-28 1918

2014-08 2014-08-28 2548

2014-08 2014-08-28 9709

2014-08 2014-08-28 9821

2014-09 2014-09-01 0401

2014-09 2014-09-03 5498

2014-09 2014-09-04 4562

2014-09 2014-09-08 3869

2014-09 2014-09-11 3966

2014-09 2014-09-11 5163

2014-09 2014-09-15 3879

2014-09 2014-09-15 6084

2014-09 2014-09-16 6347

2014-09 2014-09-18 7511

2014-09 2014-09-22 6510

2014-09 2014-09-29 2145

2014-09 2014-09-30 3317

2014-10 2014-10-01 4439

2014-10 2014-10-03 4036

2014-10 2014-10-08 4278

2014-10 2014-10-17 6996

2014-10 2014-10-28 1506

2014-11 2014-11-06 4125

2014-11 2014-11-07 5094

2014-11 2014-11-13 5534

2014-11 2014-11-17 4154

2014-11 2014-11-18 8332

2014-11 2014-11-18 9681

2014-11 2014-11-21 5800

2014-11 2014-11-24 0591

2014-11 2014-11-24 9726

2014-12 2014-12-01 9144

2014-12 2014-12-03 4830

2014-12 2014-12-03 7171

2014-12 2014-12-04 7629

2014-12 2014-12-12 0703

2014-12 2014-12-15 3130

2014-12 2014-12-29 0229

;

run;

proc sort data=have out=have_s; by Opendate acctNum; run;

data want;

set have_s;

count+1;

by Mon;

if first.mon then count=1;

run;

** to find first 2 obs for each month **;

data first2obs (drop=count);

set want;

if count le 2;

run;


Accepted Solutions
Solution
‎04-02-2015 04:20 PM
Super User
Posts: 5,353

Re: How to find first N and last N obs with By variable?

You will need to read the data twice ... once to find the total number of records in a month and a second time to select the proper records.  Here is a way you can do this in a single DATA step.  After sorting BY MON OPENDATE (or possibly BY MON OPENDATE ACCTNUM) :

data want;

   monthly_n=0;

   do until (last.month);

      set have_s;

      by month;

      monthly_n + 1;

   end;

   count=0;

   do until (last.month);

      set have_s;

      by month;

      count + 1;

      if count <= 2 or count >= monthlY_n - 1 then output;

   end;

   ** if desired, drop count and monthly_n;

run;

If you are looking through the literature to find more examples of this technique, it's often called a DOW loop.  And if you only have three observations for a MON, it will output all three but will not output any of them more than once.

Good luck.

View solution in original post


All Replies
Super User
Posts: 11,105

Re: How to find first N and last N obs with By variable?

LAST.MON same as FIRST.MON .

Contributor QLi
Contributor
Posts: 59

Re: How to find first N and last N obs with By variable?

  I tried Last.mon and can't get what i want. Could you give some detail?  Thanks,

Super User
Posts: 11,105

Re: How to find first N and last N obs with By variable?

You'll need to provide some example of what you actually want and the rules around it. An example of the input and the expected output will help. If it involves specific accounts you likely need to sort the data differently, possibly by account then mon.

Solution
‎04-02-2015 04:20 PM
Super User
Posts: 5,353

Re: How to find first N and last N obs with By variable?

You will need to read the data twice ... once to find the total number of records in a month and a second time to select the proper records.  Here is a way you can do this in a single DATA step.  After sorting BY MON OPENDATE (or possibly BY MON OPENDATE ACCTNUM) :

data want;

   monthly_n=0;

   do until (last.month);

      set have_s;

      by month;

      monthly_n + 1;

   end;

   count=0;

   do until (last.month);

      set have_s;

      by month;

      count + 1;

      if count <= 2 or count >= monthlY_n - 1 then output;

   end;

   ** if desired, drop count and monthly_n;

run;

If you are looking through the literature to find more examples of this technique, it's often called a DOW loop.  And if you only have three observations for a MON, it will output all three but will not output any of them more than once.

Good luck.

Contributor QLi
Contributor
Posts: 59

Re: How to find first N and last N obs with By variable?

It worked. thanks,

Super User
Posts: 9,856

Re: How to find first N and last N obs with By variable?

proc sort data=have out=have_s; by Opendate acctNum; run;
data want;
merge have_s have_s(firstobs=2 keep=Mon rename=(Mon=_Mon));
if mon ne lag2(mon) or mon ne _mon;
drop _mon;
run;

Xia Keshan

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 287 views
  • 0 likes
  • 4 in conversation