BookmarkSubscribeRSS Feed
sasphd
Lapis Lazuli | Level 10

Hello, 

I want to select the N first observations by id. I tried these programs but they did not work 

/*****select 1200 first observations by id*/
proc sql outobs=1200;
create table US_Stock_IPO_1200 as
select *
from US_stock
group by GVKEY;
run;
data US_Stock_IPO_1200; 
set US_stock; 
if _N_<=1200 then output; 
by GVKEY;
run;
3 REPLIES 3
Reeza
Super User
_N_ does not exist in SQL only in data steps. It's a counter for the number of loops the data set has iterated, not a row counter though often synomous with that.

_N_ does not use ay by groups but you say you want it by ID so _N_ will not work.

You need to create your own counter variable and use that.

https://stats.oarc.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/
ballardw
Super User

You need to establish your own counter. The automatic variable _n_ is how many times the data step iterates and does not reset for by variables.

 

I think this may help but does require the  US_Stock data set to be sorted by GVKEY. If your records are grouped but not sorted you could use the By Notsorted GVKEY;

data US_Stock_IPO_1200; 
   set US_stock; 
   retain reccount;
   by GVKEY;
   if first.gvkey then reccount=0;
   reccount+1;
   if reccount le 1200;
   /* drop reccount; after verifying code is working*/
run;

Retain establishes a variable whose values are kept across data step boundaries. The FIRST. is a boolean value for variables on a By statement. So you can use that to test if the current record is the first of a group to reset the counter.

 

 

 


@sasphd wrote:

Hello, 

I want to select the N first observations by id. I tried these programs but they did not work 

/*****select 1200 first observations by id*/
proc sql outobs=1200;
create table US_Stock_IPO_1200 as
select *
from US_stock
group by GVKEY;
run;
data US_Stock_IPO_1200; 
set US_stock; 
if _N_<=1200 then output; 
by GVKEY;
run;

 

Tom
Super User Tom
Super User

Since you use the terminology "N first observations" let's assume that the number can vary.  Put the number into a macro variable to make it easier to modify.

%let N=1200;

Then use the value of that macro variable to decide whether or not to output any given observation.

You need to make your own counter for how many observations you have seen for this BY group. 

If you don't want that counter to become part of the resulting dataset use _N_ as the variable name.

data US_Stock_IPO_&n; 
  do _n_=1 by 1 until (last.gvkey);
    set US_stock; 
    by GVKEY;
    if _n_ <=&N then output;
  end;
run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1422 views
  • 2 likes
  • 4 in conversation