BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mahler_ji
Obsidian | Level 7

Hello All,

I have data with many variables and many observations, and I want to know how to keep the first n observations of any configurations that I choose to make the data based on by variables. 

If I sort the data by (name) then I want the first n observations by name, but if I sort the data by (name date) then I want the first n observations for each name on a given date.  I am hoping that this makes sense. 

Please see the example below:

Obs                    name                         date

1                              a                              x

2                              a                              x

3                              a                              x

4                              a                              x

5                              a                              y

6                              a                              y

7                              b                              x

8                              b                              x

9                              b                              y

10                            b                              y

11                            b                              y

So if I chose my by variable to be just (name) and my n to be 2, then I would retain observations 1, 2, 7 and 8.  But if I chose the by variables to be (name, date) and my n to be two, then I would retain observations 1, 2, 5, 6, 7, 8, 9 and 10.

Any help is greatly appreciated!!!

Thanks,

John

1 ACCEPTED SOLUTION
11 REPLIES 11
LinusH
Tourmaline | Level 20

Just use data step with by, and restart a counter each time you encounter a new BY-group.

And pair this with a conditional/explicit output statement.

You could probably quite easy embed this in a macro (if you or someone that you can get help from knows macro programming) to make it easy for you to change the rules in each run.

Data never sleeps
mahler_ji
Obsidian | Level 7

Hey ,

How would I get the counter to restart at the beginning of a by variable?

This is what I originally tried to do but couldn't figure it out.

Thanks!

mahler_ji
Obsidian | Level 7

Thank you !!!

data_null__
Jade | Level 19

This is somewhat generic and uses and array of first dot variables.  Poor choice of example data but you get the idea..



%let obs=2;
%let data=class;
%let by=sex age;


proc sort data=sashelp.&data out=&data;
   by &by;
   run;
proc print;
  
run;
data keepobs;
   set &data;
   by &by;
   array _by
  • 'first.'n:;
       if _by[dim(_by)] then c = 0;
       c +
    1;
      
    if c le &obs then output;
      
    run;
    proc print;
      
    run;
    12-17-2014 2-16-09 PM.png
    Haikuo
    Onyx | Level 15

    Nice! Didn't know you can refer those automatic variables using literals, and first time I see those variables to be array elements! Happy to learn! Thanks for sharing, John.

    Haikuo

    data_null__
    Jade | Level 19

    You have to use VALIDVARNAME=ANY to refer to them as in the example.  If not you can use FIRST: but that is not so safe.  Sometimes we forget that ARRAYs are just variable lists FIRST dot this or that are just variables.

    It is not my original idea.  I have no original ideas. Smiley Happy

    Reeza
    Super User

    But do you ever forget anything Smiley Wink

    Ksharp
    Super User

    John, How about :

    %let obs=2;
    %let data=class;
    %let by=sex age;
    
    
    proc sort data=sashelp.&data out=&data;
       by &by;
       run; 
    proc print; 
       run; 
    data keepobs;
       set &data;
       by &by;
       if first.%scan(&by,-1) then c = 0; 
       c + 1; 
       if c le &obs then output; 
       run; 
    proc print; 
       run; 
    
    

    Xia Keshan

    data_null__
    Jade | Level 19

    What's the fun in that? :smileysilly:

    Ksharp
    Super User

    Nothing . Just another way.

    hackathon24-white-horiz.png

    The 2025 SAS Hackathon has begun!

    It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

    Latest Updates

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 11 replies
    • 15197 views
    • 5 likes
    • 6 in conversation