12-17-2014 02:35 PM
I have data with many variables and many observations, and I want to know how to keep the first n observations of any configurations that I choose to make the data based on by variables.
If I sort the data by (name) then I want the first n observations by name, but if I sort the data by (name date) then I want the first n observations for each name on a given date. I am hoping that this makes sense.
Please see the example below:
Obs name date
1 a x
2 a x
3 a x
4 a x
5 a y
6 a y
7 b x
8 b x
9 b y
10 b y
11 b y
So if I chose my by variable to be just (name) and my n to be 2, then I would retain observations 1, 2, 7 and 8. But if I chose the by variables to be (name, date) and my n to be two, then I would retain observations 1, 2, 5, 6, 7, 8, 9 and 10.
Any help is greatly appreciated!!!
12-17-2014 02:46 PM
Just use data step with by, and restart a counter each time you encounter a new BY-group.
And pair this with a conditional/explicit output statement.
You could probably quite easy embed this in a macro (if you or someone that you can get help from knows macro programming) to make it easy for you to change the rules in each run.
12-17-2014 03:26 PM
12-17-2014 03:18 PM
This is somewhat generic and uses and array of first dot variables. Poor choice of example data but you get the idea..
12-17-2014 04:03 PM
Nice! Didn't know you can refer those automatic variables using literals, and first time I see those variables to be array elements! Happy to learn! Thanks for sharing, John.
12-17-2014 04:17 PM
You have to use VALIDVARNAME=ANY to refer to them as in the example. If not you can use FIRST: but that is not so safe. Sometimes we forget that ARRAYs are just variable lists FIRST dot this or that are just variables.
It is not my original idea. I have no original ideas.
12-18-2014 08:13 AM
John, How about :
%let obs=2; %let data=class; %let by=sex age; proc sort data=sashelp.&data out=&data; by &by; run; proc print; run; data keepobs; set &data; by &by; if first.%scan(&by,-1) then c = 0; c + 1; if c le &obs then output; run; proc print; run;
Need further help from the community? Please ask a new question.