06-06-2016 02:46 PM
I'm trying to take a dataset and take the first n observations and take their name and use that to do something else. For example
Because the names are different I want to call them by observation number (i.e. Obs for jkg) and then loop through the names to created subsetted data in another dataset. So Obs should get me jkg and I should then be able to search another dataset for jkg and subset all values with the name to another dataset. Then it should go to Obs to ddg and do the same, and then continue.
I know this can be done by just searching the whole dataset for anything with the name of interest but even if I created a macro, I think there should exist a more succinct way of doing it but nothing I'm trying is really working.
06-06-2016 04:01 PM
It sounds like you are making this difficult by insisting on separate subsets for each NAME. Why not create one larger subset with all 10 names? The extract would look something like this:
set all_names (obs=10 keep=name);
create table want as
select * from another_dataset where name in (select name from subset) order by name;
That (or something very much like it) should give you the subset with all records matching the first 10 names. And in order. So subsequent processing could use a BY statement:
proc reg data=want;
The BY statement lets you loop through a separate analysis for each NAME, while keeping all your data together in one data set.
06-07-2016 04:59 AM
I am not sure I follow what your asking? The only reason jkg is at logical observation number 1 is because it is not sorted, so why would you assume that logical observation 1 = jkg? This doesn't make any sense as a change to the dataset could corrupt your logic. Your logic should work on the data, not on an absract concept. As @Astounding has stated, if you just want the first 10 obs then use that command, if you specifically want data items, then use a where clause.
06-08-2016 01:28 PM
I know Obs is only that way because that's the original way it was outputted. That's what I want. So I should be able to output and input differet datasets so when it goes through it takes those positions from the dataset and the names associated with them. So if in the first dataset jkg is Obs in one set, then if in a different set Obs is something called 'hggg' it should instead search the larger dataset for all of those inputs instead. In this way if I have different 'top 10' lists based on different criteria then I shouldn't have to go in and change the code dramatically. Ideally I could just load this into a macro and switch out names for datasets to avoid code clunkiness.
06-08-2016 03:49 PM
It's a little difficult to tell what you are asking for, but this might be a step in the right direction.
%macro subset (obsno);
retain _obsno_ &obsno;
if _n_=1 then set have point=_obsno_;
if variable_from_larger_dataset = name;
drop _obsno_ name;
When calling the macro with OBSNO=5, the program gets the 5th observation where NAME is "sadf" and looks for all matching observations from your larger data set.