BookmarkSubscribeRSS Feed
teelov
Quartz | Level 8

good afternoon

 

i have a vast data-set which i want to process, the process can only be done when isolating subset of the entire data-set.

 

as you can see from the list below i want to be able to:

 

choose rows 2-6

rename the subset with an increment

run my processing %mymacro on this subset only

go back to the main data-set

 

choose rows 7-14

 ...

 ... 

 %mymacro

 ...

choose rows 15-22

 ...

 ...

 %mymacro

 ...

choose rows 23-30

 

and so on until the macro has processed every subset up to 199 to EOF (last _n_ is not in the list for some reason).

 

 

startfinish.PNG

7 REPLIES 7
Reeza
Super User
It's not clear what your question is here. Which part are you having trouble with?

For writing a macro I usually recommend you start with your working code and then macrotize it one step at a time. There's a short tutorial here.
https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md

Yours will vary for the last step because you want to choose only a few records each time. That's easy enough to do by either looping or subsetting each time. If that's the part you're having issues with, the following illustrates one method to split a data set into subsets though you could likely get away with much simpler logic.
teelov
Quartz | Level 8

my process macro is complete

 

i need to programatically create code for firstobs lastobs to create subset of the data that needs processing.

 

so if a or macro was to go through the numbers above it would generate something like

 

dataexample.PNG

data D000100_1a;
set D000100 (firstobs=2 obs=5);
run;

data D000100_1b;
set D000100 (firstobs=7 obs=8);
run;

data D000100_1c;
set D000100 (firstobs=15 obs=8);
run;

 

ballardw
Super User

You would likely be way ahead in this project if you add a variable to identify the groups and then use BY group processing.

 

Example of creating output for each level of a variable using by group processing:

proc sort data=sashelp.class out=work.class;
   by age;
run;

proc means data=work.class max min mean;
   by age;
   var height weight;
run;

proc print data=work.class;
   by age;
   var name sex;
run;

And what happens to your "row"=1?

teelov
Quartz | Level 8
you answer has no relevance to my question. and telling me i would be "further ahead" is slightly insulting.

ballardw
Super User

@teelov wrote:
you answer has no relevance to my question. and telling me i would be "further ahead" is slightly insulting.


I am sorry that you feel that way but your original requirement included:

 the process can only be done when isolating subset of the entire

 

Which is exactly what BY group processing does.

 

Or use of the data set option WHERE with a group variable equal to the desired group (or groups).

Or use of a WHERE statement in the very many procedures that support it.

 

None of these three approaches require creating multiple data sets, which you now have to reference explicitly and maintain.

 

Plus by default BY group processing tells you which group is being processed, sets values that can be used to do such things as enhance Title statements, name output tabs in spreadsheets. If you want or need any of that functionality you have to add additional coding.

 

Another issue is that if you are defining you output sets based on arbitrary observations your process is extremely fragile if you have to repeat it because you have to redefine every single start/end pair.

Consider what happens if your project manager comes up and says, by the way these three records were missed previously and need to be incorporated. You will likely have to do a lot of work to get them in the correct group. If a group identifier variable was in your data set you would only have to add the group identifier for the three records, append to the data , sort for by group processing and go.

 

Consider you project manager asking to repeat the analysis but combine some arbitrary collections of records as different groups.

 

Reeza
Super User

@teelov wrote:
you answer has no relevance to my question. and telling me i would be "further ahead" is slightly insulting.


Why does it have no relevance to your question? It's a method to do iterative processing that is pretty much the standard in SAS.

 

If you want someone to code a specific answer, that's a consultants job, not the job for public user forum. The purpose of the forum is to help answer questions, but it's primarily volunteers answering the questions here who owe you absolutely nothing. 

teelov
Quartz | Level 8
My apologies for the abrupt reply. It was out of character. Some what of a challenging day to say the least. My response is not the way I carry myself.

I understand you example and in most situations I would of considered the approach.

My other post about me processing hierarchical data now has most of my code complete. However. I have it a brick wall and I just wanted to test this approach and for the life of me hit a coding block.

Again I apologise, thank for looking at the post.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1922 views
  • 5 likes
  • 3 in conversation