BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nstdt
Quartz | Level 8

Hi all,

My problem is to break up a large data-set into smaller ones as below.

I have a data set containing about ~1000 variables. I converted each of these variable names to macro-variables called: varname1,varname2,,,varname100,...etc.

 

Now I need to generate smaller data-sets , each containing only 5 variables. For example, I tried the following SAS macro code but I get an error.

 

%macro batch_Chk(start,fin);

%do i =&start %to &fin by 5;
%let res=%evalf(&i. + 4);

	%do j =&i. %to &res. by 1;

		data words_&i.;   /* Words_i are the sub-data-sets with 5 variables each */
		set transf_vars; /* Transf_vars is the Original data */
		%put &res;       /* check res value...*/
		keep &&varname&i.- &&varname&res.;
	%end;
run;
%end;
%mend;

%batch_Chk (1,10);

 

 

 

SYMBOLGEN: Macro variable START resolves to 1
SYMBOLGEN: Macro variable FIN resolves to 10
ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric
operand is required. The condition was: &fin by 5
ERROR: The %TO value of the %DO I loop is invalid.
ERROR: The macro BATCH_CHK will stop executing.

 

My goal is to have the 1000 variables (along with their content)  broken into smaller data-sets with just 5 variables each.

I initially thought I could change the variable names to macro-variables and manipulate them inside a DO loop - but I am not getting the results I hoped for. Would really appreciate any advise or help about this problem!

Thanks very much!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Now that can be solved with an inner loop:

%macro batch_Chk(start,fin);
%do i = &start %to &fin %by 5;
  %let res=%eval(&i. + 4);
  data words_&i.; /* words_i is a subset data with 5 variables*/
  set transf_vars; /* transf_vars is the Original Data w/all Variables*/
  keep
  %do j = &i %to &res;
    &&varname&j.
  %end;
  ;
  run;
%end;
%mend;

 

View solution in original post

20 REPLIES 20
Kurt_Bremser
Super User

Proper visual formatting will make solving your issues easier:

%macro batch_Chk(start,fin);
%do i = &start %to &fin by 5;
  %let res=%evalf(&i. + 4);
  %do j =&i. %to &res. by 1;
    data words_&i.; /* words_i is a subset data with 5 variables*/
    set transf_vars; /* transf_vars is the Original Data w/all Variables*/
    %put &res;
    %put &res;
    keep &&varname&i.- &&varname&res.;
  %end;
    run;
%mend;

Now you can immediately see that a %end is missing.

But your macro already coughs up at "by", as that is not a macro keyword.

Fixing that:

%macro batch_Chk(start,fin);
%do i = &start %to &fin %by 5;
  %let res=%evalf(&i. + 4);
  %do j =&i. %to &res. by 1;
    %put &res;
    data words_&i.; /* words_i is a subset data with 5 variables*/
    set transf_vars; /* transf_vars is the Original Data w/all Variables*/
    keep &&varname&i.- &&varname&res.;
    run;
  %end;
%end;
%mend;

Moved the %put &res; to a place where it is easier to understand, and removed the second identical statement.

 

But where do you get the 1000 macro variables varname1 to varname1000 from?

nstdt
Quartz | Level 8
Hi Kurt,
thanks for your reply. I noticed the %END was missing myself - and then from your post I have fixed the %BY also.

But I still get an ERROR:

SYMBOLGEN: Macro variable START resolves to 1
SYMBOLGEN: Macro variable FIN resolves to 10
WARNING: Apparent invocation of macro EVALF not resolved.
SYMBOLGEN: Macro variable I resolves to 1
SYMBOLGEN: Macro variable I resolves to 1
SYMBOLGEN: Macro variable RES resolves to %evalf(1 + 4)
WARNING: Apparent invocation of macro EVALF not resolved.
ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric
operand is required. The condition was: &res.
ERROR: The %TO value of the %DO J loop is invalid.
ERROR: The macro BATCH_CHK will stop executing.

Kurt_Bremser
Super User

That's because there is no %evalf function, you probably meant %sysevalf, but that is overkill, as we only work with integer values here:

%macro batch_Chk(start,fin);
%do i = &start %to &fin %by 5;
  %let res=%eval(&i. + 4);
  %put &res;
  data words_&i.; /* words_i is a subset data with 5 variables*/
  set transf_vars; /* transf_vars is the Original Data w/all Variables*/
  keep &&varname&i.- &&varname&res.;
  run;
%end;
%mend;
nstdt
Quartz | Level 8

Thanks for the help!

Your version runs, though it throws an error which I should have anticipated:

SYMBOLGEN: Macro variable VARNAME5 resolves to AVGEARN_NSA
ERROR: Missing numeric suffix on a numbered variable list (AUTO_REPO_DIRECT-AVGEARN_NSA).

 

 

The problem is that though I have the macro-variables names set up as varname1, varname2,...,varname1000,

when SAS dereferences them, the ordering is lost -the actual names inside varname1 , varname2 , etc. are not similar to each other at all.

 

 

Thanks again for your help , but if you have any ideas how I go about creating these sus-data-set form the original data (actually it has exactly 726 names), please do  help!

 

Thanks!

Kurt_Bremser
Super User

Now that can be solved with an inner loop:

%macro batch_Chk(start,fin);
%do i = &start %to &fin %by 5;
  %let res=%eval(&i. + 4);
  data words_&i.; /* words_i is a subset data with 5 variables*/
  set transf_vars; /* transf_vars is the Original Data w/all Variables*/
  keep
  %do j = &i %to &res;
    &&varname&j.
  %end;
  ;
  run;
%end;
%mend;

 

nstdt
Quartz | Level 8

Hi Kurt - thank , this works!

 

I am now going to try and extend this to my original data-set, which actually has 726 variables (not 1000, and so the last loop may  have some issues). Will try to work around or post back here!

nstdt
Quartz | Level 8

Hi Kurt,

 

I took your solution, but I have to add that that I shifted the indexes to 6 and 5 instead of 5 and 4 - since I actually have 726 variables (divisible by 6 and only one more that 5, so the final report can still be read - just barely), and not 1000.

 

Thanks very much again! Especially for catching all the small mistakes which I hadn't noticed despite staring at them for a long time!!!

 

If you think of a better way to break-up a data-set with ~1000 variables into smaller datasets, please let me know on this thread!

 

Reeza
Super User

Use the SASHELP.VTABLE data which includes the VARNUM so you can control the ordering of your variables.

Kurt_Bremser
Super User

Ups, just saw that the inner %do loop makes no sense at all:

%macro batch_Chk(start,fin);
%do i = &start %to &fin %by 5;
  %let res=%evalf(&i. + 4);
  %put &res;
  data words_&i.; /* words_i is a subset data with 5 variables*/
  set transf_vars; /* transf_vars is the Original Data w/all Variables*/
  keep &&varname&i.- &&varname&res.;
  run;
%end;
%mend;
nstdt
Quartz | Level 8

The outer loop is supposed to handle the next group of 5 words.

The inner loop indexed by J  runs from ` to 5, then the outer index I becomes 6: so that J now runs from 6 to 10.

 

Obviously eventually I hope to extend it to handle 1000 variables (in my dataset transf_vars).

 

I'm not sure how I would do the breaking down (keeping adjacent sets of 5 variables together)

Kurt_Bremser
Super User

@nstdt wrote:

The outer loop is supposed to handle the next group of 5 words.

The inner loop indexed by J  runs from ` to 5, then the outer index I becomes 6: so that J now runs from 6 to 10.

 

Obviously eventually I hope to extend it to handle 1000 variables (in my dataset transf_vars).

 

I'm not sure how I would do the breaking down (keeping adjacent sets of 5 variables together)


You don't need the "j" loop at all.

The first iteration of the outer loop will create dataset words_1 with variables 1 to 5, the next will create words_6 with vars 6 to 10, and so on.

ballardw
Super User

 

As posted that code will not compile the macro as there is an unclosed %do statement.

 

You want %by not by in the loop

 

Since you aren't actually using the macro variable J any where why is it there? As written it will overwrite the data set words&i 5 times (if the by issue gets fixed)

 

I strongly suggest that before actually manipulating your data that you just print out the index values and resolved values of the variable names.

 

I have a hard time seeing a good use for this operation though. Especially since most of the data sets following the logic you attempted would not have identification variables to allow easy realignment.

 

Tested code would require an example input data set and the macro variables varname1 to varnamennn.

 

And if the number of variables is not an integer multiple of 5 you'll get errors for the last "set".

 

nstdt
Quartz | Level 8

You said:

I have a hard time seeing a good use for this operation though. Especially since most of the data sets following the logic you attempted would not have identification variables to allow easy realignment.

 

 

You're right, the resolved names of the macro-variables do not have any oredring pattern and I get errors.

So I'm back to Square-One - How do I break down a dataset of 726 variables into smaller data-sets with just 5 variables each?

Reeza
Super User

Why are you doing this in the first place? If there's a valid reason, there's usually a way to organize the data that makes sense contextually. Are you trying to make a fact/dim type structure to reduce the size of the data set?

Breaking things into groups of 5 seems like a strange decision. Why not 10 or 12?

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 12027 views
  • 3 likes
  • 5 in conversation