BookmarkSubscribeRSS Feed
michellel
Calcite | Level 5

Hi,

 

I have a question about saving more than 1 macro variables at the same time. I have 80 files merged_1, merged_2,....., merged_80. I want to count how many observations for each file, and save the number for each file for further use. Thank you in advance for your help!

 

If I have only 1 file merged_1, I will write codes as below and it works.

proc sql;

select count(student_id) into: obs_num

from merged_1;

quit;

%put &obs_num.;

 

Now I have 80 files. I tried the following codes, but it does not work.

%macro boots;

%do i = 1 %to 80;

proc sql;

select count(student_id) into: obs_num_&i

from merged_&i;

quit;

%put &obs_num_&i;

%end;

%mend;

%boots;

 

 

Thanks, Astounding! That works.

 

I have a follow up question about how to use that macro variable. Thank you in advance for your answer!

 

I want to use bootstrap to run 1,000 times with replacement for each of 80 files. The count of observation for each file is different, saved in that macro variable &&obs_num_&i. So basically, for example, if merge_1 has 200 obs, then I want to create 1000 files, each of which has 200 obs, sampled from merge_1 with replacement. (My final target is to calculate mean and s.d. for each of 1000 files per each of 80 files. This part does not include in the following code.)

 

My code is shown below. It has error for the macro variable &&obs_num_&i. The error shows below as an example for each of variables generated.

 

------

50 &obs_num_1

_

22

WARNING: Apparent symbolic reference OBS_NUM_1 not resolved.

ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, a numeric constant, a datetime constant,

2 The SAS System 14:01 Tuesday, August 23, 2016

a missing value, INPUT, PUT.

------

 

 

%macro boots2;

%do i = 1 %to 2; /*will change 2 to 80*/

%do j = 1 %to 3; /*will change 3 to 1000*/

data select_&i._&j;

do sel = 1 to &&obs_num_&i;

 

rn = ranuni(&j. * 100000 + 20151102);

id = round(rn * 100000);

output;

end;

keep id;

run;

 

proc sort data=select_&i._&j;

by id;

run;

 

data merged_fnl_&i._&j;

merge merged_&i(in=a) select_&i._&j(in=b);

by id;

if b;

run;

%end;

%end;

%mend;

%boots2;

 

 

 

11 REPLIES 11
Astounding
PROC Star

You got it right except for one small detail.  You need another ampersand in the %PUT statement:

 

%put &&obs_num_&i;

 

That's the only issue that I see here.

michellel
Calcite | Level 5

Thanks, Astounding! That works. I have a follow up question about how to use that macro variable. Thank you in advance for your answer!

 

I want to use bootstrap to run 1,000 times with replacement for each of 80 files. The count of observation for each file is different, saved in that macro variable &&obs_num_&i. So basically, for example, if merge_1 has 200 obs, then I want to create 1000 files, each of which has 200 obs, sampled from merge_1 with replacement. (My final target is to calculate mean and s.d. for each of 1000 files per each of 80 files. This part does not include in the following code.)

 

My code is shown below. It has error for the macro variable &&obs_num_&i. The error shows below as an example for each of variables generated.

 

------

50 &obs_num_1

_

22

WARNING: Apparent symbolic reference OBS_NUM_1 not resolved.

ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, a numeric constant, a datetime constant,

2 The SAS System 14:01 Tuesday, August 23, 2016

a missing value, INPUT, PUT.

------

 

 

%macro boots2;

%do i = 1 %to 2; /*will change 2 to 80*/

%do j = 1 %to 3; /*will change 3 to 1000*/

data select_&i._&j;

do sel = 1 to &&obs_num_&i;

 

rn = ranuni(&j. * 100000 + 20151102);

id = round(rn * 100000);

output;

end;

keep id;

run;

 

proc sort data=select_&i._&j;

by id;

run;

 

data merged_fnl_&i._&j;

merge merged_&i(in=a) select_&i._&j(in=b);

by id;

if b;

run;

%end;

%end;

%mend;

%boots2;

 

Astounding
PROC Star

When SQL creates a macro variable, it always uses the local symbol table.  Once %BOOTS finishes, all those macro variables are erased.  A simple solution would be to change the definition of %BOOTS by adding the line in red:

 

%do i=1 %to 80;

%global obs_num_&i;

 

That way, a global variable will exist that SQL can utilize.

michellel
Calcite | Level 5

Thanks Astounding! It works after adding that line, but I found another problem, which is the random number generated in variable 'id'. I plan to generate random numbers, which are less than the count of observation for each file. For example, if merge_1 has 200 observation, I want all random numbers in variable 'id' <200, for example maybe 2 4s, 3 198s, and so on. So that when merge back with merge_1 file, I could get randomly selected rows with replacement from the original merge_1 file.

 

The result for random numbers in variable 'id' from My code are not limited up to the count of observation. This lead to a problem when merging back with merge_1 file. 

Reeza
Super User

Look into proc surveyselect and a paper called Don't Be Loopy by David Cassell. 

 

I recommend a combination approach, using proc surveyselect and the method looping over each file.

Astounding
PROC Star

That does look like a significantly different question than the original. 

 

For each original data set, why would you want 1,000 separate files?  Why wouldn't you create 1 file per original data set, with a variable named SAMPLE that ranges from 1 to 1,000?

Astounding
PROC Star

Here's a much simpler approach.  It involves little to no macro language, and certainly doesn't require the first macro at all.

 

data merged_1_samples;

do sample=1 to 1000;

   do recno=1 to _nobs_;

      get_this_one = ceil(_nobs_ * ranuni(12345));

      set merged_1 nobs=_nobs_ point=get_this_one;

      output;

   end;

end;

stop;

run;

 

You can always vary the seed to RANUNI if you would like.  But NOBS= retrieves the number of observations in the data set (no macro variable needed) as part of the same DATA step that performs the sampling.

ballardw
Super User

If when you "merged" the files you had kept some sort of source information you could use the data source as a strata variable in surveyselect. And specify a sample size or rate per strata. The REPS= option would create the specified number of sample replicates of given size per strata.

Reeza
Super User

But...better yet, you can query sashelp.vtable to get these results.

proc sql;
create table table_count as
select v.memname, v.libname, v.nobs
from sashelp.vtable as v
where libname='WORK' and %upper(memname) like 'MERGED_%';
quit;

 

michellel
Calcite | Level 5

Thanks Reeza, but there are errors for '%' in code shown as below.

 

_

22

200

WARNING: Apparent invocation of macro UPPER not resolved.

ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, a numeric constant, a datetime constant,

a missing value, (, +, -, BTRIM, CALCULATED, CASE, EXISTS, INPUT, NOT, PUT, SUBSTRING, TRANSLATE, USER, ^, ~.

ERROR 200-322: The symbol is not recognized and will be ignored.

Reeza
Super User

It should be UPCASE not UPPER, I always mix the two up.

 

proc sql;
create table table_count as
select v.memname, v.libname, v.nobs
from sashelp.vtable as v
where libname='WORK' and %upper(memname) like 'MERGED_%';
quit;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1959 views
  • 0 likes
  • 4 in conversation