Hello:
I have sample codes list below. In actual data, I have groups from 100 to 800. Is there a better way to create these subgroups? Thanks.
data classgroup100 classgroup200 classgroup300;
set test;
if 100 <= group < 200 then output classgroup100;
if 200 <= group < 300 then output classgroup200;
if 300 <= group < 400 then output classgroup300;
run;
data maingroup100 subgroup100;
set classgroup100;
if group=100 then output maingroup100;
else output subgroup100;
run;
data maingroup200 subgroup200;
set classgroup200;
if group=200 then output maingroup200;
else output subgroup200;
run;
data maingroup300 subgroup300;
set classgroup300;
if group=300 then output maingroup300;
else output subgroup300;
run;
@ybz12003 Kindly accept my apologies for casual negligence on my part.
See if this helps and let me know. I have made it two macros for easy understanding
%macro data_statement;
%global macrovar;
%do n=100 %to 700 %by 100;
%if %length(¯ovar)=0 %then %let macrovar=%sysfunc(catx( %str( ) , classgroup&n., maingroup&n. subgroup&n.));
%else %let macrovar=%sysfunc(catx( %str( ) ,¯ovar. classgroup&n., maingroup&n. subgroup&n.));
%end;
%put ¯ovar;
%mend data_statement;
%data_statement
%macro generate_if_statements;
%do n=100 %to 700 %by 100;
if &n <= group < %eval(&n+100) then
do;
output classgroup&n;
if group=&n then output maingroup&n;else output subgroup&n;
end;
%end;
%mend generate_statements;
/*your final*/
data ¯ovar;
set have;
%generate_if_statements
run;
I'll demo for one, use that as a model for the remaining:
data classgroup100 subgroup100 maingroup100;
set test;
if 100 <= group < 200 then
do;
output classgroup100;
if group=100 then output maingroup100;else output subgroup100;
end;
run;
Most of the time, the "best" way is not to create them at all. It's very easy to subset the observations as you need them, for example:
where group=100;
or:
where (100 < group < 200);
Is there something about your intended analysis that makes you want to keep three copies of the data?
PROC FORMAT?
%macro generate_statements;
%do n=100 %to 700 %by 100;
if &n <= group < %eval(&n+100) then
do;
output classgroup&n;
if group=&n then output maingroup&n;else output subgroup&n;
end;
%end;
%mend generate_statements;
/*call the macro in your datastep;*/
data w;
set have;
%generate_statements;
run;
If you are completely unsatisfied with any of the above suggestions and you want super ultra hyper dynamic, we'll have to jump to Hash of hashes subject to memory availability. I'll hold that solution for later
@novinosrin lets not reinvent the wheel at every question.
This topic is well covered in this post.
http://www.sascommunity.org/wiki/Split_Data_into_Subsets
or here:
https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
I modified the codes below. Some error messages were shown at the log.
%macro generate_statements;
%do n=100 %to 700 %by 100;
if &n <= group < %eval(&n+100) then
do;
output classgroup&n;
if group=&n then output maingroup&n;
else output subgroup&n;
end;
%end;
%mend generate_statements;
/*call the macro in your datastep;*/
data classgroup&n. maingroup&n. subgroup&n.;
set test;
%generate_statements;
run;
159 data classgroup&n. maingroup&n. subgroup&n.;
- - -
22 22 22
200 200 200
WARNING: Apparent symbolic reference N not resolved.
WARNING: Apparent symbolic reference N not resolved.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, (, /, ;,
_DATA_, _LAST_, _NULL_.
ERROR 200-322: The symbol is not recognized and will be ignored.
And I changed to &&n., I still got an warning message.
WARNING: Apparent symbolic reference N not resolved.
Please refer to the links provided by @Reeza
This topic is well covered in this post.
http://www.sascommunity.org/wiki/Split_Data_into_Subsets
or here:
https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/
I read the post, but I still couldn't find any solution to resolve the macro N.
declare n as global macro var, n will resolve
%global n;
Also keep your data statement inside the macro
to resolve n in data classgroup&n. maingroup&n. subgroup&n.; you need n to be a global macro var and not local. I am sorry that i didn't do that in my demo. I assumed you would grasp that.
Hi Novinosrin:
I added global N below. Still didn't work.
%macro generate_statements;
%do n=100 %to 700 %by 100;
if &n <= group < %eval(&n+100) then
do;
output classgroup&n;
if group=&n then output maingroup&n;
else output subgroup&n.;
end;
%end;
%mend generate_statements;
/*call the macro in your datastep;*/
%global n;
data classgroup&n. maingroup&n. subgroup&n.;
set test;
%generate_statements;
run;
MLOGIC(GENERATE_STATEMENTS): %DO loop index variable N is now 700; loop will iterate again.
SYMBOLGEN: Macro variable N resolves to 700
SYMBOLGEN: Macro variable N resolves to 700
MPRINT(GENERATE_STATEMENTS): if 700 <= group < 800 then do;
SYMBOLGEN: Macro variable N resolves to 700
MPRINT(GENERATE_STATEMENTS): output classgroup700;
SYMBOLGEN: Macro variable N resolves to 700
NOTE: Line generated by the macro variable "N".
1 classgroup700
-------------
455
SYMBOLGEN: Macro variable N resolves to 700
NOTE: Line generated by the macro variable "N".
1 maingroup700
------------
455
MPRINT(GENERATE_STATEMENTS): if group=700 then output maingroup700;
SYMBOLGEN: Macro variable N resolves to 700
NOTE: Line generated by the macro variable "N".
1 subgroup700
-----------
455
MPRINT(GENERATE_STATEMENTS): else output subgroup700;
MPRINT(GENERATE_STATEMENTS): end;
MLOGIC(GENERATE_STATEMENTS): %DO loop index variable N is now 800; loop will not iterate again.
MLOGIC(GENERATE_STATEMENTS): Ending execution.
ERROR 455-185: Data set was not specified on the DATA statement.
@ybz12003 Kindly accept my apologies for casual negligence on my part.
See if this helps and let me know. I have made it two macros for easy understanding
%macro data_statement;
%global macrovar;
%do n=100 %to 700 %by 100;
%if %length(¯ovar)=0 %then %let macrovar=%sysfunc(catx( %str( ) , classgroup&n., maingroup&n. subgroup&n.));
%else %let macrovar=%sysfunc(catx( %str( ) ,¯ovar. classgroup&n., maingroup&n. subgroup&n.));
%end;
%put ¯ovar;
%mend data_statement;
%data_statement
%macro generate_if_statements;
%do n=100 %to 700 %by 100;
if &n <= group < %eval(&n+100) then
do;
output classgroup&n;
if group=&n then output maingroup&n;else output subgroup&n;
end;
%end;
%mend generate_statements;
/*your final*/
data ¯ovar;
set have;
%generate_if_statements
run;
data test;
call streaminit(12345678);
do i=1 to 10000;
group=100+ceil(800*rand('uniform')); output;
end;
drop i;
run;
data test;
set test;
id=int(group/100) ;
run;
proc sort data=test;
by id group;
run;
data _null_;
if _n_=1 then do;
if 0 then set test;
declare hash h(multidata:'y');
h.definekey('id');
h.definedata('group');
h.definedone();
end;
do until(last.id);
set test;
by id;
h.add();
end;
h.output(dataset:cats('classgroup_',id*100));
h.clear();
run;
data _null_;
set sashelp.vmember(where=(libname='WORK' and upcase(memname) like 'CLASSGROUP%')) end=last;
temp=scan(memname,-1,'_');
call execute(catt('data maingroup_',temp ,' subgroup_',temp,';set classgroup_',temp,'; '));
call execute(catt('if group=',temp,' then output maingroup_',temp,';else output subgroup_',temp,';'));
call execute('run;');
run;
Thanks for your great suggestion, Ksharp. Unfortunately, the codes didn't work. Still, I am appreciated your effect to help.
A million thank you for everyone's expert suggestions. I am overwhelming with tons of macro codes! It takes me a long time to digest them. Although some are still NOT understanding, Noviosrin's codes worked. LOL!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.