08-13-2015 12:14 AM
I saw the following question and answer from
my question is, add what option(s) into Hai.kuo's code, so that the final datasets class_1 and class_2 are sorted by year?
I sorted the dataset 'temp' by class and year before Hai.kuo's code, but the output class_1 and class_2 datasets still not sorted by year.
data temp;
input CLASS VAR year;
datalines;
1 1 1990
1 2 1996
1 . 1992
2 6 1998
2 3 1993
2 0 1995
;
run;
/*Hai.kuo's code*/
data _null_;
if _n_=1 then do;
declare hash h();
h.definekey('_n_');
h.definedata('class','var','year'); /*I add "year" here to include variable "year"*/
h.definedone();
end;
set temp;
by class notsorted;
rc=h.replace();
if last.class then do;
rc=h.output(dataset:cats('class_',class));
rc=h.clear();
end;
run;
08-13-2015 08:29 AM
I will do it for HaiKuo.
data temp;
input CLASS VAR year;
datalines;
1 1 1990
1 2 1996
1 . 1992
2 6 1998
2 3 1993
2 0 1995
;
run;
data _null_;
if _n_=1 then do;
declare hash h(multidata:'y',ordered:'y');
h.definekey('class','year');
h.definedata('class','var','year');
h.definedone();
end;
set temp;
by class ;
h.add();
if last.class then do;
h.output(dataset:cats('class_',class));
h.clear();
end;
run;
08-13-2015 02:27 AM
Before doing something complicated that is hard to decipher for the next one to maintain your code (which could mean you!), just use proc sort to create the order you want.
08-13-2015 04:36 AM
I quite agree with KurtBremser here. First question would be why do you want to split the dataset, what benefit are you aiming to get from removing the ability to do by group, or aggregates on one dataset, which overrides those losses? I would say most of the time you are far better off keeping the one dataset, lets take an example, if you split the dataset, then every operation thereafter needs to be duplicated for each split dataset.
As for splitting a dataset, if its based on class:
proc sort data=have;
by class year;
run;
data _null_;
set have;
if first.class then call execute(cats('data want',class,'; set have (where=(class=',class,')); run;'));
run;
08-13-2015 08:29 AM
I will do it for HaiKuo.
data temp;
input CLASS VAR year;
datalines;
1 1 1990
1 2 1996
1 . 1992
2 6 1998
2 3 1993
2 0 1995
;
run;
data _null_;
if _n_=1 then do;
declare hash h(multidata:'y',ordered:'y');
h.definekey('class','year');
h.definedata('class','var','year');
h.definedone();
end;
set temp;
by class ;
h.add();
if last.class then do;
h.output(dataset:cats('class_',class));
h.clear();
end;
run;
08-14-2015 09:43 PM
Thank you all for your great idea and comments, very helpful.
Need further help from the community? Please ask a new question.