Hi Experts,
I will have to select random entries from a large data set. The condition is sum of their values should be close to a value I choose in the beginning. It is easy to explain with the following test data set.
Data test;
input name $ value $;
datalines;
hjk 500
lku 985
ldu 689
lll 951
lkh 147
qwe 653
lkt 566
sads 658
;
run;I want to randomly select names and their total sum of values should be close to 3000. therefore, I can choose the following names hjk, ldu, qwe, lkt, and, sads. Total summation of their values equal to 3066.
How can I do thi sSAS with a very large data set?
Thanks in advance.
Hi @Myurathan ,
try following step-by-step:
Data test;
input name $ value $;
datalines;
hjk 500
lku 985
ldu 689
lll 951
lkh 147
qwe 653
lkt 566
sads 658
;
run;
data test1;
set test(keep=value) curobs=co;
curobs=co;
call streaminit(123);
r = rand("uniform");
run;
proc sort data = test1;
by r;
run;
data test1(keep=curobs);
set test1;
s+value;
output;
if s > 3000 then
do;
call symputx("NOBS",_N_,"G");
stop;
end;
run;
data saple;
array S[&NOBS.] _temporary_;
do until(eof);
set test1 end = eof;
_I_ + 1;
S[_I_] = curobs;
end;
drop _I_ curobs;
call sortn(of S[*]);
do _I_ = lbound(S) to hbound(S);
point = S[_I_];
set test point = point;
output;
end;
stop;
run;All the best
Bart
Hi @Myurathan ,
try following step-by-step:
Data test;
input name $ value $;
datalines;
hjk 500
lku 985
ldu 689
lll 951
lkh 147
qwe 653
lkt 566
sads 658
;
run;
data test1;
set test(keep=value) curobs=co;
curobs=co;
call streaminit(123);
r = rand("uniform");
run;
proc sort data = test1;
by r;
run;
data test1(keep=curobs);
set test1;
s+value;
output;
if s > 3000 then
do;
call symputx("NOBS",_N_,"G");
stop;
end;
run;
data saple;
array S[&NOBS.] _temporary_;
do until(eof);
set test1 end = eof;
_I_ + 1;
S[_I_] = curobs;
end;
drop _I_ curobs;
call sortn(of S[*]);
do _I_ = lbound(S) to hbound(S);
point = S[_I_];
set test point = point;
output;
end;
stop;
run;All the best
Bart
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.