If I have a proc sql query with multiple columns but I want to tell it to give me the top 1000 claim numbers regardless if that claim returns 20 rows associated with it, would I put that in the where statement as an outobs or would I do something in the select statement saying I only want the first 1000 claim numbers?
I tried proc sql outobs = 1000;
but that only gave me a total of 77 claim numbers with 1000 rows and there were duplicate claim numbers because the claims have varying data associated. I actually want the first 1000 claim numbers regardless of if it produces 77000 rows in the long run.
try this:
proc sql outobs=1000;
create table temp as
select distinct claim_number
from have;
proc sql;
create table want as
select * from have
where claim_number
in (select claim_number from temp);
drop table temp;
quit;
Linlin
updated after Haikuo's comments
Can you post some date like what you have right now and how you want it? It would be easier to solve that way.
It is a lot more controlable if use DOW and _n_ counter:
data have;
input a @@;
cards;
1 2 2 3 3 3 4 5 5 5 6 6
run;
data want;
do until (last.a);
set have;
by a;
output;
end;
if _n_=3 then stop;
run;
proc print;run;
Regards,
Haikuo
data have;
input claim_no name $;
cards;
12345 test
23456 test1
23456 test2
12345 test1
45678 test3
45678 test4
;
run;
proc sql;
create table have1 as select distinct(claim_no) from have (obs=2);
quit;
DATA WANT;
LENGTH NAME $10.;
IF _N_=1 THEN DO;
IF 0 THEN SET HAVE1 (KEEP=CLAIM_NO);
DECLARE HASH HH(DATASET: 'HAVE1');
HH.DEFINEKEY ('CLAIM_NO');
HH.DEFINEDONE();
END;
DO UNTIL(EOF);
SET HAVE END=EOF;
IF HH.FIND()=0 THEN OUTPUT;
ELSE DO;
DELETE;
OUTPUT;
END;
END;
STOP;
RUN;
PROC PRINT; RUN;
try this:
proc sql outobs=1000;
create table temp as
select distinct claim_number
from have;
proc sql;
create table want as
select * from have
where claim_number
in (select claim_number from temp);
drop table temp;
quit;
Linlin
updated after Haikuo's comments
Linlin,
outobs=1000 will only get you at most 1000 obs no matte what, not 1000 claim_numbers each with multiple obs.
Haikuo
Hi Haikuo,
Thank you!
This one works:
data have;
input id @@;
cards;
1 1 8 8 3 3 4 4 5 5 6 6
;
proc sql outobs=3;
create table temp as
select distinct id
from have;
proc sql;
create table want as
select * from have
where id
in (select id from temp);
quit;
proc print data=want;
run;
There you go!Good Job!
One thing I am not positive is that SQL can't guarantee you are getting the 'Top 3' based on whatever existing order. it is easier to go with data step, in which maintaining the order of obs is a default setting.
Regards,
Haikuo
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.