Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
duplicate records in output dataset by using only proc sql?
like below
data want
id
1
1
2
44
44
44
can i get code for this ? thanks in advance.
Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
duplicate records in output dataset by using only proc sql?
like below
data want
id
1
1
2
44
44
44
can i get code for this ? thanks in advance.
Why limit yourself to SQL?
What have you tried so far?
Something like this:
Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
proc sql;
create table tmp as
select id, monotonic() as i from have
order by id
;
select a.*
from
tmp as a
group by a.id
having a.i > min(a.i)
;
delete table tmp;
quit;
Not the very optimal I'd say.
Bart
What if, in addition to variable ID, you have other "satellite" variables A, B, and C, which vary within a given ID? What criterion would you use to select which records are the extra ones?
Not using a data step for this is outright stupid:
data want;
set have;
by id;
if first.id or not last.id;
run;
@Saikiran_Mamidi wrote:
In proc sql only ?
Why be stupid? Isn't it much better to do the clever thing?
Well, if have has 600 million records but only 1 million unique key value, you would would want to avoid sorting the 600 million records.
If that is the scenario, you could read WANT sequentially only once, use a datastep with hashing to look up whether you have a match.
If so, insert the first match into the hash table (of course the other non-key saved variables might not be want you want).
At then at the end you output the hash table to a physical table.
You need to ensure that the million full rows will fit into memory. If that is not possible then
create a macro that read HAVE in n pieces using options firstobs and obs.
Then you need a final part that re-reads the reduced pieces and consolidates them into a final HAVE.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.