Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
duplicate records in output dataset by using only proc sql?
like below
data want
id
1
1
2
44
44
44
can i get code for this ? thanks in advance.
Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
duplicate records in output dataset by using only proc sql?
like below
data want
id
1
1
2
44
44
44
can i get code for this ? thanks in advance.
Why limit yourself to SQL?
What have you tried so far?
Something like this:
Data have;
input id;
cards;
1
1
1
2
2
44
44
44
44
; run;
proc sql;
create table tmp as
select id, monotonic() as i from have
order by id
;
select a.*
from
tmp as a
group by a.id
having a.i > min(a.i)
;
delete table tmp;
quit;
Not the very optimal I'd say.
Bart
What if, in addition to variable ID, you have other "satellite" variables A, B, and C, which vary within a given ID? What criterion would you use to select which records are the extra ones?
Not using a data step for this is outright stupid:
data want;
set have;
by id;
if first.id or not last.id;
run;
@Saikiran_Mamidi wrote:
In proc sql only ?
Why be stupid? Isn't it much better to do the clever thing?
Well, if have has 600 million records but only 1 million unique key value, you would would want to avoid sorting the 600 million records.
If that is the scenario, you could read WANT sequentially only once, use a datastep with hashing to look up whether you have a match.
If so, insert the first match into the hash table (of course the other non-key saved variables might not be want you want).
At then at the end you output the hash table to a physical table.
You need to ensure that the million full rows will fit into memory. If that is not possible then
create a macro that read HAVE in n pieces using options firstobs and obs.
Then you need a final part that re-reads the reduced pieces and consolidates them into a final HAVE.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.