I have some data with 500 records. I need to sum the numbers of the duplicate record. I did it using PROC SQL, but it change the order of the records at output.
Can anyone suggest me how PROC SQL plays with records. Also if i want to keep the original order of the records, how can i do it.
Thanks in advance.
Example
Var1 Var2 Var3
100 135 ase
123 143 ase
124 53 per
400 250 fge
100 300 fge
What i need is:
Var1 Var2 Var3
223 278 ase
124 53 per
500 550 fge
Hi @DaRE
I suggest to create a temporary variable to assign the initial order:
data have;
input Var1 Var2 Var3 $;
datalines;
100 135 ase
123 143 ase
124 53 per
400 250 fge
100 300 fge
;
run;
data have2;
set have;
by Var3 notsorted;
if first.Var3 then count+1;
run;
proc sql;
create table want as
select sum(Var1) as Var1, sum(Var2) as Var2, Var3
from have2
group by Var3, count
order by count;
quit;
proc print;
Hope this helps!
Best,
Yes, SQL does that.
Try using the ORDER BY clause in PROC SQL.
Good morning Sir @PaigeMiller Very clever. Honestly,I didn't believe at first that the order by could be used with monotonic() to keep the original order considering group by does a presort and the likelihood of keeping the original order would be highly ambiguous for SQL processor to choose which observation number within a by group(being unique)
So For what's it's worth, I tested adding an A group as the last group on purpose. It seemed to work:)
data have;
input Var1 Var2 Var3 $;
datalines;
100 135 ase
123 143 ase
124 53 per
400 250 fge
100 300 fge
100 135 a
123 143 a
;
run;
proc sql;
create table want as
select sum(Var1) as Var1, sum(Var2) as Var2, Var3
from (select *,monotonic() as rn from have)
group by var3
order by rn;
quit;
I did not suggest using MONOTONIC(). In fact, I always advise against using it, as it is undocumented and unsupported and could produce incorrect results.
Yes sorry. What I meant was the idea of ORDER BY clause. However, for ORDER BY clause to work here, we need a way to have the rowcount for each group as the OP wants the output in the original order. Of course this is easy should we resort to pre datastep, but assuming OP may want an all PROC SQL solution, I can't think of an alternative to monotonic() prior to order by.
You are right ... there's no obvious way to maintain the order of the records in SQL, some part of this must be done in a data step. I didn't look carefully enough.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.