## how to find duplicated observations by amount

Occasional Contributor
Posts: 18

# how to find duplicated observations by amount

Hi, Pros:

Here is my senario:

Data structure as follows:
*******************************************
id date service type amt
1 12/1/2009 station1 A +5
1 12/1/2009 station1 A +15
1 12/1/2009 station1 A -15

2 12/1/2009 station1 B +5
2 12/1/2009 station1 B -5
2 12/1/2009 station1 B +25
2 12/1/2009 station1 B -25
2 12/1/2009 station1 B 20
*****************************************

Identify unique observation by sort by id date service type;

The following result is expected. Many thanks. Tulip
***************************************
id date service type amt
1 12/1/2009 station1 A +5
2 12/1/2009 station1 B 20
***************************************
Super Contributor
Posts: 3,176

## Re: how to find duplicated observations by amount

Please provide specific expectation so one will not make their own interpretation. Specifically, I see where you chose the "first" occurence of variables "id date service type" with the first grouping but you chose the "last" occurences of the 2nd group.

For this objective, regardless, you should explore using PROC SORT and a DATA step approach using a BY statement and IF FIRST. and/or IF LAST. to perform your data selection.

data step by group processing site:sas.com

Scott Barry
SBBWorks, Inc.
Not applicable
Posts: 0

## Re: how to find duplicated observations by amount

Hi tulip

I'm not sure if I understand the problem exactly, however this code perform expected results.

DATA station;
INPUT id date:\$10. service \$ type \$ amt;
CARDS;
1 12/1/2009 station1 A +5
1 12/1/2009 station1 A +15
1 12/1/2009 station1 A -15
2 12/1/2009 station1 B +5
2 12/1/2009 station1 B -5
2 12/1/2009 station1 B +25
2 12/1/2009 station1 B -25
2 12/1/2009 station1 B 20
;
RUN;

proc sql;
select id, date, service, type, amt
from station
group by id, date, service, type, abs(amt)
having count(*) < 2
;
quit;

best regards

Tomaž
Occasional Contributor
Posts: 18

## Re: how to find duplicated observations by amount

Hi, guys:

It is so nice of you. I appreciate your help very much.

The issue in this case can not be taken care of by using sort and by first....

The final observation needs to contain the correct amount.

DATA station;
INPUT id date:\$10. service \$ type \$ amt;
CARDS;
1 12/1/2009 station1 A +5
1 12/1/2009 station1 A +15
1 12/1/2009 station1 A -15

2 12/1/2009 station1 B +5
2 12/1/2009 station1 B -5

2 12/1/2009 station1 B 20
2 12/1/2009 station1 B +25
2 12/1/2009 station1 B -25

3 12/1/2009 station1 B +55
3 12/1/2009 station1 B -55
3 12/1/2009 station1 B 55
;
RUN;

proc sql;
select id, date, service, type, amt
from station
group by id, date, service, type, abs(amt)
having count(*) < 2
;
quit;

final expects the following.
1 12/1/2009 station1 A +5

2 12/1/2009 station1 B 20

3 12/1/2009 station1 B 55 Message was edited by: tulip
Occasional Contributor
Posts: 18

## Re: how to find duplicated observations by amount

Hi, Tomaz:

there is length limit so I do not show all.
DATA station;
INPUT id date:\$10. service \$ type \$ amt;
CARDS;
3 12/1/2009 station1 B +55
3 12/1/2009 station1 B -55
3 12/1/2009 station1 B 55
;
RUN;

proc sql;
select id, date, service, type, amt
from station
group by id, date, service, type, abs(amt)
having count(*) < 2
;
quit;

final expects the following.
1 12/1/2009 station1 A +5

2 12/1/2009 station1 B 20

3 12/1/2009 station1 B 55
Occasional Contributor
Posts: 18

## Re: how to find duplicated observations by amount

final show is 3 12/1/2009 station1 B +55
Not applicable
Posts: 0

## Re: how to find duplicated observations by amount

I see
maybe using sum instead abs is more appropriate.

Tomaž
Occasional Contributor
Posts: 18

## Re: how to find duplicated observations by amount

Belated thanks, Tomaz. Further help is needed.

when I replace abs to sum function. Error message came up. Please advise. Thanks.

DATA station;
INPUT id date:\$10. service \$ type \$ amt;
CARDS;
3 12/1/2009 station1 B +55
3 12/1/2009 station1 B -55
3 12/1/2009 station1 B +55
;
RUN;
proc sql;
select id, date, service, type, amt
from station
group by id, date, service, type, abs(amt)
having count(*) < 2 ;
;
quit; Message was edited by: tulip
Super Contributor
Posts: 3,176

## Re: how to find duplicated observations by amount

As I stated previously, you can use IF LAST. THEN OUTPUT; with a sorted file using a BY statement and being the last variable in the BY. From your explanation, I believe the BY variables are ID, DATE (suggest inputting as a SAS NUMERIC date, not character), SERVICE and TYPE. Use EQUALS on the SORT to ensure the incoming order is maintained for your AMT values.

Again, see the recommended SAS support site DOC references for details, and also the EQUALS parameter discussion for PROC SORT.

Scott Barry
SBBWorks, Inc.
Discussion stats
• 8 replies
• 211 views
• 0 likes
• 3 in conversation