BookmarkSubscribeRSS Feed
Satori
Quartz | Level 8

this is my data:

 

data have;
input obs id $12. code $2. date $10. months $2. curr $7. assets $12.;
cards;
1 AU000668057 U1 20181231 6 0.7058 299965
2 AU000668057 U1 20180630 6 0.7391 264598
3 AU000721380 C1 20180930 3 0.7222 1
4 AU000721380 U2 20180930 3 0.7222 0
5 AU088749008 C1 20181231 6 0.7058 56058871
6 AU088749008 U1 20181231 6 1 7942 6000
7 AU097088689 C1 20181231 6 0.7058 67325556
8 AU097088689 U2 20181231 6 0.7058 346680489
9 AU099555290 C1 20181231 6 0.7058 271027
10 AU099555290 U1 20181231 6 0.7058 270949
11 AU117763443 C1 20181231 6 0.7058 6453835
12 AU117763443 U1 20181231 6 0.7058 6453835
13 AU120658497 C1 20190331 9 0.7087 67327
14 AU120658497 U1 20190331 9 0.7087 67859
15 AU121539375 C1 20181231 6 0.7058 2221858
16 AU121539375 U1 20181231 6 0.7058 15512072
17 AU145951622 C1 20181231 6 0.7058 3731565
18 AU145951622 U1 20181231 6 0.7058 3731440
19 AU151363129 C1 20181230 6 0.7058 26589603
20 AU151363129 U1 20181231 6 0.7058 26589603
21 AU603037341 C1 20181231 6 0.7058 5021061
22 AU603037341 U1 20181231 6 0.7058 5021061
23 AU617838418 C1 20180630 16 0.7391 352551
24 AU617838418 U1 20180630 16 0.7391 352798
25 AU619182786 U1 20181231 6 0.7058 354405
26 AU619182786 U1 20180630 13 0.7391 221771
27 AU620316252 U1 20181231 18 0.7058 3082142
28 AU620316252 U1 20180630 18 0.7391 3228000
29 AU620606840 U1 20180930 14 0.7222 86664
30 AU620606840 U1 20180630 14 0.7391 89000
31 AU620758358 C1 20180630 11 0.7391 0
32 AU620758358 U1 20180630 11 0.7391 0
33 AU620888548 C1 20180630 10 0.7391 791576
34 AU620888548 U1 20180630 11 0.7391 791576
35 AU621067678 C1 20180630 10 0.7391 0
36 AU621067678 U1 20180630 11 0.7391 0
37 AU621105824 C1 20180630 11 0.7391 1944572
38 AU621105824 U1 20180630 11 0.7391 1944572
39 AU621402588 C1 20180630 10 0.7391 91441442
40 AU621402588 U1 20180630 10 0.7391 968960
41 AU621541204 C1 20180630 10 0.7391 815966
42 AU621541204 U1 20180630 10 0.7391 816332
43 AU621969855 U1 20181231 6 0.7058 5293500
44 AU621969855 U1 20180630 9 0.7391 6547686
45 AU622014788 U1 20190331 9 0.7087 588612
46 AU622014788 U1 20180630 9 0.7391 779905
47 AU622116992 U1 20181231 6 0.7058 10986483
48 AU622116992 U1 20180630 9 0.7391 19967523
49 AU622132530 U1 20181231 6 0.7058 217065
50 AU622132530 U1 20180630 9 0.7391 118698
;

I want to remove the observations with same id, same date and same months. like for example observations 3 and 4, and keep the other observations that do not behave like this.

6 REPLIES 6
Satori
Quartz | Level 8
done! sorry, forgot to add the formats
SASJedi
SAS Super FREQ

If you want to keep ONE of the duplicates and remove the rest, you could use PROC SORT.  This code does that, and saves the deleted rows in a data set named deleted for your review:

proc sort data=have 
               out=want 
               dupout=deleted nodupkey;
	by id date months;
run;

If you want to get rid of ALL of the records when there are duplicates, I'd use PROC SQL and a DATA step:

proc sql noprint ;
select quote(ID), date, months
	into :droplist separated by ','
	from have
	group by ID, date, months
	having count(*) >1
;
quit;
%put NOTE: Dropping these IDs: %superq(droplist);
data want;
	set have;
	where ID not in (&droplist);
run;

Hope this helps.

Mark

 

Check out my Jedi SAS Tricks for SAS Users
Satori
Quartz | Level 8
I got the following ERROR: The length of the value of the macro variable DROPLIST (65540) exceeds the maximum length (65534). The value has been truncated to 65534 characters.

and below I have many of these:
ERROR: The text expression length (65535) exceeds maximum length (65534). The text expression has been truncated to 65534 characters.

Kurt_Bremser
Super User

Keep the data in a dataset:

proc sql noprint ;
create table drop as
  select ID, date, months
  from have
  group by ID, date, months
  having count(*) >1
;
quit;

data want;
set have;
if _n_ = 1
then do;
  declare hash drop (dataset:"drop");
  drop.definekey("id","date","months");
  drop.definedone();
end;
if drop.check() ne 0;
run;
s_lassen
Meteorite | Level 14

If you want to remove all the records in duplicate sets, I would suggest a data step solution:

proc sort data=have;                                                                                                                    
  by id date months;                                                                                                                    
run;                                                                                                                                    
                                                                                                                                        
data want;                                                                                                                              
  set have;                                                                                                                             
  by id date months;                                                                                                                    
  if first.months and last.months;                                                                                                      
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 508 views
  • 6 likes
  • 4 in conversation