SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
Asli_A
Fluorite | Level 6

Hello,

I am looking for a way to add the most frequent text value by group in my selection code. The selection that I am trying to do is in this order; 

1) The most frequent one

2) Common name

3) Data Source

3) Recency-the higher id the more recent data (I want to apply this in order to get rid of 1:2 frequency ratio bias) 

As an example; from the table below;

 

idNewidNameCommon name DataSource
11XXXX11
21XXYX10
31XXXX11
42AAA01
52AA00
62AAA00
73BBB10
83BB11


to the table below.

 

idNewidNameCommon name DataSource
11XXXX11
42AAA01
83BB11


Thank you very much!

4 REPLIES 4
andreas_lds
Jade | Level 19

Please add an explanation why three rows in your expected result are in the table. And you should add some more lines, so that each case you describe is fulfilled at least once.

 

Last point: if you want to get answers faster, than post the example data as datastep. This avoids unnecessary guessing data-types etc.

KachiM
Rhodochrosite | Level 12
data have;
input id Newid Name $ Common_name DataSource;
datalines;
1  1  XXXX  1  1
2  1  XXYX  1  0
3  1  XXXX  1  1
4  2  AAA   0  1
5  2  AA 0  0
6  2  AAA   0  0
7  3  BBB   1  0
8  3  BB    1  1
;
run;

data want;
   set have;
   by Newid;
   if last.Newid;
run;

Obs 	id 	Newid 	Name 	Common_name 	DataSource
1 	3 	1 	XXXX 	1 	1
2 	6 	2 	AAA 	0 	0
3 	8 	3 	BB 	1 	1

Looks to be simple yet does this meet your need.

 

regards.

DataSP

Asli_A
Fluorite | Level 6

Hello datasp,

Actually the example does not represent my master dataset. The most frequent name could be in the first, last or in between therefore I can't use the last function. A snapshot from the real dataset from which I should be able to get 'ANJELICA' 🙂

 

idNewidNameCommon name DataSource
217378742282ANGELICA00
91459642282ANGELICA00
61877542282ANJELCA00
193432142282ANJELICA00
225409242282ANJELICA00
153948542282ANJELICA00
147802842282ANJELICA00
121908642282ANJELICA00
73023342282ANJELICA00
127514242282ANJELICA00
14875942282ANJELICA00
157816042282ANJELICA00
67469642282ANJELIKA00
19403642282ANJELIKA00
37865442282ANJELIKA00
novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input id :$10.	Newid $	Name :$10.	Commonname 	DataSource;
cards;
2173787	42282	ANGELICA	0	0
914596	42282	ANGELICA	0	0
618775	42282	ANJELCA	0	0
1934321	42282	ANJELICA	0	0
2254092	42282	ANJELICA	0	0
1539485	42282	ANJELICA	0	0
1478028	42282	ANJELICA	0	0
1219086	42282	ANJELICA	0	0
730233	42282	ANJELICA	0	0
1275142	42282	ANJELICA	0	0
148759	42282	ANJELICA	0	0
1578160	42282	ANJELICA	0	0
674696	42282	ANJELIKA	0	0
194036	42282	ANJELIKA	0	0
378654	42282	ANJELIKA	0	0
;

proc sql;
create table want(drop=_c) as
select *
from (select *,count( name) as _c from have group by newid,name) 
group by newid
having max(_c)=_c;
quit;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4983 views
  • 1 like
  • 4 in conversation