Quartz | Level 8

## Most frequent observation by row & number of distinct values by row

``````data have;
input id date1:date9. date2:date9. date3:date9.;
format date1 date9. date2 date9. date3 date9.;
cards;
1 01JUN2001 01JUN2001 01JUN2001
2 01JUN2001 01JUN2001 05MAY1990
3 05MAY1990 05MAY1990 01JUN2001
4 01JUN2001 05MAY1990 .
;
run;

data want;
input id date1:date9. date2:date9. date3:date9. date_adj:date9. date_count:best8.;
format date1 date9. date2 date9. date3 date9. date_adj date9. date_count best8.;
cards;
1 01JUN2001 01JUN2001 01JUN2001 01JUN2001 1
2 01JUN2001 01JUN2001 05MAY1990 01JUN2001 2
3 05MAY1990 05MAY1990 01JUN2001 05MAY1990 2
4 01JUN2001 05MAY1990 . . 2
;
run;``````

From data have to data want:

• Create a new column named 'date_adj', it would be the most frequent observation in that row; if same frequency, then output blank, or first/last appeared non-blank value (not really fussy about this output).
• Create a new column named 'date_count', count the number of distinct values by that row.

I understand SAS can do this by column, not sure how to achieve it by rows.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Quartz | Level 8

## Re: Most frequent observation by row & number of distinct values by row

The following codes proved to be working:

``````%macro checkUnique(fileName=,col=,ID=);
%global colList;

/* get all date columns' names */
proc sql noprint;
select name into: colList separated by ' '
from dictionary.columns
where libname = 'WORK' and memname contains upcase("&fileName") and name contains upcase("&col");
quit;

/* transpose back to flat table */
set work.&fileName.(rename= (&ID.=ID));
array &col._array{*} &colList.;
do i = 1 to dim(&col._array);
frequent = &col._array{i};
position = i;
output;
end;
format frequent date9.;
run;

/* get the frequency of each date value */
proc sql noprint;
create table temp2 as select
frequent,
count(frequent) as count
/* count(distinct(frequent)) as num_unique */
from temp1
quit;

proc sort data= temp2(where=(frequent ne .));
by ID_adj descending count descending frequent;
run;

/* get the most frequent date */
set temp2;
format frequent date9.;
run;

%mend checkUnique;

%checkUnique(fileName=test,col=date,ID=id);``````

7 REPLIES 7
Super User

## Re: Most frequent observation by row & number of distinct values by row

Transpose to a row structure via PROC TRANSPOSE, obtain solution, and merge it back into main dataset.

Rhodochrosite | Level 12

## Re: Most frequent observation by row & number of distinct values by row

By using case statements but we careful in case statement if below query has missed any scenario. But you get idea by seeing this query.

```proc sql;
create table want as
select date1, date2, date3,
case when date1=date2 and date1 =date3 then date1
when date1 =date2 and date1<>date3 then date1
when date1 =date3 and date1 <>date2 then date1
when date2= date3 and date1<>date3 then date2
when date1<>date2 and date2<>date3 then .
case when date1=date2 and date1 =date3 then 1
when date1 =date2 and date1<>date3 then 2
when date1 =date3 and date1 <>date2 then 2
when date2= date3 and date1<>date3 then 2
when date1<>date2 and date2<>date3 then 2
end as date_count
from have ;
quit;```

Quartz | Level 8

## Re: Most frequent observation by row & number of distinct values by row

Hi @kiranv_

Thanks for your approach. Because in my real data I have 8 columns, it might be too hard to build the logic by using 'Case'.

Rhodochrosite | Level 12

## Re: Most frequent observation by row & number of distinct values by row

Yes. @Ksharp approach is best to follow

Super User

## Re: Most frequent observation by row & number of distinct values by row

``````data have;
input id date1:date9. date2:date9. date3:date9.;
format date1 date9. date2 date9. date3 date9.;
cards;
1 01JUN2001 01JUN2001 01JUN2001
2 01JUN2001 01JUN2001 05MAY1990
3 05MAY1990 05MAY1990 01JUN2001
4 01JUN2001 05MAY1990 .
;
run;
proc iml;
use have(keep=date:) nobs nobs;
read all var _all_ into x;
close;
date_count=j(nobs,1,.);
do i=1 to nobs;
temp=x[i,];
call tabulate(level,freq,temp);
if sum(freq=max(freq)) ^= 1 then date_adj[i]=.;
date_count[i]=ncol(level);
end;
append;
close;
run;

data want;
merge have temp;
run;``````
Quartz | Level 8

## Re: Most frequent observation by row & number of distinct values by row

Thanks @Ksharp, I believe your codes can help a lot of people tackle similar problems. We don't have IML however.

Quartz | Level 8

## Re: Most frequent observation by row & number of distinct values by row

The following codes proved to be working:

``````%macro checkUnique(fileName=,col=,ID=);
%global colList;

/* get all date columns' names */
proc sql noprint;
select name into: colList separated by ' '
from dictionary.columns
where libname = 'WORK' and memname contains upcase("&fileName") and name contains upcase("&col");
quit;

/* transpose back to flat table */
set work.&fileName.(rename= (&ID.=ID));
array &col._array{*} &colList.;
do i = 1 to dim(&col._array);
frequent = &col._array{i};
position = i;
output;
end;
format frequent date9.;
run;

/* get the frequency of each date value */
proc sql noprint;
create table temp2 as select
frequent,
count(frequent) as count
/* count(distinct(frequent)) as num_unique */
from temp1
quit;

proc sort data= temp2(where=(frequent ne .));
by ID_adj descending count descending frequent;
run;

/* get the most frequent date */
set temp2;
format frequent date9.;