Solved
Contributor
Posts: 57

# Single Mining View

Hi All,

I am wondering if its possible to solve the following problem in SAS:

I have a dataset as follows -

 id var1 var2 8250 1 23987 8250 45 13841 8250 79 13223 8250 24 10031 8250 54 8849 8250 88 8284 8250 51 4733 8250 71 4109 14984 80 1695 14984 55 195 14984 83 184 14984 26 166 14984 84 141 14984 40 127 14984 17 122 14984 96 120 14984 10 96

I just would like to have a unique record per column id which takes the max value from var 2 and gives -

 id var1_2 8250 1 45 79 24 54 14984 80 55 83 26 84

Is thsi possible in Base SAS?

KR

SK

Accepted Solutions
Solution
‎07-25-2016 11:50 PM
Super User
Posts: 10,787

## Re: Single Mining View

```So you want the max five value for each ID ?

data have;
infile cards expandtabs truncover;
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120
;
run;
proc sort data=have;
by id descending var2;
run;
data temp;
set have;
by id;
if first.id then n=0;
n+1;
if n le 5;
run;
data want;
length var1_2 \$ 2000;
do until(last.id);
set temp;
by id;
var1_2=catx(' ',var1_2,var1);
end;
keep id var1_2;
run;

```

All Replies
Super Contributor
Posts: 326

## Re: Single Mining View

[ Edited ]

[1] Your input data seems to be sorted both ID and VAR2. Here is a solution which requires data to be sorted by ID alone.

[2] You seem to want the first 5 largest values of Var2, although it is not mentioned. The present solution can handle as many values as you want.

[3] An array approch is used. It requires the largest ID-Group size which is 9 in your data. Again this size can be done programmatically. Proc Sql may be used but I use a data step which will be faster than SQL when data set is large.

[4] Your input data set is slightly disordered by Var2 within ID.

Here is the program:

``````data have;
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120

;
run;

data _null_;
retain max  0;
do i = 1 by 1 until(last.id);
set have end = eof;
by id;

end;
if i > max then max = i;
if eof then call symputx('groupsize', max);
run;

%let h = 5;

data want;
array k[&groupsize] _temporary_;
array m[&groupsize] _temporary_;
length var1_2 \$50;
do i = 1 by 1 until(last.id);
set have;
by id;
k[i] = var2;
m[i] = var1;
if first.id then call missing(var1_2);
end;
call catx(' ', var1_2, id);
do i = 1 to &h;
b = largest(i, of k[*]);
v = m[whichn(b, of k[*])];
call catx(' ', var1_2, v);
end;
keep var1_2;
run;
``````

I hope you will not have any problem to understand the program and this will be an acceptable solution to you.

Edited . . . . .

In the above program ID goes as part of new variable, VAR1_2. If you don't want it, delete

``call catx(' ', var1_2, id);``

and change the last statement to

``keep ID var1_2;``

If you don't want a general program but just a simple program for the given data(where ID and VAR2 are sorted), few statements will do as in:

``````data want1;
do i = 1 by 1 until(last.id);
set have;
by id;
length var1_2 \$50;
if i < 6 then call catx(' ', var1_2, var1);
else continue;
end;
keep id var1_2;
run;``````

Solution
‎07-25-2016 11:50 PM
Super User
Posts: 10,787

## Re: Single Mining View

```So you want the max five value for each ID ?

data have;
infile cards expandtabs truncover;
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120
;
run;
proc sort data=have;
by id descending var2;
run;
data temp;
set have;
by id;
if first.id then n=0;
n+1;
if n le 5;
run;
data want;
length var1_2 \$ 2000;
do until(last.id);
set temp;
by id;
var1_2=catx(' ',var1_2,var1);
end;
keep id var1_2;
run;

```
Super Contributor
Posts: 326