BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Siddharth123
Obsidian | Level 7

Hi All,

I am wondering if its possible to solve the following problem in SAS:

I have a dataset as follows -

id var1 var2
8250 1 23987
8250 45 13841
8250 79 13223
8250 24 10031
8250 54 8849
8250 88 8284
8250 51 4733
8250 71 4109
14984 80 1695
14984 55 195
14984 83 184
14984 26 166
14984 84 141
14984 40 127
14984 17 122
14984 96 120
14984 10 96



I just would like to have a unique record per column id which takes the max value from var 2 and gives -

id var1_2
8250 1 45 79 24 54
14984 80 55 83 26 84

 

 

Is thsi possible in Base SAS?

 

KR

SK

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
So you want the max five value for each ID ?



data have; 
infile cards expandtabs truncover;
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120
;
run;
proc sort data=have;
 by id descending var2;
run;
data temp;
 set have;
 by id;
 if first.id then n=0;
 n+1;
 if n le 5;
run;
data want;
 length var1_2 $ 2000;
 do until(last.id);
  set temp;
  by id;
  var1_2=catx(' ',var1_2,var1);
 end; 
 keep id var1_2;
run;


View solution in original post

3 REPLIES 3
KachiM
Rhodochrosite | Level 12

[1] Your input data seems to be sorted both ID and VAR2. Here is a solution which requires data to be sorted by ID alone.

[2] You seem to want the first 5 largest values of Var2, although it is not mentioned. The present solution can handle as many values as you want.

[3] An array approch is used. It requires the largest ID-Group size which is 9 in your data. Again this size can be done programmatically. Proc Sql may be used but I use a data step which will be faster than SQL when data set is large.

[4] Your input data set is slightly disordered by Var2 within ID.

Here is the program:

 

data have; 
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120

;
run;

data _null_;
   retain max  0;
   do i = 1 by 1 until(last.id);
      set have end = eof;
      by id;
      
   end;
   if i > max then max = i;
   if eof then call symputx('groupsize', max);
run;


%let h = 5;

data want;
   array k[&groupsize] _temporary_;
   array m[&groupsize] _temporary_;
   length var1_2 $50;
   do i = 1 by 1 until(last.id);
      set have;
      by id;
      k[i] = var2;
      m[i] = var1;
      if first.id then call missing(var1_2);
   end;
   call catx(' ', var1_2, id);
   do i = 1 to &h;
      b = largest(i, of k[*]);
      v = m[whichn(b, of k[*])];
      call catx(' ', var1_2, v);
   end;
keep var1_2;   
run;
 

I hope you will not have any problem to understand the program and this will be an acceptable solution to you.

 

Edited . . . . .

 

In the above program ID goes as part of new variable, VAR1_2. If you don't want it, delete

call catx(' ', var1_2, id);

and change the last statement to

 

keep ID var1_2;

 

If you don't want a general program but just a simple program for the given data(where ID and VAR2 are sorted), few statements will do as in:

data want1;
   do i = 1 by 1 until(last.id);
      set have;
      by id;
   length var1_2 $50;
   if i < 6 then call catx(' ', var1_2, var1);
   else continue;
   end;
keep id var1_2;
run;

 

Ksharp
Super User
So you want the max five value for each ID ?



data have; 
infile cards expandtabs truncover;
input id	var1	var2;
datalines;
8250	1	23987
8250	51	4733
8250	45	13841
8250	71	4109
8250	24	10031
8250	54	8849
8250	88	8284
8250	79	13223
14984	10	96
14984	55	195
14984	26	166
14984	84	141
14984	40	127
14984	83	184
14984	17	122
14984	80	1695
14984	96	120
;
run;
proc sort data=have;
 by id descending var2;
run;
data temp;
 set have;
 by id;
 if first.id then n=0;
 n+1;
 if n le 5;
run;
data want;
 length var1_2 $ 2000;
 do until(last.id);
  set temp;
  by id;
  var1_2=catx(' ',var1_2,var1);
 end; 
 keep id var1_2;
run;


KachiM
Rhodochrosite | Level 12

You seemed to have not gone through my answers. Just see the end of my answer. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 882 views
  • 0 likes
  • 3 in conversation