search across columns for min/max, then find value in another column

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

search across columns for min/max, then find value in another column

[ Edited ]

Hi.  I have what I suspect is a complicated question.  I have a transposed dataset that is not properly sorted by dates.  For each row ("patient") of this dataset, I'd like to search across multiple "date" columns for a minimum (or maximum) value, and then use the name of the "date_n" column containing this min value to identify the value in corresponding "test_n" and "exam_n" columns. 

 

In other words, suppose I have a dataset that looks like this:

patient     date_1         date_2         date_3        test_1    test_2    test_3    exam_1    exam_2     exam_3

A          11/01/2013   09/01/2000   01/01/2015    pos        neg        pos        pass          fail             fail

B          11/01/2012   07/01/2003                         neg        neg                     pass          pass     

C          03/01/2012   04/01/2013                         ind         neg                      fail            pass

D          08/01/2015   02/01/2013   01/01/2015    ind         pos        neg         fail           pass          fail

 

Is there a way to get a dataset that contains the following info (+/- columns in original dataset)?

patient       date_first     date_last          test_first      test_last      exam_first        exam_last

A              09/01/2000    01/01/2015       neg             pos               fail                   fail

B              07/01/2003    11/01/2012        neg            neg               pass                pass     

C              03/01/2012    04/01/2013        ind             neg               fail                   pass

D              02/01/2013    08/01/2015        pos            ind                pass                fail

 

Thanks so much for your help & instruction.  You all are amazing.


Accepted Solutions
Solution
‎03-09-2016 04:05 PM
Grand Advisor
Posts: 10,210

Re: search across columns for min/max, then find value in another column

[ Edited ]

I think this does what you are requesting:

data want;
   set have;
   array d date_1 - date_3;
   array t test_1 - test_3;
   array e exam_1 - exam_3;
   pos = whichn( min (of d[*]),of d[*]);
   date_first = d[pos];
   test_first = t[pos];
   exam_first = e[Pos];
   pos = whichn( max (of d[*]),of d[*]);
   date_last =d[pos];
   test_last =t[pos];
   exam_last =e[Pos];
run;

 

 

you could use two POS variables to keep track of First and Last if needed.

 

View solution in original post


All Replies
Solution
‎03-09-2016 04:05 PM
Grand Advisor
Posts: 10,210

Re: search across columns for min/max, then find value in another column

[ Edited ]

I think this does what you are requesting:

data want;
   set have;
   array d date_1 - date_3;
   array t test_1 - test_3;
   array e exam_1 - exam_3;
   pos = whichn( min (of d[*]),of d[*]);
   date_first = d[pos];
   test_first = t[pos];
   exam_first = e[Pos];
   pos = whichn( max (of d[*]),of d[*]);
   date_last =d[pos];
   test_last =t[pos];
   exam_last =e[Pos];
run;

 

 

you could use two POS variables to keep track of First and Last if needed.

 

Contributor
Posts: 25

Re: search across columns for min/max, then find value in another column

Thank you, @ballardwSmiley Happy

Esteemed Advisor
Esteemed Advisor
Posts: 7,211

Re: search across columns for min/max, then find value in another column

Firstly, please post test data in the form of a datastep, its not good to type all that in just to provide working code.  Whilst @ballardw has given you a great working solution using arrays, I would ask why you are working with data in a transposed format anyways?  Transposed data is fine for output reports, that is where its mainly used for people to look at.  You will find your programs, and logic will be far simpler if you work with normalised datastructures - and if needed for a report, then transpose before the report, i.e. don't work on transposed data.  I provide some code below which normalises your data into a long format, as it looks like patient data I am sure your familiar with CDISC standards, who also use normalised data - for a reason.  The output of that should be familiar to you, parameter/response once per observation.  One of the benfits of such a strcuture is that you need a very simple proc sort to find earliest date, and then all relevant data is on that observation.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 336 views
  • 2 likes
  • 3 in conversation