BookmarkSubscribeRSS Feed
michan22
Quartz | Level 8

Hi guys! I have another question.

The same dataset:

var1   var1_date   var2   var2_date        var3    var3_date       var4   var4_date     .......   diagnosis_date

200    08/09/16     300      05/02/15       400        06/10/14       500    01/20/13   .......     02/01/16

250    07/06/16     200      04/10/14                                                                                  05/10/14

100    01/02/17     150      06/01/15       550        02/10/15                                               03/30/15

.

.

.

 

I need to find the lowest numerical value and its associated date for each observation, but I do not want to count or accept the values/dates if it happens before the diagnosis date (e.g. for obs 1 the lowest value would be 300 and date is 05/02/15, for obs 2 lowest value is 200 with date 04/10/14 and so on). I have a rough idea which is to use array and add if then statement within the array but do not know how to code exactly to get the value and its date.

Thank you and I realy appreciate all of your advice!

 

10 REPLIES 10
Reeza
Super User

Post the code you have so far. This will require a loop. Also, post sample data as a data step, likely your previous thread has examples from where users created fake data to replicate your issue. 

michan22
Quartz | Level 8

Using the code that @Ksharp  provided in my last post, I think I have to add the if/then statement within the following code:

 

array x{*} newcd4_1-newcd4_17;

idx=whichn(min(of x{*}),of x{*});

min_date=vvaluex(cats(vname(x{idx}),'_date'));

run;

 

I am not sure where to add the if/then statement/do loop to let SAS know to ignore the values that are linked to dates before the the diagnosis_date?

Thanks!

Astounding
PROC Star

With the additional requirement related to diagnosis date, you will need to locate the date differently.  The MIN function can't do the job alone.  One way:

 

array x {*} ... as before, use whatever is the correct set of variable names ...;

array dates {*} var1_date var2_date var3_date .....;

do _n_=1 to dim(dates);

   if diagnosis_date <= dates{_n_} < mindate then do;

      mindate = dates{_n_};

      minvalue = x{_n_};

   end;

end;

michan22
Quartz | Level 8

Thank you! I tried:

 

array x {*} newcd4_1-newcd4_16;

array dates {*} newcd4_1_date--newcd4_16_date;

do _n_=1 to dim(dates);

if diagnosis_date <= dates{_n_} < mindate then do;

mindate = dates{_n_};

minvalue = x{_n_};

end;

end;

 

When I ran a proc freq for mindate and minvalue nothing shows up, everything is missing.

Reeza
Super User

@michan22 wrote:

Thank you! I tried:

 

array x {*} newcd4_1-newcd4_16;

array dates {*} newcd4_1_date--newcd4_16_date;

do _n_=1 to dim(dates);

if diagnosis_date <= dates{_n_} < mindate then do;

mindate = dates{_n_};

minvalue = x{_n_};

end;

end;

 

When I ran a proc freq for mindate and minvalue nothing shows up, everything is missing.


This is different than the code you posted originally so it has different issues. Pick one and go with that...there are multiple solutions but trying to debug various different sets of code isn't particularily helpful to anyone. 

 

 

Reeza
Super User

 

 

1. Need a do loop to iterate over arrays. 

2. Find first, lowest date after diagnosis. 

 

Min_index will have the index of the smallest value that's greater than the diagnosis date. 

 

 

Min_index = 1;

Do i = 1 to dim(x);

if x(i) > diagnosis_date /* after diagnosis date*/

And x(I) ne . /* not missing */

and x(I) < x(min_index)  /* smaller than current smallest value*/

them do;

min_index = I;

end;

 

end;

 

 

Also, this problem is infinitely easier if you keep your data in a wide structure not a long structure. 

michan22
Quartz | Level 8

I put:

 

array x{*} newcd4_1-newcd4_17;

Min_index = 1;

Do i = 1 to dim(x);

if x(i) > diagnosis_date and x(i) ^= . and x(i) < x(min_index)  

then do;

min_index = i;

end;

end; 

idx=whichn(min(of x{*}),of x{*});

min_date=vvaluex(cats(vname(x{idx}),'_date'));

run;

 

log says "array subscript out of range at line 400 column 29".

Is it problem with my data?

Reeza
Super User

Post the full code and the log.

michan22
Quartz | Level 8

I changed to and no more error in log:

 

array x{*} newcd4_1-newcd4_17;

Min_index = 1;

Do i = 1 to dim(x);

if x(i) > diagnosis_date and x(i) ^= . and x(i) < x(min_index)  

then do;

min_index = i;

 

idx=whichn(min(of x{*}),of x{*});

min_date=vvaluex(cats(vname(x{idx}),'_date'));

end;

end; 

run;

 

what would be the variable created for the lowest value with a date later than the diagnosis date? it's not min_index right?

 

Reeza
Super User

You're still using the MIN function, but you need to use the minimum index you've found instead of the MIN function. Please read up on the WHICHN, VVALUEX and MIN/MAX functions. 

 

If you comment the code it will probably help with your understanding. Pseudocoding a solution before you code is also a good idea.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 721 views
  • 0 likes
  • 3 in conversation