BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ohemgee
Calcite | Level 5

Hi All,

 

As per title, I need help in selecting the observation with the latest date (from 5 date variables, no hierarchy) and from there selecting the one with the highest VarId by group.

 

Example Data:

Group,VarID,VarDate1,VarDate2,VarDate3,VarDate4,VarDate5

A,15,Oct2018,Oct2018,.,.,Oct2018

A,2,.,.,.,.,Jan2019

A,3,.,.,Sept2018,.,.,

A,14,.,.,.,.,Dec2018

B,45,July2018,.,.,.,July2018

B,46,.,.,.,.,.July2018

 

Result Data should be:

A,2,.,.,.,.,Jan2019

B,46,.,.,.,.,.July2018

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Assuming that you have actual dates in those variables and not the strings you are showing then you can just make new variable with the maximum of the individual date variables.  If you just have strings then you will need to convert them to actual dates first.

 

You can then use the new variable to order the data and find the last value per group.

data step1;
  set have ;
  max_date=max(of vardate1-vardate5);
  format max_date yymmdd10.;
run;
proc sort data=step1 out=want ;
   by group max_date varid ;
run;
data want;
  set want;
  by group;
  if last.group;
run;

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Hi @ohemgee  Welcome to SAS communities. Assuming those are all SAS numeric dates, the following may work for your requirement though untested

 

data want;
do _n_=1 by 1 until(last.group);
 set have;
 by group;
 array t(*) VarDate1--VarDate5;
 max=max(max,max(of t(*)));
end;
do _n_=1 to _n_;
 set have;
 if max in t then output;
end;
run;
Tom
Super User Tom
Super User

Assuming that you have actual dates in those variables and not the strings you are showing then you can just make new variable with the maximum of the individual date variables.  If you just have strings then you will need to convert them to actual dates first.

 

You can then use the new variable to order the data and find the last value per group.

data step1;
  set have ;
  max_date=max(of vardate1-vardate5);
  format max_date yymmdd10.;
run;
proc sort data=step1 out=want ;
   by group max_date varid ;
run;
data want;
  set want;
  by group;
  if last.group;
run;
Reeza
Super User

A data step works here as well:

 

data want;
set have;

by group;

retain _temp;
if first.group then call missing(_temp);

_temp = max(of vardate1-vardate5);
_current_max = max(_temp, _current_max);

*keep right varID;
if _current_max = _temp then newVarID = varID;


if last.group then output;

keep group newvarID _current_max;
run;





@ohemgee wrote:

Hi All,

 

As per title, I need help in selecting the observation with the latest date (from 5 date variables, no hierarchy) and from there selecting the one with the highest VarId by group.

 

Example Data:

Group,VarID,VarDate1,VarDate2,VarDate3,VarDate4,VarDate5

A,15,Oct2018,Oct2018,.,.,Oct2018

A,2,.,.,.,.,Jan2019

A,3,.,.,Sept2018,.,.,

A,14,.,.,.,.,Dec2018

B,45,July2018,.,.,.,July2018

B,46,.,.,.,.,.July2018

 

Result Data should be:

A,2,.,.,.,.,Jan2019

B,46,.,.,.,.,.July2018

 

Thank you!




PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS Community 🙂

 

data have(drop=i);
	array VarDate{5};
	do ID="A", "B", "C";
		do varID=1 to 5;
			do i=1 to 5;
				VarDate[i]=rand('integer', '01jan2019'd, today());
			end;
			output;
		end;
	end;
	format VarDate: date9.;
run;

data want;
	do until (last.ID);
		set have;
		by ID;
		if max(of VarDate:) ge _iorc_ then _iorc_=max(of VarDate:);
	end;
	do until (last.ID);
		set have;
		by ID;
		if max(of VarDate:)=_iorc_ then output;
	end;
	_iorc_=.;
run;
ohemgee
Calcite | Level 5

Thanks a lot for all your help. I still can't make it work though. Modifying a bit so that it sort of represents the data wherein Group is actually 3 variables. And yes, those are SAS numeric dates. Also looking for the most efficient way as this will run through millions of records.

 

Example Data:

Group1, Group 2, Group3, VarID,VarDate1,VarDate2,VarDate3,VarDate4,VarDate5

A,AA,AAA,15,Oct2018,Oct2018,.,.,Oct2018

A,AA,BBB,2,.,.,.,.,Jan2019

A,AA,CCC,3,.,.,Sept2018,.,.,

A,AA,DDD,14,.,.,.,.,Dec2018

B,AA,AAA,45,July2018,.,.,.,July2018

B,AA,BBB,46,.,.,.,.,.July2018

 

Result Data should be:

Group1, Group 2, Group3, VarID,VarDate1,VarDate2,VarDate3,VarDate4,VarDate5

A,AA,BBB,2,.,.,.,.,Jan2019

B,AA,BBB,46,.,.,.,.,.July2018

PeterClemmensen
Tourmaline | Level 20

If your data is representable of your actual data, which means that your data is sorted by Group1, Group2, Group3, then you can do something like this

 

data have(drop=i);
	array VarDate{5};
	do Group1="A", "B", "C", "D";
       do Group2="AA", "BB", "CC", "DD";
	      do Group3="AAA", "BBB", "CCC", "DDD";
             VarID=ceil(rand('Uniform')*100);
             do i=1 to 5;
		        VarDate[i]=ceil(rand('Uniform', '01jan2019'd, today()));
             end;
             output;
	      end;
	   end;		
	end;
	format VarDate: date9.;
run;

data want(drop=max);
	do until (last.Group1);
		set have;
		by Group1;
        max=max(of VarDate:);
		if max gt _iorc_ then _iorc_=max;
	end;
	do until (last.Group1);
		set have;
		by Group1;
		if max(of VarDate:)=_iorc_ then output;
	end;
	_iorc_=.;
run;
ohemgee
Calcite | Level 5

Went with the ff. code: Let me know what you think and a more efficient way is much appreciated. 🙂

 

data want;

   set have;

   max_dte = max(of vardate1-vardate5);

run;

 

proc sort data=want out=want2;

   by group1 group2 group3 descending max_dte descending varid;

run;

 

data want3;

   set want2;

   by group1 group2 group3 descending max_dte descending varid;

   if first.max_dte;

run;

 

proc sort nodupkey data=want3

                                 out=want4 (index=(indx = (group 1 group2 group3)/unique));

by group1 group2 group3;

quit;

   

Reeza
Super User
Toms solution is shorter and will have the same results as your code.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 917 views
  • 3 likes
  • 5 in conversation