DATA Step, Macro, Functions and more

Create New MAX Value Variables

Accepted Solution Solved
Reply
Contributor
Posts: 42
Accepted Solution

Create New MAX Value Variables

Hi Guy's,

 

I have a dataset example (below) and I need to create 2 new variables

1. Max_Date_New

2. Variable_Name_New 

 

I need to work out how to populate the new variables highlighted in RED. 

 

If the count value is the same for multiple variables (like in the first row), I then need the max date from the date variables which is F_date so I need the 2 new variables to produce F_Count and F_Date.

 

I hope that makes sense. 

 

Variable Name - ValueVariable Name - DateReturned Variables 
A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
80001010601/01/201701/02/2017..03/03/201704/03/201702/02/201704/03/2017F_Count
9111064102/01/201801/06/201605/06/201709/02/201711/11/201716/12/201708/03/201609/02/2017D_Count
90109001028/10/2016.25/01/201716/10/2017..22/06/201725/01/2017C_Count
200000018/08/2016......18/08/2016A_Count
246580024/04/201722/05/201617/07/201723/03/201628/07/2017..28/07/2017E_Count

Accepted Solutions
Solution
‎02-13-2018 09:47 AM
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

[ Edited ]
Posted in reply to Brandon16

Something like this?

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

 

View solution in original post


All Replies
Frequent Contributor
Posts: 109

Re: Create New MAX Value Variables

[ Edited ]
Posted in reply to Brandon16
data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;
run;

proc contents data=have out=contents noprint;
run;

proc sql;
select name into :countvar separated by ' ' from contents where upcase(name) like '%COUNT';
select name into :datevar separated by ',' from contents where upcase(name) like '%DATE';;
quit;

%macro test(dsn,vars,func); 
data &dsn; 
 set &dsn;
 format Max_Date_New date9.;
 array list(*) &vars; 
 Max_Date_New=&func(&datevar.);
 Variable_Name_New = vname(list[whichn(&func(of list[*]), of list[*])]); 
run; 
%mend test; 
 
/** retrieve maximum value from a b and c **/ 
%test(have,&countvar.,max);

Doc: http://support.sas.com/kb/46/471.html

 

This Works. 

Solution
‎02-13-2018 09:47 AM
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

[ Edited ]
Posted in reply to Brandon16

Something like this?

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

 

PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

@Brandon16, I get a slightly different result than you desire, but I fail to see the logic to why the Max_Date_New value in row 3 should be 25/01/2017 and not 22/06/2017?

 

Shouldn't it be the maximum of the dates that resemble the tied maximum count values?

Contributor
Posts: 42

Re: Create New MAX Value Variables

Apologies, that was my fault, good spot!

I'm still trying to get my head around this, is the first statement just a test as I don't want to key in the variables each time on the datalines?
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

What "first statement" are you refering to?

Contributor
Posts: 42

Re: Create New MAX Value Variables

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .;
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

I simply transformed your inserted sample data to a data step, so you can insert my entire code and see that you get the desired result.

 

Makes sense?

Contributor
Posts: 42

Re: Create New MAX Value Variables

I have plugged the code it and I do have a question on some outputs.

A_Count was the only variable to populate a value which variable_name_new captures, however the the new variable MAX_DATE_NEW has a blank date, when it should be 02/12/2015 as it is in variable a_date, any idea why that might be please?
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

Are you sure you ran the entire code I posted. 

 

Please run this and post the result

 

proc print data=want;
	var max_date_new Variable_Name_New;
run;
Contributor
Posts: 42

Re: Create New MAX Value Variables

No I didn't run the code where you mentioned
Dataline with my sample data as I thought that was just a test.
Contributor
Posts: 42

Re: Create New MAX Value Variables

Posted in reply to Brandon16

Another issue is I should see the following results returned below........

 

A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
808880020/04/2015.18/03/201718/03/2017...20/04/2015A_Count
                
Instead I get this…              
                
A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
808880020/04/2015.18/03/201318/03/2013...20/04/2015C_Count
PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

I do not follow your logic? Why do you want Max_Date_New to be 20/04/2015 when 18/03/2017 is larger (C_Date). 

 

Isn't that the whole point of your logic, that you want the date values to be the tie breakers, when there are more than 1 max value i your count variables?

 

 

Contributor
Posts: 42

Re: Create New MAX Value Variables

Sorry, I hadn't updated the table correctly, the years are 2013 NOT 2017. So C_date and D_date are both 18/03/2013 not 18/03/2017.

 

if any of the values match within A_count - G counts, I want the value that displays the latest date

 

So in this example, as A_Count relates to A_date (which has the latest date) I want Variable_New_Name to show A_COUNT

PROC Star
Posts: 1,190

Re: Create New MAX Value Variables

Posted in reply to Brandon16

My code does that exactly? Please run the below code 

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0	8 8	8 0	0 20/04/2015 . 18/03/2013 18/03/2013 . . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

proc print data=want;
	var Max_Date_New Variable_Name_New;
run;

 

It gives you the output

Capture.PNG

 

 

 

 

 

 

 

 

 

Which is exactly what you are asking for?

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 19 replies
  • 190 views
  • 0 likes
  • 3 in conversation