BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Brandon16
Obsidian | Level 7

Hi Guy's,

 

I have a dataset example (below) and I need to create 2 new variables

1. Max_Date_New

2. Variable_Name_New 

 

I need to work out how to populate the new variables highlighted in RED. 

 

If the count value is the same for multiple variables (like in the first row), I then need the max date from the date variables which is F_date so I need the 2 new variables to produce F_Count and F_Date.

 

I hope that makes sense. 

 

Variable Name - ValueVariable Name - DateReturned Variables 
A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
80001010601/01/201701/02/2017..03/03/201704/03/201702/02/201704/03/2017F_Count
9111064102/01/201801/06/201605/06/201709/02/201711/11/201716/12/201708/03/201609/02/2017D_Count
90109001028/10/2016.25/01/201716/10/2017..22/06/201725/01/2017C_Count
200000018/08/2016......18/08/2016A_Count
246580024/04/201722/05/201617/07/201723/03/201628/07/2017..28/07/2017E_Count
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Something like this?

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

 

View solution in original post

19 REPLIES 19
Satish_Parida
Lapis Lazuli | Level 10
data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;
run;

proc contents data=have out=contents noprint;
run;

proc sql;
select name into :countvar separated by ' ' from contents where upcase(name) like '%COUNT';
select name into :datevar separated by ',' from contents where upcase(name) like '%DATE';;
quit;

%macro test(dsn,vars,func); 
data &dsn; 
 set &dsn;
 format Max_Date_New date9.;
 array list(*) &vars; 
 Max_Date_New=&func(&datevar.);
 Variable_Name_New = vname(list[whichn(&func(of list[*]), of list[*])]); 
run; 
%mend test; 
 
/** retrieve maximum value from a b and c **/ 
%test(have,&countvar.,max);

Doc: http://support.sas.com/kb/46/471.html

 

This Works. 

PeterClemmensen
Tourmaline | Level 20

Something like this?

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

 

PeterClemmensen
Tourmaline | Level 20

@Brandon16, I get a slightly different result than you desire, but I fail to see the logic to why the Max_Date_New value in row 3 should be 25/01/2017 and not 22/06/2017?

 

Shouldn't it be the maximum of the dates that resemble the tied maximum count values?

Brandon16
Obsidian | Level 7
Apologies, that was my fault, good spot!

I'm still trying to get my head around this, is the first statement just a test as I don't want to key in the variables each time on the datalines?
PeterClemmensen
Tourmaline | Level 20

What "first statement" are you refering to?

Brandon16
Obsidian | Level 7
data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;8 0 0 0 10 10 6 01/01/2017 01/02/2017 . . 03/03/2017 04/03/2017 02/02/2017
9 1 1 10 6 4 1 02/01/2018 01/06/2016 05/06/2017 09/02/2017 11/11/2017 16/12/2017 08/03/2016
9 0 10 9 0 0 10 28/10/2016 . 25/01/2017 16/10/2017 . . 22/06/2017
2 0 0 0 0 0 0 18/08/2016 . . . . . .
2 4 6 5 8 0 0 24/04/2017 22/05/2016 17/07/2017 23/03/2016 28/07/2017 . .;
PeterClemmensen
Tourmaline | Level 20

I simply transformed your inserted sample data to a data step, so you can insert my entire code and see that you get the desired result.

 

Makes sense?

Brandon16
Obsidian | Level 7
I have plugged the code it and I do have a question on some outputs.

A_Count was the only variable to populate a value which variable_name_new captures, however the the new variable MAX_DATE_NEW has a blank date, when it should be 02/12/2015 as it is in variable a_date, any idea why that might be please?
PeterClemmensen
Tourmaline | Level 20

Are you sure you ran the entire code I posted. 

 

Please run this and post the result

 

proc print data=want;
	var max_date_new Variable_Name_New;
run;
Brandon16
Obsidian | Level 7
No I didn't run the code where you mentioned
Dataline with my sample data as I thought that was just a test.
Brandon16
Obsidian | Level 7

Another issue is I should see the following results returned below........

 

A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
808880020/04/2015.18/03/201718/03/2017...20/04/2015A_Count
                
Instead I get this…              
                
A_CountB_CountC_CountD_CountE_CountF_CountG_CountA_DateB_DateC_DateD_DateE_DateF_DateG_DateMax_Date_NewVariable_Name_New
808880020/04/2015.18/03/201318/03/2013...20/04/2015C_Count
PeterClemmensen
Tourmaline | Level 20

I do not follow your logic? Why do you want Max_Date_New to be 20/04/2015 when 18/03/2017 is larger (C_Date). 

 

Isn't that the whole point of your logic, that you want the date values to be the tie breakers, when there are more than 1 max value i your count variables?

 

 

Brandon16
Obsidian | Level 7

Sorry, I hadn't updated the table correctly, the years are 2013 NOT 2017. So C_date and D_date are both 18/03/2013 not 18/03/2017.

 

if any of the values match within A_count - G counts, I want the value that displays the latest date

 

So in this example, as A_Count relates to A_date (which has the latest date) I want Variable_New_Name to show A_COUNT

PeterClemmensen
Tourmaline | Level 20

My code does that exactly? Please run the below code 

 

data have;
input A_Count B_Count C_Count D_Count E_Count F_Count G_Count (A_Date B_Date C_Date D_Date E_Date F_Date G_Date)(:ddmmyy10.);
format A_Date B_Date C_Date D_Date E_Date F_Date G_Date ddmmyy10.;
datalines;
8 0	8 8	8 0	0 20/04/2015 . 18/03/2013 18/03/2013 . . .
;

data want;
	set have;
	c=0;
	array counts{*} A_Count B_Count C_Count D_Count E_Count F_Count G_Count;
	array dates{*} A_Date B_Date C_Date D_Date E_Date F_Date G_Date;
	array maxdummies{*} dummy1-dummy7;
	
	maxcount=max(of counts[*]);
	maxpos=whichn(maxcount, of counts[*]);

	do i=1 to dim(counts);
		if counts[i]=maxcount then do; 
			c+1;
			maxdummies[i]=1;
		end;
	end;
	
	Variable_Name_New=vname(counts[maxpos]);
	Max_Date_New=dates[maxpos];
	if c>1 then do j=1 to dim(dates);
		if maxdummies[j]=1 then do;
			if dates[j] > Max_Date_New then do;
				Max_Date_New=dates[j];
				Variable_Name_New=vname(counts[j]);
			end;
		end;
	end;

	format max_date_new ddmmyy10.;
	drop c i j maxcount maxpos dummy1-dummy7;
run;

proc print data=want;
	var Max_Date_New Variable_Name_New;
run;

 

It gives you the output

Capture.PNG

 

 

 

 

 

 

 

 

 

Which is exactly what you are asking for?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 19 replies
  • 1750 views
  • 0 likes
  • 3 in conversation