BookmarkSubscribeRSS Feed
rjenn
Calcite | Level 5

Hi everyone, I'm currently having some difficulty coding this macro to do what I need and I'm hoping someone more experienced than I may be able to help. This coding problem is fairly complex so I hope I explain this clearly enough as well. I have a dataset of the following form:

 

ID            VAR1               VAR2              VAR3            VAR4

1           "Lighting"          "HVAC"

2           "Lighting"

3           "HVAC"         

4           "Lighting"         "Envelope"      "VFD"

5           "Lighting"         "HVAC"           "Envelope"      "VFD"

 

My goal is to select values for a new variable (VAR5) using the following logic:

 

if VAR1 = "Lighting" then
     if VAR2 = "Lighting"
          if VAR3 = "Lighting"
               if VAR4 = "Lighting"
                    VAR5 = VAR4
          else VAR5 = VAR3
     else VAR5 = VAR2
else VAR5 = VAR1

So essentially, never prefer "Lighting" unless it is the only option. On top of this, I need to select for a second criteria: that the value selected always be the value that occurs the least in a given column. For example, since "Envelope" occurs the least in column 2, row 4 would select that value for the new variable. On the other hand, since in row 5 column 2, "HVAC" has the highest frequency, we would want to jump to column 3 and select "Envelope" (since that occurs 1:2 times).

 

 

So it all boils down to both:

  1. prefer non-lighting
  2. prefer the left-most column possible
  3. prefer the lowest frequency in that column - else, jump to the next column.

Here is what I have written so far (btw, I'm aware that this code is incomplete):

 

%macro NEI_select(dsn=);
	%do i=1 %to 4;
		/*running frequency on each measure*/
		proc freq
		  data = &dsn nlevels;
		  table m&i._meas_name / out= freq_m&i._t list missing nopercent; 
		run;
		/*removing lighting from preference list*/
		data freq_m&i;
			set freq_m&i._t;
			where m&i._meas_name ne "Lighting";
			m&i._meas_name = compress(m&i._meas_name," ");
		run;
		/*sorting by least frequent occurence*/
		proc sort data = freq_m&i ; by count; run;
		/*putting list into macro variable*/
		proc sql noprint;
			select m&i._meas_name into :m&i._measure separated by ' '
			from freq_m&i;
		quit;
	%end;

	data &dsn._t1;
	set &dsn;

		/*selecting NEI question measure*/
/*if the second column is empty assign var (since there is nowhere left to look)*/ if m2_meas_name = "" then NEI_measure = m1_meas_name;
/*if the first column is already non-lighting, pick that*/ else if m1_meas_name ne "Lighting" then NEI_measure = m1_meas_name;
/*if the first column is lighting (here is where things get tricky)*/ else if m1_meas_name = "Lighting" then do;
/*looping through columns*/ %do a=2 %to 4;
/*looping through obs in macro variables defined earlier*/ %do j=1 %to %sysfunc(countw(&&m&a._measure)); %let next_meas = %scan(&&m&a._measure, &j); if compress(m&a._meas_name," ") = &next_meas and &next_meas ne "Lighting" and NEI_measure = "" then; NEI_measure = m&a._meas_name; %end; %end; end; else NEI_measure = m1_meas_name; run; %mend; %NEI_select(dsn=data_v4_wide);

The top part of the code (frequencies+proc SQL) runs perfectly, but the second half (where the selection actually happens) is where I run into problems. If you need something cleared up or have any questions please feel free to ask and I'll do my best to explain. Hope someone on here can help me out! Thanks so much Smiley Happy

2 REPLIES 2
Reeza
Super User
Assuming the exact input as shown in your first post, what would you expect as output?
Tom
Super User Tom
Super User


Your problem description is showing a dataset and logic to generating a new variable in that dataset.

data want ;
  set have ;
  array xx var1-var4 ;
  do i=1 to dim(xx) while (missing(var5));
     if xx(i) ne 'Lighting' then var5=xx(i);
  end;
  var5=coalesce(of var5 var1-var4);
run;

What is the purpose of the macro?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 628 views
  • 0 likes
  • 3 in conversation