BookmarkSubscribeRSS Feed
buechler66
Barite | Level 11

Hi. I'm using dynamic sql to create a sas program with a CASE statuement containing many, many CASE WHEN statements. I use an array to produce each WHEN statement. The program flow is controlled by use of 'if _N_' and 'if EOF' statements to build the output sas file.  The 'if _N_' creates the top half of the program and the 'if EOF' creates the lower portion.  

 

As it's written now the code works perfectly. Below is the code and the output (I've simplified the output to include only 2 CASE WHEN statements): 

 

		data _null_;
		   set QueryRules end=eof;
		   by Rule_Order;
		   file '/home/ssbuechl/input_whens.sas';

			array rule_nm_array {1} $ 58 rule_nm;
			array rule_array {1} $ 65 rule;

		   if _n_=1 then put 'proc sql;';
		   if _n_=1 then put 'create table QueryData&ZIP5 as ';
		   if _n_=1 then put 'select DISTINCT ';
		   if _n_=1 then put "     a.actual_dlvry_date, ";
		   if _n_=1 then put "     a.imb_code length = 31, ";
		   if _n_=1 then put "     a.imb_dlvry_zip_5, ";
		   if _n_=1 then put "     CASE";
			   do h = 1 to dim(rule_nm_array);
					do j = 1 to dim(rule_array);
							rule_nm=rule_nm_array{h};
							rule=rule_array{j};
						   PUT '           WHEN ('RULE') THEN trim("'RULE_NM'")';
					end;
			   end;
		   if eof then put "     ELSE ' '";
		   if eof then put "     END as RULE_NM,";
		   if eof then put %nrstr("from QueryDataBase&ZIP5 as a ");
		   if eof then put %nrstr("inner join QueryDataBase&ZIP5 as b ");
		   if eof then put "on a.imb_code=b.imb_code ";
		   if eof then put "where a.source='A' and b.source='B'; ";
		   if eof then put "quit; ";
		run;
proc sql;
create table QueryData&ZIP5 as
select DISTINCT
     a.actual_dlvry_date as ad_dt,
     a.imb_code length = 31,
     a.imb_dlvry_zip_5,
     CASE
           WHEN ( A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL  ) THEN trim( "ACTUAL DELIVERY DATE MISSING IN IV " )
           WHEN ( A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE  ) THEN trim( "ACTUAL DELIVERY DATE LATER IN IV " )
     ELSE ' '
     END asRULE_NM length = 58
from QueryDataBase&ZIP5 as a inner join QueryDataBase&ZIP5 as b on a.imb_code=b.imb_code where a.source='A' and b.source='B'; quit;
 

 

However, now I need to include an additional CASE statement containing it's own WHEN statements. Since I only have program control using the 'if _N_' and 'if EOF' statements, the two CASE statements in the output sas file get all mixed together.  My desired output would look like this:

 

proc sql;
create table QueryData&ZIP5 as
select DISTINCT
     a.actual_dlvry_date as ad_dt,
     a.imb_code length = 31,
     a.imb_dlvry_zip_5,
     CASE
           WHEN ( A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL  ) THEN trim( "ACTUAL DELIVERY DATE MISSING IN IV " )
           WHEN ( A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE  ) THEN trim( "ACTUAL DELIVERY DATE LATER IN IV " )
     ELSE ' '
     END as RULE_NM length = 58,
     CASE
           WHEN ( A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL  ) THEN 1.0
           WHEN ( A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE  ) THEN 1.5
     ELSE .
     END as Rule_Order length = 5 format=4.1
from QueryDataBase&ZIP5 as a
inner join QueryDataBase&ZIP5 as b
on a.imb_code=b.imb_code
where a.source='A' and b.source='B';
quit; 

Is there some way form me to introduce the second CASE statement between the 'if _N_' and 'if EOF' program flow statements? Any suggestions would be greatly appreciated.

 

As a work-around I currently create 2 output sas files - one for the top half of the program and one for the botton half of the program. Then I 'X' out of the program and copy the two files together at an OS command line.  This is inefficient as it requires two separate passes over a very, very large dataset. I'd like to build the entire output sas file it in a single pass if possible.

8 REPLIES 8
ballardw
Super User

Your QueryRules dataset could be structed to include a variable that indicated which WHICH case statement is involved, maybe named RuleGroup  to follow your example and the Name of the variable to assign.

Then the BY statement would be:

By RuleGroup RuleOrder;

 

And use First.RuleGroup to start the group processing which would look something like:

if First.RuleGroup then do:
   put "     CASE";
   do h = 1 to dim(rule_nm_array);
      do j = 1 to dim(rule_array);
         rule_nm=rule_nm_array{h};
         rule=rule_array{j};
         PUT '           WHEN ('RULE') THEN trim("'RULE_NM'")';
      end;
   end;
   put "     ELSE ' '";
   put "     END as " RULEVar ",";
end;

where RuleVar is the name of the target variable for the "rules" involved

 

buechler66
Barite | Level 11

Hmm...I'm not certain I'm completely clear on your suggestion. In my QueryRules dataset every record is unique.  Rule_Nm, Rule, and Rule_Order are never repeated (see the attached file for a snippet of what the data looks like).

 

Then I Set the QueryRules dataset 'by Rule_Order'.  So every record will then be both first.rule_order and last.rule_order?  I guess that is where I'm confused. Each CASE statement involves every record in the dataset.  

 

Am I making this too hard?  What aren't I seeing?

 

		data _null_;
		   set QueryRules end=eof;
		   by Rule_Order;
		   file '/home/ssbuechl/input_whens.sas';

 

 


Capture.JPG
ballardw
Super User

@buechler66 wrote:

Hmm...I'm not certain I'm completely clear on your suggestion. In my QueryRules dataset every record is unique.  Rule_Nm, Rule, and Rule_Order are never repeated (see the attached file for a snippet of what the data looks like).

 

Then I Set the QueryRules dataset 'by Rule_Order'.  So every record will then be both first.rule_order and last.rule_order?  I guess that is where I'm confused. Each CASE statement involves every record in the dataset.  


And you'll have to clarify what "Each CASE statement involves every record in the dataset.  " In your QueryRules control data set or the target result of the query?

If you are appling the SAME rule to multiple variables in a data set then maybe you should just move to a data step instead of a very convolute SQL approach.

It might also help to provide what some of these "rules" are. If they are a bunch of recoding statments then look ups or formats might be a better approach to the core problem.

 

 

buechler66
Barite | Level 11

Do you mean doing something like this?

 

		data _null_;
		   set QueryRules end=eof;
		   by Rule_Order;
		   file '/home/ssbuechl/input_whens.sas';

			array rule_array {1} $ 65 rule;
			array rule_order_array {1} rule_order; array rule_nm_array {1} $ 58 rule_nm;

		   if _n_=1 then put "     CASE";
				if first.rule_order = 1 then do;
					do h = 1 to dim(rule_nm_array);
							do j = 1 to dim(rule_array);
									rule_nm=rule_nm_array{h};
									rule=rule_array{j};
								   PUT '           WHEN ('RULE') THEN trim("'RULE_NM'")';
							end;
					end;
				end;

		   if eof then put "     CASE";
				if first.rule_order = 0 then do;
					do j = 1 to dim(rule_array);
						do k = 1 to dim(rule_order_array);
							rule=rule_array{j};
							rule_order=rule_order_array{k};
					   	   PUT '           WHEN ('RULE') THEN 'RULE_ORDER;
						end;
			   		end;
				end;

		   if eof then put "     ELSE .";
		   if eof then put "     END as Rule_Order";
		   if eof then put %nrstr("from QueryDataBase&ZIP5 as a ");
		   if eof then put %nrstr("inner join QueryDataBase&ZIP5 as b ");
		   if eof then put "on a.imb_code=b.imb_code ";
		   if eof then put "where a.source='A' and b.source='B'; ";
		   if eof then put "quit; ";
		run;
ballardw
Super User

Does your control table have all the names of the variables that will be in the "As variablename" for the target of the CASE statement?

 

Then have a separate array to contain those and loop over them with the body of the case inside.

do j=1 to dim(outputvariablesarray);

   put "Case";

   do i=1 to dim(rules);

      put "when" Rules [i] "then " value ;

   end;

   put "else . ";

   put "end as " outputvariablesarray [ j ] " , ";

end;

 

You'll likely need to have an end of j loop test about whether to have the last one end with a comma unless there is other code you aren't showing.

 

  

 

But lots of guessing without an example dataset.

buechler66
Barite | Level 11
Let me work on something like this. Ty very much.
Patrick
Opal | Level 21

Below a fully working code sample which demonstrates how you could approach this.

data QueryRules;
  Rule_Order=1;
  rule_nm='ACTUAL DELIVERY DATE MISSING IN IV';
  rule='A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL';
  output;
  rule_nm='ACTUAL DELIVERY DATE LATER IN IV';
  rule='A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE';
  output;
run;

filename header temp;
filename footer temp;
filename case1 temp;
filename case2 temp;

data _null_;
  set QueryRules end=eof;
  by Rule_Order;

  /*  file '/home/ssbuechl/input_whens.sas';*/
  array rule_nm_array {1} $ 58 rule_nm;
  array rule_array {1} $ 65 rule;

  /*** Header ***/
  if _n_=1 then
    do;
      file header;
      put 'proc sql;';
      put 'create table QueryData&ZIP5 as ';
      put 'select DISTINCT ';
      put "     a.actual_dlvry_date, ";
      put "     a.imb_code length = 31, ";
      put "     a.imb_dlvry_zip_5, ";
    end;

  /*** Cases ***/

  /* Case 1 */
  file case1 mod;
  if _n_=1 then
    do;
      put "     CASE";
    end;
  do h = 1 to dim(rule_nm_array);
    do j = 1 to dim(rule_array);
      rule_nm=rule_nm_array{h};
      rule=rule_array{j};
      PUT '           WHEN (' RULE ') THEN trim("' RULE_NM '")';
    end;
  end;
  if eof then
    do;
      put "     ELSE ' '";
      put "     END as RULE_NM_1,";
  end;

  /* Case 2 */
  file case2 mod;
  if _n_=1 then
    do;
      put "     CASE";
    end;
  do h = 1 to dim(rule_nm_array);
    do j = 1 to dim(rule_array);
      rule_nm=rule_nm_array{h};
      rule=rule_array{j};
      PUT '           WHEN (' RULE ') THEN trim("' RULE_NM '")';
    end;
  end;
  if eof then
    do;
      put "     ELSE ' '";
      put "     END as RULE_NM_2";
  end;

  /*** Footer ***/
  if eof then
    do;
      file footer;
      put %nrstr("from QueryDataBase&ZIP5 as a ");
      put %nrstr("inner join QueryDataBase&ZIP5 as b ");
      put "on a.imb_code=b.imb_code ";
      put "where a.source='A' and b.source='B'; ";
      put "quit; ";
    end;
run;


filename all temp;
/*filename all '/home/ssbuechl/input_whens.sas';*/
data _null_;
  file all;
  /* header */
  last=0;
  do until(last);
    infile header end=last;
    input;
    put _infile_;
  end;

  /* case 1 */
  last=0;
  do until(last);
    infile case1 end=last;
    input;
    put _infile_;
  end;

  /* case 2 */
  last=0;
  do until(last);
    infile case2 end=last;
    input;
    put _infile_;
  end;

  /* footer */
  last=0;
  do until(last);
    infile footer end=last;
    input;
    put _infile_;
  end;
  stop;
run;

data _null_;
  file print;
  infile all;
  input;
  put _infile_;
run;

filename header clear;
filename footer clear;
filename case1 clear;
filename case2 clear;
buechler66
Barite | Level 11

Wow! Thank you so much for this code example. With only slight alterations I was able to get exactly what I was after. And I learned a lot too! Thanks so much for taking the time to help. I really do appreciate it.

 

Slightly modified code:

data QueryRules;
  Rule_Order=1;
  rule_nm='ACTUAL DELIVERY DATE MISSING IN IV';
  rule='A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL';
  output;
  Rule_Order=1.5;
  rule_nm='ACTUAL DELIVERY DATE LATER IN IV';
  rule='A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE';
  output;
run;

filename header temp;
filename footer temp;
filename case1 temp;
filename case2 temp;

data _null_;
  set QueryRules end=eof;
  by Rule_Order;

  /*  file '/home/ssbuechl/input_whens.sas';*/
  array rule_nm_array {1} $ 58 rule_nm;
  array rule_array {1} $ 65 rule;
  array rule_order_array {1} rule_order;

  /*** Header ***/
  if _n_=1 then
    do;
      file header;
      put 'proc sql;';
      put 'create table QueryData&ZIP5 as ';
      put 'select DISTINCT ';
      put "     a.actual_dlvry_date, ";
      put "     a.imb_code length = 31, ";
      put "     a.imb_dlvry_zip_5, ";
    end;

  /*** Cases ***/

  /* Case 1 */
  file case1 mod;
  if _n_=1 then
    do;
      put "     CASE";
    end;
  do h = 1 to dim(rule_nm_array);
    do j = 1 to dim(rule_array);
      rule_nm=rule_nm_array{h};
      rule=rule_array{j};
      PUT '           WHEN (' RULE ') THEN trim("' RULE_NM '")';
    end;
  end;
  if eof then
    do;
      put "     ELSE ' '";
      put "     END as RULE_NM,";
  end;

  /* Case 2 */
  file case2 mod;
  if _n_=1 then
    do;
      put "     CASE";
    end;
  do x = 1 to dim(rule_order_array);
    do y = 1 to dim(rule_array);
      rule_nm=rule_nm_array{x};
      rule=rule_array{y};
      PUT '           WHEN (' RULE ') THEN trim("' RULE_ORDER '")';
    end;
  end;
  if eof then
    do;
      put "     ELSE ' '";
      put "     END as RULE_ORDER";
  end;

  /*** Footer ***/
  if eof then
    do;
      file footer;
      put %nrstr("from QueryDataBase&ZIP5 as a ");
      put %nrstr("inner join QueryDataBase&ZIP5 as b ");
      put "on a.imb_code=b.imb_code ";
      put "where a.source='A' and b.source='B'; ";
      put "quit; ";
    end;
run;


filename all temp;
/*filename all '/home/ssbuechl/input_whens.sas';*/
data _null_;
/*  file all;*/
FILE 'C:\TESTME.SAS';

  /* header */
  last=0;
  do until(last);
    infile header end=last;
    input;
    put _infile_;
  end;

  /* case 1 */
  last=0;
  do until(last);
    infile case1 end=last;
    input;
    put _infile_;
  end;

  /* case 2 */
  last=0;
  do until(last);
    infile case2 end=last;
    input;
    put _infile_;
  end;

  /* footer */
  last=0;
  do until(last);
    infile footer end=last;
    input;
    put _infile_;
  end;
  stop;
run;


/*data _null_;*/
/*  file print;*/
/*  infile all;*/
/*  input;*/
/*  put _infile_;*/
/*run;*/

filename header clear;
filename footer clear;
filename case1 clear;
filename case2 clear;

 

Output TestMe.sas:

proc sql;
create table QueryData&ZIP5 as 
select DISTINCT 
     a.actual_dlvry_date, 
     a.imb_code length = 31, 
     a.imb_dlvry_zip_5, 
     CASE
           WHEN (A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL ) THEN trim("ACTUAL DELIVERY DATE MISSING IN IV ")
           WHEN (A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE ) THEN trim("ACTUAL DELIVERY DATE LATER IN IV ")
     ELSE ' '
     END as RULE_NM,
     CASE
           WHEN (A.ACTUAL_DLVRY_DATE IS NULL AND B.ACTUAL_DLVRY_DATE IS NOT NULL ) THEN trim("1 ")
           WHEN (A.ACTUAL_DLVRY_DATE > B.ACTUAL_DLVRY_DATE ) THEN trim("1.5 ")
     ELSE ' '
     END as RULE_ORDER
from QueryDataBase&ZIP5 as a 
inner join QueryDataBase&ZIP5 as b 
on a.imb_code=b.imb_code 
where a.source='A' and b.source='B'; 
quit; 

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1283 views
  • 3 likes
  • 3 in conversation