Hi:
The DROP statement applies to the OUTPUT data sets. The error message says it all:
ERROR: Variable Department is not on file WORK.SALES.
In the first program, the WHERE DATA set option is being applied to the OUTPUT dataset. The DROP statement impacts the OUTPUT dataset. Even though Department is in the INPUT dataset -- you ask for the WHERE to be applied to the OUTPUT dataset. And you gave explicit instructions (DROP) to keep Department out of the output dataset. Essentially, you ask for the WHERE condition to be applied using a variable that is just NOT in the OUTPUT dataset.
In the second program, Department is DROPPED from the output dataset. But the IF statement is acting on Department in the Program Data Vector (PDV) as it comes from the INPUT dataset. So the IF statement is working with the INPUT dataset.
You can DROP Product from the OUTPUT dataset and still use it from the INPUT dataset with this code. The WHERE on the INPUT dataset limits the input rows to ONLY the BED or SOFA rows:
[pre]
data bed1 sofa1;
set sashelp.prdsale(where=(Product in ('BED', 'SOFA')));
if product = 'BED' then output bed1;
else if product = 'SOFA' then output sofa1;
drop Product;
run;
[/pre]
However, you might want to benchmark the different techniques using your data. As you can see from my log (Windows), the second program with the IF statements took less CPU time than the first program (with the WHERE and the IF statements). (The SQL approach is shown in the SAS Log, too, for comparison purposes.)
cynthia
[pre]
436 data bed1 sofa1;
437 set sashelp.prdsale(where=(Product in ('BED', 'SOFA')));
438 if product = 'BED' then output bed1;
439 else if product = 'SOFA' then output sofa1;
440 drop Product;
441 run;
NOTE: There were 576 observations read from the data set SASHELP.PRDSALE.
WHERE Product in ('BED', 'SOFA');
NOTE: The data set WORK.BED1 has 288 observations and 9 variables.
NOTE: The data set WORK.SOFA1 has 288 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
442
443 data bed2 sofa2;
444 set sashelp.prdsale;
445 if Product='BED' then output bed2;
446 else if Product='SOFA' then output sofa2;
447 drop Product;
448 run;
NOTE: There were 1440 observations read from the data set SASHELP.PRDSALE.
NOTE: The data set WORK.BED2 has 288 observations and 9 variables.
NOTE: The data set WORK.SOFA2 has 288 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
449 proc sql;
450 create table bed3 as
451 select country, region, division, prodtype,
452 quarter, year, month, actual, predict
453 from sashelp.prdsale
454 where Product = 'BED';
NOTE: Table WORK.BED3 created, with 288 rows and 9 columns.
455
456 create table sofa3 as
457 select country, region, division, prodtype,
458 quarter, year, month, actual, predict
459 from sashelp.prdsale
460 where Product = 'SOFA';
NOTE: Table WORK.SOFA3 created, with 288 rows and 9 columns.
461 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.15 seconds
cpu time 0.03 seconds
[/pre]