I think the sql procedure should return the same number of observations the data step below, but it doesn't. Does anyone know why?
28 proc sql;
29 create table check1 as select *
30 from dat.PECOS_dat(firstobs=2)
31 where profit_status3 ne '';
NOTE: Table WORK.CHECK1 created, with 1 rows and 19 columns.
32 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
33
34 data check2;
35 set dat.PECOS_dat(firstobs=2);
36 if profit_Status3 ne '';
37 run;
NOTE: There were 7416 observations read from the data set DAT.PECOS_DAT.
NOTE: The data set WORK.CHECK2 has 2 observations and 19 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
Hello @Batman,
It's not the difference between PROC SQL and DATA step, but the difference between WHERE and IF: The WHERE filter (be it in a PROC SQL or a DATA step) is applied before the firstobs= dataset option, whereas the subsetting IF is applied after the dataset option. So, in the first case firstobs=2 removes one of the observations satisfying the WHERE condition profit_status3 ne '' from the result, unless there is none at all. The observation skipped with firstobs=2 in the second case might not satisfy the IF condition anyway.
Example:
data have;
input x profit_status3 $;
cards;
1 .
2 b
3 c
;
What kind of library is dat? SAS or a database? If in doubt, run
libname dat list;
Hello @Batman,
It's not the difference between PROC SQL and DATA step, but the difference between WHERE and IF: The WHERE filter (be it in a PROC SQL or a DATA step) is applied before the firstobs= dataset option, whereas the subsetting IF is applied after the dataset option. So, in the first case firstobs=2 removes one of the observations satisfying the WHERE condition profit_status3 ne '' from the result, unless there is none at all. The observation skipped with firstobs=2 in the second case might not satisfy the IF condition anyway.
Example:
data have;
input x profit_status3 $;
cards;
1 .
2 b
3 c
;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.