BookmarkSubscribeRSS Feed
sra2786
Calcite | Level 5

When I run the code below, I get the log Note  "The execution of this query involves performing one or more Cartesian product joins that can not be optimized." on the LEFT JOIN code highlighted in Bold.  Is there a better way to Join the dstDates and Sales tables?

 

My goal is to use the Daylight Savings Time when applicable.  The salesDate is in UTC.

 

data dstDates;
format date1 datetime18.0 date2 datetime18.0;
input date1 datetime18.0
date2 datetime18.0
;

cards;
10MAR2019:02:00:00 03NOV2019:02:00:00
;

data sales;
format localStandard datetime18.0 localDaylight datetime18.0;
input salesDate datetime18.0
;
localStandard = tzoneu2s(salesDate,'EST');
localDaylight = tzoneu2s(salesDate, 'EDT');
cards;
08MAR2019:02:00:00
12MAR2019:02:00:00
;
proc print data=sales;
title "Sales";
format salesDate datetime18.0;
PROC SQL;
CREATE TABLE work.final1 as
SELECT sales.salesDate,
localStandard,
localDaylight,
CASE WHEN date1 IS NOT NULL then localDaylight else localStandard END AS newDate format datetime18.0
FROM sales sales
LEFT JOIN dstDates dst ON (sales.localStandard >= dst.date1 AND
sales.localDaylight <= dst.date2);

proc print data=final1;
format salesDate datetime18.0;

 

6 REPLIES 6
koyelghosh
Lapis Lazuli | Level 10

Not sure if you wanted the output as attached in the image below.

Output as I seeOutput as I see

 

If the above is what you wanted, then I could get it done because you were missing many "RUN" and "QUIT" statements after DATA/PROC steps.

The corrected code is as below.

 

data dstDates;
	format date1 datetime18.0 date2 datetime18.0;
	input date1 datetime18.0 date2 datetime18.0;
	cards;
10MAR2019:02:00:00 03NOV2019:02:00:00
;
run;

data sales;
	format localStandard datetime18.0 localDaylight datetime18.0;
	input salesDate datetime18.0;
	localStandard=tzoneu2s(salesDate, 'EST');
	localDaylight=tzoneu2s(salesDate, 'EDT');
	cards;
08MAR2019:02:00:00
12MAR2019:02:00:00
;
run;

proc print data=sales;
	title "Sales";
	format salesDate datetime18.0;
run;

PROC SQL;
	CREATE TABLE work.final1 as SELECT sales.salesDate, localStandard, 
		localDaylight, CASE WHEN date1 IS NOT NULL then localDaylight else 
		localStandard END AS newDate format datetime18.0 FROM sales sales LEFT JOIN 
		dstDates dst ON (sales.localStandard >=dst.date1 AND 
		sales.localDaylight <=dst.date2);
quit;

proc print data=final1;
	format salesDate datetime18.0;
run;

Please let me know if this helps.

koyelghosh
Lapis Lazuli | Level 10

Apologies .. I did not solve your issue, in the above post!! Your issue is the log message (which I am getting as well) and not the fact that the program is not running.

 

I think the problem is in the statement following LEFT JOIN. You want to do LEFT JOIN of sales with dstDates but in the ON clause you are selecting both the common columns from sales. Also you are saying FROM sales sales.

ON (sales.localStandard >=dst.date1 AND 
		sales.localDaylight <=dst.date2)

I am sorry, I am not a SAS expert and thus can not understand what you want to achieve there but doing an alteration helped me get rid of the log. Below is the code and the output remains same. However this may not be what you want. There is some discussion on this error here.

 

data dstDates;
	format date1 datetime18.0 date2 datetime18.0;
	input date1 datetime18.0 date2 datetime18.0;
	cards;
10MAR2019:02:00:00 03NOV2019:02:00:00
;
run;

data sales;
	format localStandard datetime18.0 localDaylight datetime18.0;
	input salesDate datetime18.0;
	localStandard=tzoneu2s(salesDate, 'EST');
	localDaylight=tzoneu2s(salesDate, 'EDT');
	cards;
08MAR2019:02:00:00
12MAR2019:02:00:00
;
run;

proc print data=sales;
	title "Sales";
	format salesDate datetime18.0;
run;

PROC SQL;
	CREATE TABLE work.final1 as 
		SELECT 
		sales.salesDate, 
		localStandard, 
		localDaylight, 
		CASE WHEN date1 IS NOT NULL then localDaylight 
		else localStandard END AS newDate format datetime18.0 
			FROM sales 
				LEFT JOIN 
				dstDates dst ON (sales.localStandard =dst.date1 AND sales.localDaylight =dst.date2);
quit;

proc print data=final1;
	format salesDate datetime18.0;
run;
Tom
Super User Tom
Super User

Why did you insert extra RUN statements after the first two DATA steps?  That can confuse novice users as there were two recent questions where users tried to put data step statements after the in-line data.

koyelghosh
Lapis Lazuli | Level 10

@Tom I inserted RUN statements at the end of each DATA step because I used to think that it is a markup of a block that is run at once. I used to think that it is a good practice rather (as a beginner/novice) to close your beginning DATA and PROC step with its corresponding RUN statement but your post has made me think otherwise. Will you please point me to the recent discussion (that you mentioned in your post and the one that created confusion)? I will get to learn something new.

 

In the meanwhile you are right. While the output does not change, the LOG says differently. The Memory consumption and CPU time remains "almost" identical but other details are varying.

 

Below is the pertinent log if you omit the first two run statements. It will be followed by another log, if you keep the first two run statements.

 

 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              674.25k
       OS Memory           28072.00k
       Timestamp           24/06/2019 12:06:43 AM
       Step Count                        49  Switch Count  2
       Page Faults                       0
       Page Reclaims                     90
       Page Swaps                        0
       Voluntary Context Switches        11
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264

 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.01 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              2015.18k
       OS Memory           29096.00k
       Timestamp           24/06/2019 12:06:43 AM
       Step Count                        50  Switch Count  1
       Page Faults                       0
       Page Reclaims                     226
       Page Swaps                        0
       Voluntary Context Switches        7
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           8

 NOTE: PROCEDURE SQL used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              5867.87k
       OS Memory           34480.00k
       Timestamp           24/06/2019 12:06:43 AM
       Step Count                        51  Switch Count  2
       Page Faults                       0
       Page Reclaims                     345
       Page Swaps                        0
       Voluntary Context Switches        15
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           280

 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.01 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              717.09k
       OS Memory           29608.00k
       Timestamp           24/06/2019 12:06:43 AM
       Step Count                        52  Switch Count  0
       Page Faults                       0
       Page Reclaims                     79
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0

and this is the log if you keep the first two run statements.

 

 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              669.25k
       OS Memory           28328.00k
       Timestamp           24/06/2019 12:10:35 AM
       Step Count                        58  Switch Count  2
       Page Faults                       0
       Page Reclaims                     151
       Page Swaps                        0
       Voluntary Context Switches        13
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264

 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.01 seconds
       memory              675.25k
       OS Memory           28328.00k
       Timestamp           24/06/2019 12:10:35 AM
       Step Count                        59  Switch Count  2
       Page Faults                       0
       Page Reclaims                     90
       Page Swaps                        0
       Voluntary Context Switches        10
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264

 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.01 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              2015.65k
       OS Memory           29096.00k
       Timestamp           24/06/2019 12:10:35 AM
       Step Count                        60  Switch Count  1
       Page Faults                       0
       Page Reclaims                     160
       Page Swaps                        0
       Voluntary Context Switches        6
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           16

 NOTE: PROCEDURE SQL used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              5867.71k
       OS Memory           34480.00k
       Timestamp           24/06/2019 12:10:35 AM
       Step Count                        61  Switch Count  2
       Page Faults                       0
       Page Reclaims                     340
       Page Swaps                        0
       Voluntary Context Switches        14
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           280

 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.01 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              696.18k
       OS Memory           29608.00k
       Timestamp           24/06/2019 12:10:35 AM
       Step Count                        62  Switch Count  0
       Page Faults                       0
       Page Reclaims                     74
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0

Any thoughts on what is going on? Please let me know.

Thank you.

Tom
Super User Tom
Super User

Timing has nothing to do with it. It takes the compiler very little time to ignore the non-step generated by extra RUN statements.

Here is a link to a question by a users confused by the presence of the superfluous RUN statement after a DATA step.

https://stackoverflow.com/questions/56191122/why-is-month-function-is-not-working-here/56191425#5619...

 

The idea behind always ending your steps is to make it easier for the humans reading your code and not the compiler.  Only in an interactive session will leaving out the termination of the steps matter because if you leave off the last termination the compiler will not run the step since it is waiting for you to finish sending it the code for the step. 

 

The idea is to always explicitly end the step rather than just letting SAS auto detect that it has finished being specified because a new step is being requested by the presence of the next DATA or PROC statement. 

 

But what code it is that ends a step depends on what step you are running.  If you are running a data step that uses in-line data then the line with the semi-colon (or four semi-colons) that marks the end of the in-line data marks the end of the data step.  For other DATA steps the RUN statement marks the end.  For some procs a RUN statement marks the end, but for others a QUIT is required because it can "run" multiple groups of statements. 

koyelghosh
Lapis Lazuli | Level 10
Thank you. Makes sense. Learnt something new about SAS.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 13446 views
  • 1 like
  • 3 in conversation