BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tugtrog
Fluorite | Level 6

I am trying to write a query that will select MIN and MAX timestamp on a single day. I am able to get a vast majority of my data to pull, but there are issues on a few of the rows where I can pull the MIN date but when I change it to MAX I get a page break as the results. There are multiple timestamps per day, but if not it still should select the same timestamp for MIN and MAX.

 

My code is extremely long or I would post it.

 

I am using a having statement to select the min and max similar to this:

 

 

proc sql;

SELECT distinct t2.no,
					T1._ID,
					t2.id, 
					t3.First_Name
					t3.last_name
					t3.TITLE, 
					t3.02_LOCATION
					t3.Level_01, 
					t3.Level_02, 
					T1.M_TS
					T1.O_TS
					t3.employee
					T1.O_A
				FROM TEST.DAY T2
					INNER JOIN B.LE T1 ON (T2.ID = T1._ID and (PUT(DATEPART(T2.ADate),date9.) = PUT(DATEPART(T1.M_TS),date9.)))
					INNER JOIN B.E T3 ON (T1.O_A = T3.EMPLOYEE AND (t3.O_TS > '15Dec9999:0:0:0'dt))
					WHERE T2.NO = '11111-11111'
						GROUP BY T1._ID
							HAVING M_TS = MAX(T1.M_TS) and 
								O_TS = MIN(T1.O_TS) AND
								FIRST_NAME = scan(t3.first_name,1,' ') AND 
								LAST_NAME = SCAN(T3.LAST_NAME,1,' ')
;
run;

 

 

I change nothing in the code but the MAX to MIN and the code works. Does anyone have an idea of what could cause this issue?

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

MH 2 cents thought:

 

If you are at an organization that uses SAS at enterprise level on a commercial level, I would rather prefer to deal a lot of this in piecemeal preferably taking advantage with a datastep approach unless something is preferred by egoistic imbecile bosses who think they have achieved a great deal after an explicit sql connect pass through  writing tons of variable names with the potential  leading to typos and plus the risk of "remerge"  that may cause so much duplication and other unwarranted intricacies causing inaccuracy and integrity issues.

 

Taking advantage of the beautiful PDV block with exquisite block of addresses that holds the operands reading the descriptor portion giving us the breathing space to play with variables of interest for create, insert,  update, delete alone is priceless while the other variables can just sleep and still be neatly written to the output buffer.

 

Of course, I love proc sql and have/will use when it's not unwieldy and that involves cleaner column processing involving less lexicon. Sure Proc Sql also has to have a memory address block and even the great guru Paul Dorfman has mentioned the APP functions can be put to effect in proc sql which might of my personal interest at some point in the future. Nevertheless, long select clauses with many nested in it will certainly cause someone a heartache for sure. HTH

 

View solution in original post

21 REPLIES 21
tugtrog
Fluorite | Level 6
I posted the code, but I am not getting an error I am just having nulls returned in the columns when I put it in a created table, and page brake when looking at results.
novinosrin
Tourmaline | Level 20

Got it.

HAVING M_TS = MAX(T1.M_TS) and 
								O_TS = MIN(T1.O_TS) AND
								FIRST_NAME = scan(t3.first_name,1,' ') AND 
								LAST_NAME = SCAN(T3.LAST_NAME,1,' ')
;

Nothing wrong with the above syntactically. From a business logic or data point of view can't tell. However,  simple conclusion is that no records satisfy all the conditions specified in "having" filter.

 

tugtrog
Fluorite | Level 6
Thanks a lot for your response and time. I think it might be an issue in the data, but I figured I would check here to see if it was something else I was missing.
novinosrin
Tourmaline | Level 20

MH 2 cents thought:

 

If you are at an organization that uses SAS at enterprise level on a commercial level, I would rather prefer to deal a lot of this in piecemeal preferably taking advantage with a datastep approach unless something is preferred by egoistic imbecile bosses who think they have achieved a great deal after an explicit sql connect pass through  writing tons of variable names with the potential  leading to typos and plus the risk of "remerge"  that may cause so much duplication and other unwarranted intricacies causing inaccuracy and integrity issues.

 

Taking advantage of the beautiful PDV block with exquisite block of addresses that holds the operands reading the descriptor portion giving us the breathing space to play with variables of interest for create, insert,  update, delete alone is priceless while the other variables can just sleep and still be neatly written to the output buffer.

 

Of course, I love proc sql and have/will use when it's not unwieldy and that involves cleaner column processing involving less lexicon. Sure Proc Sql also has to have a memory address block and even the great guru Paul Dorfman has mentioned the APP functions can be put to effect in proc sql which might of my personal interest at some point in the future. Nevertheless, long select clauses with many nested in it will certainly cause someone a heartache for sure. HTH

 

kiranv_
Rhodochrosite | Level 12

@novinosrin I love your answers and learn from you, i agree that  Data step gives more flexibility in terms of Data processing  but

 

1. Explicit sql has significant impact in run times and they are very important when they are lots of records.

2. Other than Proc sql , no other SQL implementation(as far as I know) allow remerge but multiple passes

 

novinosrin
Tourmaline | Level 20

@kiranv_ --"1. Explicit sql has significant impact in run times and they are very important when they are lots of records."

 

Not necessarily.  I will prove that to you in time. Unfortunately not at 6:10 chicago time while on a hurry to catch red line CTA to get back home.

 

I agree to 2.  

 

Finally Thank you for the kind note. 

kiranv_
Rhodochrosite | Level 12

sure. I worked  on code to process close to 200 millions records with close to 3000 lines of SAS code to Teradata sql, difference was from 20 min to 18 hours. major time issue is because of  I/0 

novinosrin
Tourmaline | Level 20
That depends upon the quality of SAS user and not the product/language plus taking into account he objective/need.and optimum approaches.
Reeza
Super User

@novinosrin wrote:
That depends upon the quality of SAS user and not the product/language plus taking into account he objective/need.and optimum approaches.

Usually it has to do with running on a server versus running on a desktop is my guess. 

It's a good workaround when you can't afford a SAS server, but need the power of a server. You are limited by the type of work though, because it's mostly standard data manipulation. Until SQL introduced PIVOT (2008ish) and some of the more statistical techniques it was more of an issue.

novinosrin
Tourmaline | Level 20

Thank you @Reeza. Well noted, The discussion might next slip into performance of  proc sql vs datastep(libname) at engine level as the latter indeed has a higher overhead measuring direct extracts/in general column processing as opposed to row, fewer joins/set operators/indexes (more flexible in proc sql than datastep or even proc datasets for obvious reasons. Reference; complete guide to indexes - by author Micheal raithel) utmost.

 

Datastep is indeed powerful for various tasks that can be accomplished. However, saying that how extensive, variety/complexity it can get pretty much seems unlimited especially  to the readers eye on this very forum when demonstrated by some whom we look up to as role models.( don't wanna keep repeating the ones I adore lol)

 

 

The matter of typing and maintaining convenience, control and the kind of objective./business requirement beyond the set of functionality described in paragraph one by proc sql and the control of handling variables to the tune of a  SAS user be row level processing or column level processing(I must emphasize the required proficiency hash, array(multi) and dow etc) is what makes SAS the classy gem, a kind of its own and yet one can be obliterate to the descriptors(meta portion) i.e not having to worry about the variables that's that not required for processing which doesn't hinder. 

 

On the other hand in proc sql, if fewer select clauses with small number of vars  is fine, however long clauses with a huge list of variables, and joins made on multiple variables(with several AND operations) , aliases, ambiguous references and many more unnecessary nits to take care of. And on top of the that, Cartesian is another sweetheart to beware. 

 

So to conclude, besides the server and other things, my view is that different objectives may warrant efficient approaches that would test the individual super user to his/her merits  in coming up with the most optimum quality solution. 

 

Best Regards!

 

PS IMHO,  if SAS didn't provide the scope of competitiveness to pursue further far and wide, the charm and joy of coding wouldn't exist. 🙂 Lastly, I use sql when that happens to be  the most laziest option to type(code)  yet comparable to other options.

 

 

novinosrin
Tourmaline | Level 20

Hey @tugtrog  Sorry for the bother and I hope your day is going well. 

 

if you have a moment, take  a look at the below  new thread posted by a new member 🙂 

 

https://communities.sas.com/t5/New-SAS-User/Merging-Data-in-Proc-SQL/m-p/507812

 

What a deja vu  🙂  So that means our discussion has actually made sense 🙂

 

 

 

 

tugtrog
Fluorite | Level 6

Hey @novinosrin, you are not bothering me at all. I appreciate you sharing this with me, I am actually working on using a data step and PDV right now. Our discussion really did help me, and opened my mind to a whole new idea on how to quickly execute some massive queries on massive data sets that I don't know if I would have been exposed to otherwise. 

 

Thanks again! Have a great weekend .

 

PS feel free to share anything with me! 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 21 replies
  • 2732 views
  • 16 likes
  • 5 in conversation