BookmarkSubscribeRSS Feed
JamieTee
Fluorite | Level 6

I have a few datasets for which I have many various DATE variables. I'm looking to get the data distribution on each one. I've seen past posts which describe how to get min, max, nmissing, but I cannot seem to find a way to get the Q1 and Q3 marks (and if possible P1 and P99).

I need to use PROC SQL because I am accessing data through some data servers which operate through PROC SQL.

Thank you kindly!

4 REPLIES 4
Reeza
Super User

If you're going through SQL and must use SQL are you using Explicit Pass through? Otherwise you can use the standard PROC MEANS approach. 

 

If you are using explicit pass through, then your code needs to be SQL specific and you need to use the specific functions that align with your SQL variant. 

 

How do you know which variables are dates that you want to analyze? Based on variable type or a naming convention?

 


@JamieTee wrote:

I have a few datasets for which I have many various DATE variables. I'm looking to get the data distribution on each one. I've seen past posts which describe how to get min, max, nmissing, but I cannot seem to find a way to get the Q1 and Q3 marks (and if possible P1 and P99).

I need to use PROC SQL because I am accessing data through some data servers which operate through PROC SQL.

Thank you kindly!


 

JamieTee
Fluorite | Level 6

Hi @Reeza thank you for your reply!

 

I looked up explicit pass through (https://communities.sas.com/t5/SAS-Programming/Implicit-vs-Explicit-SQL-Pass-through-SQL-Query-in-SA...), and it looks like that's exactly what I'm using. 

 

I would be identifying the DATE variables based on their format, which is generally DATE9. - would it be possible to search by the 'Type' of the variable instead (in case something is in a different DATE format)? Thank you kindly for your help. 

Reeza
Super User

How I would do this is first to identify all my date variables. 

All DB have dictionary tables that indicate the variable name and type. You can query that to build your list of variables you need to apply it to. 

 

Second is to figure out what the function is you need for explicit pass through based on the SQL variant. Most have it, but they're never efficient at those calculations by the way.

 

If your data is in the smaller range (few million) I'd recommend downloading the data and then using a full SAS approach. That'll save you programming time but it'll take slightly longer to run - I suspect you won't make up the run time difference in this process but you'll learn how to do it I suppose. 

 

PGStats
Opal | Level 21

If the external DBMS doesn't provide the tools to calculate the quantiles, it could surely give you the record counts for each date. You could thus query for the dates and their frequencies and use proc univariate with a freq statement to get the quantiles. I think you would have to do this separately for each date variable.

 

hth

PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2330 views
  • 3 likes
  • 3 in conversation