BookmarkSubscribeRSS Feed
MegJOH
Fluorite | Level 6

I'm trying to use the Query Builder to write a percentile (PCTL) function pseudo-manually, where I want to use a column of data from my data set in the expression (since the PTCL function requires the raw data from which you want to calculate the percentile). [Note: I'm finding the 25th percentile in this example, which I know can be done in other ways in EG, but I am wondering about this in general for percentiles not "offered" by Summary Statistics in the Tasks > Describe menu. It also applies to using the raw data in the Query Builder in general.]

To do this, I do the following:

  • Start at the Input Data window
  • Select Query Builder
  • Drag my variable of interest (column called Length) over to to the Select Data pane in the Query Builder (shown below)

query01.jpg

  • Select the "Add a New Computed Column" button (that looks like a calculator)
  • Select radio button next to "Advanced expression" and click Next
  • I manually type in the "Enter an expression:" pane: PCTL(25,
  • Now I need to get my raw data (n = 46) into the PCTL function, separated by commas, so in the lower left pane, I double-click "Selected Columns" to see my variable length drop down here (see image below)
  • Then, I see this message in the pane to the right: "The maximum number of rows to process for retrieving distinct values may be limited"
  • When I click "Get Values," sure enough - it only pulls the distinct values from my variable Length, so when I then "Select Values" to insert them in the PCTL function, I don't have all my data, only the distinct values (and no, I did not select the "Select distinct rows only" box in the original Query Builder box"). In this case, I have n = 40 distinct values, so those are the only values that get inserted into the PCTL function.

Query.jpg

I can't have this if I want to calculate something like a percentile - I need all of the data, not just distinct values! Why does it do this? Is there a way to change this?

Any help would be greatly (greatly) appreciated.

6 REPLIES 6
Reeza
Super User

I think you're using it in a way not intended. That list is intend for WHERE or IF clauses so having a unique list makes sense and I doubt there's a way to change it.

I wouldn't recommend this method of calculating percentiles, it would be difficult to maintain, explain or follow for anyone else.

MegJOH
Fluorite | Level 6

It seems unreasonable to ask Query Builder to insert raw data as an argument in a function?

Reeza
Super User

Not intended and unconventional.  Generally, the purpose is to reference data sets, and variables not to include the raw data.

If you want to go this route add a step that selects all the values into a macro variable and use that in your function.

MegJOH
Fluorite | Level 6

Okay. Thanks for letting me know. I'm not an EG user (I code in SAS), but am trying to use it for my intro stats class and it just seems intuitive to me that when you pull a variable over into the Query Builder that it should use that data as-is, or at least allow that option (especially when all functions, including those that require raw data, such as PCTL, are included in its menu in the Advanced Expression window), but it appears I don't understand the purpose of the Query Builder holistically.

I'm going to skip introducing them to Query Builder at this point and just wait until we use functions that don't require raw data (like finding p-values based on a test statistic from a known distribution).

Reeza
Super User

Query Builder essentially builds SQL code.

I'm sure you know, but some functions in SAS, such as PCTL, MEDIAN don't work in SAS SQL on a variable, they work across rows.

You could transpose the data (Transpose Step) and then use the values that way though it seems like more work than writing some SAS code, i.e. proc univariate.

MegJOH
Fluorite | Level 6

That explains it. I didn't realize it was building SQL code per se. Thanks.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 9133 views
  • 8 likes
  • 2 in conversation