Did you miss the Ask the Expert session on Using SQL within SAS? Not to worry, you can catch it on-demand at your leisure.
The session covers basic SQL concepts, including:
I have added the most recent Q&A below.
Q: Which SQL ANSI standard is supported?
A: PROC SQL follows most of the guidelines set by the American National Standards Institute (ANSI) in its implementation of SQL. However, it is not fully compliant with the current ANSI standard for SQL. Following is a link that will describe the differences:
Q: How do you select only the first 50 rows?
A: One way in which you could do this is to use the OBS= data set option. A portion of the syntax follows:
from health.claims_sample (obs=50)
Q: Since there is no CREATE TABLE statement in these examples, I'm assuming that it's not creating a physical SAS dataset, correct? Just a display in the output window?
A: Yes, in many of our examples the results set is a report. We added the CREATE TABLE statement in order to create an output table from the results set.
Q: What if CHGAMT is 0 in the calculation to create percentage (EERESP/CHGAMT)?
A: The resulting value is missing, as this would represent division by zero.
Q: Can calculated values be referenced by alias or does the calculation have to be specified again for additional uses. IE: PERCENTAGE is calculated, can I do CASE PERCENTAGE... or do I have to do CASE EERESP/CHGAMT....?
A: You can use the CALCULATED keyword in order to specify to SAS that you are using a calculated column.
Q: Why did the format=percent6.1 did not always show 1 decimal position in the output?
A: We could increase the column width and have all values accurately represented. The format requires a print position for all characters as well as the possibility of parentheses for negative numbers. For instance, format=percent10.2 would accurately show all decimal positions in our example. I have changed thee example to be more clear for future reference.
Q: Are xxx.DAT files supported?
A: PROC SQL does not read raw data files. SQL can read existing SAS data sets (tables) or relational databases if the appropriate SAS/Access engine is licensed.
Q: Can I download a free version of SAS and setup personally?
A: Yes, there is a free version of the SAS University for personal use. You can find it at this location https://www.sas.com/en_us/software/university-edition/download-software.html
Q: Is the CALCULATED keyword something that works only within the PROC SQL?
A: The CALCULATED keyword is a SAS enhancement to the ANSI standard.
Q: So to be clear, an outer join is basically inner join with additional parts of the table? (e.g. if you do an outer join you are by default including an inner join?)
A: The outer join does include the matching values. However, with outer joins, you are limited to 2 tables, whereas the same limit does not apply to inner joins.
Q: For joining two tables, which technique is preferable - a MERGE in the data step or a join in PROC SQL where both the tables are large in size?
A: Since the two processes are different, you must decide what the desired output should look like. PROC SQL does a Cartesian product on the matching values, and the data step does not. After you decide which process yields the desired results, then benchmarking is recommended to see which method is more efficient.
Q: The option NUMBER appears on the PROC SQL code as see in the following. What does it do?
PROC SQL NUMBER;
A: The NUMBER option provides line numbers in the output for ease of reading the results set.
Q: is there a link that shows all of the different statements for each type of join?
A: The following is a good place to start:
Q: I wonder the performance between SQL join and DATA step merge.
A: It is recommended that you perform benchmarking for production jobs to evaluate performance between PROC SQL and the data step.
Q: Can we use PROC SQL in place of first. and last. logic?
A: First. and Last. programming is best implemented in Data step programming.
Q: In the comparison between SQL and the data step, you mentioned "no equality required for SQL". I am not quite sure I understand.
A: SQL does not require that data be joined on an equality, for instance, a.providerID=b.providerID.
Q: same question regarding "coalesce". Is COALESCE SQL ANSI standard or is it a SAS enhancement in PROC SQL?
A: COALESCE is found in some implementations of the ANSI standard. In others the equivalent is IFNULL.
Q: Does INTO make the macro variable global to allow you to use it in the other PROC SQL steps, or was there something else happening?
A: In our example, the macro variable was created as a global variable.
Q: Can we use proc sql in SAS student edition?
A: Yes the university edition is for private use and can be downloaded from this link
Q: Doe all functions like INDEX,PIVOT and CAST work in PROC SQL?
Please see the following for comparisons between PROC SQL and the ANSI standard.
Q: Do we always have to use QUIT; for every proc SQL step?
A: No. PROC SQL will see QUIT; or the beginning of another step as a step boundary.
SAS SQL does not currently support windowing functions. Please see the following post for more information:
Pass-through queries can be used in SAS to send DBMS-specific statements to a DBMS for processing. The syntax for the connection string varies based upon which database you are accessing. The following introduces SQL pass-through and provides some examples relevant for connecting to Oracle.
There is also an Ask the Expert session that addresses this topic. It is entitled ‘The What, When, and Why of SAS/ACCESS Interfaces’:
These are not equivalent. A view is a type of SAS data set that stores a set of instructions for surfacing the data when the view is used. Views can be created with SQL or the data step. A dashboard provides views about a particular business process or metric, much more like a report. Please see the following for more information about SQL views:
For our introductory session, we focus on reading SAS data only. Please see the following for an introduction to reading Hive or Impala tables.
The SELECT statement in PROC SQL is appropriate for reading SAS data. It can also connect to a database provided that you have the appropriate SAS/Access engine licensed.
The QUIT statement is not required when the next statement is PROC SQL. It is included in the demonstration to allow the presenter a modular approach that is flexible for demonstration purposes.
Yes, on the FROM clause and following the name of the data set, you can use the DROP= data set option. Please see the following for more information:
This presentation does not contain a discussion of subqueries. Information on subqueries can be found here:
The 6 indicates that we are asking for 6 total columns to display the value. Anything that requires a position would be included in that number, including decimals and percent signs.
No, you are not bound by SAS naming conventions. Labels are designed to be descriptive and can include blanks and other special characters.
An introduction to this topic is addressed here:
A WHERE clause subsets one row at a time. The HAVING clause is used to subset a group of data, possibly after performing some sort of summarization.
The ELSE is used to specify a value that will be used if none of the previous conditions in a CASE expression are met.
Yes, the FROM clause is executed 1st.
You can view the log as usual. No special operations are required for PROC SQL.
Assign a permanent library using the LIBNAME statement. Then, on the CREATE statement in PROC SQL, choose a 2 level name such as the following:
create table perm.table1 ...
Yes, you can use SET operators to accomplish this. The kind of vertical join that we might be most familiar with is the equivalent of appending or concatenating data using the data step. This can be accomplished with the OUTER UNION CORR operation in PROC SQL.
PROC SQL does closely align with the ANSI SQL-92 standard. If you are referring to the syntax used for the joins, either of the following could be used:
…FROM one, two;
…FROM one INNER JOIN two;
It is a way for us to request that row numbers be displayed in the output.
This would likely require 2 SELECT statements in PROC SQL. This might be a time when PROC MEANS/SUMMARY with a CLASS would be a good fit for the task.
The data step process that is most like the FULL JOIN is the data step MERGE. Please be aware that the process by which SQL and the data step join their data is different. Thus the results will not always be the same between a FULL JOIN and a data step MERGE.
Yes. Since Base SAS includes PROC SQL, it runs on any platform.
Currently, there is no session that addresses concepts from this webinar in more detail. We do have a session that addresses reading database data. It is ‘The What, When and Why or SAS Access Engines’
Yes. On the CREATE statement, use the name of the table that you wish to replace.
Newvariable=(0.5 x √[SumX]) + (0.8 x √[SumY]) + (0.6 x (log([Z] +8))) + (0.06 x ([Zx 5])) + 0.32
This expression might be best accommodated by the data step. You will need to use the SQRT function to perform square roots, the * operator for multiplication, the + operator, and the LOG function.
This is an example of a task that might be more typically accomplished using the data step. If you would like to use PROC SQL, the following paper addresses this topic:
Please contact the person who sent the information to you in order to resolve this question.
It can be used in either situation. For example, it can be used to consolidate all X values into a single column for reporting purposes. Below are a few helpful links:
Below is an example from the session that includes an added ORDER BY clause. The GROUP BY indicates the grouping values that are to be used for summarization. The ORDER BY is used for ordering the results set. You can think of an ORDER BY as a sorting of the data.
proc sql number; select c.providerID, providerName, sum(chgamt) as sumChg, sum(lndiscamt) as sumdisc, sum(eeresp) as sumeeresp, sum(paidamt) as sumpaidamt from health.claims_sample as c, health.provider as p where c.providerID = p.providerID group by c.providerID, providerName order by providerID ; quit;
Below are some helpful links for further investigating PROC SQL and the macro language: