turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- PROC SQL VS DATA STEP FOR IF STATEMENT

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-29-2016 11:43 AM - edited 08-29-2016 11:44 AM

I am trying to eliminate some extra steps in my SAS code.

I need to use the same source data a few times to accurately count certain the numerator and denominator for a particular measure.

The first step in the process is to count distinct PERSONID by quarter for records where state=LA. This will be the count for my denominator.

```
DATA ONE (KEEP=RX_QTRYR PERSONID); SET O_TEST2012;
IF STATE='LA';
RUN;
PROC SQL;
CREATE TABLE ONE_LA AS
SELECT RX_QTRYR, COUNT(DISTINCT PERSONID) AS NUM_LA
FROM ONE
GROUP BY RX_QTRYR
;
QUIT;
```

In order to calculate the numerator, I have to re-sort the original data by PERSONID, DT_FILLED and DT_COMPLETE. Next, I have to use a lag function with a by group and an IF/THEN/DO statement to output only the First PERSONID in each grouping. Then, I to sort the data agian using NODUPKEY by PERSONID and RX_QTRYR.

```
PROC SORT DATA=O_TEST2012;
BY PERSONID DT_FILLED DT_COMPLETE;
RUN;
DATA TWO; SET O_TEST2012;
BY PERSONID DT_FILLED DT_COMPLETE;
P_DT_COMPLETE=LAG(DT_COMPLETE);
IF FIRST.PERSONID THEN DO;
OUTPUT;
END;
FORMAT P_DT_COMPLETE MMDDYY10.;
RUN;
PROC SORT DATA=THREE OUT=FOUR NODUPKEY;
BY PERSONID RX_QTRYR;
RUN;
```

Since I have to calculate the numerator by using the same original dataset, would there be a way to incorporate this distinct count of STATE=LA into the dataset to calculate the numerator from the beginning? I was thinking about using PROC SQL to eliminate some of the sorting, but was hoping to be able to include the counting piece as well.

```
PROC SQL;
CREATE TABLE ONE_V2 AS
SELECT
PERSONID,
DT_FILLED,
DT_COMPLETE,
RX_QTRYR,
STATE,
CASE
WHEN STATE='LA' THEN COUNT(DISTINCT PERSONID)
ELSE 0
END AS NUM_LA
FROM O_TEST2012
GROUP BY RX_QTRYR
ORDER BY PERSONID, DT_FILLED, DT_COMPLETE
;
QUIT;
```

Unfortunately, that case statement doesn't yield the same results as the count in table ONE_LA. Any assistance would be greatly appreciated.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to gdaymte

08-29-2016 01:31 PM

So, the denominator is the number of personid in LA in each quarter. What is the numerator?

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

08-29-2016 03:24 PM

YOu are correct about the denominator...The numerator takes dataset FOUR a couple of steps further. Ultimately, the numerator will be the distict count of PERSONID by RX_QTRYR from the resulting dataset.

```
PROC SQL;
CREATE TABLE FIVE AS
SELECT
A.PERSONID,
A.RX_QTRYR,
B.DT_FILLED ,
B.DT_COMPLETE
FROM FOUR A, O_TEST2012 B
WHERE
A.PERSONID=B.PERSONID AND
B.STATE='LA' AND
A.RX_QTRYR=B.RX_QTRYR AND
B.DT_FILLED >= A.DT_FILLED
ORDER BY A.PERSONID, A.RX_QTRYR
;
QUIT;
PROC SQL;
CREATE TABLE SIX AS
SELECT
RX_QTRYR,
COUNT(DISTINCT PERSONID) AS NUM_ON
FROM FIVE
GROUP BY RX_QTRYR
;
QUIT;
```