Is the following even possible?
PROC SQL;
CREATE TABLE TEMP AS
SELECT A.OBS1,
A.DATEPART(DATETIME) AS DATE FORMAT DATE9.,
A.HOUR(DATETIME) AS HOUR FORMATE TIME9.,
A.CASE
WHEN OBS3='GIJeff' THEN
(OBS4-OBS5)
END AS OBS2,
B.AVG(OBS2)
FROM TABLE1 A,
(SELECT OBS2
FROM TABLE1) B
WHERE OBS6 IN(2,3,4);
QUIT;
Basically, can you perform AVERAGE and CASE functions while trying to sub select 1 observations from another table?
No problem. Just remember that variables belong to tables, and not functions
PROC SQL;
CREATE TABLE TEMP AS
SELECT A.OBS1,
DATEPART(A.DATETIME) AS DATE FORMAT=DATE9.,
HOUR(A.DATETIME) AS HOUR FORMAT=TIME9.,
CASE
WHEN A.OBS3='GIJeff' THEN
A.OBS4-A.OBS5
END AS OBS2,
MEAN(B.OBS2) AS MEAN_OBS2
FROM TABLE1 A,
(SELECT OBS2
FROM TABLE1) B
WHERE A.OBS6 IN(2,3,4);
QUIT;
But the following would likely be more efficient
PROC SQL;
CREATE TABLE TEMP AS
SELECT
OBS1,
DATEPART(DATETIME) AS DATE FORMAT=DATE9.,
HOUR(DATETIME) AS HOUR FORMAT=TIME9.,
CASE
WHEN OBS3='GIJeff' THEN OBS4 - OBS5
ELSE .
END AS OBS2,
(SELECT MEAN(OBS2) FROM TABLE1) AS MEAN_OBS2
FROM TABLE1
WHERE OBS6 IN(2, 3, 4);
QUIT;
Note that MEAN_OBS2 will be the average of OBS2 over all observations in Table1, not just those with OBS6 in (2, 3, 4).
Hi,
Since you did not provide any sample data, I took the liberty of creating two fictitious datasets. I tried to keep many of your variable names the same, but you will see that I "ad-libbed" quite a bit. Hope that's okay.
This example assumes that Table1 has one observation per individual and Table2 may have multiple observations per individual (I made this assumption because you were trying to find the average).
DATA Table1;
LENGTH Obs1 8 Name $10 Obs4 Obs5 Obs6 8;
FORMAT Datetime DATETIME20.;
Datetime = DATETIME() - RANUNI(1234)*60*86400;
INPUT Obs1 Name $ Obs4 Obs5 Obs6;
DATALINES;
1 Catherine 22 3 3
3 Jeff 34 14 3
4 Aaron 79 10 2
8 Brian 34 9 .
10 Eugene 60 12 4
11 Anne 73 8 4
20 Ryan 59 11 1
;
RUN;
DATA Table2;
LENGTH Name $10 Day MilesDriven 8;
INPUT Name $ Day MilesDriven;
DATALINES;
Catherine 1 11
Catherine 2 33
Catherine 3 7
Catherine 4 31
Jeff 1 12
Jeff 2 36
Brian 1 32
Brian 2 21
Brian 3 22
Brian 4 21
Eugene 1 11
Eugene 2 52
Eugene 3 12
Eugene 4 11
Anne 1 32
Anne 2 13
Anne 3 48
Ryan 1 21
Ryan 2 38
Ryan 3 51
;
RUN;
PROC SQL;
CREATE TABLE TEMP AS
SELECT A.Obs1
, A.Name
, DATEPART(DATETIME) AS Date FORMAT=DATE9.
, HOUR(DATETIME) AS Hour FORMAT=TIME9.
, CASE WHEN A.Name='Jeff' THEN (OBS4-OBS5)
ELSE .
END AS Obs2
/* In my later edit, I added this snippet (borrowing from PGStats) ... */
, (SELECT AVG(C.MilesDriven)
FROM TABLE2 AS C) AS Avg_Miles_Driven_All_Drivers
, AVG(B.MilesDriven) AS Avg_Miles_Driven
, SUM(B.MilesDriven) AS Total_Miles_Driven
, MAX(B.MilesDriven) AS Max_Daily_Miles_Driven
, COUNT(DISTINCT B.Day) AS Num_Days_Driven
FROM TABLE1 AS A
LEFT JOIN TABLE2 AS B
ON A.Name = B.Name
WHERE Obs6 IN (2,3,4)
GROUP BY A.Obs1
, A.Name
, calculated Date
, calculated Hour
, calculated Obs2;
QUIT;
The fictitious output would look like this:
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.