I have the following data set C1 and I would like to select the highest value before 4 hours and the highest value after 4 hours for each subject. I tried proc seql to select values prior to 4 hrs with no luck. Can someone tell me how to properly do this with either proc seql or a data step?
PROC SQL;
CREATE TABLE CMAX AS
SELECT SUBJECT where time in (0 0.25 0.5 1 1.5 2 3 4 ), COUNT(SUBJECT) AS N,
DATA FORMAT=DATE7.,cMAX,
MAX(DV) AS CMAXHIGH,
FROM C1
GROUP BY SUBJECT;
QUIT;
SUBJECT | TIME | CMT | DV |
1 | 0 | 11 | 0 |
1 | 0.25 | 11 | 0 |
1 | 0.5 | 11 | 2.7695 |
1 | 1 | 11 | 3.4254 |
1 | 1.5 | 11 | 6.2236 |
1 | 2 | 11 | 5.7419 |
1 | 3 | 11 | 4.4998 |
1 | 4 | 11 | 5.3484 |
1 | 5 | 11 | 8.4798 |
1 | 6 | 11 | 10.843 |
1 | 6.5 | 11 | 10.545 |
1 | 7 | 11 | 6.8125 |
1 | 7.5 | 11 | 8.7324 |
1 | 8 | 11 | 7.752 |
1 | 10 | 11 | 5.8673 |
1 | 12 | 11 | 1.9105 |
1 | 16 | 11 | 1.4909 |
1 | 24 | 11 | 1.9336 |
2 | 0 | 11 | 0 |
2 | 0.25 | 11 | 0 |
2 | 0.5 | 11 | 2.0534 |
2 | 1 | 11 | 1.9252 |
2 | 1.5 | 11 | 1.1755 |
2 | 2 | 11 | 2.7677 |
2 | 3 | 11 | 4.9696 |
2 | 4 | 11 | 7.29 |
2 | 5 | 11 | 12.178 |
2 | 6 | 11 | 12.825 |
2 | 6.5 | 11 | 13.286 |
2 | 7 | 11 | 14.776 |
2 | 7.5 | 11 | 9.8487 |
2 | 8 | 11 | 7.5002 |
2 | 10 | 11 | 5.2042 |
2 | 12 | 11 | 5.1375 |
2 | 16 | 11 | 2.7229 |
2 | 24 | 11 | 0 |
Combining your example data, plus code from @PGStats and from me.
data c1;
infile datalines delimiter=' ';
input SUBJECT TIME CMT DV;
datalines;
1 0 11 0
1 0.25 11 0
1 0.5 11 2.7695
1 1 11 3.4254
1 1.5 11 6.2236
1 2 11 5.7419
1 3 11 4.4998
1 4 11 5.3484
1 5 11 8.4798
1 6 11 10.843
1 6.5 11 10.545
1 7 11 6.8125
1 7.5 11 8.7324
1 8 11 7.752
1 10 11 5.8673
1 12 11 1.9105
1 16 11 1.4909
1 24 11 1.9336
2 0 11 0
2 0.25 11 0
2 0.5 11 2.0534
2 1 11 1.9252
2 1.5 11 1.1755
2 2 11 2.7677
2 3 11 4.9696
2 4 11 7.29
2 5 11 12.178
2 6 11 12.825
2 6.5 11 13.286
2 7 11 14.776
2 7.5 11 9.8487
2 8 11 7.5002
2 10 11 5.2042
2 12 11 5.1375
2 16 11 2.7229
2 24 11 0
;
run;
data c1_new;
set c1;
time_cat = ifc(time<=4,"before","after");
run;
proc summary data=c1_new
max;
class subject time_cat;
var dv;
output out=result
(where=(_type_=3))
max=dv_max;
run;
proc sql;
create table CMAX as
select *
from
( select
*,
case when TIME <= 4
then "Before"
else "After"
end as Period
from C1 )
group by subject, period
having DV = max(DV)
order by subject, period desc;
select * from cmax;
quit;
Eyeballing, these look like they match the criteria you specified. But maybe there is a missing piece of the puzzle you left out?
Looks like you've got quite the soup of PROC SQL statements in there -- you must have been throwing them in hoping that one would do the trick 😉
Simplest SQL approach is two queries (untested -- sorry, might not have this exactly right):
proc sql;
create table cmax_before4 as
select subject, time, cmt, max(dv) from c1
where time <= 4
group by subject, time, cmt
having dv=max(dv);
create table cmax_after4 as
select subject, time, cmt, max(dv) from c1
where time > 4
group by subject, time, cmt
having dv=max(dv);
quit;
An even simpler approach might be a DATA step and then PROC SUMMARY. DATA step to assign a new class var: time_cat (values like "before" and "after", based on whether less than 4 hours or not). Then PROC SUMMARY to get max value per class: SUBJECT and time_cat. Again, this code is untested but hopefully gives an idea.
data c1_new;
set c1;
time_cat = ifc(time<=4,"before","after");
run;
proc summary data=c1_new
max;
class subject time_cat;
var dv;
output out=result max=dv_max;
run;
What are the requested max values, for your example data?
Combining your example data, plus code from @PGStats and from me.
data c1;
infile datalines delimiter=' ';
input SUBJECT TIME CMT DV;
datalines;
1 0 11 0
1 0.25 11 0
1 0.5 11 2.7695
1 1 11 3.4254
1 1.5 11 6.2236
1 2 11 5.7419
1 3 11 4.4998
1 4 11 5.3484
1 5 11 8.4798
1 6 11 10.843
1 6.5 11 10.545
1 7 11 6.8125
1 7.5 11 8.7324
1 8 11 7.752
1 10 11 5.8673
1 12 11 1.9105
1 16 11 1.4909
1 24 11 1.9336
2 0 11 0
2 0.25 11 0
2 0.5 11 2.0534
2 1 11 1.9252
2 1.5 11 1.1755
2 2 11 2.7677
2 3 11 4.9696
2 4 11 7.29
2 5 11 12.178
2 6 11 12.825
2 6.5 11 13.286
2 7 11 14.776
2 7.5 11 9.8487
2 8 11 7.5002
2 10 11 5.2042
2 12 11 5.1375
2 16 11 2.7229
2 24 11 0
;
run;
data c1_new;
set c1;
time_cat = ifc(time<=4,"before","after");
run;
proc summary data=c1_new
max;
class subject time_cat;
var dv;
output out=result
(where=(_type_=3))
max=dv_max;
run;
proc sql;
create table CMAX as
select *
from
( select
*,
case when TIME <= 4
then "Before"
else "After"
end as Period
from C1 )
group by subject, period
having DV = max(DV)
order by subject, period desc;
select * from cmax;
quit;
Eyeballing, these look like they match the criteria you specified. But maybe there is a missing piece of the puzzle you left out?
This approach worked just fine.
You can do it in a single query but it's not really pretty. It also depends if you want the time of the maximum, which Chris's solution does and mine does not.
proc sql;
create table want as
select subject,
max(case when time <=4 then dv else . end) as max_le_4,
max(case when time >4 then dv else . end) as max_gt_4
from have
group by subject;
quit;
If I understand you correctly, the code would work as is. You'd simply have 100 records as output: 50 with the before4hours value, and 50 with the after4hours value.
Again, What is the expected output??
I would propose:
proc sql;
create table CMAX as
select *
from
( select
*,
case when TIME <= 4
then "Before"
else "After"
end as Period
from C1 )
group by subject, period
having DV = max(DV)
order by subject, period desc;
select * from cmax;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.