Hi! With the help of @PGStats I have this code - which works great. I need to now add in one more dimension. I need to calculate the difference between Q1- Visit1 and Q2_Visit2, and THEN get the descriptive statistics of the change.
something like sum(Q1)[visit2] -sum(Q1)[visit 1] as change....then all the descriptives would use 'change' (instead of totalscore). I'm just not sure where to indicate the visit value. Help is appreciated!!
proc sql;
create table sep_004 as
select visit, mean (totalscore) as mean, max(totalscore) as max, min(totalscore) as min,
std(totalscore) as std, nmiss(totalscore) as nmiss, n(totalscore) as n, ('004') as Study
from
(select PID, visit, sum(Q1) as totalscore
from sep004
group by PID, visit)
group by visit;
select * from sep_004;
quit;
You need two queries then and a self join:
proc sql;
create table totals as
select
id,
visit,
sum(score) as totalScore
from have
group by id, visit;
create table want as
select
mean(change) as meanChange
from
(select
b.id,
b.visit,
b.totalScore - a.totalScore as change
from
totals as a inner join
totals as b on a.id=b.id
where a.visit=1 and b.visit ne 1);
select * from want;
quit;
Sample data and expected output?
ID | Score | Visit | Sum_Visits | Change |
A | 1 | 1 | ||
A | 0 | 1 | ||
A | 1 | 1 | ||
A | 1 | 1 | 3 | |
A | 1 | 2 | ||
A | 0 | 2 | ||
A | 1 | 2 | ||
A | 1 | 2 | 3 | 0 |
B | 0 | 1 | ||
B | 1 | 1 | ||
B | 1 | 1 | ||
B | 0 | 1 | 2 | |
B | 1 | 2 | ||
B | 1 | 2 | ||
B | 1 | 2 | ||
B | 1 | 2 | 4 | 2 |
So the 'change would be sum_visit (2) -sum_visit (1) by ID.... then i need descriptive stats on the change variable.
Add an aggregation level:
proc sql;
create table want as
select
mean(change) as meanChange
from
(select
id,
sum( (case visit when 1 then -1 else 1 end) * totalScore ) as change
from
(select
id,
visit,
sum(score) as totalScore
from have
group by id, visit)
group by id);
select * from want;
quit;
You need two queries then and a self join:
proc sql;
create table totals as
select
id,
visit,
sum(score) as totalScore
from have
group by id, visit;
create table want as
select
mean(change) as meanChange
from
(select
b.id,
b.visit,
b.totalScore - a.totalScore as change
from
totals as a inner join
totals as b on a.id=b.id
where a.visit=1 and b.visit ne 1);
select * from want;
quit;
Do you have to be working with your full table?
Is there some reason you can't simply calculate the sum using PROC MEANS and then calculate the changes from baseline using a data step similar to your most recent question and a PROC MEANs?
It is possible to do it in a single data step but it seems like that's beyond your current skill set, so wouldn't it make sense to use this approach for now so you can modify it yourself as necessary? If you want to give the single data step approach a try, here's one a rough outline.
Steps for a single data step:
1. Use BY group process so make sure things are in order.
2. Use RETAIN to keep the baseline value if its the last record of visit=1.
3. Sum as you go down, and at end of each visit group, do the subtraction.
4. Summarize the final data using PROC MEANS.
It's one less proc means overall so I'm not sure you gain much with the single step approach.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.