I have a data set like this:
Subject Office Date Value Output
Patient 1 Visit 1 12 20 20
Patient 1 Followup 2 4 35 35-20
Patient 1 Followup 2 2 54 54-20
Patient 1 Followup 3 3 29 29-20
Patient 1 Followup 3 11 13 13-20
Patient 1 Followup 3 5 98 98-20
I am trying to use proc sql to subtract the numbers in the "VALUE" column based on the "SUBJECT" and "OFFICE" columns. For example, visit 1 is the baseline visit and it should be subtracted from the other visits. The result should form the "OUTPUT" column. Is this possible? I am open to other methods.
Thank you.
Easy with a data step.
data want;
set have;
by subject;
retain base;
if find(office,'Visit') = 1
then do;
output = value;
base = value;
end;
else output = value - base;
drop base;
run;
data have;
input Subject & $15. Office & $15. Date Value ;* Output;
cards;
Patient 1 Visit 1 12 20 20
Patient 1 Followup 2 4 35 35-20
Patient 1 Followup 2 2 54 54-20
Patient 1 Followup 3 3 29 29-20
Patient 1 Followup 3 11 13 13-20
Patient 1 Followup 3 5 98 98-20
;
proc sql;
create table want(drop=rn) as
select *,ifn(Office= 'Visit 1',value,value-max((Office= 'Visit 1')*value)) as output
from (select * ,monotonic() as rn from have)
group by subject
order by subject,rn;
quit;
Should your dataset be sorted in the order with VISIT 1 being the first observation for each subject as shown in the sample, you are certainly better off resorting to a datastep
data have;
input Subject & $15. Office & $15. Date Value ;* Output;
cards;
Patient 1 Visit 1 12 20 20
Patient 1 Followup 2 4 35 35-20
Patient 1 Followup 2 2 54 54-20
Patient 1 Followup 3 3 29 29-20
Patient 1 Followup 3 11 13 13-20
Patient 1 Followup 3 5 98 98-20
;
data want;
set have;
by subject;
if first.subject then do;
_iorc_=value;
output=_iorc_;
end;
else output=value-_iorc_;
run;
Why SQL? Why not just write normal SAS code?
proc sql;
create table WANT as
select a.*
, case when a.office = 'Visit 1' then a.value
when 0=nmiss(a.value,b.value) then a.value-b.value
else .
end as OUTPUT
from have a
left join have b
on a.subject = b.subject
and b.office = 'Visit 1'
;
quit;
Hi everyone, here is a snippet of a data set I have named “general” :
Row Subject Type Procedure Measurement Output
1 500 Intial Invasive 20 20
2 500 Initial Surface 35 35
3 500 Followup Invasive 54 54-20
4 428 Followup Outer 29 29-10
5 765 Followup Other 13 13-19
6 500 Followup Surface 98 98-35
7 428 Intial Outer 10 10
8 765 Intial Other 19 19
9 610 Third Invasive 66 66-17
10 610 Initial Invasive 17 17
The goal is to subtract the numbers in the "MEASUREMENT" column based on the "SUBJECT", "TYPE" and “PROCEDURE” columns. If two values in the “SUBJECT” column match and two values in the “PROCEDURE” column match, then the initial measurement should be subtracted from the other measurement. For example, the initial measurement in row 1 (20) should be subtracted from the followup measurement in row 3 (54) because the subject (500) and procedure (Invasive) match. The result should form the "OUTPUT" column.
Thank you in advance!
Hi everyone, I have a data set named “Tentative”:
Row Subject Type Procedure Measurement Output
1 500 Intial Invasive 20 20
2 500 Initial Surface 35 35
3 500 Followup Invasive 54 54-20
4 428 Followup Outer 29 29-10
5 765 Followup Other 13 13-19
6 500 Followup Surface 98 98-35
7 428 Intial Outer 10 10
8 765 Intial Other 19 19
9 610 Third Invasive 66 66-17
10 610 Initial Invasive 17 17
I was trying to use proc sql to subtract the numbers in the "MEASUREMENT" column based on the "SUBJECT", "TYPE" and “PROCEDURE” columns. If two values in the “SUBJECT” column match and two values in the “PROCEDURE” column match, then the initial measurement should be subtracted from the other measurement. For example, the initial measurement in row 1 (20) should be subtracted from the followup measurement in row 3 (54) because the subject (500) and procedure (Invasive) match. The result should form the "OUTPUT" column.
Thank you in advance!
Could you put this in a working DATA step? I see that you haven't even consistently spelled "Initial". If you provided your starting data in a data step, it would be a lot easier, and more reliable for us to help you.
1) Replace TYPE by a sequential number or add a new variable (TypeN)
2) Sort data by Subject TypeN
3) Execute data step BY Subject; Retain Initial values then at any other observation of same subject subtract values as desired.
Something like:
data temp
set have;
if Type = 'Initial' then TypeN = 0 ; else
if Type = 'Followup' then TypeN = 1 ; else
if Type = 'Third' then TypeN = 2 ;
run;
proc sort data=temp; bu subject TypeN;
data want(drop=measure1);
set temp;
by subject;
retain measure1;
if first.subject and TypeN=0 then measure1 = measurement;
else measurement = measurement - measure1;
run;
Hi everyone, here is a portion of a data set I have named “antibody” :
Row Subject Type Procedure Measurement Output
1 500 Intial Invasive 20 20
2 500 Initial Surface 35 35
3 500 Followup Invasive 54 54-20
4 428 Followup Outer 29 29-10
5 765 Seventh Other 13 13-19
6 500 Followup Surface 98 98-35
7 428 Initial Outer 10 10
8 765 Initial Other 19 19
9 610 Third Invasive 66 66-17
10 610 Initial Invasive 17 17
The goal is to subtract the numbers in the "MEASUREMENT" column based on the "SUBJECT", "TYPE" and “PROCEDURE” columns. If two values in the “SUBJECT” column match and two values in the “PROCEDURE” column match, then the initial measurement should be subtracted from the other measurement. For example, the initial measurement in row 1 (20) should be subtracted from the followup measurement in row 3 (54) because the subject (500) and procedure (Invasive) match. Furthermore, the initial measurement in row 8 (19) should be subtracted from the seventh measurement in row 5 (13) because the subject (765) and procedure (Other) match. The result should form the "OUTPUT" column.
Thank you in advance!
Why do you keep asking the same question?
Did you try the answers on your other questions?
I merged everything back into the original topic.
When adding additional elements to a question it is appropriate to ask a new question.
To get the best response it would be a good idea to show:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.