Have
Subject ID | Visit | Score |
117961 | Visit1 | 0 |
117961 | Visit2 | 8 |
117961 | Visit3 | 4 |
117961 | Visit4 | 1 |
117961 | Visit5 | 1 |
117962 | Visit1 | 0 |
117962 | Visit2 | 6 |
117962 | Visit3 | 4 |
117962 | Visit4 | 9 |
117962 | Visit5 | 2 |
117963 | Visit1 | 0 |
117963 | Visit2 | 4 |
117963 | Visit3 | 2 |
117963 | Visit4 | 5 |
117963 | Visit5 | 3 |
117964 | Visit1 | 0 |
117964 | Visit2 | 3 |
117964 | Visit3 | 2 |
117964 | Visit4 | 8 |
117964 | Visit5 | 6 |
want
Subject ID | Visit | Score |
117962 | Visit1 | 0 |
117962 | Visit2 | 6 |
117962 | Visit3 | 4 |
117962 | Visit4 | 9 |
117962 | Visit5 | 2 |
117964 | Visit1 | 0 |
117964 | Visit2 | 3 |
117964 | Visit3 | 2 |
117964 | Visit4 | 8 |
117964 | Visit5 | 6 |
I want to have a subset of data by comparing the score of Visit 2 onward (I don't need to include Visit 1, since Visit 1 is baseline), where the difference between 2 cosecutive scores jump by greater than 4 points.
Thanks as always!!
Asking for:
" I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset." - try next code:
data subjects(keep=subject_ID);
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
proc sql;
create table want as
select * from have
where subject_ID in
(select distinct subject_ID form subjects)
; quit;
In case of any issue please post the log and describe the issue you have.
Pay attention:
- DIF function computes the difference between current and previous value of a variable.
I want every records of those subject(s) who has a jump of score greater than 4
You need to make two passes through the data. One to identify the subjects. One to select the data for those subjects.
You could use a method that goes by the name DOW loop. Or double-DOW loop.
data want ;
do until (last.subjectid);
set have;
by subjectid vist;
if dif(score) > 4 and not first.subjectid then found=1;
end;
do until(last.subjectid);
set have;
by subjectid;
if found then output;
end;
run;
Why did you not include subject_id=117961 visit=visit2 sore=0 in the wanted results
where the difference to previous visit is 8-0=8 which is greater then 4 ?
and why you did include subject_id=117962 visit=visit3 score=4 where the difference to
previous visit is -2 which is less then 4 ?
Try next code:
data want;
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
Within all subjects the first visit is "Visit1" so using "not first.subject_ID" omits it;
you asked for: "I don't need to include Visit 1, since Visit 1 is baseline" then why is
subject=117962 visit=visit1 score=0 is in the wanted output? and why
subject=117964 visit=visit1 score=0 is in the wanted output?
You posted as wanted output observations which do not fit your specifications.
you asked for: "I don't need to include Visit 1, since Visit 1 is baseline" then why is
subject=117962 visit=visit1 score=0 is in the wanted output? and why
subject=117964 visit=visit1 score=0 is in the wanted output?
I am really sorry fo rthe confusion. In the wanted output, I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset. I hope it is clear now.
Asking for:
" I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset." - try next code:
data subjects(keep=subject_ID);
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
proc sql;
create table want as
select * from have
where subject_ID in
(select distinct subject_ID form subjects)
; quit;
In case of any issue please post the log and describe the issue you have.
Pay attention:
- DIF function computes the difference between current and previous value of a variable.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.