Have
| Subject ID | Visit | Score |
| 117961 | Visit1 | 0 |
| 117961 | Visit2 | 8 |
| 117961 | Visit3 | 4 |
| 117961 | Visit4 | 1 |
| 117961 | Visit5 | 1 |
| 117962 | Visit1 | 0 |
| 117962 | Visit2 | 6 |
| 117962 | Visit3 | 4 |
| 117962 | Visit4 | 9 |
| 117962 | Visit5 | 2 |
| 117963 | Visit1 | 0 |
| 117963 | Visit2 | 4 |
| 117963 | Visit3 | 2 |
| 117963 | Visit4 | 5 |
| 117963 | Visit5 | 3 |
| 117964 | Visit1 | 0 |
| 117964 | Visit2 | 3 |
| 117964 | Visit3 | 2 |
| 117964 | Visit4 | 8 |
| 117964 | Visit5 | 6 |
want
| Subject ID | Visit | Score |
| 117962 | Visit1 | 0 |
| 117962 | Visit2 | 6 |
| 117962 | Visit3 | 4 |
| 117962 | Visit4 | 9 |
| 117962 | Visit5 | 2 |
| 117964 | Visit1 | 0 |
| 117964 | Visit2 | 3 |
| 117964 | Visit3 | 2 |
| 117964 | Visit4 | 8 |
| 117964 | Visit5 | 6 |
I want to have a subset of data by comparing the score of Visit 2 onward (I don't need to include Visit 1, since Visit 1 is baseline), where the difference between 2 cosecutive scores jump by greater than 4 points.
Thanks as always!!
Asking for:
" I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset." - try next code:
data subjects(keep=subject_ID);
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
proc sql;
create table want as
select * from have
where subject_ID in
(select distinct subject_ID form subjects)
; quit;
In case of any issue please post the log and describe the issue you have.
Pay attention:
- DIF function computes the difference between current and previous value of a variable.
I want every records of those subject(s) who has a jump of score greater than 4
You need to make two passes through the data. One to identify the subjects. One to select the data for those subjects.
You could use a method that goes by the name DOW loop. Or double-DOW loop.
data want ;
do until (last.subjectid);
set have;
by subjectid vist;
if dif(score) > 4 and not first.subjectid then found=1;
end;
do until(last.subjectid);
set have;
by subjectid;
if found then output;
end;
run;
Why did you not include subject_id=117961 visit=visit2 sore=0 in the wanted results
where the difference to previous visit is 8-0=8 which is greater then 4 ?
and why you did include subject_id=117962 visit=visit3 score=4 where the difference to
previous visit is -2 which is less then 4 ?
Try next code:
data want;
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
Within all subjects the first visit is "Visit1" so using "not first.subject_ID" omits it;
you asked for: "I don't need to include Visit 1, since Visit 1 is baseline" then why is
subject=117962 visit=visit1 score=0 is in the wanted output? and why
subject=117964 visit=visit1 score=0 is in the wanted output?
You posted as wanted output observations which do not fit your specifications.
you asked for: "I don't need to include Visit 1, since Visit 1 is baseline" then why is
subject=117962 visit=visit1 score=0 is in the wanted output? and why
subject=117964 visit=visit1 score=0 is in the wanted output?
I am really sorry fo rthe confusion. In the wanted output, I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset. I hope it is clear now.
Asking for:
" I want records by "Subject", which means I want ALL the records of "those Subjects" whose 2 consective visits scores jumped gretaer than 4 points, when I do the comparsion, I don't need to consider the jump between the Visit 1 and Visit 2, I only need to consider the jump (increase) between visit 2 to visit 3, visit 3 to visit 4, and visit 4 to visit 5, when I find that subject, I wnat all his(her) records in the output dataset." - try next code:
data subjects(keep=subject_ID);
set have;
by subject_ID;
if not first.subject_ID and dif(score) ge 4 then output;
run;
proc sql;
create table want as
select * from have
where subject_ID in
(select distinct subject_ID form subjects)
; quit;
In case of any issue please post the log and describe the issue you have.
Pay attention:
- DIF function computes the difference between current and previous value of a variable.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.