Hello,
I have a tricky programming question.
How do I add new records based on the value of a variable in the subsequent record?
As an example, I have this input data
Subj Test Status Frq
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
I want the following output data
Subj Test Status Frq
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 2 Missed
1002 Test 3 Done 1
1003 Test 1 Missed
1003 Test 2 Missed
1003 Test 3 Done 1
So the records in red are added based on the value in Status column.
Can anyone help me create a dynamic SAS code to get the output dataset?
Many thanks!
Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data have_2 (drop=i freq);
set have (rename=(frq=freq));
if status = "Missed previous" then do;
status = "Done";
frq = 1;
output;
do i = 1 to freq;
test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
status = "Missed";
frq = .;
output;
end;
end;
else do;
frq = freq;
output;
end;
run;
proc sort data=have_2 out=want;
by subj test;
run;
Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data have_2 (drop=i freq);
set have (rename=(frq=freq));
if status = "Missed previous" then do;
status = "Done";
frq = 1;
output;
do i = 1 to freq;
test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
status = "Missed";
frq = .;
output;
end;
end;
else do;
frq = freq;
output;
end;
run;
proc sort data=have_2 out=want;
by subj test;
run;
data new;
infile datalines dlm=',';
informat id 4. test $12. Status $32. freq 3.;
input id Test $ Status $ freq;
datalines;
1001,Test 1,Done,1
1001,Test 2,Done,1
1001,Test 3,Done,1
1002,Test 1,Done,1
1002,Test 3,Missed previous,1
1003,Test 3,Missed previous,2
;;;
run;
proc sort; By id test;
run;
data want;
set new;
array _test{3} $12. ;
array _status{3} $12. ;
array _frq{3} ;
by id Test;
retain _test: _status: ' ' count _frq: 0;
if first.id then do;
call missing(of _test1 - _test3);
call missing(of _frq1 - _frq3);
call missing(of _status1 - _status3);
count=0;
end;
count=input(scan(test,-1),??4.);
_test[count] = test;
_status[count] = status;
_frq[count] = freq;
if last.id then do;
do i = 1 to 3;
if missing(_test[i]) then test = "Test "||put(i,??$1.);
else test = _test[i];
if missing(_status[i]) then status = "Missed";
else Status = "Done";
freq=_frq[i];
*put _all_;
output;
end;
end;
drop _test: _status: _frq: i count;
run;
It is possible for all tests for a given subject are missed? Suppose a subject 1004 had Test 1 and Test 2 (both missed). It may be adventitious to have a separate list of given tests and then compare if the original list has matching records in this bank list of all tests given by subject. I use SQL all day so I found it easier to use PROC SQL for this.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data bank;
input subj $ test & $;
datalines;
1001 Test 1
1001 Test 2
1001 Test 3
1002 Test 1
1002 Test 2
1002 Test 3
1003 Test 1
1003 Test 2
1003 Test 3
1004 Test 1
1004 Test 2
;
run;
PROC SQL;
Create Table temp AS
SELECT
tbl1.subj AS Subject
, tbl1.test AS Test
, CASE WHEN MISSING(Tbl2.subj) THEN 'Missed' ELSE 'Done' END AS Status
FROM work.bank as tbl1
LEFT JOIN work.have as tbl2
ON tbl1.subj = tbl2.subj
AND tbl1.test = tbl2.test
;
QUIT;
PROC SORT DATA = temp Out = Temp2;
BY Subject Test;
RUN;
PROC PRINT DATA = Temp2; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.