- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a tricky programming question.
How do I add new records based on the value of a variable in the subsequent record?
As an example, I have this input data
Subj Test Status Frq
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
I want the following output data
Subj Test Status Frq
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 2 Missed
1002 Test 3 Done 1
1003 Test 1 Missed
1003 Test 2 Missed
1003 Test 3 Done 1
So the records in red are added based on the value in Status column.
Can anyone help me create a dynamic SAS code to get the output dataset?
Many thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data have_2 (drop=i freq);
set have (rename=(frq=freq));
if status = "Missed previous" then do;
status = "Done";
frq = 1;
output;
do i = 1 to freq;
test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
status = "Missed";
frq = .;
output;
end;
end;
else do;
frq = freq;
output;
end;
run;
proc sort data=have_2 out=want;
by subj test;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data have_2 (drop=i freq);
set have (rename=(frq=freq));
if status = "Missed previous" then do;
status = "Done";
frq = 1;
output;
do i = 1 to freq;
test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
status = "Missed";
frq = .;
output;
end;
end;
else do;
frq = freq;
output;
end;
run;
proc sort data=have_2 out=want;
by subj test;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data new;
infile datalines dlm=',';
informat id 4. test $12. Status $32. freq 3.;
input id Test $ Status $ freq;
datalines;
1001,Test 1,Done,1
1001,Test 2,Done,1
1001,Test 3,Done,1
1002,Test 1,Done,1
1002,Test 3,Missed previous,1
1003,Test 3,Missed previous,2
;;;
run;
proc sort; By id test;
run;
data want;
set new;
array _test{3} $12. ;
array _status{3} $12. ;
array _frq{3} ;
by id Test;
retain _test: _status: ' ' count _frq: 0;
if first.id then do;
call missing(of _test1 - _test3);
call missing(of _frq1 - _frq3);
call missing(of _status1 - _status3);
count=0;
end;
count=input(scan(test,-1),??4.);
_test[count] = test;
_status[count] = status;
_frq[count] = freq;
if last.id then do;
do i = 1 to 3;
if missing(_test[i]) then test = "Test "||put(i,??$1.);
else test = _test[i];
if missing(_status[i]) then status = "Missed";
else Status = "Done";
freq=_frq[i];
*put _all_;
output;
end;
end;
drop _test: _status: _frq: i count;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your help. However, the variables Test, Status and Frq are not fixed at 3 levels.I put them at 3 to construct an example input data. But I can take the maximum level from Test, put that in a macro variable and use the code.
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is possible for all tests for a given subject are missed? Suppose a subject 1004 had Test 1 and Test 2 (both missed). It may be adventitious to have a separate list of given tests and then compare if the original list has matching records in this bank list of all tests given by subject. I use SQL all day so I found it easier to use PROC SQL for this.
data have;
input subj $ test & $ status & $20. frq;
datalines;
1001 Test 1 Done 1
1001 Test 2 Done 1
1001 Test 3 Done 1
1002 Test 1 Done 1
1002 Test 3 Missed previous 1
1003 Test 3 Missed previous 2
;
run;
data bank;
input subj $ test & $;
datalines;
1001 Test 1
1001 Test 2
1001 Test 3
1002 Test 1
1002 Test 2
1002 Test 3
1003 Test 1
1003 Test 2
1003 Test 3
1004 Test 1
1004 Test 2
;
run;
PROC SQL;
Create Table temp AS
SELECT
tbl1.subj AS Subject
, tbl1.test AS Test
, CASE WHEN MISSING(Tbl2.subj) THEN 'Missed' ELSE 'Done' END AS Status
FROM work.bank as tbl1
LEFT JOIN work.have as tbl2
ON tbl1.subj = tbl2.subj
AND tbl1.test = tbl2.test
;
QUIT;
PROC SORT DATA = temp Out = Temp2;
BY Subject Test;
RUN;
PROC PRINT DATA = Temp2; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content