SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kingCobra
Obsidian | Level 7

Hello, 

I have a tricky programming question.

How do I add new records based on the value of a variable in the subsequent record?

 

As an example, I have this input data

 

Subj         Test         Status                          Frq

1001        Test 1       Done                           1

1001        Test 2       Done                           1

1001        Test 3       Done                           1

1002        Test 1       Done                           1

1002        Test 3       Missed previous          1

1003        Test 3       Missed previous         2

 

I want the following output data

 

Subj         Test         Status                          Frq

1001        Test 1       Done                           1

1001        Test 2       Done                           1

1001        Test 3       Done                           1

1002        Test 1       Done                           1

1002         Test 2      Missed

1002        Test 3       Done                           1

1003         Test 1      Missed                      

1003        Test 2       Missed

1003        Test 3       Done                           1

 

So the records in red are added based on the value in Status column.

Can anyone help me create a dynamic SAS code to get the output dataset?

 

Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
mklangley
Lapis Lazuli | Level 10

Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data have_2 (drop=i freq);
    set have (rename=(frq=freq));
    if status = "Missed previous" then do;
        status = "Done";
        frq = 1;
        output;
        do i = 1 to freq;
            test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
            status = "Missed";
            frq = .;
            output;
        end;
    end;
    else do;
        frq = freq;
        output;
    end;
run;

proc sort data=have_2 out=want; 
    by subj test;
run;

View solution in original post

6 REPLIES 6
mklangley
Lapis Lazuli | Level 10

Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data have_2 (drop=i freq);
    set have (rename=(frq=freq));
    if status = "Missed previous" then do;
        status = "Done";
        frq = 1;
        output;
        do i = 1 to freq;
            test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
            status = "Missed";
            frq = .;
            output;
        end;
    end;
    else do;
        frq = freq;
        output;
    end;
run;

proc sort data=have_2 out=want; 
    by subj test;
run;
kingCobra
Obsidian | Level 7
Thanks for the help.
smantha
Lapis Lazuli | Level 10
data new;
infile datalines dlm=',';
informat id 4. test $12. Status $32. freq 3.;
input id Test $ Status $ freq;
datalines;
1001,Test 1,Done,1
1001,Test 2,Done,1
1001,Test 3,Done,1
1002,Test 1,Done,1
1002,Test 3,Missed previous,1
1003,Test 3,Missed previous,2
;;;
run;
proc sort; By id test;
run;

data want;
set new;
array _test{3} $12. ;
array _status{3} $12. ;
array _frq{3} ;
by id Test;
retain  _test: _status: ' ' count _frq: 0;
if first.id then do;
 call missing(of _test1 - _test3);
 call missing(of _frq1 - _frq3);
 call missing(of _status1 - _status3);
 count=0;
end;
count=input(scan(test,-1),??4.);

	_test[count] = test;
	_status[count] = status;
	_frq[count] = freq;

if last.id then do;
  do i = 1 to 3;
  	if missing(_test[i]) then test = "Test "||put(i,??$1.);
	   else test = _test[i];
    if missing(_status[i]) then status = "Missed";
	   else Status = "Done";
	freq=_frq[i];
    *put _all_;
	output;
  end;
end;
drop _test: _status: _frq: i count;
run;
kingCobra
Obsidian | Level 7
Hi smantha,

Thanks for your help. However, the variables Test, Status and Frq are not fixed at 3 levels.I put them at 3 to construct an example input data. But I can take the maximum level from Test, put that in a macro variable and use the code.
Thanks again.
BrennanBux
Fluorite | Level 6

It is possible for all tests for a given subject are missed? Suppose a subject 1004 had Test 1 and Test 2 (both missed). It may be adventitious to have a separate list of given tests and then compare if the original list has matching records in this bank list of all tests given by subject. I use SQL all day so I found it easier to use PROC SQL for this. 

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data bank;
	input subj $ test & $;
	datalines;
    1001   Test 1   
    1001   Test 2   
    1001   Test 3   
    1002   Test 1   
    1002   Test 2   
    1002   Test 3   
    1003   Test 1   
    1003   Test 2   
    1003   Test 3   
    1004   Test 1   
    1004   Test 2   
	;
run;

PROC SQL;
	Create Table temp AS
	SELECT
		tbl1.subj AS Subject
		, tbl1.test AS Test
		, CASE WHEN MISSING(Tbl2.subj) THEN 'Missed' ELSE 'Done' END AS Status
	FROM work.bank as tbl1
	LEFT JOIN work.have as tbl2
		ON tbl1.subj = tbl2.subj
		AND tbl1.test = tbl2.test
	;
QUIT;		
PROC SORT DATA = temp Out = Temp2;
	BY Subject Test;
RUN;
PROC PRINT DATA = Temp2; run;
kingCobra
Obsidian | Level 7
Yes it is possible to construct an exhaustive list of all Tests available. I'll try your code (the Proc SQL way) .

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1419 views
  • 2 likes
  • 4 in conversation