BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kingCobra
Obsidian | Level 7

Hello, 

I have a tricky programming question.

How do I add new records based on the value of a variable in the subsequent record?

 

As an example, I have this input data

 

Subj         Test         Status                          Frq

1001        Test 1       Done                           1

1001        Test 2       Done                           1

1001        Test 3       Done                           1

1002        Test 1       Done                           1

1002        Test 3       Missed previous          1

1003        Test 3       Missed previous         2

 

I want the following output data

 

Subj         Test         Status                          Frq

1001        Test 1       Done                           1

1001        Test 2       Done                           1

1001        Test 3       Done                           1

1002        Test 1       Done                           1

1002         Test 2      Missed

1002        Test 3       Done                           1

1003         Test 1      Missed                      

1003        Test 2       Missed

1003        Test 3       Done                           1

 

So the records in red are added based on the value in Status column.

Can anyone help me create a dynamic SAS code to get the output dataset?

 

Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
mklangley
Lapis Lazuli | Level 10

Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data have_2 (drop=i freq);
    set have (rename=(frq=freq));
    if status = "Missed previous" then do;
        status = "Done";
        frq = 1;
        output;
        do i = 1 to freq;
            test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
            status = "Missed";
            frq = .;
            output;
        end;
    end;
    else do;
        frq = freq;
        output;
    end;
run;

proc sort data=have_2 out=want; 
    by subj test;
run;

View solution in original post

6 REPLIES 6
mklangley
Lapis Lazuli | Level 10

Here's one approach. Whenever a status of "Missed previous" is encountered, output records appropriately to fill in those missing tests. Then sort the data at the end.

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data have_2 (drop=i freq);
    set have (rename=(frq=freq));
    if status = "Missed previous" then do;
        status = "Done";
        frq = 1;
        output;
        do i = 1 to freq;
            test = "Test " || put(input(char(test, 6), 1.) -1, 1.);
            status = "Missed";
            frq = .;
            output;
        end;
    end;
    else do;
        frq = freq;
        output;
    end;
run;

proc sort data=have_2 out=want; 
    by subj test;
run;
kingCobra
Obsidian | Level 7
Thanks for the help.
smantha
Lapis Lazuli | Level 10
data new;
infile datalines dlm=',';
informat id 4. test $12. Status $32. freq 3.;
input id Test $ Status $ freq;
datalines;
1001,Test 1,Done,1
1001,Test 2,Done,1
1001,Test 3,Done,1
1002,Test 1,Done,1
1002,Test 3,Missed previous,1
1003,Test 3,Missed previous,2
;;;
run;
proc sort; By id test;
run;

data want;
set new;
array _test{3} $12. ;
array _status{3} $12. ;
array _frq{3} ;
by id Test;
retain  _test: _status: ' ' count _frq: 0;
if first.id then do;
 call missing(of _test1 - _test3);
 call missing(of _frq1 - _frq3);
 call missing(of _status1 - _status3);
 count=0;
end;
count=input(scan(test,-1),??4.);

	_test[count] = test;
	_status[count] = status;
	_frq[count] = freq;

if last.id then do;
  do i = 1 to 3;
  	if missing(_test[i]) then test = "Test "||put(i,??$1.);
	   else test = _test[i];
    if missing(_status[i]) then status = "Missed";
	   else Status = "Done";
	freq=_frq[i];
    *put _all_;
	output;
  end;
end;
drop _test: _status: _frq: i count;
run;
kingCobra
Obsidian | Level 7
Hi smantha,

Thanks for your help. However, the variables Test, Status and Frq are not fixed at 3 levels.I put them at 3 to construct an example input data. But I can take the maximum level from Test, put that in a macro variable and use the code.
Thanks again.
BrennanBux
Fluorite | Level 6

It is possible for all tests for a given subject are missed? Suppose a subject 1004 had Test 1 and Test 2 (both missed). It may be adventitious to have a separate list of given tests and then compare if the original list has matching records in this bank list of all tests given by subject. I use SQL all day so I found it easier to use PROC SQL for this. 

data have;
    input subj $ test & $ status & $20. frq;
    datalines;
    1001   Test 1   Done             1
    1001   Test 2   Done             1
    1001   Test 3   Done             1
    1002   Test 1   Done             1
    1002   Test 3   Missed previous  1
    1003   Test 3   Missed previous  2
    ;
run;

data bank;
	input subj $ test & $;
	datalines;
    1001   Test 1   
    1001   Test 2   
    1001   Test 3   
    1002   Test 1   
    1002   Test 2   
    1002   Test 3   
    1003   Test 1   
    1003   Test 2   
    1003   Test 3   
    1004   Test 1   
    1004   Test 2   
	;
run;

PROC SQL;
	Create Table temp AS
	SELECT
		tbl1.subj AS Subject
		, tbl1.test AS Test
		, CASE WHEN MISSING(Tbl2.subj) THEN 'Missed' ELSE 'Done' END AS Status
	FROM work.bank as tbl1
	LEFT JOIN work.have as tbl2
		ON tbl1.subj = tbl2.subj
		AND tbl1.test = tbl2.test
	;
QUIT;		
PROC SORT DATA = temp Out = Temp2;
	BY Subject Test;
RUN;
PROC PRINT DATA = Temp2; run;
kingCobra
Obsidian | Level 7
Yes it is possible to construct an exhaustive list of all Tests available. I'll try your code (the Proc SQL way) .

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1018 views
  • 2 likes
  • 4 in conversation