BookmarkSubscribeRSS Feed
r3570
Obsidian | Level 7

Hi All,

 

I have a dataset as below:

 

SUBJECTPERIODDAYTESTRESULT
10000811Hb10
10000812Hb20
10000813Hb15
10000821Hb14
10000822Hb13
10000823Hb16
10000831Hb19
10000832Hb10
10000833Hb11
10000911Hb12
10000912Hb14
10000913Hb11
10001011Hb12
10001012Hb18
10001013Hb13
10001021Hb11
10001022Hb13
10001023Hb15
10001031Hb17
10001032Hb11
10001033Hb11

 

For example in this study there are 3 periods and each period is for 3 days. Here subject 100008 has completed all the three periods for all the 3 days. But subject 100009 has completed only period 1 and then got exited from the study due to some reason. Here i would like to add dummy records for all the other existing variables in dataset though subject has not completed all the visits. Output should be as below:

 

SUBJECTPERIODDAYTESTRESULT
10000811Hb10
10000812Hb20
10000813Hb15
10000821Hb14
10000822Hb13
10000823Hb16
10000831Hb19
10000832Hb10
10000833Hb11
10000911Hb12
10000912Hb14
10000913Hb11
10000921HbNA
10000922HbNA
10000923HbNA
10000931HbNA
10000932HbNA
10000933HbNA
10001011Hb12
10001012Hb18
10001013Hb13
10001021Hb11
10001022Hb13
10001023Hb15
10001031Hb17
10001032Hb11
10001033Hb11

 

It means though the subject has not visited the next days or period i need to create dummy records for all the available variables in the dataset.

 

Please help.

3 REPLIES 3
Kurt_Bremser
Super User
data want;
set have;
by subject;
output;
if last.subject then do period = period + 1 to 3;
  do day = 1 to 3;
    result = "NA";
    output;
  end;
end;
run;

Untested; for tested code, post your example data as a data step with datalines.

Tom
Super User Tom
Super User

Usually you can just create a skeleton dataset with all possible records and the default value and then overwrite the skeleton with the real data.

data period_day ;
  if 0 then set have(keep=period day test result);
  test = 'Hb';
  result='NA';
  do period=1 to 3;
    do day=1 to 3;
      output;
    end;
  end;
run;
proc sql ;
  create table skeleton as 
    select * 
    from (select distinct subject from have) a
        , period_day b
    order by subject, period, day
 ;
quit;

data want;
  merge skeleton have;
  by subject period day;
run;
PGStats
Opal | Level 21

If you want to take care of incomplete days as well as missing periods:

 

data want;
set have; by subject;
output;
if last.subject then do;
    call missing(result);
    do p = period to 3;
        do d = day+1 to 3;
            period = p;
            day = d;
            output;
            end;
        day = 0;
        end;
    end;
drop d p;
run;
PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2959 views
  • 0 likes
  • 4 in conversation