I have the following dataset
data have;
input time group var1 count percent;
datalines;
0 1 1 5 50
0 1 3 5 50
3 1 1 2 20
3 1 3 8 80
3 2 1 3 30
3 2 3 7 70
11 2 1 4 40
11 2 3 6 60
;
run;
The variable group has two values 1 and 2. Per each time and group different combination, there are two rows. Var1 is either 1 or 3 and for each there is a count variable and a percent which were calculated. We need to interleave a row per each time and group combination where var1 is equal to 2 and count=0 and percent=0. The output should look like this:
time | group | var1 | count | percent |
0 | 1 | 1 | 5 | 50 |
0 | 1 | 2 | 0 | 0 |
0 | 1 | 3 | 5 | 50 |
3 | 1 | 1 | 2 | 20 |
3 | 1 | 2 | 0 | 0 |
3 | 1 | 3 | 8 | 80 |
3 | 2 | 1 | 3 | 30 |
3 | 2 | 2 | 0 | 0 |
3 | 2 | 3 | 7 | 70 |
11 | 2 | 1 | 4 | 40 |
11 | 2 | 2 | 0 | 0 |
11 | 2 | 3 | 6 | 60 |
Below one option that addresses the updated requirements. The source data must be sorted by time, group, var1
data have;
input time group var1 count percent;
datalines;
0 1 1 5 50
0 1 3 5 50
3 1 1 2 20
3 1 3 8 80
3 2 1 3 30
3 2 3 7 70
11 2 2 4 40
11 2 3 6 60
;
run;
data template;
set have;
by time group var1;
if first.group;
count=0;
percent=0;
do var1=1 to 3;
output;
end;
run;
data want;
merge template have;
by time group var1;
run;
proc print data=want;
run;
Your sample data and narrative indicates that you've got always two rows per time and group with var1 being either 1 or 3. It that's true then code as simple as below should do.
data have;
input time group var1 count percent;
datalines;
0 1 1 5 50
0 1 3 5 50
3 1 1 2 20
3 1 3 8 80
3 2 1 3 30
3 2 3 7 70
11 2 1 4 40
11 2 3 6 60
;
run;
data want;
set have;
output;
if var1=1 then
do;
var1=2;
count=0;
percent=0;
output;
end;
run;
proc print data=want;
run;
Thank you for your response. Your solution works with the assumption you mentioned. Sorry I missed this important aspect, I just realized from looking at another dataset, that var1 per combination can be missing either var1=1 or var1=2. But the output needs to have per time group combination, var1=1 var1=2 and var1=3. This is a sample dataset (please notice the change in the last two observations):
data have;
input time group var1 count percent;
datalines;
0 1 1 5 50
0 1 3 5 50
3 1 1 2 20
3 1 3 8 80
3 2 1 3 30
3 2 3 7 70
11 2 2 4 40
11 2 3 6 60;
run;
The wanted output:
time | group | var1 | count | percent |
0 | 1 | 1 | 5 | 50 |
0 | 1 | 2 | 0 | 0 |
0 | 1 | 3 | 5 | 50 |
3 | 1 | 1 | 2 | 20 |
3 | 1 | 2 | 0 | 0 |
3 | 1 | 3 | 8 | 80 |
3 | 2 | 1 | 3 | 30 |
3 | 2 | 2 | 0 | 0 |
3 | 2 | 3 | 7 | 70 |
11 | 2 | 1 | 0 | 0 |
11 | 2 | 2 | 4 | 40 |
11 | 2 | 3 | 6 | 60 |
This should run only when either var1=1 or var1=2 are missing per combination.
Below one option that addresses the updated requirements. The source data must be sorted by time, group, var1
data have;
input time group var1 count percent;
datalines;
0 1 1 5 50
0 1 3 5 50
3 1 1 2 20
3 1 3 8 80
3 2 1 3 30
3 2 3 7 70
11 2 2 4 40
11 2 3 6 60
;
run;
data template;
set have;
by time group var1;
if first.group;
count=0;
percent=0;
do var1=1 to 3;
output;
end;
run;
data want;
merge template have;
by time group var1;
run;
proc print data=want;
run;
Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.