Hi there,
I am using SAS v9.4. I have a dataset that looks like this:
refid | title | abstract | reviewer_A | reviewer_B | reviewer_C | reviewer_D |
1 | how to have fun | this is a study about how to have fun | include | include | ||
2 | gaming and being awesome | we found that gaming makes you awesome | include | exclude |
And I would like to transform it so it looks like this:
-
refid | title | abstract | reviewer_1 | reviewer_2 |
1 | how to have fun | this is a study about how to have fun | include | include |
2 | gaming and being awesome | we found that gaming makes you awesome | include | exclude |
That is, I would like to select the first TWO non-missing values across the range reviewer_A--reviewer_D in each row (each refid) and put the first in reviewer_1 and the second in reviewer_2.
Very much would appreciate any suggestions on how to program this using data steps or SQL. I have tried the proc sql coalescec approach but I haven't been able to get it to select the first and second values.
Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):
data want;
set have;
array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 4;
if abcd{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcd{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D;
run;
Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):
data want;
set have;
array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 4;
if abcd{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcd{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D;
run;
Thank you, Astounding.
Actually, this is a simplified example so your intuition is correct, there are in fact 7 reviewers so I would have reviewer_A to reviewer_G. Note that only two reviewers would ever review a single study.
In this case, would a correct code modification for the 7 reviewer situation be:
data want;
set have;
array abcdefg {7} reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 7;
if abcdefg{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcdefg{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;
run;
??
Many thanks again.
Yes, that looks fine.
Your array names are up to you, so you could use "abc" instead of "abcdefg" ... just a choice you make like you choose names for new variables.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.