Hi there,
I am using SAS v9.4. I have a dataset that looks like this:
refid | title | abstract | reviewer_A | reviewer_B | reviewer_C | reviewer_D |
1 | how to have fun | this is a study about how to have fun | include | include | ||
2 | gaming and being awesome | we found that gaming makes you awesome | include | exclude |
And I would like to transform it so it looks like this:
-
refid | title | abstract | reviewer_1 | reviewer_2 |
1 | how to have fun | this is a study about how to have fun | include | include |
2 | gaming and being awesome | we found that gaming makes you awesome | include | exclude |
That is, I would like to select the first TWO non-missing values across the range reviewer_A--reviewer_D in each row (each refid) and put the first in reviewer_1 and the second in reviewer_2.
Very much would appreciate any suggestions on how to program this using data steps or SQL. I have tried the proc sql coalescec approach but I haven't been able to get it to select the first and second values.
Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):
data want;
set have;
array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 4;
if abcd{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcd{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D;
run;
Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):
data want;
set have;
array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 4;
if abcd{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcd{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D;
run;
Thank you, Astounding.
Actually, this is a simplified example so your intuition is correct, there are in fact 7 reviewers so I would have reviewer_A to reviewer_G. Note that only two reviewers would ever review a single study.
In this case, would a correct code modification for the 7 reviewer situation be:
data want;
set have;
array abcdefg {7} reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;
array new {2} $ 8 reviewer_1 reviewer_2;
i=0;
do j=1 to 7;
if abcdefg{j} > ' ' then do;
i + 1;
if i < 3 then new{i} = abcdefg{j};
end;
end;
drop i j reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;
run;
??
Many thanks again.
Yes, that looks fine.
Your array names are up to you, so you could use "abc" instead of "abcdefg" ... just a choice you make like you choose names for new variables.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.