Solved: How do I select and keep the first two non-missing values across a fix...

dantes · Posted 11-30-2016 05:08 PM

Hi there,

I am using SAS v9.4. I have a dataset that looks like this:

refid	title	abstract	reviewer_A	reviewer_B	reviewer_C	reviewer_D
1	how to have fun	this is a study about how to have fun	include	include
2	gaming and being awesome	we found that gaming makes you awesome			include	exclude

And I would like to transform it so it looks like this:

-

refid	title	abstract	reviewer_1	reviewer_2
1	how to have fun	this is a study about how to have fun	include	include
2	gaming and being awesome	we found that gaming makes you awesome	include	exclude

That is, I would like to select the first TWO non-missing values across the range reviewer_A--reviewer_D in each row (each refid) and put the first in reviewer_1 and the second in reviewer_2.

Very much would appreciate any suggestions on how to program this using data steps or SQL. I have tried the proc sql coalescec approach but I haven't been able to get it to select the first and second values.

Astounding · Posted 11-30-2016 05:20 PM

Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):

data want;

set have;

array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;

array new {2} $ 8 reviewer_1 reviewer_2;

i=0;

do j=1 to 4;

if abcd{j} > ' ' then do;

i + 1;

if i < 3 then new{i} = abcd{j};

end;

drop i j reviewer_A reviewer_B reviewer_C reviewer_D;

run;

View solution in original post

Astounding · Posted 11-30-2016 05:20 PM

Given that you have only a few reviewers, this might be overkill. But it will work here (as well as when you have many more possibilities):

data want;

set have;

array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;

array new {2} $ 8 reviewer_1 reviewer_2;

i=0;

do j=1 to 4;

if abcd{j} > ' ' then do;

i + 1;

if i < 3 then new{i} = abcd{j};

end;

drop i j reviewer_A reviewer_B reviewer_C reviewer_D;

run;

dantes · Posted 11-30-2016 05:33 PM

Thank you, Astounding.

Actually, this is a simplified example so your intuition is correct, there are in fact 7 reviewers so I would have reviewer_A to reviewer_G. Note that only two reviewers would ever review a single study.

In this case, would a correct code modification for the 7 reviewer situation be:

data want;

set have;

array abcdefg {7} reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;

array new {2} $ 8 reviewer_1 reviewer_2;

i=0;

do j=1 to 7;

if abcdefg{j} > ' ' then do;

i + 1;

if i < 3 then new{i} = abcdefg{j};

end;

drop i j reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;

run;

??

Many thanks again.

Astounding · Posted 11-30-2016 06:59 PM

Yes, that looks fine.

Your array names are up to you, so you could use "abc" instead of "abcdefg" ... just a choice you make like you choose names for new variables.

How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Registration is open