SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How do I select and keep the first two non-missing values across a fixed range of variables?

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

How do I select and keep the first two non-missing values across a fixed range of variables?

Hi there,

 

I am using SAS v9.4. I have a dataset that looks like this:

 

refidtitleabstractreviewer_Areviewer_Breviewer_Creviewer_D
1how to have funthis is a study about how to have funincludeinclude  
2gaming and being awesomewe found that gaming makes you awesome  includeexclude

 

And I would like to transform it so it looks like this:

-

refidtitleabstractreviewer_1reviewer_2
1how to have funthis is a study about how to have funincludeinclude
2gaming and being awesomewe found that gaming makes you awesomeincludeexclude

 

That is, I would like to select the first TWO non-missing values across the range reviewer_A--reviewer_D in each row (each refid) and put the first in reviewer_1 and the second in reviewer_2.

 

Very much would appreciate any suggestions on how to program this using data steps or SQL. I have tried the proc sql coalescec approach but I haven't been able to get it to select the first and second values.


Accepted Solutions
Solution
‎11-30-2016 07:05 PM
Super User
Posts: 5,083

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Given that you have only a few reviewers, this might be overkill.  But it will work here (as well as when you have many more possibilities):

 

data want;

set have;

array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;

array new {2} $ 8 reviewer_1 reviewer_2;

i=0;

do j=1 to 4;

   if abcd{j} > ' ' then do;

      i + 1;

      if i < 3 then new{i} = abcd{j};

   end;

end;

drop i j reviewer_A reviewer_B reviewer_C reviewer_D;

run;

View solution in original post


All Replies
Solution
‎11-30-2016 07:05 PM
Super User
Posts: 5,083

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Given that you have only a few reviewers, this might be overkill.  But it will work here (as well as when you have many more possibilities):

 

data want;

set have;

array abcd {4} reviewer_A reviewer_B reviewer_C reviewer_D;

array new {2} $ 8 reviewer_1 reviewer_2;

i=0;

do j=1 to 4;

   if abcd{j} > ' ' then do;

      i + 1;

      if i < 3 then new{i} = abcd{j};

   end;

end;

drop i j reviewer_A reviewer_B reviewer_C reviewer_D;

run;

New Contributor
Posts: 2

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

 

Thank you, Astounding.

 

Actually, this is a simplified example so your intuition is correct, there are in fact 7 reviewers so I would have reviewer_A to reviewer_G. Note that only two reviewers would ever review a single study.

 

In this case, would a correct code modification for the 7 reviewer situation be:

 

data want;

 

set have;

 

array abcdefg {7} reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;

 

array new {2} $ 8 reviewer_1 reviewer_2;

 

i=0;

 

do j=1 to 7;

 

   if abcdefg{j} > ' ' then do;

 

      i + 1;

 

      if i < 3 then new{i} = abcdefg{j};

 

   end;

 

end;

 

drop i j reviewer_A reviewer_B reviewer_C reviewer_D reviewer_E reviewer_F reviewer_G;

 

run;

 

 

??

 

Many thanks again.

Super User
Posts: 5,083

Re: How do I select and keep the first two non-missing values across a fixed range of variables?

Yes, that looks fine.  

 

Your array names are up to you, so you could use "abc" instead of "abcdefg" ... just a choice you make like you choose names for new variables.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 210 views
  • 2 likes
  • 2 in conversation