Data Have:
ID | Q_TYPE | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | NULL | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | 3 | NULL | NULL | NULL |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | 4 | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 8 | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | 3 | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | B | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | 3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 5 |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 3 |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 5 | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 6 | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | 8 | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | NULL | 9 | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL |
A01 | C | NULL | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL |
A02 | D | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
A03 | E | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
DATA WANT:
ID | Q_TYPE | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8 | Q9 | Q10 | COUNT |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | NULL | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | 3 | NULL | NULL | NULL | |
A01 | A | NULL | NULL | NULL | NULL | NULL | NULL | 4 | NULL | NULL | NULL | 3 |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | 8 | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | 3 | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | B | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL | NULL | 3 |
A01 | C | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | C | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | C | 3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | 6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 5 | |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 3 | |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 4 | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 5 | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 6 | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | 8 | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | NULL | 9 | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | 7 | NULL | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | 5 | NULL | NULL | NULL | NULL | NULL | |
A01 | C | NULL | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | NULL | 6 |
A02 | D | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 0 |
A03 | E | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 0 |
Variable Q_type is Questionnaire Type, each questionnaire could have 10 questions or less, whatever answered, are listed there as some 1-10 values, otherwise are NULL.
Want to count for each ID, each questionnaire type, how many questions were answered. There are cases for same ID, same questionnaire answered several times, then keep the last answered one as final answer, that is, only add 1 to total count.
Hope I explained my question clearly enough.
Thank you,and happy new year!
Here is an one step data step approach:
data want;
set have;
by id q_type;
array q q1-q10;
array _q(1:10) _temporary_;
if first.q_type then do;
ct=0;
do i=1 to dim(_q);
_q(i)=1;
end;
end;
do i=1 to dim(q);
ct+(q(i) ne 'NULL')*_q(i);
if q(i) ne 'NULL' then _q(i)=0;
end;
count=ifn(last.q_type,ct,.);
drop i ct;
run;
Haikuo
Ok so the way that I understand this is that you want the ID variable, questionnaire type, and the number of variables that they answered. So essentially your final table should only include 3 variables, ID, Questionnaire type, and count. Also I didn't really understand your comment about adding 1 to total, so I can't really address that.
To do this you have a couple of options, but I think that the easiest and fastest way is to use both SQL and datastep.
proc sql;
create table good as
select distinct ID, q_type, (max(Q1)) as Q1, (max(Q2)) as Q2, (max(Q3)) as Q3, (max(Q4)) as Q4, (max(Q5)) as Q5,
(max(Q6)) as Q1, (max(Q7)) as Q7, (max(Q8)) as Q8, (max(Q9)) as Q9, (max(Q10)) as Q10
from bad
group by ID, q_type;
quit;
/*The sql code takes the maximum value of any of the responses which will leave a null value for those which only have null values*/
data good;
set good;
array Que (10) Q1-Q10;
do i = 1 to 10;
if que(i) = . then que(i) = 0;
end;
count = sum(Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10);
keep id q_type count;
run;
/*Converting all of these to 0's just allows sum to do its job without a whole lot of issues, and then you can just keep the three variables that you wanted*/
If you actually wanted values for the responses then other code would need to be written.
Here is an one step data step approach:
data want;
set have;
by id q_type;
array q q1-q10;
array _q(1:10) _temporary_;
if first.q_type then do;
ct=0;
do i=1 to dim(_q);
_q(i)=1;
end;
end;
do i=1 to dim(q);
ct+(q(i) ne 'NULL')*_q(i);
if q(i) ne 'NULL' then _q(i)=0;
end;
count=ifn(last.q_type,ct,.);
drop i ct;
run;
Haikuo
This might be an issue that could use the UPDATE statement. It will allow you to collapse all of the variables to the last non-missing value.
data have ;
input id $ q_type $ q1-q10 ;
cards;
A01 A . . . . . . . . . .
A01 A 1 . . . . . . . . .
A01 A 2 . . . . . . . . .
A01 A . . . 6 . . . . . .
A01 A . . . 7 . . . . . .
A01 A . . . 4 . . . . . .
A01 A . . . . . . 3 . . .
A01 A . . . . . . 4 . . .
A01 B . . . . . . 5 . . .
A01 B . . . . . . 7 . . .
A01 B . . . . . . 8 . . .
A01 B . . . . . . . . . .
A01 B . . . . 5 . . . . .
A01 B . . . . 6 . . . . .
A01 B . . . . 7 . . . . .
A01 B . . . 3 . . . . . .
A01 B . . . 4 . . . . . .
A01 C 1 . . . . . . . . .
A01 C 2 . . . . . . . . .
A01 C 3 . . . . . . . . .
A01 C . . 5 . . . . . . .
A01 C . . 6 . . . . . . .
A01 C . . . . . . . . . 5
A01 C . . . . . . . . . 3
A01 C . . . . . . . 4 . .
A01 C . . . . . . . 5 . .
A01 C . . . . . . . 6 . .
A01 C . . . . . 7 . . . .
A01 C . . . . . 8 . . . .
A01 C . . . . . 9 . . . .
A01 C . . . . 7 . . . . .
A01 C . . . . 5 . . . . .
A01 C . . . . 4 . . . . .
A02 D . . . . . . . . . .
A03 E . . . . . . . . . .
run;
data counts ;
update have(obs=0) have ;
by id q_type ;
if last.q_type then count=n(of q1-q10);
output;
keep count ;
run;
data want ;
merge have counts;
* NO BY STATEMENT ;
run;
I think that Haikuo's code does what you want. The following, methinks, is simply a slightly simpler version of the same approach:
data want;
set have;
by id q_type;
array q q1-q10;
array _q(10) _temporary_;
if first.q_type then do;
call missing(of _q(*));
end;
do _n_=1 to dim(q);
if q(_n_) then _q(_n_)=1;
end;
if last.q_type then count=max(0,sum(of _q(*)));
run;
Definitely nicer approach, Art! I think you may not need "retain _q:;" as temporary array is retained by default?
Haikuo
Art
I think there is an argument for retaining the retain as an explicit statement in this case. As the statement is interpreted at compile time it does not make the code any less efficient but it does act to signal (to anyone who has to maintain the code) that the value is being retained.
Richard now back in NZ
And a prosperous New Year to you
PS was stoked at the poke in your Christmas joke
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.