HAVE:
pnr | Var1 | Var4 | Var5 | Var7 | Var10 | Var14 |
111111 | b | b | b | c | c | b |
111112 | e | e | e | e | e | e |
111113 | a | a | b | b | d | d |
111114 | e | e | e | e | e | e |
Want:
pnr | Var1 | Var4 | Var5 | Var7 | Var10 | Var14 | a | b | c | d | e |
111111 | b | b | b | c | c | b | 0 | 4 | 2 | 0 | 0 |
111112 | e | e | e | e | e | e | 0 | 0 | 0 | 0 | 5 |
111113 | a | a | b | b | d | d | 2 | 2 | 0 | 2 | 0 |
111114 | e | e | c | c | e | e | 0 | 0 | 3 | 0 | 4 |
There must be a simpel way to do it! Im new in SAS, so please try to think of that, while you make an answer.
thank you so much for taking your time to answer this,
Using SAS 9.4
Rebekka
Post test data in the form of a datastep, using the code window (its the {i} above post area). Follow this post if you need help:
I would imagine the simplest method is to do a quick normalise of your data (wide to long - as a tip for the future, working with normalised data is far easier from a programming point of view - you can still transpose for output):
data have;
input pnr Var1 $ Var4 $ Var5 $ Var7 $ Var10 $ Var14 $;
datalines;
111111 b b b c c b
111112 e e e e e e
;
run;
proc transpose data=have out=inter;
by pnr;
var var:;
run;
proc sort data=inter;
by pnr col1;
run;
data inter;
set inter;
retain cnt 0;
by pnr col1;
cnt=ifn(first.col1,1,cnt+1);
if last.col1 then output;
run;
proc transpose data=inter out=t_inter;
by pnr;
var cnt;
id col1;
idlabel col1;
run;
data want;
merge have t_inter;
by pnr;
run;
Note that I have gone for a more generic approach, i.e. I have not hardcoded the letters in. If you only want a count of a, then you could hardcode it (by this I mean if the letter z appeared for instance, then you wouldn't get a column z with the count, as its not flexible):
data want; set have; a=count(catx(',',of var:),'a'); b=count(catx(',',of var:),'b'); ... run;
the var: means all variables with prefix of var.
Are the values of your variables restricted to a known set of values such as literal a, b, c etc. or are they actually not known at the start of the process? And are the values actually character?
data have;
input pnr Var1 $ Var4 $ Var5 $ Var7 $ Var10 $ Var14 $;
datalines;
111111 b b b c c b
111112 e e e e e e
;
run;
data want;
set have;
array v(*) var:;
array _v(*) a b c d e;
_k=cats(of v(*));
do _n_=1 to dim(_v);
_v(_n_) = countc(_k, vname(_v(_n_)));
end;
drop _:;
run;
@Rebekka wrote:
The values are characters that are known (they are actuallys numbers from 1
to 12, but defined as charaters in SAS). I hope that was answer enougth 🐵
Thanks Rebekka
Since digits alone are not valid variable names do you want the resulting variable names to be _1 _2 ... _12, start with something other than _ or require the cumbersome the '1'n or "2"n name literal constructs? Large economy sized hint: Name literals can lead to a number of issues maintaining code. And if you want the variable names to sort in order you might really want 01 instead of 1 such as NewV01 to NewV12.
You can use DO loops, two nested ones.
This assumes you know the values ahead of time, but there are ways to get those values either way.
Here's an example, based of @RW9 solution. Note that your 'output' for the second row is incorrect, you have 6 E's not 5.
data have;
input pnr Var1 $ Var4 $ Var5 $ Var7 $ Var10 $ Var14 $;
datalines;
111111 b b b c c b
111112 e e e e e e
;
run;
data want;
set have;
array _new(*) a b c d e (5*0);
array _old(*) var1--var14;
do i=1 to dim(_old);
do j=1 to dim(_new);
if vname(_new(j)) = _old(i) then _new(j)+1;
end;
end;
drop i j;
run;
data have;
input pnr Var1 $ Var4 $ Var5 $ Var7 $ Var10 $ Var14 $;
datalines;
111111 b b b c c b
111112 e e e e e e
;
run;
proc iml;
use have(keep=var:);
read all var _ALL_ into x;
close;
call tabulate(level,freq,x);
level=t(level);
create level var{level};
append;
close;
quit;
proc sql noprint;
select level into : list separated by ' ' from level;
quit;
data want;
set have;
array x{*} &list;
array y{*} $ var: ;
do k=1 to dim(x);
x{k}=0;
end;
do i=1 to dim(y);
do j=1 to dim(x);
if upcase(y{i})=upcase(vname(x{j})) then do;x{j}=x{j}+1;leave;end;
end;
end;
drop i j k;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.