Dear anyone
Once again I turn to this site for help.
I have data=want with the variables var1 and var2. Var1 is single choice, either a og b. Var2 is multiple choice, a-d. My data is organized as shown in table 1.
I want to cross-table var1 with var2, as shown in table 2. I have tried proc freq with var1*(var2i-var2iv), but that just gave me four different tables.
I could make a 0/1 variable for each possible answer (a-d), but I have many variables to analyze and it would take a lot of time and typing.
Any suggestions on how to get around this
Sincerely
Anders
Table 1 | ||||||
ID | Var1 | Var2i | Var2ii | Var2iii | Var2iv | |
11 | a | a | c | |||
12 | b | |||||
13 | b | b | c | |||
14 | a | a | b | c | d | |
15 | a | c | ||||
16 | b | c | d | |||
17 | b | b | d | |||
18 | b | |||||
19 | a | a | c | d | ||
20 | a | d | ||||
Table 2 | Var2 | |||||
a | b | c | d | |||
Var 1 | a | 3 | 1 | 4 | 3 | |
b | 0 | 2 | 2 | 2 | ||
I think the easiest way is to transpose your data so that the values of var2i-var2iv are all in 1 column. Doing the crosstab after that is then simple.
data have;
infile cards missover;
input id var1 $ var2i $ var2ii $ var2iii $ var2iv $;
cards;
11 a a c
12 b
13 b b c
14 a a b c d
15 a c
16 b c d
17 b b d
18 b
19 a a c d
20 a d
;
run;
data have_trans;
set have;
array trans_var{*} var2: ;
do i=1 to dim(trans_var);
if missing(trans_var{i}) then leave;
new_var=trans_var{i};
output;
end;
drop var2: i;
run;
proc freq data=have_trans;
table var1*new_var / nopercent norow nocol;
run;
I think the easiest way is to transpose your data so that the values of var2i-var2iv are all in 1 column. Doing the crosstab after that is then simple.
data have;
infile cards missover;
input id var1 $ var2i $ var2ii $ var2iii $ var2iv $;
cards;
11 a a c
12 b
13 b b c
14 a a b c d
15 a c
16 b c d
17 b b d
18 b
19 a a c d
20 a d
;
run;
data have_trans;
set have;
array trans_var{*} var2: ;
do i=1 to dim(trans_var);
if missing(trans_var{i}) then leave;
new_var=trans_var{i};
output;
end;
drop var2: i;
run;
proc freq data=have_trans;
table var1*new_var / nopercent norow nocol;
run;
Hi Keith
Awesome 🙂 I had a notion of transposing would be the right thing to do - but I had no idea of how to do it. You obviously had 🙂 Only, I had to put in a proc sort statement, so I ended up doing:.
data have;
infile cards missover;
input id var1 $ var2i $ var2ii $ var2iii $ var2iv $;
cards;
11 a a c
12 b
13 b b c
14 a a b c d
15 a c
16 b c d
17 b b d
18 b
19 a a c d
20 a d
;
run;
proc sort data=have; by id; run;
data have_trans;
set have;
by id;
array trans_var{*} var2: ;
do i=1 to dim(trans_var);
if missing(trans_var{i}) then leave;
new_var=trans_var{i};
output;
end;
drop var2: i;
run;
Thanks a 10^6
Sincerely
Anders
Glad to be of help. I actually made a small modification to my original code, the 'by ID' line in the data step is not strictly necessary so I took it out. Then you wouldn't then to pre-sort the dataset.
Or you could try a SQL way which don't need to transpose dataset.
If there were many variables, you can use a macro variable to wrap them all.
data have; infile cards missover; input id var1 $ var2i $ var2ii $ var2iii $ var2iv $; cards; 11 a a c 12 b 13 b b c 14 a a b c d 15 a c 16 b c d 17 b b d 18 b 19 a a c d 20 a d ; run; proc sql; create table want as select var1,sum(var2i='a')+sum(var2ii='a')+sum(var2iii='a')+sum(var2iv='a') as a, sum(var2i='b')+sum(var2ii='b')+sum(var2iii='b')+sum(var2iv='b') as b, sum(var2i='c')+sum(var2ii='c')+sum(var2iii='c')+sum(var2iv='c') as c, sum(var2i='d')+sum(var2ii='d')+sum(var2iii='d')+sum(var2iv='d') as d from have group by var1; quit;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.