Hello:
I have a data set as below. I want to create a two-table that each cell has a value as percentage of changes in values. E.G. for variable "x1", it has values: a, b, c, d (in exact data, it has more values. so using loop for all values are desired). I want to get the percent of "a" changing to "b", "c", "d", and percent of remaining in "a" (based on the values of variable "x2").
The data:
data test;
input x1 $1 x2 $1;
cards;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
run;
So my goal is to get a table like:
a b c d
a 2/5 1/5 1/5 1/5
b 1/4 1/4 1/4 1/4
c 1/4 3/4 0 0
d 1/4 2/4 0 1/4
Any help is great!
Thanks a lot!
data test;
input x1 $1. x2 :$1.;
cards;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
run;
proc sql;
create table t as
select x1,x2, count(x2) as c
from test
group by x1,x2;
quit;
proc transpose data=t out=want(drop=_name_) ;
by x1;
var c;
id x2;
run;
data final_want;
set want;
array t(*) _numeric_;
k=sum(of t(*));
do _n_=1 to dim(t);
if missing(t(_n_)) then t(_n_)=0;
else t(_n_)=t(_n_)/k*100;
end;
drop k;
run;
Sorry, i think my data code is somehow getting problems, Please use the following:
data have;
input x1 $ x2 $;
datalines;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
data test;
input x1 $1. x2 :$1.;
cards;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
run;
proc sql;
create table t as
select x1,x2, count(x2) as c
from test
group by x1,x2;
quit;
proc transpose data=t out=want(drop=_name_) ;
by x1;
var c;
id x2;
run;
data final_want;
set want;
array t(*) _numeric_;
k=sum(of t(*));
do _n_=1 to dim(t);
if missing(t(_n_)) then t(_n_)=0;
else t(_n_)=t(_n_)/k*100;
end;
drop k;
run;
Great! That's what I want.
Thank you very much!
or using proc freq and proc transpose?
proc freq data=have;
tables x1*x2 / noprint nofreq nocol nopercent sparse outpct out=temp(keep=x1 x2 PCT_ROW);
run;
proc transpose data=temp out=want(drop=_Name_ _Label_);
by x1;
id x2;
run;
proc print data=want noobs; run;
Hello Wong:
The code works well. Thanks!
Hello novinosrin:
I have a follow up question. When I ran the code on my real data, it gave me an error:
-----
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
The error is right after the code
proc sql;
create table t as
select x1,x2, count(x2) as c
from test
group by x1,x2;
My data is not sorted.I also got an error later as: Data set MYDATA.T is not sorted in ascending sequence.
Does the code only run if the data is sorted?
Thanks.
Does the code only run if the data is sorted?
No, Sql doesn't require for the dataset to be sorted. I am not sure why the code isn't working for your real data. However @Miracle 's code is rather robust as proc freq takes care of most of the work. I would suggest to try that as opposed to the longer process.
Thank you for your reply. You are right that Wong's code works fine. I am also interested in learning sql process. Not sure why the code not working on my real data. But your code also teaches me something I need in the future. Thanks!
Indeed that's why i am in this forum. The sql logic is too simple and aint a big deal. If it's not a matter of data security in business production environment, i would ask you to share more of your real so that I can debug and test from my side. Obviously, we can't do that against ethics.
Well, I cannot share the real data here. It is confidential. But it basically just like the test data. The only two variables needed are locations of a firm who may move to other locations or remain in original place in certain period. Also, it is not sorted. I didn't see anything wrong with the code. Just don't know why I got that error. I also notice that it creates a table "t" in the sql procedure. But the column "c" has only one value which is the number of observations.
data test;
input x1 $ x2 $;
cards;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
run;
proc tabulate data=test;
class x1 x2;
table x1='',x2=''*rowpctn=''/misstext='0';
run;
Thanks for your reply. It works. What if I also want the frequency count in the result?
OK. Easy for PROC TABULATE .
data test;
input x1 $ x2 $;
cards;
a a
a a
a b
a c
a d
b b
b a
b c
b d
c c
c c
c c
c a
d d
d a
d b
d b
;
run;
proc tabulate data=test;
class x1 x2;
table x1='',x2=''*(n='n' rowpctn='rowpctn%')/misstext='0';
run;
Thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.