I am trying to create two tables with both numeric and percentage result. I was given two tables
ID | CC |
---|---|
1 | 2 |
1 | 5 |
1 | 40 |
2 | 55 |
2 | 2 |
2 | 130 |
2 | 177 |
3 | 20 |
3 | 55 |
3 | 40 |
4 | 30 |
4 | 100 |
ID | Brand |
---|---|
1 | Dell |
1 | Lenovo |
1 | HP |
2 | Sony |
2 | Dell |
2 | Acer |
2 | Other |
3 | Fujitsu |
3 | Sony |
3 | HP |
4 | Apple |
4 | Asus |
I have already created a table that looks like by using the attached code. In other words, how many id have cc1 also have cc2..cc177..etc. Now, I am wondering if it's doable to add percentage next to each number. For instance if CC1*CC1=264 (100%) then CC1*CC2= 5/264=1.9% I would like to know how to add percentage next to each number.
CC | CC1 | CC2 | CC177 |
---|---|---|---|
1 | 264 | 5 | 1 |
2 | 250 | 132 | 6 |
3 | |||
177 | 292 | 129 | 12 |
I would like my table looks like
CC | CC1 | CC2 | CC177 |
---|---|---|---|
1 | 264 (100%) | 5 (1.9%) | 1 |
2 | 250 | 132 | 6 |
3 | |||
177 | 292 | 129 | 12 |
data RESULT_PRE;
set ID_CC;
by ID;
retain CC1-CC177;
array CC_LIST(177) CC1-CC177;
if first.ID then do i=1 to 177;
CC_LIST(i)=0;
end;
CC_LIST(CC)=1;
if last.id then output;
run;
data RESULT;
set RESULT_PRE;
array CC_LIST(177) CC1-CC177;
array HCC_LIST(177) $ HCC1-HCC177;
do i=1 to 177;
percent=CC_LIST(i)/CC_LIST(_n_);
HCC_LIST(i)=cats(CC_LIST(i), "(", put(CC_LIST(i)/CC_LIST(_n_), percent8.1), ")");
end;
run;
Another table I am trying to create is to have description of each CC on the table. Each CC number stands for one brand. 2=Dell 177=Other, etc.
If I want to change the CC1 CC2 to characters, how do I modify the arrays? Eventually, I would like my table looks like
Description | Dell | Lenovo | HP |
---|---|---|---|
Dell | 264 (100%) | ||
Lenovo | 10 (2%) | ||
HP | 50(10%) | ||
Sony |
In other words, how many people have dell also have acer, sony, other, etc?
Please help me with this! I'm very desperate! I'm very new to SAS. Thank you!!
This may not be the most efficient but I think it's pretty straightforward. It works fine in my example, I simplified it to 10 companies. The final steps include renaming the variables as well.
You'll need to adapt it to your data of course. Make sure to verify that your data out of the proc corr procedure does have the 177 variables.
Good Luck!
*Generate sample data - up to 10 observations;
data have_random;
do i=1 to 100;
ID=floor(rand('uniform')*10+1);
CC=floor(rand('uniform')*10+1);
output;
end;
drop i;
run;
proc sort data=have_random nodupkey;
by ID CC;
run;
*Create indicator table with CC1-CC10;
data step1;
set have_random;
by ID;
retain CC1-CC10;
array CC_List(10) CC1-CC10;
if first.ID then do i=1 to 10;
CC_LIST(i)=0;
end;
CC_List(CC)=1;
drop CC;
if last.ID then output;
run;
*Create matrix of counts;
ods output sscp=step2;
ods select sscp;
proc corr data=step1 sscp;
var CC1-CC10;
run;
proc print data=step2;
run;
*Add in percentages to data;
data step3;
set step2;
array cc_list(10) cc1-cc10;
array dd_list(10) $20. dd1-dd10;
do i=1 to 10;
percent=cc_list(i)/cc_list(_n_);
dd_list(i)=catt(cc_list(i), " (", put(percent, percent8.1 -l), ")");
end;
drop i percent cc:;
run;
*Rename to old variables - not really necessary but makes things a touch easier;
data step4;
set step3;
rename dd1-dd10 = cc1-cc10;
run;
*Create sample data for company names;
data rename_list;
input CC Name $12.;
cards;
1 ASUS
2 DELL
3 APPLE
4 IBM
5 SAMSUNG
6 TOSHIBA
7 LG
8 GOOGLE
9 AIRBNB
10 INTEL
;
run;
*Create format to be used for rename and format;
data format_list;
set rename_list;
length label $20.;
start=cats("CC", put(CC, 2. -l));
label=Name;
fmtname="CC_Name";
type="C";
drop Name CC;
run;
*Create format to apply to variable;
proc format cntlin=format_list;
run;
*Create rename list from format variable - allows renaming of variables;
proc sql noprint;
select catx("=", start, label) into :rename_list separated by " "
from format_list;
quit;
%put &rename_list;
*Create final output table - recode company name and rename variables;
data step5;
length Company $20.;
set step4;
Company=put(variable, $cc_name.);
rename &rename_list;
drop Variable;
run;
*Display final output;
proc print data=step5;
run;
This may not be the most efficient but I think it's pretty straightforward. It works fine in my example, I simplified it to 10 companies. The final steps include renaming the variables as well.
You'll need to adapt it to your data of course. Make sure to verify that your data out of the proc corr procedure does have the 177 variables.
Good Luck!
*Generate sample data - up to 10 observations;
data have_random;
do i=1 to 100;
ID=floor(rand('uniform')*10+1);
CC=floor(rand('uniform')*10+1);
output;
end;
drop i;
run;
proc sort data=have_random nodupkey;
by ID CC;
run;
*Create indicator table with CC1-CC10;
data step1;
set have_random;
by ID;
retain CC1-CC10;
array CC_List(10) CC1-CC10;
if first.ID then do i=1 to 10;
CC_LIST(i)=0;
end;
CC_List(CC)=1;
drop CC;
if last.ID then output;
run;
*Create matrix of counts;
ods output sscp=step2;
ods select sscp;
proc corr data=step1 sscp;
var CC1-CC10;
run;
proc print data=step2;
run;
*Add in percentages to data;
data step3;
set step2;
array cc_list(10) cc1-cc10;
array dd_list(10) $20. dd1-dd10;
do i=1 to 10;
percent=cc_list(i)/cc_list(_n_);
dd_list(i)=catt(cc_list(i), " (", put(percent, percent8.1 -l), ")");
end;
drop i percent cc:;
run;
*Rename to old variables - not really necessary but makes things a touch easier;
data step4;
set step3;
rename dd1-dd10 = cc1-cc10;
run;
*Create sample data for company names;
data rename_list;
input CC Name $12.;
cards;
1 ASUS
2 DELL
3 APPLE
4 IBM
5 SAMSUNG
6 TOSHIBA
7 LG
8 GOOGLE
9 AIRBNB
10 INTEL
;
run;
*Create format to be used for rename and format;
data format_list;
set rename_list;
length label $20.;
start=cats("CC", put(CC, 2. -l));
label=Name;
fmtname="CC_Name";
type="C";
drop Name CC;
run;
*Create format to apply to variable;
proc format cntlin=format_list;
run;
*Create rename list from format variable - allows renaming of variables;
proc sql noprint;
select catx("=", start, label) into :rename_list separated by " "
from format_list;
quit;
%put &rename_list;
*Create final output table - recode company name and rename variables;
data step5;
length Company $20.;
set step4;
Company=put(variable, $cc_name.);
rename &rename_list;
drop Variable;
run;
*Display final output;
proc print data=step5;
run;
Reeza, you are such a life saver!! Thank you very much!! That's exactly what I need.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.