Hi,
Could someone help me with the code to solve this problem?
From data oc below, I would like to select the industry code(indscode) with the following conditions:
(1) In which at least 50% of the jobs are exposed to agent a(0=unexposed, 1=exposed).[I want all percentages to be displayed and from there I will select the >=50%].
(2) Couting number of subjects( lung cases(case) and controls(cont) for the selected indscode.
I used proc SQL, expecting the indscode variable to be displayed on the table results, but it didn't.
Coding help is needed to have the indscode count and % displayed on the table,please.
Please find my code and results below:
Thanks in advance.
ak.
data oc;
input id$ 1-5 job 6-7 a 8-9 b 10-11 indscode 12-16 lung$ 17-21;
datalines;
OS1 3 1 0 6198 case
OS2 2 1 1 6191 cont
OS3 3 1 0 6191 case
OS4 3 0 0 6198 cont
OS5 3 1 0 6191 case
OS6 3 1 0 6191 cont
OS6 4 1 1 6198 case
OS7 4 1 1 6191 case
OS8 1 1 1 6191 cont
OS9 1 0 0 6198 cont
OS10 1 1 1 6191 cont
OS11 3 1 1 4153 case
OS12 1 1 1 6191 case
OS13 3 1 1 8563 case
OS14 1 1 1 6191 case
OS14 2 0 0 6191 cont
OS14 5 1 1 6191 cont
OS15 1 1 1 4153 cont
OS16 1 1 0 8563 case
OS17 4 1 1 6191 case
OS17 2 1 0 6191 case
OS17 3 1 0 6191 cont
OS20 3 1 1 4153 cont
;
proc sql;
create table want as
select job, count(job)/(select count(*) from oc) as pct format=percent.
from oc where a=1 /*a=0 is unexposed to agent a; 1= exposed to agent a*/
group by indscode;
quit;
proc print data=want;
run;
Obs job pct
1 3 13%
2 3 13%
3 1 13%
4 5 57%
5 1 57%
6 3 57%
7 1 57%
8 3 57%
9 1 57%
10 4 57%
11 2 57%
12 4 57%
13 3 57%
14 2 57%
15 3 57%
16 1 57%
17 4 9%
18 3 9%
19 3 9%
20 1 9%
A variable that you want in the result dataset must be part of the SELECT.
I think you want this
select
indscode,
sum(a) / count(*) format=percent7.2 as pct
from oc.
group by indscode
Since a is either 1 or 0, sum(a) is an exact count of the 1's.
sum(lung = "case") and sum(lung = "cont") gives you the respective counts. Keep in mind that the boolean result of a comparison is also either 1 or 0.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Get started using SAS Studio to write, run and debug your SAS programs.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.