BookmarkSubscribeRSS Feed
ak2011
Fluorite | Level 6

Hi,

Could someone help me with the code to solve this problem?

From data oc below, I would like to select the industry code(indscode) with the following conditions:

 

(1) In which at least 50% of the jobs are exposed to agent a(0=unexposed, 1=exposed).[I want all percentages to be displayed and from there I will select the >=50%]. 

(2) Couting number of subjects( lung cases(case) and controls(cont) for the selected indscode.

 

I used proc SQL, expecting the indscode variable to be displayed on the table results, but it didn't. 

Coding help is needed to have the indscode  count and % displayed on the table,please.

Please find my code and  results below:

Thanks in advance.

ak.

data oc;
input id$ 1-5 job 6-7 a 8-9 b 10-11 indscode 12-16 lung$ 17-21;
datalines;
OS1 3 1 0 6198 case
OS2 2 1 1 6191 cont
OS3 3 1 0 6191 case
OS4 3 0 0 6198 cont
OS5 3 1 0 6191 case
OS6 3 1 0 6191 cont
OS6 4 1 1 6198 case
OS7 4 1 1 6191 case
OS8 1 1 1 6191 cont
OS9 1 0 0 6198 cont
OS10 1 1 1 6191 cont
OS11 3 1 1 4153 case
OS12 1 1 1 6191 case
OS13 3 1 1 8563 case
OS14 1 1 1 6191 case
OS14 2 0 0 6191 cont
OS14 5 1 1 6191 cont
OS15 1 1 1 4153 cont
OS16 1 1 0 8563 case
OS17 4 1 1 6191 case
OS17 2 1 0 6191 case
OS17 3 1 0 6191 cont
OS20 3 1 1 4153 cont
;

proc sql;
create table want as
select job, count(job)/(select count(*) from oc) as pct format=percent.
from oc where a=1 /*a=0 is unexposed to agent a; 1= exposed to agent a*/
group by indscode;
quit;

proc print data=want;
run;

 

Obs job pct
1 3 13%
2 3 13%
3 1 13%
4 5 57%
5 1 57%
6 3 57%
7 1 57%
8 3 57%
9 1 57%
10 4 57%
11 2 57%
12 4 57%
13 3 57%
14 2 57%
15 3 57%
16 1 57%
17 4 9%
18 3 9%
19 3 9%
20 1 9%

 

 

4 REPLIES 4
Kurt_Bremser
Super User

A variable that you want in the result dataset must be part of the SELECT.

 

I think you want this

select
  indscode,
  sum(a) / count(*) format=percent7.2 as pct
from oc.
group by indscode
ak2011
Fluorite | Level 6
Hi,
Thanks for your code. However, there are 2 more condition to be met:
(1) where a=1(exposed) and I should also know the n's (ie. the counts) that generated that %s.
(2) Count the number of cases and conts.
I don't know if proc freq too can solve this problem.
ak.
Kurt_Bremser
Super User

Since a is either 1 or 0, sum(a) is an exact count of the 1's.

sum(lung = "case") and sum(lung = "cont") gives you the respective counts. Keep in mind that the boolean result of a comparison is also either 1 or 0.

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
Develop Code with SAS Studio

Get started using SAS Studio to write, run and debug your SAS programs.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1844 views
  • 0 likes
  • 2 in conversation