BookmarkSubscribeRSS Feed
ak2011
Fluorite | Level 6

Hi,

Could someone help me with the code to solve this problem?

From data oc below, I would like to select the industry code(indscode) with the following conditions:

 

(1) In which at least 50% of the jobs are exposed to agent a(0=unexposed, 1=exposed).[I want all percentages to be displayed and from there I will select the >=50%]. 

(2) Couting number of subjects( lung cases(case) and controls(cont) for the selected indscode.

 

I used proc SQL, expecting the indscode variable to be displayed on the table results, but it didn't. 

Coding help is needed to have the indscode  count and % displayed on the table,please.

Please find my code and  results below:

Thanks in advance.

ak.

data oc;
input id$ 1-5 job 6-7 a 8-9 b 10-11 indscode 12-16 lung$ 17-21;
datalines;
OS1 3 1 0 6198 case
OS2 2 1 1 6191 cont
OS3 3 1 0 6191 case
OS4 3 0 0 6198 cont
OS5 3 1 0 6191 case
OS6 3 1 0 6191 cont
OS6 4 1 1 6198 case
OS7 4 1 1 6191 case
OS8 1 1 1 6191 cont
OS9 1 0 0 6198 cont
OS10 1 1 1 6191 cont
OS11 3 1 1 4153 case
OS12 1 1 1 6191 case
OS13 3 1 1 8563 case
OS14 1 1 1 6191 case
OS14 2 0 0 6191 cont
OS14 5 1 1 6191 cont
OS15 1 1 1 4153 cont
OS16 1 1 0 8563 case
OS17 4 1 1 6191 case
OS17 2 1 0 6191 case
OS17 3 1 0 6191 cont
OS20 3 1 1 4153 cont
;

proc sql;
create table want as
select job, count(job)/(select count(*) from oc) as pct format=percent.
from oc where a=1 /*a=0 is unexposed to agent a; 1= exposed to agent a*/
group by indscode;
quit;

proc print data=want;
run;

 

Obs job pct
1 3 13%
2 3 13%
3 1 13%
4 5 57%
5 1 57%
6 3 57%
7 1 57%
8 3 57%
9 1 57%
10 4 57%
11 2 57%
12 4 57%
13 3 57%
14 2 57%
15 3 57%
16 1 57%
17 4 9%
18 3 9%
19 3 9%
20 1 9%

 

 

4 REPLIES 4
Kurt_Bremser
Super User

A variable that you want in the result dataset must be part of the SELECT.

 

I think you want this

select
  indscode,
  sum(a) / count(*) format=percent7.2 as pct
from oc.
group by indscode
ak2011
Fluorite | Level 6
Hi,
Thanks for your code. However, there are 2 more condition to be met:
(1) where a=1(exposed) and I should also know the n's (ie. the counts) that generated that %s.
(2) Count the number of cases and conts.
I don't know if proc freq too can solve this problem.
ak.
Kurt_Bremser
Super User

Since a is either 1 or 0, sum(a) is an exact count of the 1's.

sum(lung = "case") and sum(lung = "cont") gives you the respective counts. Keep in mind that the boolean result of a comparison is also either 1 or 0.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1326 views
  • 0 likes
  • 2 in conversation