BookmarkSubscribeRSS Feed
ak2011
Fluorite | Level 6

Hi,

Could someone help me with the code to solve this problem?

From data oc below, I would like to select the industry code(indscode) with the following conditions:

 

(1) In which at least 50% of the jobs are exposed to agent a(0=unexposed, 1=exposed).[I want all percentages to be displayed and from there I will select the >=50%]. 

(2) Couting number of subjects( lung cases(case) and controls(cont) for the selected indscode.

 

I used proc SQL, expecting the indscode variable to be displayed on the table results, but it didn't. 

Coding help is needed to have the indscode  count and % displayed on the table,please.

Please find my code and  results below:

Thanks in advance.

ak.

data oc;
input id$ 1-5 job 6-7 a 8-9 b 10-11 indscode 12-16 lung$ 17-21;
datalines;
OS1 3 1 0 6198 case
OS2 2 1 1 6191 cont
OS3 3 1 0 6191 case
OS4 3 0 0 6198 cont
OS5 3 1 0 6191 case
OS6 3 1 0 6191 cont
OS6 4 1 1 6198 case
OS7 4 1 1 6191 case
OS8 1 1 1 6191 cont
OS9 1 0 0 6198 cont
OS10 1 1 1 6191 cont
OS11 3 1 1 4153 case
OS12 1 1 1 6191 case
OS13 3 1 1 8563 case
OS14 1 1 1 6191 case
OS14 2 0 0 6191 cont
OS14 5 1 1 6191 cont
OS15 1 1 1 4153 cont
OS16 1 1 0 8563 case
OS17 4 1 1 6191 case
OS17 2 1 0 6191 case
OS17 3 1 0 6191 cont
OS20 3 1 1 4153 cont
;

proc sql;
create table want as
select job, count(job)/(select count(*) from oc) as pct format=percent.
from oc where a=1 /*a=0 is unexposed to agent a; 1= exposed to agent a*/
group by indscode;
quit;

proc print data=want;
run;

 

Obs job pct
1 3 13%
2 3 13%
3 1 13%
4 5 57%
5 1 57%
6 3 57%
7 1 57%
8 3 57%
9 1 57%
10 4 57%
11 2 57%
12 4 57%
13 3 57%
14 2 57%
15 3 57%
16 1 57%
17 4 9%
18 3 9%
19 3 9%
20 1 9%

 

 

4 REPLIES 4
Kurt_Bremser
Super User

A variable that you want in the result dataset must be part of the SELECT.

 

I think you want this

select
  indscode,
  sum(a) / count(*) format=percent7.2 as pct
from oc.
group by indscode
ak2011
Fluorite | Level 6
Hi,
Thanks for your code. However, there are 2 more condition to be met:
(1) where a=1(exposed) and I should also know the n's (ie. the counts) that generated that %s.
(2) Count the number of cases and conts.
I don't know if proc freq too can solve this problem.
ak.
Kurt_Bremser
Super User

Since a is either 1 or 0, sum(a) is an exact count of the 1's.

sum(lung = "case") and sum(lung = "cont") gives you the respective counts. Keep in mind that the boolean result of a comparison is also either 1 or 0.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 721 views
  • 0 likes
  • 2 in conversation