Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Studio
- /
- Re: SAS code for counting cases/control for unique id_idchem combinat...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-01-2020 10:49 AM
(791 views)

Hi,

Please, could someone help me with the SAS code to verify the count(frequency) of ca case ca cont pop cont for each unique/distinct id and idchem combination. I have 29 obs, 3 pollutants idchem 990005=cla_exp, 990021 = bio_exp and 210701 = amo_exp. I need to count the number of the lung variables associated with each unique id_idchem combination. For obs 2 (Osaa13-idchem 210701) has 1 lung ca case. Same with obs 4 so I expect SAS to count obs 2 & 4 as frequency of 1 since they have the same id_idchem combination.

Thus, I would like the overall freq of unique id_idchem by lung for 210701 to be 2/8 for ca case (for example).

My SAS output Tables 1 & 2 are attached. The SAS code and log are found below.

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

72

73 data mg1b; set mgp(obs=29);

NOTE: There were 29 observations read from the data set WORK.MGP.

NOTE: The data set WORK.MG1B has 29 observations and 4 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

74 proc print data=mg1b;

75 Title " Table 1: Exposure to 3 pollutants";

76

NOTE: There were 29 observations read from the data set WORK.MG1B.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.27 seconds

cpu time 0.27 seconds

77 proc freq data=mg1b;

78 tables idchem*lung;

79 Title" Table 2: Frequency: id_idchem vs case/control for 3 pollutants "; run;

NOTE: There were 29 observations read from the data set WORK.MG1B.

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.21 seconds

cpu time 0.20 seconds

80

81 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

93

Thanks very much.

ak.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

proc sort with NODUPKEY is useful, but depending on the size of the data set, can be expensive. In this case, I'd suggest two proc freqs. The first produces a data set FREQS of frequencies of IDCHEM*LUNG (i.e. one obs for each combination with 2 additional variables COUNT and PERCENT). Then submit the new dataset to a proc freq with the same tables specification. This will yield counts of non-empty combinations:

```
proc freq data =mgp noprint;
tables idchem * lung /out=freqs;
run;
proc freq data=freqs;
tables idchem * lung / norow nocol;
run;
```

The advantage here is that proc freq doesn't need a sorted data set.

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @ak2011

Happy new year!

I suggest that you run a PROC SORT with the NODUPKEY option before running the PROC FREQ:

```
proc sort data = env out = env_distinct nodupkey;
by lung id_idchem;
run;
proc freq data = env_distinct;
table lung * idchem / nocol norow nocol nopercent;
run;
```

Does that give you the expected results?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

proc sort with NODUPKEY is useful, but depending on the size of the data set, can be expensive. In this case, I'd suggest two proc freqs. The first produces a data set FREQS of frequencies of IDCHEM*LUNG (i.e. one obs for each combination with 2 additional variables COUNT and PERCENT). Then submit the new dataset to a proc freq with the same tables specification. This will yield counts of non-empty combinations:

```
proc freq data =mgp noprint;
tables idchem * lung /out=freqs;
run;
proc freq data=freqs;
tables idchem * lung / norow nocol;
run;
```

The advantage here is that proc freq doesn't need a sorted data set.

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for

Allow PROC SORT to output multiple datasets

--------------------------

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks very much. It works!

ak.

ak.

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 25. Read more here about **why** you should contribute and **what is in it** for you!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.