Quartz | Level 8

## Running frequency

Hi all,

I have this dataset. I need help urgently on this.

 id characteristic id1 RDD id1 TSS id2 RDD id2 TSS

I need the final result table as:

 Characteristic N % RDD 2 100 TSS 2 100

how can I accomplish this

I tried by sorting the data-set by characteristic and it doesn't work.

proc freq data=have
by characteristic;
tables id/out=want
run;

Thanks.

10 REPLIES 10
Super User

## Re: Running frequency

Your percent value looks redundant and therefore useless to me.

Calculate the count like that:

``````data have;
input id \$ characteristic \$;
cards;
id1 RDD
id1 TSS
id2 RDD
id2 TSS
;
run;

proc sql;
create table want as
select
characteristic,
count(*) as N
from have
group by characteristic;
quit;``````

Note how I put your data in a data step with datalines for easy recreation by others. Do so yourself in the future, don't offload your work on somebody else.

Quartz | Level 8

## Re: Running frequency

Hi @Kurt_Bremser  I was looking to summarize the count but since there was a condition that could be repeated for more than an id I was concerned. Also in your code how can I ensure the counts of the condition(N) is among  all ids. I mean I don't see id mentioned in the code. or that is not needed.

proc sql;
create table want as select
characteristic,
count(*) as N
from have
group by characteristic;
quit;

Super User

## Re: Running frequency

``````proc print data=want noobs;
run;``````

to my previous code, the result

```characteristic    N

RDD          2
TSS          2
```

clearly matches what you posted as the expected result, minus the irrelevant percent values.

If you have factors that influences these percent values so that they are not 100 all the time, please show so with an expanded dataset.

Quartz | Level 8

## Re: Running frequency

Hi @KurtBremser I am back on this string. I do need help with the complete solution.  Right now, this code works for me that is edited from your first code.

proc sql;

create table want as

select characteristic, count(distinct id) as n

from have

group by characteristic;

quit;

However, I need the denominator to be the 'total number of distinct ids' to get the percent. How can I accomplish that as a nested select program in sql?

so basically I need :

proc sql;

create table want2 as

select count (distinct id) as n2

from have  /* original dataset*/

quit;

so this count of distinct ids are not grouped by characteristic and my percent needs to be

n/n2*100

Hope this makes sense.

I have my code as this but I know there is something missing in it

proc sql;

create table want as

select a.characteristic, a.n, b.n2

from (select characteristic, count(distinct id) as n from have group by ccs2)

(select count(distinct id) as n2 from have) b;

quit;

So I want the three columns in the final dataset characteristic, n, n2, percent

and just to expand the data:

 id characteristic id1 RDD id1 TSS id2 TSS id2 RDD id2 GGG id2 HHH

So the output dataset will be

 Characteristic N N2 % RDD 2 2 100 TSS 2 2 100 GGG 1 2 50 HHH 1 2 50

Super User

## Re: Running frequency

Take the data step from my post and expand it. I had to edit your copy/paste from the table viewer once to get that step running, I won't do it again.

Diamond | Level 26

## Re: Running frequency

Post test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

Also explain what is "not working".  I am going to assume you mean it is taking the percentage as sum from total obs.  Maybe something like:

```proc sql;
create table want as
select a.characteristic,
a.n,
(a.n/b.tot)*100 as perc
from   (select characteristic,count(distinct id) as n from have group by characteristic) a
full join (select characteristic,count(*) as tot from have group by characteristic) b
on     a.characteristic=b.characteristic;
quit;```
Quartz | Level 8

## Re: Running frequency

Hi @RW9 Its obvious by solution of proc freq was not helpful:

It gave an error the characteristic variable  is not sorted properly.

And, the denominator for the percent has to be the total number of distinct ids.

Super User

## Re: Running frequency

@dr2014 wrote:

Hi @RW9 Its obvious by solution of proc freq was not helpful:

It gave an error the characteristic variable  is not sorted properly.

Then sort the dataset.

And, the denominator for the percent has to be the total number of distinct ids.

Do you mean something like this:

``````data have;
input id \$ characteristic \$;
cards;
id1 RDD
id1 TSS
id2 RDD
id2 TSS
id2 RDD
;
run;

proc sql;
create table want as
select
characteristic,
count(distinct id) as N,
calculated N / count(*) as percent format=percent6.
from have
group by characteristic;
quit;

proc print data=want noobs;
run;``````
```characteristic    N    percent

RDD          2      67%
TSS          2     100%
```

Once again, meaningful and usable example data with expected result will help us greatly in helping you.

Diamond | Level 26

## Re: Running frequency

Then sort the data.  Also please refer to the guidance on posting new questions - post test data in the form of a datastep, show what "is not working" etc. It is not for me to type test data, then run code to get a log telling me "what is wrong with the code".

Simply use and alter the code I have provided.

Rhodochrosite | Level 12

## Re: Running frequency

one more to find counts

``````data have;
input id \$ characteristic \$;
datalines;
id1 RDD
id1 TSS
id2 RDD
id2 TSS
;

proc sort data = have out=have1;
by characteristic;
run;

data want;
do until(last.characteristic);
set have1(keep=characteristic);
by characteristic;
count=sum(count,1);;
end;
run;``````
Discussion stats
• 10 replies
• 1147 views
• 0 likes
• 4 in conversation