Hi, i have a dataset;
ID | VAR |
1 | A |
1 | B |
2 | B |
3 | A |
3 | C |
4 |
and I want to calculate the percentege of distinct value in var a for all distinct ID's:
for example: population is 4 (distinct ID), occurance of var "B" is 2 so the result is 50%
thank you
example output:
so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:
VAR | Percentage of occurance var/ entire poplulation (distinct ID) |
A | 50% |
B | 50% |
C | 25% |
data have; input ID VAR $; cards; 1 A 1 B 2 B 3 A 3 C 4 . ; proc sql; create table want as select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2 from have where var is not missing group by var; quit;
What do you mean by 'Population is 4' ?
What does your desired result look like given this data?
@Jedrzej wrote:
distinct ID
Since your "example" data has exactly zero occurrences of B for Id=4 I suggest that you provide an actual example of data where the B would actually be "50%". Better would be to provide expected output for ALL of the "example" data so we know what you expect us to do with missing values of Var.
@Jedrzej wrote:
Hi, i have a dataset;
ID VAR 1 A 1 B 2 B 3 A 3 4 A
and I want to calculate the percentege of distinct value in var a for all ID:
for example: population is 4, occurance of var "B" is 2 so the result is 50%
thank you
No, this does not make any sense at all.
You have six obs in the table, so the two B are 33% or 40% if the obs with missing VAR should be ignored.
So, if you want help, please show what you expect as result and explain the logic you want applied.
As I suggested previously show the entrie expected output for your example data.
like what is the result for A?
As phrased there are exactly 3 "distinct" values of Var: A, B and blank.
And maybe a larger completely worked example as your logic isn't really clear.
ok, so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:
VAR | Percentage of occurance var/ entire poplulation (distinct ID) |
A | 50% |
B | 50% |
C | 25% |
data have; input ID VAR $; cards; 1 A 1 B 2 B 3 A 3 C 4 . ; proc sql; create table want as select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2 from have where var is not missing group by var; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.