Hi, i have a dataset;
| ID | VAR |
| 1 | A |
| 1 | B |
| 2 | B |
| 3 | A |
| 3 | C |
| 4 |
and I want to calculate the percentege of distinct value in var a for all distinct ID's:
for example: population is 4 (distinct ID), occurance of var "B" is 2 so the result is 50%
thank you
example output:
so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:
| VAR | Percentage of occurance var/ entire poplulation (distinct ID) |
| A | 50% |
| B | 50% |
| C | 25% |
data have; input ID VAR $; cards; 1 A 1 B 2 B 3 A 3 C 4 . ; proc sql; create table want as select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2 from have where var is not missing group by var; quit;
What do you mean by 'Population is 4' ?
What does your desired result look like given this data?
@Jedrzej wrote:
distinct ID
Since your "example" data has exactly zero occurrences of B for Id=4 I suggest that you provide an actual example of data where the B would actually be "50%". Better would be to provide expected output for ALL of the "example" data so we know what you expect us to do with missing values of Var.
@Jedrzej wrote:
Hi, i have a dataset;
ID VAR 1 A 1 B 2 B 3 A 3 4 A
and I want to calculate the percentege of distinct value in var a for all ID:
for example: population is 4, occurance of var "B" is 2 so the result is 50%
thank you
No, this does not make any sense at all.
You have six obs in the table, so the two B are 33% or 40% if the obs with missing VAR should be ignored.
So, if you want help, please show what you expect as result and explain the logic you want applied.
As I suggested previously show the entrie expected output for your example data.
like what is the result for A?
As phrased there are exactly 3 "distinct" values of Var: A, B and blank.
And maybe a larger completely worked example as your logic isn't really clear.
ok, so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:
| VAR | Percentage of occurance var/ entire poplulation (distinct ID) |
| A | 50% |
| B | 50% |
| C | 25% |
data have; input ID VAR $; cards; 1 A 1 B 2 B 3 A 3 C 4 . ; proc sql; create table want as select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2 from have where var is not missing group by var; quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.