BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jedrzej
Obsidian | Level 7

Hi, i have a dataset;

 

IDVAR
1A
1B
2B
3A
3C
4 

 

and I want to calculate the percentege of distinct value in var a for all  distinct ID's:

for example: population is 4 (distinct ID), occurance of var "B" is 2 so the result is 50%

 

thank you

 

example output:

 

 so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:

 

VAR Percentage of occurance var/ entire poplulation (distinct ID)
A50%
B50%
C25%
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input ID	VAR $;
cards;
1	A
1	B
2	B
3	A
3	C
4	 .
;
proc sql;
create table want as
select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2
 from have 
  where var is not missing
  group by var;
quit;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

What do you mean by 'Population is 4' ?

 

What does your desired result look like given this data?

ballardw
Super User

@Jedrzej wrote:
distinct ID

Since your "example" data has exactly zero occurrences of B for Id=4 I suggest that you provide an actual example of data where the B would actually be "50%". Better would be to provide expected output for ALL of the "example" data so we know what you expect us to do with missing values of Var.

andreas_lds
Jade | Level 19

@Jedrzej wrote:

Hi, i have a dataset;

 

ID VAR
1 A
1 B
2 B
3 A
3  
4 A

 

and I want to calculate the percentege of distinct value in var a for all ID:

for example: population is 4, occurance of var "B" is 2 so the result is 50%

 

thank you


No, this does not make any sense at all.

You have six obs in the table, so the two B are 33% or 40% if the obs with missing VAR should be ignored.

So, if you want help, please show what you expect as result and explain the logic you want applied.

Jedrzej
Obsidian | Level 7
No, I have 6 obs, but distinct ID's are 4 so the entire population is 4, one ID= one item. So there are two B, one for ID1 second for ID2. The question is: how many B are there for the entire population?


ballardw
Super User

As I suggested previously show the entrie expected output for your example data.

like what is the result for A?

 

As phrased there are exactly 3 "distinct" values of Var: A, B and blank.

 

And maybe a larger completely worked example as your logic isn't really clear.

 

 

Jedrzej
Obsidian | Level 7

ok, so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:

 

VAR Percentage of occurance var/ entire poplulation (distinct ID)
A50%
B50%
C25%
Ksharp
Super User
data have;
input ID	VAR $;
cards;
1	A
1	B
2	B
3	A
3	C
4	 .
;
proc sql;
create table want as
select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2
 from have 
  where var is not missing
  group by var;
quit;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1672 views
  • 1 like
  • 5 in conversation