BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jedrzej
Obsidian | Level 7

Hi, i have a dataset;

 

IDVAR
1A
1B
2B
3A
3C
4 

 

and I want to calculate the percentege of distinct value in var a for all  distinct ID's:

for example: population is 4 (distinct ID), occurance of var "B" is 2 so the result is 50%

 

thank you

 

example output:

 

 so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:

 

VAR Percentage of occurance var/ entire poplulation (distinct ID)
A50%
B50%
C25%
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input ID	VAR $;
cards;
1	A
1	B
2	B
3	A
3	C
4	 .
;
proc sql;
create table want as
select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2
 from have 
  where var is not missing
  group by var;
quit;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

What do you mean by 'Population is 4' ?

 

What does your desired result look like given this data?

ballardw
Super User

@Jedrzej wrote:
distinct ID

Since your "example" data has exactly zero occurrences of B for Id=4 I suggest that you provide an actual example of data where the B would actually be "50%". Better would be to provide expected output for ALL of the "example" data so we know what you expect us to do with missing values of Var.

andreas_lds
Jade | Level 19

@Jedrzej wrote:

Hi, i have a dataset;

 

ID VAR
1 A
1 B
2 B
3 A
3  
4 A

 

and I want to calculate the percentege of distinct value in var a for all ID:

for example: population is 4, occurance of var "B" is 2 so the result is 50%

 

thank you


No, this does not make any sense at all.

You have six obs in the table, so the two B are 33% or 40% if the obs with missing VAR should be ignored.

So, if you want help, please show what you expect as result and explain the logic you want applied.

Jedrzej
Obsidian | Level 7
No, I have 6 obs, but distinct ID's are 4 so the entire population is 4, one ID= one item. So there are two B, one for ID1 second for ID2. The question is: how many B are there for the entire population?


ballardw
Super User

As I suggested previously show the entrie expected output for your example data.

like what is the result for A?

 

As phrased there are exactly 3 "distinct" values of Var: A, B and blank.

 

And maybe a larger completely worked example as your logic isn't really clear.

 

 

Jedrzej
Obsidian | Level 7

ok, so as I wrote earlier the entire population=distinct ID, no matter if there are blanks in var. The output is to show percentage of occurance of var in the entire population, like this:

 

VAR Percentage of occurance var/ entire poplulation (distinct ID)
A50%
B50%
C25%
Ksharp
Super User
data have;
input ID	VAR $;
cards;
1	A
1	B
2	B
3	A
3	C
4	 .
;
proc sql;
create table want as
select var,count(distinct id)/(select count(distinct id) from have) as per format=percent8.2
 from have 
  where var is not missing
  group by var;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 657 views
  • 1 like
  • 5 in conversation