BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amng
Calcite | Level 5

Hi, I have a dataset with about 7000 observations. Some are identified as twins, signaled by a 0/1 variable "TWIN", and a family variable ID "FAMILY" indicates whether they are part of the same family (e.g. twins would have the same Family ID). I would like to run a Kappa Statistic showing the concordance between their reports on a childhood variable "CHILD", but am unsure how to get the data set in the appropriate mode for that without manually looking and recognizing everyone with the same family variable as twins. Once I reorganize the data, I can run a Kappa Statistic without any issues.

 

Any help would be appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Filter only family that has twins as 1

2. Ensure that there are two records minimum for the twins identified (some have only 1???)

3. Take the lower ID as the first ID and maximum as second ID - will not work if you need to scale to triplets or other multiple children.

 

Will also not work if you have kids in a family with twins and the non-twins have the twin flag = 1. 

For the shown use cases it will work, if it doesn't work for you please post your log and the examples of cases where it doesn't work. 

 

proc sql;
create table want as
select familyID, min(childID) as child_twin_ID1, max(childID) as child_twin_ID2
from have
where twins = 1
group by familyID
having count(familyID) >=2;
quit;

@amng wrote:

Data attached as an example. On the lefthand side it an example of the data I have, and the righthand side is what I would like to have. Thanks so much!


 

View solution in original post

8 REPLIES 8
Reeza
Super User
You need to show your data set if you need help reformatting it.
amng
Calcite | Level 5

For example:

 

FAMILY ID TWIN CHILD

1                    1       4

2                    0       5

3                    1       3

3                    1       4

4                    0       2

4                    0       5

 

Reeza
Super User
And the expected output (exactly please) if this is the input.
amng
Calcite | Level 5

And the output dataset should just include twins, with reports side by side for each set of twins (individuals that indicate twin=1 and have the same family ID). For example:

 

 CHILD1  CHILD2  FAMILY ID

   5            6                1

   2            2                2

   3            4                3

   4            5                4

 

Reeza
Super User
I don't see how the data you provided maps to the data shown? Where does the Child1/2 come from, is that the Child record from the initial table?If so, where did the 6 come from?
amng
Calcite | Level 5

Sorry, this is not the actual dataset just an example of how it looks currently vs. how it needs to be organized to run my proposed analysis.

amng
Calcite | Level 5

Data attached as an example. On the lefthand side it an example of the data I have, and the righthand side is what I would like to have. Thanks so much!

Reeza
Super User

1. Filter only family that has twins as 1

2. Ensure that there are two records minimum for the twins identified (some have only 1???)

3. Take the lower ID as the first ID and maximum as second ID - will not work if you need to scale to triplets or other multiple children.

 

Will also not work if you have kids in a family with twins and the non-twins have the twin flag = 1. 

For the shown use cases it will work, if it doesn't work for you please post your log and the examples of cases where it doesn't work. 

 

proc sql;
create table want as
select familyID, min(childID) as child_twin_ID1, max(childID) as child_twin_ID2
from have
where twins = 1
group by familyID
having count(familyID) >=2;
quit;

@amng wrote:

Data attached as an example. On the lefthand side it an example of the data I have, and the righthand side is what I would like to have. Thanks so much!


 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2075 views
  • 0 likes
  • 2 in conversation