turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Comparing rows

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-02-2011 01:01 PM

I have a dataset that has multiple records (ID2) for each person (ID1).

Note: (1) each Id1 has multiple ID2, (2) also the number of ID2 varies within each ID1. For E.g ID1 1x has 6 ID2, ID1 ef3 has 4 ID2.

I want to compare the observations for variables v1-v3 for each id1. E.g. I want to check if value for v1 in row1 is same as value in row2. Again I want to compare only within the same person or ID1 All variables are categorical. Transposing the dataset is not an option for me due to the size of the dataset. Any help would be greatly appreciated.

Thanks in advance.

Header 1 | Header 2 | Header 3 | Header 4 | Header 5 |
---|---|---|---|---|

id1 | id2 | v1 | v2 | v3 |

1x | 23d | r1 | r2 | |

1x | ef | r1 | ||

1x | sdg5 | r2 | ||

1x | drt | r2 | r1 | |

1x | s5 | r1 | r2 | |

1x | gert | r2 | r3 | |

ef3 | sy45 | r1 | ||

ef3 | 6eje | r3 | ||

ef3 | 47st | r3 | r1 | |

ef3 | dy45 | r3 | r1 | |

d2 | srgw5 | r1 | r2 | |

d2 | sr | r2 | ||

d2 | srt | r3 | r2 | |

d2 | 5r5 | r3 | ||

d2 | 45j | |||

d2 | eyh | r4 | ||

d2 | 56t4 | r5 | ||

d2 | hr6 | r2 | ||

d2 | dh45 | r5 |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to rsva

11-02-2011 01:35 PM

sounds like a job for a DoW loop.

What info is wanted from these compares?

- all blank
- single non-blank value
- all non-blanks are the same
- differences present

or

3. # distinct values

hth

peter

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Peter_C

11-02-2011 02:05 PM

Hi Peter,

I have a variables that I did not add in my original post. For example, assume V1-3 as race1, race2, and race3. I have a overall variable called Race. So (1) if I have different values in race1-3 then I coded as Race=Multiple, (2) if I have only one value in race1 and race2-3 are missing then I code as Race=Single.

So for a person to be of multiple race he can have values in race1-3 in a single id2 or one value in first id2 and anther value in the next id2.

E.g Scenario 1:

id1 id2 v1 v2 v3 race

q d1 r1 r2 Multiple

q d2 r2

Scenario 2:

id1 id2 v1 v2 v3 race

q d1 r1

q d2 r2 Multiple

I already checked the values between v1-3 for each observation (scenario 1). Now I am trying to compare the value between rows for each person (scenario 2).

Hope this helps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to rsva

11-02-2011 02:19 PM

Hi ... if the question is "are all non-missing values the same for variables V1-V3 within values of ID1",

I think this answers that question.

This will produce a "1" when they are the same and a "0" if different for variables MULT1-MULT3

(variable "o" maintains the ID1 order of the original data set).

**proc sql;**

**create table want (drop=o) as**

**select id1, monotonic() as o, (min(v1) eq max(v1)) as mult1, (min(v2) eq max(v2)) as mult2, (min(v3) eq max(v3)) as mult3**

**from have**

**group id1**

**order o**

**;**

**quit;**

In your posted data, no variable values meet the above criteria except for variable V3 where there there is only one non-missing value within each ID1.

**id1 mult1 mult2 mult3**

** 1x 0 0 1**

**ef3 0 0 1**

**d2 0 0 1**

If you want all the observations in the final data set with added variables MULT1-MULT3, change SELECT ID1 to SELECT * .

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to rsva

11-02-2011 01:42 PM

How about putting them into groups? Requires sorting so if your data is really large might not be reasonable.

data groups ;

set have;

by id1 v1-v3 ;

group + first.v3;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to rsva

11-03-2011 04:28 AM

You can use array to solve it.

Ksharp