BookmarkSubscribeRSS Feed
rsva
Fluorite | Level 6

I have a dataset that has multiple records (ID2) for each person (ID1).

Note: (1) each Id1 has multiple ID2, (2) also the number of ID2 varies within each ID1. For E.g ID1 1x has 6 ID2, ID1 ef3 has 4 ID2.

I want to compare the observations for variables v1-v3 for each id1. E.g. I want to check if value for v1 in row1 is same as value in row2.  Again I want to compare only within the same person or ID1 All variables are categorical. Transposing the dataset is not an option for me due to the size of the dataset. Any help would be greatly appreciated.

Thanks in advance.

Header 1Header 2Header 3Header 4Header 5
id1id2v1v2v3

1x

23dr1r2
1xefr1
1xsdg5r2
1xdrtr2r1
1xs5r1r2
1xgertr2r3
ef3sy45r1
ef36ejer3
ef347str3r1
ef3dy45r3r1
d2srgw5r1r2
d2srr2
d2srtr3r2
d25r5r3
d245j
d2eyhr4
d256t4r5
d2hr6r2
d2dh45r5
5 REPLIES 5
Peter_C
Rhodochrosite | Level 12

sounds like a job for a DoW loop.

What info is wanted from these compares?

  1. all blank
  2. single non-blank value
  3. all non-blanks are the same
  4. differences present

or

  3.  # distinct values

hth

peter

rsva
Fluorite | Level 6

Hi Peter,

I have a variables that I did not add in my original post. For example, assume V1-3 as race1, race2, and race3. I have a overall variable called Race. So (1) if I have different values in race1-3 then I coded as Race=Multiple, (2) if I have only one value in race1 and race2-3 are missing then I code as Race=Single.

So for a person to be of multiple race he can have values in race1-3 in a single id2 or one value in first id2 and anther value in the next id2.

E.g Scenario 1:

id1       id2         v1          v2          v3    race

q           d1         r1          r2                 Multiple

q           d2         r2

Scenario 2:

id1       id2        v1           v2           v3     race

q          d1        r1

q          d2        r2                                  Multiple

I already checked the values between v1-3 for each observation (scenario 1). Now I am trying to compare the value between rows for each person (scenario 2).

Hope this helps

MikeZdeb
Rhodochrosite | Level 12

Hi ... if the question is "are all non-missing values the same for variables V1-V3 within values of ID1",

I think this answers that question.

This will produce a "1" when they are the same and a "0" if different for variables MULT1-MULT3

(variable "o" maintains the ID1 order of the original data set).

proc sql;

create table want (drop=o) as

select id1, monotonic() as o, (min(v1) eq max(v1)) as mult1, (min(v2) eq max(v2)) as mult2, (min(v3) eq max(v3)) as mult3

from have

group id1

order o

;

quit;

In your posted data, no variable values meet the above criteria except for variable V3 where there there  is only one non-missing value within each ID1.

id1    mult1    mult2    mult3

1x       0        0        1

ef3      0        0        1

d2       0        0        1

If you want all the observations in the final data set with added variables MULT1-MULT3, change SELECT ID1 to SELECT * .

Tom
Super User Tom
Super User

How about putting them into groups?  Requires sorting so if your data is really large might not be reasonable.

data groups ;

   set have;

   by id1 v1-v3 ;

   group + first.v3;

run;

Ksharp
Super User

You can use array to solve it.

Ksharp

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1332 views
  • 0 likes
  • 5 in conversation