I need help checking if each observation contains a record from a list

Reply
Frequent Contributor
Posts: 82

I need help checking if each observation contains a record from a list

    For example, I have one dataset that has player_id 1=555, player_id_2=600, etc. through player_id_10

I have another dataset that contains a list of 1,000 player_id's. I want to check, for each observation in data set 1, if each player_id is contained in the list in dataset 2.

so if 555 is on the list, I want it to tell me so (maybe create a new variable that says player_id_1_check=1). If 600 isnt, maybe the new variable says 0.

Is that clear? Thanks for any help.

Regular Contributor
Posts: 151

Re: I need help checking if each observation contains a record from a list

Given the structure of your data, my instinct is to create a format from your list of player_id's and then read across your main dataset to see if the formatted value exists.  Here's an example using 3 player_id's..

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

run;

data have2;

input player_id;

cards;

555

650

700

740

;

run;

data id_fmt;

set have2 (rename=(player_id=start));

retain fmtname 'id_fmt' type 'N';

label='check';

run;

proc format cntlin=id_fmt;

run;

data want;

set have1;

array pl{*} player_id1 - player_id3;

array plchk{*} player_check_id1 - player_check_id3;

  do i=1 to dim(pl);

  plchk{i}=put(pl{i},id_fmt.)='check';

  end;

drop i;

run;

Respected Advisor
Posts: 3,156

Re: I need help checking if each observation contains a record from a list

There are so many ways to do the lookup, besides Keith's format look up, you could also do the followings, of course, 'merge' will require presorting.

data h1;

  do id=1 to 10;

  output;

  end;

  run;

  data h2;

    do id=1 by 2 to 20;

      output;

  end;

  run;

  /*SQL*/

  proc sql;

    create table want as

         select id, case when  id in (select id from h2) then 1 else 0 end as flag from h1;

      quit;

      /*Merge*/

      data want;

        merge h1(in=h1) h2(in=h2);

        by id;

      if h1;

flag=h1*h2;

        run;

        /*hash*/

        data want;

          if _n_=1 then do;

if 0 then set h2;

declare hash h2(dataset:'h2');

h2.definekey('id');

h2.definedone();

            end;

            set h1;

            flag=not(h2.find());

            run;

Haikuo

Regular Contributor
Posts: 151

Re: I need help checking if each observation contains a record from a list

These solutions assume the data has been transposed, although I think a table structure that has Player_Id and Player_Number as the variables is preferable (in which case your solutions would be good).

Respected Advisor
Posts: 3,156

Re: I need help checking if each observation contains a record from a list

You are right. If using non-transposed data as you have presented, then that will still leave Hash() viable:

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

data have2;

input player_id;

cards;

555

650

700

740

;

        /*hash*/

        data want (drop=player_id);

          if _n_=1 then do;

if 0 then set have2;

declare hash h2(dataset:'have2');

h2.definekey('player_id');

h2.definedone();

            end;

            set have1;

            array h1 player_id1-player_id3;

            array flag player_check_id1 - player_check_id3;

            do over h1;

flag=not(h2.check(key: h1));

end;

            run;

Haikuo

PROC Star
Posts: 7,474

Re: I need help checking if each observation contains a record from a list

And, another way to do the lookup would be to use the in function.  e.g.:

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

run;

data have2;

input player_id;

cards;

555

650

700

740

;

run;

data want;

  if _n_ eq 1 then set have1;

  array players(*) player_id1-player_id3;

  retain players:;

  do until (eof);

    set have2 end=eof;

    if player_id in players then output;

  end;

run;

Frequent Contributor
Posts: 82

Re: I need help checking if each observation contains a record from a list

I am having trouble seeing what this does

if _n_ eq 1 then set have1;


obviously I know the set, but what use is the if _n_ eq1 then

PROC Star
Posts: 7,474

Re: I need help checking if each observation contains a record from a list

In this case it didn't really serve any purpose other than my own programming habits.  I like to be explicitly reminded that the first set statement is only going to be called once.

Frequent Contributor
Posts: 82

Re: I need help checking if each observation contains a record from a list

Do you think there is a way to do this in reverse, IE I need to know when each player appears

player

1

2

3

game player1id player2id player3id-player10id

1  1  5     7

so whereever player 1 appears, I ge tthe game #

player game1-gameX

1 1 3 4 10  (so he appears in games 1,3, 4, and 10)

does that make sense? I already have a data set checking which players are in each game, I want to check now what games each player is in. Since there are 1,000+ games, and 10 players in each game, will the array type function still work, with the format as Keith suggested in the original problem?

PROC Star
Posts: 7,474

Re: I need help checking if each observation contains a record from a list

Depend on what you want.  If you want a field created that contains the list, its length will depend upon the maximum number of games a particular player could be in.  The following assumes that the maximum could be 10 games, but it could be set to handle all 1,000.

data have;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

data need (keep=game player);

  set have;

  array players(*) player1-player10;

  do i=1 to 10;

    player=players(i);

    output;

  end;

run;

proc sort data=need;

  by player game;

run;

data want (keep=player games);

  set need;

  length games $70;

  retain games;

  by player;

  if first.player then games=game;

  else games=catx(",",games,game);

  if last.player then output;

run;

Respected Advisor
Posts: 3,156

Re: I need help checking if each observation contains a record from a list

Just in case you already have two tables, one keyed by game, with player1-play10; another table only has players. So I stole Art's data as well as some of his ideas:

data h1;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

data h2;

do player=1 to 19; output;end;

run;

data want;

  set h2;

  length games $100;

    do i=1 to nobs;

    set h1 point=i nobs=nobs;

     array players player1-player10;

      if player in players then games=catx(',', games,game);

    end;

keep player games;

run;

Haikuo

Frequent Contributor
Posts: 82

Re: I need help checking if each observation contains a record from a list

Alright, this worked, but I realized one small thing I need to add. In addition to the list of players there is a list of positions next to each.

So, from above:

data h1;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

the input would actually look like:

data h1;

  input game player1-player10;

input game player1 pos1 player2 pos2...and so on

  cards;

1 1 a 5 b 6 c 8 d 7 e 10 f 12 g 14 h 16 i 18 j

2 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 10 k

3 1 a 2 b 5 c 7 d 9 e 11 f 13 g 15 h 17 i 19 k

;

I am trying to simply add another array and put said array in the catx function to no success.

Respected Advisor
Posts: 3,156

Re: I need help checking if each observation contains a record from a list

More clarification is needed. What does the final outcome look like? Some samples please.

Haikuo

Frequent Contributor
Posts: 82

Re: I need help checking if each observation contains a record from a list

Similar to above,

player game1-gameX

1 1 3 4 10  (so he appears in games 1,3, 4, and 10)

I would need it to be somewhat like

player game1 pos1 game2 pos2.....

1 1 a 3 c 4 d 10 g.... and so forth.

it wouldnt matter if the variables are combined (for example, instead of game1 and pos1, you simply have game1 that is 1a and then game 2 is 3c and so forth. I basically need the same thing done, but the positions have to be connected to the games it found before.)

PROC Star
Posts: 7,474

Re: I need help checking if each observation contains a record from a list

data h1;

input game

      player1 pos1 $

      player2 pos2 $

      player3 pos3 $

      player4 pos4 $

      player5 pos5 $

      player6 pos6 $

      player7 pos7 $

      player8 pos8 $

      player9 pos9 $

      player10 pos10 $;

  cards;

1 1 a 5 b 6 c 8 d 7 e 10 f 12 g 14 h 16 i 18 j

2 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 10 k

3 1 a 2 b 5 c 7 d 9 e 11 f 13 g 15 h 17 i 19 k

;

data need (keep=game player pos);

  set h1;

  array players(*) player1-player10;

  array position(*) $ pos1-pos10;

  do i=1 to 10;

    player=players(i);

    pos=position(i);

    output;

  end;

run;

proc sort data=need;

  by player game;

run;

data want (keep=player games);

  set need;

  length games $70;

  retain games;

  by player;

  game_and_pos=catx(" ",game,pos);

  if first.player then games=game_and_pos;

  else games=catx(",",games,game_and_pos);

  if last.player then output;

run;

Ask a Question
Discussion stats
  • 22 replies
  • 600 views
  • 6 likes
  • 4 in conversation