BookmarkSubscribeRSS Feed
Xamius32
Calcite | Level 5

    For example, I have one dataset that has player_id 1=555, player_id_2=600, etc. through player_id_10

I have another dataset that contains a list of 1,000 player_id's. I want to check, for each observation in data set 1, if each player_id is contained in the list in dataset 2.

so if 555 is on the list, I want it to tell me so (maybe create a new variable that says player_id_1_check=1). If 600 isnt, maybe the new variable says 0.

Is that clear? Thanks for any help.

22 REPLIES 22
Keith
Obsidian | Level 7

Given the structure of your data, my instinct is to create a format from your list of player_id's and then read across your main dataset to see if the formatted value exists.  Here's an example using 3 player_id's..

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

run;

data have2;

input player_id;

cards;

555

650

700

740

;

run;

data id_fmt;

set have2 (rename=(player_id=start));

retain fmtname 'id_fmt' type 'N';

label='check';

run;

proc format cntlin=id_fmt;

run;

data want;

set have1;

array pl{*} player_id1 - player_id3;

array plchk{*} player_check_id1 - player_check_id3;

  do i=1 to dim(pl);

  plchk{i}=put(pl{i},id_fmt.)='check';

  end;

drop i;

run;

Haikuo
Onyx | Level 15

There are so many ways to do the lookup, besides Keith's format look up, you could also do the followings, of course, 'merge' will require presorting.

data h1;

  do id=1 to 10;

  output;

  end;

  run;

  data h2;

    do id=1 by 2 to 20;

      output;

  end;

  run;

  /*SQL*/

  proc sql;

    create table want as

         select id, case when  id in (select id from h2) then 1 else 0 end as flag from h1;

      quit;

      /*Merge*/

      data want;

        merge h1(in=h1) h2(in=h2);

        by id;

      if h1;

flag=h1*h2;

        run;

        /*hash*/

        data want;

          if _n_=1 then do;

if 0 then set h2;

declare hash h2(dataset:'h2');

h2.definekey('id');

h2.definedone();

            end;

            set h1;

            flag=not(h2.find());

            run;

Haikuo

Keith
Obsidian | Level 7

These solutions assume the data has been transposed, although I think a table structure that has Player_Id and Player_Number as the variables is preferable (in which case your solutions would be good).

Haikuo
Onyx | Level 15

You are right. If using non-transposed data as you have presented, then that will still leave Hash() viable:

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

data have2;

input player_id;

cards;

555

650

700

740

;

        /*hash*/

        data want (drop=player_id);

          if _n_=1 then do;

if 0 then set have2;

declare hash h2(dataset:'have2');

h2.definekey('player_id');

h2.definedone();

            end;

            set have1;

            array h1 player_id1-player_id3;

            array flag player_check_id1 - player_check_id3;

            do over h1;

flag=not(h2.check(key: h1));

end;

            run;

Haikuo

art297
Opal | Level 21

And, another way to do the lookup would be to use the in function.  e.g.:

data have1;

input player_id1-player_id3;

cards;

555 600 740

;

run;

data have2;

input player_id;

cards;

555

650

700

740

;

run;

data want;

  if _n_ eq 1 then set have1;

  array players(*) player_id1-player_id3;

  retain players:;

  do until (eof);

    set have2 end=eof;

    if player_id in players then output;

  end;

run;

Xamius32
Calcite | Level 5

I am having trouble seeing what this does

if _n_ eq 1 then set have1;


obviously I know the set, but what use is the if _n_ eq1 then

art297
Opal | Level 21

In this case it didn't really serve any purpose other than my own programming habits.  I like to be explicitly reminded that the first set statement is only going to be called once.

Xamius32
Calcite | Level 5

Do you think there is a way to do this in reverse, IE I need to know when each player appears

player

1

2

3

game player1id player2id player3id-player10id

1  1  5     7

so whereever player 1 appears, I ge tthe game #

player game1-gameX

1 1 3 4 10  (so he appears in games 1,3, 4, and 10)

does that make sense? I already have a data set checking which players are in each game, I want to check now what games each player is in. Since there are 1,000+ games, and 10 players in each game, will the array type function still work, with the format as Keith suggested in the original problem?

art297
Opal | Level 21

Depend on what you want.  If you want a field created that contains the list, its length will depend upon the maximum number of games a particular player could be in.  The following assumes that the maximum could be 10 games, but it could be set to handle all 1,000.

data have;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

data need (keep=game player);

  set have;

  array players(*) player1-player10;

  do i=1 to 10;

    player=players(i);

    output;

  end;

run;

proc sort data=need;

  by player game;

run;

data want (keep=player games);

  set need;

  length games $70;

  retain games;

  by player;

  if first.player then games=game;

  else games=catx(",",games,game);

  if last.player then output;

run;

Haikuo
Onyx | Level 15

Just in case you already have two tables, one keyed by game, with player1-play10; another table only has players. So I stole Art's data as well as some of his ideas:

data h1;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

data h2;

do player=1 to 19; output;end;

run;

data want;

  set h2;

  length games $100;

    do i=1 to nobs;

    set h1 point=i nobs=nobs;

     array players player1-player10;

      if player in players then games=catx(',', games,game);

    end;

keep player games;

run;

Haikuo

Xamius32
Calcite | Level 5

Alright, this worked, but I realized one small thing I need to add. In addition to the list of players there is a list of positions next to each.

So, from above:

data h1;

  input game player1-player10;

  cards;

1 1 5 6 8 7 10 12 14 16 18

2 1 2 3 4 5 6 7 8 9 10

3 1 2 5 7 9 11 13 15 17 19

;

the input would actually look like:

data h1;

  input game player1-player10;

input game player1 pos1 player2 pos2...and so on

  cards;

1 1 a 5 b 6 c 8 d 7 e 10 f 12 g 14 h 16 i 18 j

2 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 10 k

3 1 a 2 b 5 c 7 d 9 e 11 f 13 g 15 h 17 i 19 k

;

I am trying to simply add another array and put said array in the catx function to no success.

Haikuo
Onyx | Level 15

More clarification is needed. What does the final outcome look like? Some samples please.

Haikuo

Xamius32
Calcite | Level 5

Similar to above,

player game1-gameX

1 1 3 4 10  (so he appears in games 1,3, 4, and 10)

I would need it to be somewhat like

player game1 pos1 game2 pos2.....

1 1 a 3 c 4 d 10 g.... and so forth.

it wouldnt matter if the variables are combined (for example, instead of game1 and pos1, you simply have game1 that is 1a and then game 2 is 3c and so forth. I basically need the same thing done, but the positions have to be connected to the games it found before.)

art297
Opal | Level 21

data h1;

input game

      player1 pos1 $

      player2 pos2 $

      player3 pos3 $

      player4 pos4 $

      player5 pos5 $

      player6 pos6 $

      player7 pos7 $

      player8 pos8 $

      player9 pos9 $

      player10 pos10 $;

  cards;

1 1 a 5 b 6 c 8 d 7 e 10 f 12 g 14 h 16 i 18 j

2 1 a 2 b 3 c 4 d 5 e 6 f 7 g 8 h 9 i 10 k

3 1 a 2 b 5 c 7 d 9 e 11 f 13 g 15 h 17 i 19 k

;

data need (keep=game player pos);

  set h1;

  array players(*) player1-player10;

  array position(*) $ pos1-pos10;

  do i=1 to 10;

    player=players(i);

    pos=position(i);

    output;

  end;

run;

proc sort data=need;

  by player game;

run;

data want (keep=player games);

  set need;

  length games $70;

  retain games;

  by player;

  game_and_pos=catx(" ",game,pos);

  if first.player then games=game_and_pos;

  else games=catx(",",games,game_and_pos);

  if last.player then output;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 22 replies
  • 1356 views
  • 6 likes
  • 4 in conversation