BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xamius32
Calcite | Level 5

So I want to be able to mark 3-4+ observations as the first based on them being tied on a date.

Data looks like this:

Player_IDDateTeamNumber
14/3/12A
24/3/12A
34/3/12A

So I am trying to mark all of them as  0 in the number column.

If I do first.team, only the first would be 0, same with Date. Is there an easy solution to this? I am probably just missing it.

1 ACCEPTED SOLUTION

Accepted Solutions
EdvinasS
Calcite | Level 5

data need;

retain triger 0;

set original; by team date player_id;

if first.team = 1 then triger = 1;

if last.team = 1 then triger = 0;

run;

something like that?

if you could give bigger dataset example and the outcome you want to achieve it would be much more easy to understand and explane. im working on something similar but i dont get your problem..

View solution in original post

14 REPLIES 14
UrvishShah
Fluorite | Level 6

Hi,

try the following code...

proc sort data = test;

      by team;

run;

data test;

       set test;

       by team;

       if first.team then do;

          retain number 0;

       end;

run;

Thanks,

Xamius32
Calcite | Level 5

How would that take into account the dates? For each team, there are about 100 observations, and 2-3 that I want to mark as the first, if they all have the same first date.

ballardw
Super User

Try sort by date and team then first.team. Then the first occurence of each team within a given date gets selected.

Xamius32
Calcite | Level 5

Problem is I want to select all the observations for each time that are in the first date. Does that make sense? It seems first. can only select one observation for a particular group.

Reeza
Super User

Yes it does, but  assuming you're grouping by something else you should be able to do it.

What does the following give you?

proc sort data=have;

by team date player;

data want;

set have;

by team date;

retain number;

if first.team then number=0;

if first.date then number+1;

run;

Xamius32
Calcite | Level 5

I see where you are going with it, and I am going to try a few things, but I probably should explain the back story.

I got the number variable as a way to see how many days it has been since a team has played, using the lag function. However, this of course didnt work fully as teams with multiple players on the same day would have a zero instead of a one. So I wanted to change all the zeroes to ones except for the first date for each team.

data gamedates4;

set gamedates2;

by team_name gamedate;

days=game_date-lag(game_date);

run;

Im going to try what you said, but I need to figure out how to not mess up the rest of it.

ballardw
Super User

Looks like you need to include player_id in the sort and the by statement: by team date player_id;

then First.date refers to the first player's record within the date.

Also your Days calculation is probably incorrect as you are getting some days, the first days value for each team (except the first team) reflects the days between the last game of the previous team and the first of the current team.

days=game_date-lag(game_date);

probably should have a:

if first.team then days=.; /* or what ever makes sense in your application*/

Xamius32
Calcite | Level 5

Yea, I got rid of those by putting all negative values as zero.

Anyways, wouldn't the first.date still only capture the one players, I need to capture all the players with the first date for each team. I am starting to think I need to rethink this.

Reeza
Super User

It will capture the first date and the retain will keep it at value until it changes again (another first.date) if you have BY variables in the proper sequence.

Xamius32
Calcite | Level 5

Okay, last question. This worked, but one thing I realize is that the lag function will create zeroes on the teams that have multiple players on the same date (IE the dates are the same, so after the first, the lag is zero.) How can I tell it to retain days until the next gamedate? IE I put the days=game_date-lag(game_date), then I need to retain that value until the gamedate changes. MY thought was use if first.game_date=0 then days=lag(days). but it doesnt work.

Reeza
Super User

why not retain days and only calculate it when the date changes and reset the lag at that point.

EdvinasS
Calcite | Level 5

Hi, i consume you are a tipster? please contact me on stajer@usa.com for bussines.

EdvinasS
Calcite | Level 5

data need;

retain triger 0;

set original; by team date player_id;

if first.team = 1 then triger = 1;

if last.team = 1 then triger = 0;

run;

something like that?

if you could give bigger dataset example and the outcome you want to achieve it would be much more easy to understand and explane. im working on something similar but i dont get your problem..

DBailey
Lapis Lazuli | Level 10

If you're truly trying to track team dates, then something like this might be better:

proc sql;

create table team_dates as

select distinct team, game_date from have

order by team, game_date;

quit;

data want;

set team_dates;

format game_lag 8.;

by team;

if first.team then game_lag=0;

else game_lag=game_date - lag(game_date);

run;

should produce something like

TeamGame_Dategame_lag
A4/3/120
A4/4/121
A4/6/122
A5/1/1224
B5/1/120
B5/5/124

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 4778 views
  • 6 likes
  • 6 in conversation