BookmarkSubscribeRSS Feed
Tom
Super User Tom
Super User

Here is your data with shorter variable names and values so it is easier to see what is happening.  There are four records, but they only come from 3 distinct combinations of ID,START and STOP so the output dataset will only have 3 observations.  The first observation in not output because it is not the last record for that combination of ID,FIRST and STOP values.


Notice that when LAST.ID is true then LAST.STOP must also be true because even if the next observation just happened to have the same value for STOP it would represent another group since it is for a DIFFERENT value of ID.  

data one ;

  length ID 8 START STOP $1 NAME ROLE $50;

  infile cards dsd dlm='|' ;

  input ID START STOP NAME ROLE ;

cards;

1|A|B|Christine Godden|Primary Planner

1|A|B|David Addison|Primary Planner

1|C|D|Rachael Mayne|Plan Support Coordinator

2|E|F|Annalise Korsch|Primary Planner

run;

data want;

  set one;

  by id start stop ;

  length new $200 ;

  retain new ;

  if first.stop then new = ' ';

  new = catx('; ',new,catx(' ',name,cats('(',role,')')));

  put (id start stop first.id first.stop last.stop last.id ) (=) / new = /;

  if last.stop ;

run;

ID=1 START=A STOP=B FIRST.ID=1 FIRST.STOP=1 LAST.STOP=0 LAST.ID=0

new=Christine Godden (Primary Planner)

ID=1 START=A STOP=B FIRST.ID=0 FIRST.STOP=0 LAST.STOP=1 LAST.ID=0

new=Christine Godden (Primary Planner); David Addison (Primary Planner)

ID=1 START=C STOP=D FIRST.ID=0 FIRST.STOP=1 LAST.STOP=1 LAST.ID=1

new=Rachael Mayne (Plan Support Coordinator)

ID=2 START=E STOP=F FIRST.ID=1 FIRST.STOP=1 LAST.STOP=1 LAST.ID=1

new=Annalise Korsch (Primary Planner)

NOTE: There were 4 observations read from the data set WORK.ONE.

NOTE: The data set WORK.WANT has 3 observations and 6 variables.


ven
Calcite | Level 5 ven
Calcite | Level 5

In fact if I can use the by variables to identify the correct record I can use the method detailed, will do some more testing and update.

ven
Calcite | Level 5 ven
Calcite | Level 5

Hi Tom,

I was able to use the By variables as explained above. Which helps to simplify the program, removing the merge as well and this in turn improved memory usage.

Thanks a lot for the help.

data pt_all_planteam_temp  ;

    set NDIAOUT.plan_mgmt_staffname_nodup_hist;

    by ParentPartyID g_snapshot_start_date g_snapshot_end_date ;

    length All_Planners $255;

    retain All_Planners;

    if first.g_snapshot_end_date then

        All_Planners=cat(trim(Plan_mgmt_team_member)," (",trim(Participant_Management_Role_Desc),")")

    ;

    else

        All_Planners=catx("; ",trim(All_Planners),cat(trim(Plan_mgmt_team_member)," (",trim(Participant_Management_Role_Desc),")"))

    ;

 

    if last.g_snapshot_end_date;

    keep ParentPartyID g_snapshot_start_date g_snapshot_end_date All_Planners;

run;

ven
Calcite | Level 5 ven
Calcite | Level 5

Hi Tom ,

Also noted within a given g_snapshot_start_date and g_snapshot_end_date there can be many ParentPartyId s .

Which means

if first.g_snapshot_end_date will not give the right output at all times.

if last.ParentPartyID needs to be checked as well.

Tom
Super User Tom
Super User

That is not right.  If you are using two or more variables in your BY statement they are nested.  So if you have BY A B then by definition when it is LAST.A is also LAST.B.  So if you want one record per Party Id, Start Time, End Time as your original code was doing then you should use FIRST.G_SNAPSHOT_END_DATE and LAST.G_SNAPSHOT_END_DATE to detect the beginning and ending of a new group of records.  If you just want one per PARTYID then you do not want to keep the date variables on the output record since they will just be the last set of dates for that PARTYID.

Ksharp
Super User

Tom,

OP has a big table . Not good for MERGE .

Tom
Super User Tom
Super User

Not sure that was the issue. The issue was that instead of merging the two table the code was looping over one of the tables and for every observation in that table it was reading all of the other table.  So it was essentially doing an full outer join.  That is what was causing it to take so long.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 3402 views
  • 6 likes
  • 4 in conversation