BookmarkSubscribeRSS Feed
SASdevAnneMarie
Barite | Level 11

Hello Experts,

I would like to Insert the line by "group by" before the first observation and after the last observation (I added the file).

I started the code but I don't know how to write the "insert" statement :

 

proc sort data=donnees;
by Numero;
run;

data donnees_2;
set donnees;
if first.Numero then 

SASdevAnneMarie_4-1685115680083.pngSASdevAnneMarie_5-1685115701926.png

Thank you!

6 REPLIES 6
Reeza
Super User

Insert them at the end and then sort by date. Note that if you change the numero variable you lose your ability to group/sort so I would also add another variable.

data donnees;
    input numero $ date : ddmmyy10.;
    format date date9.;
    cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
;
run;

data donnees_2;
    length numero $20.;
    set donnees;
    by numero;
    numero_grouping=numero;
    output;

    if last.numero then
        do;
            numero=catx("_", "Debut", numero);
            date='31Dec2022'd;
            output;
            numero=catx("_", "Fin", numero);
            date='31Dec2023'd;
            output;
        end;
run;

proc sort data=donnees_2;
    by numero_grouping date;
run;
Tom
Super User Tom
Super User

It doesn't make much sense to add those lines into the DATASET.

Why not just add them into the REPORT?

data have;
  input numero $ date :ddmmyy.;
  format date date9.;
cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
;

proc report data=have;
  columns numero date ;
  define numero / group;
  define date / display;
  compute before numero ; line 'BEFORE ' numero $8. ' 31DEC2022' ; endcomp;
  compute after numero ; line 'AFTER ' numero $8. ' 31DEC2023' ; endcomp;
run;
mkeintz
PROC Star

@Tom wrote:

It doesn't make much sense to add those lines into the DATASET.

Why not just add them into the REPORT?

This certainly seems to be more in the proc report wheelhouse.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

Probably a repeat:  Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Where to the values for the "inserted" date come from?

 

Not going to write code with non-existent data but the general approach would be something like:

When you get the "first" value:

  1) store ALL the variables into temporary variables.

 2) assign the needed values to all the variables

 3) output to the data set

 4) reassign the values from the temporary variables back to the original variables

  5) output the record

When neither a first nor last, output the record

when last

1) output the record

2) assign the values to the variables

3) output the record

 

KEEP just the variables you want (or Drop the ones you don't)

 

One concern: You are adding enough text to the "Numero" variable it is very likely that you need to specify a longer length to hold the values before the SET statement using your existing values.

mkeintz
PROC Star

Use two SET statements, which provides a way to achieve your goal in a single DATA step.

 

Both SETs read all the observations - i.e. they are parallel "streams" of input data.  The first SET is tied to a BY NUMERO statement, allowing you to know when to construct and explicitly output (only) the DEBUT and FIN observations.  The second SET reads all observations and is followed by its own OUTPUT, thus providing a way to output the unmodified dates.

 

data donnees;
    input numero $ date : ddmmyy10.;
    format date date9.;
    cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
run;

data want;
  set donnees;
  by numero;
  if first.numero then do;
    numero=cats('Debut_',numero);
    date='31dec2022'd;
    output;
  end;

  set donnees;
  output;

  if last.numero then do;
    date='31dec2023'd;
    numero=cats('Fin_',numero);
    output;
  end;
run;

 

 

You can use a slightly more familiar construct, with a single SET statement that has two arguments, both of which are dataset DONNEES.  The concept is similar, i.e. two streams, but now instead a 1:1 parallel stream, it is now block to block parallel (i.e. all of a given numero from one stream, followed by all of them for the second stream.  

 

data want2;
  set donnees (in=stream1) donnees (in=stream2);
  by numero;

  if first.numero then do;
    numero=cats('DEBUT_',numero);
    date='31dec2022'd;
    output;
  end;
  if stream2 then output;
  if last.numero then do;
    numero=cats('FIN_',numero);
    date='31dec2023'd;
    output;
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

Don't ignore the fact that once you have succeeded in making the desired change, your data is no longer sorted by the values in variable NUMERO.   You won't be able to use "BY NUMERO" in the resulting data set.

 

You might prefer to keep the NUMERO variable unchanged, and simply add the desired new dates.  Or generate a new variable with values like "DEBUT', 'FIN' and some other indicator of a middle date.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1044 views
  • 3 likes
  • 5 in conversation