Hello Experts,
I would like to Insert the line by "group by" before the first observation and after the last observation (I added the file).
I started the code but I don't know how to write the "insert" statement :
proc sort data=donnees;
by Numero;
run;
data donnees_2;
set donnees;
if first.Numero then
Thank you!
Insert them at the end and then sort by date. Note that if you change the numero variable you lose your ability to group/sort so I would also add another variable.
data donnees;
input numero $ date : ddmmyy10.;
format date date9.;
cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
;
run;
data donnees_2;
length numero $20.;
set donnees;
by numero;
numero_grouping=numero;
output;
if last.numero then
do;
numero=catx("_", "Debut", numero);
date='31Dec2022'd;
output;
numero=catx("_", "Fin", numero);
date='31Dec2023'd;
output;
end;
run;
proc sort data=donnees_2;
by numero_grouping date;
run;
It doesn't make much sense to add those lines into the DATASET.
Why not just add them into the REPORT?
data have;
input numero $ date :ddmmyy.;
format date date9.;
cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
;
proc report data=have;
columns numero date ;
define numero / group;
define date / display;
compute before numero ; line 'BEFORE ' numero $8. ' 31DEC2022' ; endcomp;
compute after numero ; line 'AFTER ' numero $8. ' 31DEC2023' ; endcomp;
run;
@Tom wrote:
It doesn't make much sense to add those lines into the DATASET.
Why not just add them into the REPORT?
This certainly seems to be more in the proc report wheelhouse.
Probably a repeat: Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
Where to the values for the "inserted" date come from?
Not going to write code with non-existent data but the general approach would be something like:
When you get the "first" value:
1) store ALL the variables into temporary variables.
2) assign the needed values to all the variables
3) output to the data set
4) reassign the values from the temporary variables back to the original variables
5) output the record
When neither a first nor last, output the record
when last
1) output the record
2) assign the values to the variables
3) output the record
KEEP just the variables you want (or Drop the ones you don't)
One concern: You are adding enough text to the "Numero" variable it is very likely that you need to specify a longer length to hold the values before the SET statement using your existing values.
Use two SET statements, which provides a way to achieve your goal in a single DATA step.
Both SETs read all the observations - i.e. they are parallel "streams" of input data. The first SET is tied to a BY NUMERO statement, allowing you to know when to construct and explicitly output (only) the DEBUT and FIN observations. The second SET reads all observations and is followed by its own OUTPUT, thus providing a way to output the unmodified dates.
data donnees;
input numero $ date : ddmmyy10.;
format date date9.;
cards;
K2 01/01/2023
K2 23/05/2023
K2 25/05/2023
P10 02/01/2023
P10 05/04/2023
run;
data want;
set donnees;
by numero;
if first.numero then do;
numero=cats('Debut_',numero);
date='31dec2022'd;
output;
end;
set donnees;
output;
if last.numero then do;
date='31dec2023'd;
numero=cats('Fin_',numero);
output;
end;
run;
You can use a slightly more familiar construct, with a single SET statement that has two arguments, both of which are dataset DONNEES. The concept is similar, i.e. two streams, but now instead a 1:1 parallel stream, it is now block to block parallel (i.e. all of a given numero from one stream, followed by all of them for the second stream.
data want2;
set donnees (in=stream1) donnees (in=stream2);
by numero;
if first.numero then do;
numero=cats('DEBUT_',numero);
date='31dec2022'd;
output;
end;
if stream2 then output;
if last.numero then do;
numero=cats('FIN_',numero);
date='31dec2023'd;
output;
end;
run;
Don't ignore the fact that once you have succeeded in making the desired change, your data is no longer sorted by the values in variable NUMERO. You won't be able to use "BY NUMERO" in the resulting data set.
You might prefer to keep the NUMERO variable unchanged, and simply add the desired new dates. Or generate a new variable with values like "DEBUT', 'FIN' and some other indicator of a middle date.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.