BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Frk_bolsen
Calcite | Level 5

In sas 9.4 i have a dataset looking something like this:

 

idstartdateenddateindateoutdateowner
49810622.01.201909.11.201929.01.2018.803
49810622.01.201909.11.201901.01.201926.05.2023584
49954723.01.201909.11.201913.02.2018.803
49954723.01.201909.11.201901.01.2019.584

where . is missing outdate.

 

I want to manipulate this in two ways:

1. In cases where the outdate is missing, and it is not the last id record, I want to use the indate for the record after as the outdate.

2. In cases where the the outdate is missing, and it is the last record for that id, i want to put 09.11.2023.

 

The code should produce this dataset:

animal_idstartdateenddateindateoutdateowner
49810622.01.201909.11.201929.01.201801.01.2019803
49810622.01.201909.11.201901.01.201926.05.2023584
49954723.01.201909.11.201913.02.201801.01.2019803
49954723.01.201909.11.201901.01.201909.11.2023584

 

Note that id's can have more that two rows each in the real dataset.

 

So far I have this code,

proc sort data=have; by id indate outdate; run;

data want;
set have;
by id indate outdate;
if id = lag(id) and lag(outdate) = . then outdate2 = indate;
if last.id and outdate = . then outdate = 23323;
run;

which produce the acquired results for problem 2, and store the correct indate for use in problem 1, but I do not know how to insert it in the row above.

 

Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

We'll need to use the LAG function to get values from other rows. Because LAG only accesses values from previous rows of data, I'll sort the data in reverse chronological order for processing. Here's the code: 

 

 

data have;
	infile datalines dsd dlm='|';
	input id	startdate:ddmmyy10. enddate:ddmmyy10. 
				indate:ddmmyy10. outdate:ddmmyy10. owner;
	format startdate -- outdate ddmmyyp10.;
datalines;
498106|22.01.2019|09.11.2019|29.01.2018|.|803
498106|22.01.2019|09.11.2019|01.01.2019|26.05.2023|584
499547|23.01.2019|09.11.2019|13.02.2018|.|803
499547|23.01.2019|09.11.2019|01.01.2019|.|584
499547|23.01.2020|09.11.2020|13.02.2010|.|803
499547|23.01.2021|09.11.2021|01.01.2021|.|584
;

title "Have";
proc print data=have;
run;

/* LAG() only reads previous records, so first sort in reverse chronological order */
proc sort data=have;
	by id descending indate descending outdate;
run;

data want;
	set have;
	by id descending indate descending outdate;
	/* The LAG() function must be executed for every row, even if we don't need every value */
	_prevInDate=lag1(indate);
	/* If this is the first record for the ID and outdata is missing, set 11SEP2023*/
	if first.id and missing(outdate) then outdate='11SEP2023'd;
	/* Otherwise, if outdata is missing, use the previous indate */
	else if missing(outdate) then outdate=_prevInDate;
	drop _:;
run;

/* Sort the data back into chronological order */
proc sort data=want;
		by id indate outdate;
run;

title "Want";
proc print data=want;
run;

And here is the result:

 

HAVE
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 . 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2019 09.11.2019 13.02.2018 . 803
4 499547 23.01.2019 09.11.2019 01.01.2019 . 584
5 499547 23.01.2020 09.11.2020 13.02.2010 . 803
6 499547 23.01.2021 09.11.2021 01.01.2021 . 584
 
WANT
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 01.01.2019 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2020 09.11.2020 13.02.2010 13.02.2018 803
4 499547 23.01.2019 09.11.2019 13.02.2018 01.01.2019 803
5 499547 23.01.2019 09.11.2019 01.01.2019 01.01.2021 584
6 499547 23.01.2021 09.11.2021 01.01.2021 11.09.2023 584
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

3 REPLIES 3
SASJedi
SAS Super FREQ

We'll need to use the LAG function to get values from other rows. Because LAG only accesses values from previous rows of data, I'll sort the data in reverse chronological order for processing. Here's the code: 

 

 

data have;
	infile datalines dsd dlm='|';
	input id	startdate:ddmmyy10. enddate:ddmmyy10. 
				indate:ddmmyy10. outdate:ddmmyy10. owner;
	format startdate -- outdate ddmmyyp10.;
datalines;
498106|22.01.2019|09.11.2019|29.01.2018|.|803
498106|22.01.2019|09.11.2019|01.01.2019|26.05.2023|584
499547|23.01.2019|09.11.2019|13.02.2018|.|803
499547|23.01.2019|09.11.2019|01.01.2019|.|584
499547|23.01.2020|09.11.2020|13.02.2010|.|803
499547|23.01.2021|09.11.2021|01.01.2021|.|584
;

title "Have";
proc print data=have;
run;

/* LAG() only reads previous records, so first sort in reverse chronological order */
proc sort data=have;
	by id descending indate descending outdate;
run;

data want;
	set have;
	by id descending indate descending outdate;
	/* The LAG() function must be executed for every row, even if we don't need every value */
	_prevInDate=lag1(indate);
	/* If this is the first record for the ID and outdata is missing, set 11SEP2023*/
	if first.id and missing(outdate) then outdate='11SEP2023'd;
	/* Otherwise, if outdata is missing, use the previous indate */
	else if missing(outdate) then outdate=_prevInDate;
	drop _:;
run;

/* Sort the data back into chronological order */
proc sort data=want;
		by id indate outdate;
run;

title "Want";
proc print data=want;
run;

And here is the result:

 

HAVE
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 . 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2019 09.11.2019 13.02.2018 . 803
4 499547 23.01.2019 09.11.2019 01.01.2019 . 584
5 499547 23.01.2020 09.11.2020 13.02.2010 . 803
6 499547 23.01.2021 09.11.2021 01.01.2021 . 584
 
WANT
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 01.01.2019 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2020 09.11.2020 13.02.2010 13.02.2018 803
4 499547 23.01.2019 09.11.2019 13.02.2018 01.01.2019 803
5 499547 23.01.2019 09.11.2019 01.01.2019 01.01.2021 584
6 499547 23.01.2021 09.11.2021 01.01.2021 11.09.2023 584
Check out my Jedi SAS Tricks for SAS Users
Frk_bolsen
Calcite | Level 5
Thank you! That worked brilliantly! Can I just ask, what is the point of the underscore in _PrevInDate that you drop later?
SASJedi
SAS Super FREQ

I'm a lazy programmer - I hate to type more than necessary. So whenever I create a variable I intend to drop, I start the name with an underscore. Then, I can use simple name prefix list syntax to drop them all:

/* This statement drops all variables that have a name beginning with '_' */
drop _:;

It's an old habit that I still find useful.

May the SAS be with you!
Mark 

Check out my Jedi SAS Tricks for SAS Users

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 979 views
  • 1 like
  • 2 in conversation