BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Frk_bolsen
Calcite | Level 5

In sas 9.4 i have a dataset looking something like this:

 

idstartdateenddateindateoutdateowner
49810622.01.201909.11.201929.01.2018.803
49810622.01.201909.11.201901.01.201926.05.2023584
49954723.01.201909.11.201913.02.2018.803
49954723.01.201909.11.201901.01.2019.584

where . is missing outdate.

 

I want to manipulate this in two ways:

1. In cases where the outdate is missing, and it is not the last id record, I want to use the indate for the record after as the outdate.

2. In cases where the the outdate is missing, and it is the last record for that id, i want to put 09.11.2023.

 

The code should produce this dataset:

animal_idstartdateenddateindateoutdateowner
49810622.01.201909.11.201929.01.201801.01.2019803
49810622.01.201909.11.201901.01.201926.05.2023584
49954723.01.201909.11.201913.02.201801.01.2019803
49954723.01.201909.11.201901.01.201909.11.2023584

 

Note that id's can have more that two rows each in the real dataset.

 

So far I have this code,

proc sort data=have; by id indate outdate; run;

data want;
set have;
by id indate outdate;
if id = lag(id) and lag(outdate) = . then outdate2 = indate;
if last.id and outdate = . then outdate = 23323;
run;

which produce the acquired results for problem 2, and store the correct indate for use in problem 1, but I do not know how to insert it in the row above.

 

Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

We'll need to use the LAG function to get values from other rows. Because LAG only accesses values from previous rows of data, I'll sort the data in reverse chronological order for processing. Here's the code: 

 

 

data have;
	infile datalines dsd dlm='|';
	input id	startdate:ddmmyy10. enddate:ddmmyy10. 
				indate:ddmmyy10. outdate:ddmmyy10. owner;
	format startdate -- outdate ddmmyyp10.;
datalines;
498106|22.01.2019|09.11.2019|29.01.2018|.|803
498106|22.01.2019|09.11.2019|01.01.2019|26.05.2023|584
499547|23.01.2019|09.11.2019|13.02.2018|.|803
499547|23.01.2019|09.11.2019|01.01.2019|.|584
499547|23.01.2020|09.11.2020|13.02.2010|.|803
499547|23.01.2021|09.11.2021|01.01.2021|.|584
;

title "Have";
proc print data=have;
run;

/* LAG() only reads previous records, so first sort in reverse chronological order */
proc sort data=have;
	by id descending indate descending outdate;
run;

data want;
	set have;
	by id descending indate descending outdate;
	/* The LAG() function must be executed for every row, even if we don't need every value */
	_prevInDate=lag1(indate);
	/* If this is the first record for the ID and outdata is missing, set 11SEP2023*/
	if first.id and missing(outdate) then outdate='11SEP2023'd;
	/* Otherwise, if outdata is missing, use the previous indate */
	else if missing(outdate) then outdate=_prevInDate;
	drop _:;
run;

/* Sort the data back into chronological order */
proc sort data=want;
		by id indate outdate;
run;

title "Want";
proc print data=want;
run;

And here is the result:

 

HAVE
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 . 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2019 09.11.2019 13.02.2018 . 803
4 499547 23.01.2019 09.11.2019 01.01.2019 . 584
5 499547 23.01.2020 09.11.2020 13.02.2010 . 803
6 499547 23.01.2021 09.11.2021 01.01.2021 . 584
 
WANT
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 01.01.2019 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2020 09.11.2020 13.02.2010 13.02.2018 803
4 499547 23.01.2019 09.11.2019 13.02.2018 01.01.2019 803
5 499547 23.01.2019 09.11.2019 01.01.2019 01.01.2021 584
6 499547 23.01.2021 09.11.2021 01.01.2021 11.09.2023 584
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

3 REPLIES 3
SASJedi
SAS Super FREQ

We'll need to use the LAG function to get values from other rows. Because LAG only accesses values from previous rows of data, I'll sort the data in reverse chronological order for processing. Here's the code: 

 

 

data have;
	infile datalines dsd dlm='|';
	input id	startdate:ddmmyy10. enddate:ddmmyy10. 
				indate:ddmmyy10. outdate:ddmmyy10. owner;
	format startdate -- outdate ddmmyyp10.;
datalines;
498106|22.01.2019|09.11.2019|29.01.2018|.|803
498106|22.01.2019|09.11.2019|01.01.2019|26.05.2023|584
499547|23.01.2019|09.11.2019|13.02.2018|.|803
499547|23.01.2019|09.11.2019|01.01.2019|.|584
499547|23.01.2020|09.11.2020|13.02.2010|.|803
499547|23.01.2021|09.11.2021|01.01.2021|.|584
;

title "Have";
proc print data=have;
run;

/* LAG() only reads previous records, so first sort in reverse chronological order */
proc sort data=have;
	by id descending indate descending outdate;
run;

data want;
	set have;
	by id descending indate descending outdate;
	/* The LAG() function must be executed for every row, even if we don't need every value */
	_prevInDate=lag1(indate);
	/* If this is the first record for the ID and outdata is missing, set 11SEP2023*/
	if first.id and missing(outdate) then outdate='11SEP2023'd;
	/* Otherwise, if outdata is missing, use the previous indate */
	else if missing(outdate) then outdate=_prevInDate;
	drop _:;
run;

/* Sort the data back into chronological order */
proc sort data=want;
		by id indate outdate;
run;

title "Want";
proc print data=want;
run;

And here is the result:

 

HAVE
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 . 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2019 09.11.2019 13.02.2018 . 803
4 499547 23.01.2019 09.11.2019 01.01.2019 . 584
5 499547 23.01.2020 09.11.2020 13.02.2010 . 803
6 499547 23.01.2021 09.11.2021 01.01.2021 . 584
 
WANT
Obs id startdate enddate indate outdate owner
1 498106 22.01.2019 09.11.2019 29.01.2018 01.01.2019 803
2 498106 22.01.2019 09.11.2019 01.01.2019 26.05.2023 584
3 499547 23.01.2020 09.11.2020 13.02.2010 13.02.2018 803
4 499547 23.01.2019 09.11.2019 13.02.2018 01.01.2019 803
5 499547 23.01.2019 09.11.2019 01.01.2019 01.01.2021 584
6 499547 23.01.2021 09.11.2021 01.01.2021 11.09.2023 584
Check out my Jedi SAS Tricks for SAS Users
Frk_bolsen
Calcite | Level 5
Thank you! That worked brilliantly! Can I just ask, what is the point of the underscore in _PrevInDate that you drop later?
SASJedi
SAS Super FREQ

I'm a lazy programmer - I hate to type more than necessary. So whenever I create a variable I intend to drop, I start the name with an underscore. Then, I can use simple name prefix list syntax to drop them all:

/* This statement drops all variables that have a name beginning with '_' */
drop _:;

It's an old habit that I still find useful.

May the SAS be with you!
Mark 

Check out my Jedi SAS Tricks for SAS Users

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 701 views
  • 1 like
  • 2 in conversation