BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lucas
Calcite | Level 5

Hi all, 

I have table with 3 columns (id;datetime;status) and I need to create new table with periods between two statuses: CLOSE-REOPEN.

 

Have:

id datetime status
1 06SEP2020:12:00:00 open
1 06SEP2020:12:00:01 CLOSE
1 06SEP2020:12:00:02 s1
1 06SEP2020:12:00:03 s2
1 06SEP2020:12:00:04 REOPEN
1 06SEP2020:12:00:05 s1
1 06SEP2020:12:00:06 CLOSE
1 06SEP2020:12:00:07 s1
1 06SEP2020:12:00:08 REOPEN
2 06SEP2020:12:00:00 open
2 06SEP2020:12:00:01 CLOSE
2 06SEP2020:12:00:02 s1

 

Want:

id status1 status2 datetime_from datetime_to
1 CLOSE REOPEN 06SEP2020:12:00:01 06SEP2020:12:00:03
1 CLOSE REOPEN 06SEP2020:12:00:06 06SEP2020:12:00:07

 

Is it possible to achieve this with SQL Query? I can use only PROC SQL. It would be great to do this with one query but if it is not possible then more queries are also fine.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Sure. It can be achieved with a SQL query:

 

proc sql;
select 
    a.id,
    a.status,
    b.status,
    a.datetime as datetimeFrom,
    b.datetime as datetimeTo
from 
    have as a inner join
    have as b on a.id=b.id and b.datetime>a.datetime
where a.status = "CLOSE" and b.status="REOPEN"
group by a.id, a.datetime
having b.datetime-a.datetime = min(b.datetime-a.datetime);
quit;
PG

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@Lucas wrote:

I can use only PROC SQL.


In my opinion, an unfortunate restriction. SQL doesn't really a good tool for finding the values immediately above a specific record. I'm sure someone will come along and write wizard-like SQL code that will be difficult to understand, that will get the job done.

 

I would use a DATA step, followed by PROC TRANSPOSE. So for anyone reading along who might not have this restriction, here is the code:

 

data times;
    set have;
    prev_time=lag(datetime);
    if status='CLOSE' then do;
    	sequence+1;
    	output_time=datetime;
    	output;
	end;
	else if status='REOPEN' then do;
	    output_time=prev_time;
	    output;
    end;
    drop prev_time datetime;
run;
proc transpose data=times out=want prefix=datetime_;
    var output_time;
    by id sequence;
    id status;
run;
data want;
    set want(drop=sequence _name_);
    if missing(datetime_reopen) then delete;
    format datetime: datetime16.;
run;
--
Paige Miller
PGStats
Opal | Level 21

Sure. It can be achieved with a SQL query:

 

proc sql;
select 
    a.id,
    a.status,
    b.status,
    a.datetime as datetimeFrom,
    b.datetime as datetimeTo
from 
    have as a inner join
    have as b on a.id=b.id and b.datetime>a.datetime
where a.status = "CLOSE" and b.status="REOPEN"
group by a.id, a.datetime
having b.datetime-a.datetime = min(b.datetime-a.datetime);
quit;
PG
PaigeMiller
Diamond | Level 26

This doesn't give the requested output in the column named datetimeTo

--
Paige Miller
PGStats
Opal | Level 21

Oops. You are right @PaigeMiller.

 

But then, what should be the value of datetimeTo? The most recent recorded timestamp before REOPEN (whatever the status) or the REOPEN timestamp minus 1 second? What happens if CLOSE and REOPEN are consecutive?

PG
PaigeMiller
Diamond | Level 26

Those are questions only @Lucas can answer.

 

This illustrates the importance of coming up with a realistic example to illustrate the problem.

--
Paige Miller
mkeintz
PROC Star

Yes, it can be done with SQL, but AFAICT SQL has to do a cartesian crossing of all CLOSE vs REOPEN rows to determine the reopen most closely following close.  It doesn't appear to take advantage of the fact that the data are ordered - ready made for a DATA step, which should be much faster. 

 

The user is being asked to show proficiency in SQL rather than efficiency (and relative simplicity) in data processing.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

If this is an assignment to help someone become proficient at SQL, then I hate the idea behind this assignment. Part of becoming proficient at SQL is understanding when SQL is not a good tool to use.

--
Paige Miller
Lucas
Calcite | Level 5
Briliant solution in easy way. Values datetimeTo are different than in my example but it's fine for me. Thank you very much @PGStats!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1090 views
  • 5 likes
  • 4 in conversation