BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lucas
Calcite | Level 5

Hi all, 

I have table with 3 columns (id;datetime;status) and I need to create new table with periods between two statuses: CLOSE-REOPEN.

 

Have:

id datetime status
1 06SEP2020:12:00:00 open
1 06SEP2020:12:00:01 CLOSE
1 06SEP2020:12:00:02 s1
1 06SEP2020:12:00:03 s2
1 06SEP2020:12:00:04 REOPEN
1 06SEP2020:12:00:05 s1
1 06SEP2020:12:00:06 CLOSE
1 06SEP2020:12:00:07 s1
1 06SEP2020:12:00:08 REOPEN
2 06SEP2020:12:00:00 open
2 06SEP2020:12:00:01 CLOSE
2 06SEP2020:12:00:02 s1

 

Want:

id status1 status2 datetime_from datetime_to
1 CLOSE REOPEN 06SEP2020:12:00:01 06SEP2020:12:00:03
1 CLOSE REOPEN 06SEP2020:12:00:06 06SEP2020:12:00:07

 

Is it possible to achieve this with SQL Query? I can use only PROC SQL. It would be great to do this with one query but if it is not possible then more queries are also fine.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Sure. It can be achieved with a SQL query:

 

proc sql;
select 
    a.id,
    a.status,
    b.status,
    a.datetime as datetimeFrom,
    b.datetime as datetimeTo
from 
    have as a inner join
    have as b on a.id=b.id and b.datetime>a.datetime
where a.status = "CLOSE" and b.status="REOPEN"
group by a.id, a.datetime
having b.datetime-a.datetime = min(b.datetime-a.datetime);
quit;
PG

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@Lucas wrote:

I can use only PROC SQL.


In my opinion, an unfortunate restriction. SQL doesn't really a good tool for finding the values immediately above a specific record. I'm sure someone will come along and write wizard-like SQL code that will be difficult to understand, that will get the job done.

 

I would use a DATA step, followed by PROC TRANSPOSE. So for anyone reading along who might not have this restriction, here is the code:

 

data times;
    set have;
    prev_time=lag(datetime);
    if status='CLOSE' then do;
    	sequence+1;
    	output_time=datetime;
    	output;
	end;
	else if status='REOPEN' then do;
	    output_time=prev_time;
	    output;
    end;
    drop prev_time datetime;
run;
proc transpose data=times out=want prefix=datetime_;
    var output_time;
    by id sequence;
    id status;
run;
data want;
    set want(drop=sequence _name_);
    if missing(datetime_reopen) then delete;
    format datetime: datetime16.;
run;
--
Paige Miller
PGStats
Opal | Level 21

Sure. It can be achieved with a SQL query:

 

proc sql;
select 
    a.id,
    a.status,
    b.status,
    a.datetime as datetimeFrom,
    b.datetime as datetimeTo
from 
    have as a inner join
    have as b on a.id=b.id and b.datetime>a.datetime
where a.status = "CLOSE" and b.status="REOPEN"
group by a.id, a.datetime
having b.datetime-a.datetime = min(b.datetime-a.datetime);
quit;
PG
PaigeMiller
Diamond | Level 26

This doesn't give the requested output in the column named datetimeTo

--
Paige Miller
PGStats
Opal | Level 21

Oops. You are right @PaigeMiller.

 

But then, what should be the value of datetimeTo? The most recent recorded timestamp before REOPEN (whatever the status) or the REOPEN timestamp minus 1 second? What happens if CLOSE and REOPEN are consecutive?

PG
PaigeMiller
Diamond | Level 26

Those are questions only @Lucas can answer.

 

This illustrates the importance of coming up with a realistic example to illustrate the problem.

--
Paige Miller
mkeintz
PROC Star

Yes, it can be done with SQL, but AFAICT SQL has to do a cartesian crossing of all CLOSE vs REOPEN rows to determine the reopen most closely following close.  It doesn't appear to take advantage of the fact that the data are ordered - ready made for a DATA step, which should be much faster. 

 

The user is being asked to show proficiency in SQL rather than efficiency (and relative simplicity) in data processing.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

If this is an assignment to help someone become proficient at SQL, then I hate the idea behind this assignment. Part of becoming proficient at SQL is understanding when SQL is not a good tool to use.

--
Paige Miller
Lucas
Calcite | Level 5
Briliant solution in easy way. Values datetimeTo are different than in my example but it's fine for me. Thank you very much @PGStats!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1989 views
  • 5 likes
  • 4 in conversation