BookmarkSubscribeRSS Feed
AMFR
Quartz | Level 8

Hi all,

 

Below is a very small part of my big messy data

 

data temp;
 input ID Action $ TS_SER TS Verify Notecount;
 informat TS_SER datetime14. TS datetime14.;
 format TS_SER  datetime14. TS datetime14.;
 datalines;

252670 Sign 13DEC17:11:05 13DEC17:11:14 . .
252670 Perform 13DEC17:11:05 13DEC17:11:14 . .
252670 Verify 13DEC17:11:05 13DEC17:11:14  1 1
252670 Sign 13DEC17:15:27 13DEC17:15:41 . .
252670 Perform 13DEC17:15:27 13DEC17:15:41 . .
252670 Verify 13DEC17:15:27 13DEC17:15:41 1 2
252670 Sign 13DEC17:16:28 13DEC17:16:30 . .
252670 Perform 13DEC17:16:28 13DEC17:16:30 . .
252670 Verify 13DEC17:16:28 13DEC17:16:30 1 3
252670 Sign 13DEC17:16:29 13DEC17:16:30 . .
252670 Modify 13DEC17:16:29 13DEC17:16:30 . .
252670 Perform 14DEC17:07:32 14DEC17:07:34 . .
252670 Verify 14DEC17:07:32 14DEC17:07:34 1 4
252670 Sign 14DEC17:07:32 14DEC17:07:34 . .
252670 Perform 14DEC17:09:15 14DEC17:09:15 . .
252670 Verify 14DEC17:09:15 14DEC17:09:15 1 5
252670 Sign 14DEC17:09:15 14DEC17:09:15 . .
252670 Perform 14DEC17:14:18 14DEC17:16:29 . .
252670 Verify 14DEC17:14:18 14DEC17:16:29 1 6
252670 Sign 14DEC17:14:18 14DEC17:14:18 . .
252670 Sign 14DEC17:14:18 14DEC17:16:29 . .
252670 Sign 14DEC17:16:13 14DEC17:16:17 . .
252670 Perform 14DEC17:16:13 14DEC17:16:17 . .
252670 Verify 14DEC17:16:13 14DEC17:16:17 1 7
252670 Modify 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:16:29 14DEC17:16:29 . .
252670 Sign 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:19:12 14DEC17:19:15 . .
252670 Verify 14DEC17:19:12 14DEC17:19:15 1 8
252670 Sign 14DEC17:19:12 14DEC17:19:15 . .
252670 Perform 15DEC17:15:27 15DEC17:15:27 . .
252670 Verify 15DEC17:15:27 15DEC17:15:27 1 9
252670 Sign 15DEC17:15:27 15DEC17:15:27 . .
252670 Sign 16DEC17:10:38 16DEC17:10:38 . .
252670 Perform 16DEC17:10:38 16DEC17:10:38 . .
252670 Verify 16DEC17:10:38 16DEC17:10:38 1 10
318672 Sign 13OCT17:20:32 14OCT17:10:11 . .
318672 Modify 13OCT17:20:36 14OCT17:10:11  . .
318672 Modify 13OCT17:20:37 14OCT17:10:11 . .
318672 Modify 13OCT17:20:40 14OCT17:10:11 . .
318672 Modify 13OCT17:20:44 14OCT17:10:11 . .
318672 Modify 13OCT17:20:46 14OCT17:10:11 . .
318672 Modify 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:04 13OCT17:23:04 . .
318672 Verify 14OCT17:10:11 14OCT17:10:11 1 1
318672 Sign 14OCT17:10:11 14OCT17:10:11 . .
318672 Sign 14OCT17:10:26 14OCT17:10:52 . .
318672 Perform 14OCT17:10:26 14OCT17:14:09 . .
318672 Modify 14OCT17:10:52 14OCT17:14:09 . .
318672 Sign 14OCT17:10:54 14OCT17:10:54 . .
318672 Sign 14OCT17:10:54 14OCT17:14:09 . .
318672 Modify 14OCT17:10:54 14OCT17:14:09 . .
318672 Verify 14OCT17:10:54 14OCT17:14:09 1 2
;
run;

 

I want to retain value of notecount so my clean data would look like this (as below). I need to retain variable notecount  based on TS_SER and TS.


data temp1;
 input ID Action $ TS_SER TS Verify Notecount;
  informat TS_SER datetime14. TS datetime14.;
 format TS_SER datetime14. TS datetime14.;
 datalines;

252670 Sign 13DEC17:11:05 13DEC17:11:14 . 1
252670 Perform 13DEC17:11:05 13DEC17:11:14 . 1
252670 Verify 13DEC17:11:05 13DEC17:11:14  1 1
252670 Sign 13DEC17:15:27 13DEC17:15:41 . 2
252670 Perform 13DEC17:15:27 13DEC17:15:41 . 2
252670 Verify 13DEC17:15:27 13DEC17:15:41 1 2
252670 Sign 13DEC17:16:28 13DEC17:16:30 . 3
252670 Perform 13DEC17:16:28 13DEC17:16:30 . 3
252670 Verify 13DEC17:16:28 13DEC17:16:30 1 3
252670 Sign 13DEC17:16:29 13DEC17:16:30 . 3
252670 Modify 13DEC17:16:29 13DEC17:16:30 . 3
252670 Perform 14DEC17:07:32 14DEC17:07:34 . 4
252670 Verify 14DEC17:07:32 14DEC17:07:34 1 4
252670 Sign 14DEC17:07:32 14DEC17:07:34 . 4
252670 Perform 14DEC17:09:15 14DEC17:09:15 . 5
252670 Verify 14DEC17:09:15 14DEC17:09:15 1 5
252670 Sign 14DEC17:09:15 14DEC17:09:15 . 5
252670 Perform 14DEC17:14:18 14DEC17:16:29 . 6
252670 Verify 14DEC17:14:18 14DEC17:16:29 1 6
252670 Sign 14DEC17:14:18 14DEC17:14:18 . 6
252670 Sign 14DEC17:14:18 14DEC17:16:29 . 6
252670 Modify 14DEC17:16:29 14DEC17:16:29 . 6
252670 Perform 14DEC17:16:29 14DEC17:16:29 . 6
252670 Sign 14DEC17:16:29 14DEC17:16:29 . 6
252670 Sign 14DEC17:16:13 14DEC17:16:17 . 7
252670 Perform 14DEC17:16:13 14DEC17:16:17 . 7
252670 Verify 14DEC17:16:13 14DEC17:16:17 1 7
252670 Perform 14DEC17:19:12 14DEC17:19:15 . 8
252670 Verify 14DEC17:19:12 14DEC17:19:15 1 8
252670 Sign 14DEC17:19:12 14DEC17:19:15 . 8
252670 Perform 15DEC17:15:27 15DEC17:15:27 . 9
252670 Verify 15DEC17:15:27 15DEC17:15:27 1 9
252670 Sign 15DEC17:15:27 15DEC17:15:27 . 9
252670 Sign 16DEC17:10:38 16DEC17:10:38 . 10
252670 Perform 16DEC17:10:38 16DEC17:10:38 . 10
252670 Verify 16DEC17:10:38 16DEC17:10:38 1 10
318672 Sign 13OCT17:20:32 14OCT17:10:11 . 1
318672 Modify 13OCT17:20:36 14OCT17:10:11  . 1
318672 Modify 13OCT17:20:37 14OCT17:10:11 . 1
318672 Modify 13OCT17:20:40 14OCT17:10:11 . 1
318672 Modify 13OCT17:20:44 14OCT17:10:11 . 1
318672 Modify 13OCT17:20:46 14OCT17:10:11 . 1
318672 Modify 13OCT17:23:03 14OCT17:10:11 . 1
318672 Sign 13OCT17:23:03 14OCT17:10:11 . 1
318672 Sign 13OCT17:23:04 13OCT17:23:04 . 1
318672 Verify 14OCT17:10:11 14OCT17:10:11 1 1
318672 Sign 14OCT17:10:11 14OCT17:10:11 . 1
318672 Sign 14OCT17:10:26 14OCT17:10:52 . 2
318672 Perform 14OCT17:10:26 14OCT17:14:09 . 2
318672 Modify 14OCT17:10:52 14OCT17:14:09 . 2
318672 Sign 14OCT17:10:54 14OCT17:10:54 . 2
318672 Sign 14OCT17:10:54 14OCT17:14:09 . 2
318672 Modify 14OCT17:10:54 14OCT17:14:09 . 2
318672 Verify 14OCT17:10:54 14OCT17:14:09 1 2
;
run;

proc print data=work.temp1;
run;

 

 

Thank you all.

 

 

 

8 REPLIES 8
Ksharp
Super User

Here could give you a start.

 


data temp;
 input ID Action $ TS_SER TS Verify Notecount;
 informat TS_SER datetime14. TS datetime14.;
 format TS_SER  datetime14. TS datetime14.;
 datalines;
252670 Sign 13DEC17:11:05 13DEC17:11:14 . .
252670 Perform 13DEC17:11:05 13DEC17:11:14 . .
252670 Verify 13DEC17:11:05 13DEC17:11:14  1 1
252670 Sign 13DEC17:15:27 13DEC17:15:41 . .
252670 Perform 13DEC17:15:27 13DEC17:15:41 . .
252670 Verify 13DEC17:15:27 13DEC17:15:41 1 2
252670 Sign 13DEC17:16:28 13DEC17:16:30 . .
252670 Perform 13DEC17:16:28 13DEC17:16:30 . .
252670 Verify 13DEC17:16:28 13DEC17:16:30 1 3
252670 Sign 13DEC17:16:29 13DEC17:16:30 . .
252670 Modify 13DEC17:16:29 13DEC17:16:30 . .
252670 Perform 14DEC17:07:32 14DEC17:07:34 . .
252670 Verify 14DEC17:07:32 14DEC17:07:34 1 4
252670 Sign 14DEC17:07:32 14DEC17:07:34 . .
252670 Perform 14DEC17:09:15 14DEC17:09:15 . .
252670 Verify 14DEC17:09:15 14DEC17:09:15 1 5
252670 Sign 14DEC17:09:15 14DEC17:09:15 . .
252670 Perform 14DEC17:14:18 14DEC17:16:29 . .
252670 Verify 14DEC17:14:18 14DEC17:16:29 1 6
252670 Sign 14DEC17:14:18 14DEC17:14:18 . .
252670 Sign 14DEC17:14:18 14DEC17:16:29 . .
252670 Sign 14DEC17:16:13 14DEC17:16:17 . .
252670 Perform 14DEC17:16:13 14DEC17:16:17 . .
252670 Verify 14DEC17:16:13 14DEC17:16:17 1 7
252670 Modify 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:16:29 14DEC17:16:29 . .
252670 Sign 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:19:12 14DEC17:19:15 . .
252670 Verify 14DEC17:19:12 14DEC17:19:15 1 8
252670 Sign 14DEC17:19:12 14DEC17:19:15 . .
252670 Perform 15DEC17:15:27 15DEC17:15:27 . .
252670 Verify 15DEC17:15:27 15DEC17:15:27 1 9
252670 Sign 15DEC17:15:27 15DEC17:15:27 . .
252670 Sign 16DEC17:10:38 16DEC17:10:38 . .
252670 Perform 16DEC17:10:38 16DEC17:10:38 . .
252670 Verify 16DEC17:10:38 16DEC17:10:38 1 10
318672 Sign 13OCT17:20:32 14OCT17:10:11 . .
318672 Modify 13OCT17:20:36 14OCT17:10:11  . .
318672 Modify 13OCT17:20:37 14OCT17:10:11 . .
318672 Modify 13OCT17:20:40 14OCT17:10:11 . .
318672 Modify 13OCT17:20:44 14OCT17:10:11 . .
318672 Modify 13OCT17:20:46 14OCT17:10:11 . .
318672 Modify 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:04 13OCT17:23:04 . .
318672 Verify 14OCT17:10:11 14OCT17:10:11 1 1
318672 Sign 14OCT17:10:11 14OCT17:10:11 . .
318672 Sign 14OCT17:10:26 14OCT17:10:52 . .
318672 Perform 14OCT17:10:26 14OCT17:14:09 . .
318672 Modify 14OCT17:10:52 14OCT17:14:09 . .
318672 Sign 14OCT17:10:54 14OCT17:10:54 . .
318672 Sign 14OCT17:10:54 14OCT17:14:09 . .
318672 Modify 14OCT17:10:54 14OCT17:14:09 . .
318672 Verify 14OCT17:10:54 14OCT17:14:09 1 2
;
run;
proc sort data=temp;
by  id  TS;
run;
data want;
 merge temp 
 temp(keep=id TS_SER TS Notecount rename=(Notecount=want)
 where=(want is not missing));
 by id  TS;
run;
proc print;run;
AMFR
Quartz | Level 8

Thank you very much for your help. I am not getting desired results but I can try to work around. Please let me know if you find any other solution.

 

Kurt_Bremser
Super User

Where from do you get the "2" for ID=318672 and TS=14OCT17:10:52?

 

Other than that, try

proc sql;
create table want as
select
  ID, Action, TS_SER, TS, Verify,
  max(notecount) as notecount
from temp
group by id, ts
order by id, ts_ser
;
quit;
AMFR
Quartz | Level 8

it would be 2 because if you look at the Ser_Ts when it got verified.14OCT17:10:54, both timestamps are very close. Your codes are very helpful but not giving me desired results. I will try to work around through your codes

AMFR
Quartz | Level 8

Yes, you are absolutely right. First we need to define "Close". Honestly I am also confuse. Actually this is someone else projects who left and now I am responsible to continue working on it. It is an ongoing project and my supervisor wants me to continue working on it the same way previous person was doing. She used STATA for analysis, and I have no background knowledge of STATA. I got most of her work but only this part of coding is difficult to understand. Below are STATA codes, it might help you. What I am understanding that data is sorted at id, ts_ser, and ts and then notecount variable has retained based on if ((ts_ser=lag(ts_ser) or ((ts_ser-ts_ser) le  2 min))) or (ts=lag(ts) or ((ts-ts) le 2 min))). In codes below, I think 2000= 2minutes. I am hoping I am making some sense to you.

Thanks

 

 

 

set more off
 local i=1
 while `i'<7 {
  while inlist(`i',1,3,5) {
   local j=1
   sort id ts_ser ts
   while `j'<100 {
    replace notecount=notecount[_n+1] if notecount==. & notecount[_n+1]!=. & id==id[_n+1] & ///
     (ts_ser==ts_ser[_n+1] | abs(ts_ser-ts_ser[_n+1])<=2000 | ts==ts[_n+1] | abs(ts-ts[_n+1])<=2000)
    replace notecount=notecount[_n-1] if notecount==. & notecount[_n-1]!=. & id==id[_n-1] & ///
     (ts_ser==ts_ser[_n-1] | abs(ts_ser-ts_ser[_n-1])<=2000 | ts==ts[_n-1] | abs(ts-ts[_n-1])<=2000)
    local j=`j'+1
    }
   local i=`i'+1
   }
  while inlist(`i',2,4,6) {
   local j=1
   sort id ts ts_ser
   while `j'<100 {
    replace notecount=notecount[_n+1] if notecount==. & notecount[_n+1]!=. & id==id[_n+1] & ///
     (ts_ser==ts_ser[_n+1] | abs(ts_ser-ts_ser[_n+1])<=2000 | ts==ts[_n+1] | abs(ts-ts[_n+1])<=2000)
    replace notecount=notecount[_n-1] if notecount==. & notecount[_n-1]!=. & id==id[_n-1] & ///
     (ts_ser==ts_ser[_n-1] | abs(ts_ser-ts_ser[_n-1])<=2000 | ts==ts[_n-1] | abs(ts-ts[_n-1])<=2000)
    local j=`j'+1
    }
   local i=`i'+1
   }
  }

 

Kurt_Bremser
Super User

How about adding a cleaning step that aligns the TS timestamps:

data temp;
input
  ID
  Action $
  TS_SER
  TS
  Verify
  Notecount
;
informat
  TS_SER
  TS
    datetime14.
;
format
  TS_SER
  TS
    datetime14.
;
datalines;
252670 Sign 13DEC17:11:05 13DEC17:11:14 . .
252670 Perform 13DEC17:11:05 13DEC17:11:14 . .
252670 Verify 13DEC17:11:05 13DEC17:11:14  1 1
252670 Sign 13DEC17:15:27 13DEC17:15:41 . .
252670 Perform 13DEC17:15:27 13DEC17:15:41 . .
252670 Verify 13DEC17:15:27 13DEC17:15:41 1 2
252670 Sign 13DEC17:16:28 13DEC17:16:30 . .
252670 Perform 13DEC17:16:28 13DEC17:16:30 . .
252670 Verify 13DEC17:16:28 13DEC17:16:30 1 3
252670 Sign 13DEC17:16:29 13DEC17:16:30 . .
252670 Modify 13DEC17:16:29 13DEC17:16:30 . .
252670 Perform 14DEC17:07:32 14DEC17:07:34 . .
252670 Verify 14DEC17:07:32 14DEC17:07:34 1 4
252670 Sign 14DEC17:07:32 14DEC17:07:34 . .
252670 Perform 14DEC17:09:15 14DEC17:09:15 . .
252670 Verify 14DEC17:09:15 14DEC17:09:15 1 5
252670 Sign 14DEC17:09:15 14DEC17:09:15 . .
252670 Perform 14DEC17:14:18 14DEC17:16:29 . .
252670 Verify 14DEC17:14:18 14DEC17:16:29 1 6
252670 Sign 14DEC17:14:18 14DEC17:14:18 . .
252670 Sign 14DEC17:14:18 14DEC17:16:29 . .
252670 Sign 14DEC17:16:13 14DEC17:16:17 . .
252670 Perform 14DEC17:16:13 14DEC17:16:17 . .
252670 Verify 14DEC17:16:13 14DEC17:16:17 1 7
252670 Modify 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:16:29 14DEC17:16:29 . .
252670 Sign 14DEC17:16:29 14DEC17:16:29 . .
252670 Perform 14DEC17:19:12 14DEC17:19:15 . .
252670 Verify 14DEC17:19:12 14DEC17:19:15 1 8
252670 Sign 14DEC17:19:12 14DEC17:19:15 . .
252670 Perform 15DEC17:15:27 15DEC17:15:27 . .
252670 Verify 15DEC17:15:27 15DEC17:15:27 1 9
252670 Sign 15DEC17:15:27 15DEC17:15:27 . .
252670 Sign 16DEC17:10:38 16DEC17:10:38 . .
252670 Perform 16DEC17:10:38 16DEC17:10:38 . .
252670 Verify 16DEC17:10:38 16DEC17:10:38 1 10
318672 Sign 13OCT17:20:32 14OCT17:10:11 . .
318672 Modify 13OCT17:20:36 14OCT17:10:11  . .
318672 Modify 13OCT17:20:37 14OCT17:10:11 . .
318672 Modify 13OCT17:20:40 14OCT17:10:11 . .
318672 Modify 13OCT17:20:44 14OCT17:10:11 . .
318672 Modify 13OCT17:20:46 14OCT17:10:11 . .
318672 Modify 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:03 14OCT17:10:11 . .
318672 Sign 13OCT17:23:04 13OCT17:23:04 . .
318672 Verify 14OCT17:10:11 14OCT17:10:11 1 1
318672 Sign 14OCT17:10:11 14OCT17:10:11 . .
318672 Sign 14OCT17:10:26 14OCT17:10:52 . .
318672 Perform 14OCT17:10:26 14OCT17:14:09 . .
318672 Modify 14OCT17:10:52 14OCT17:14:09 . .
318672 Sign 14OCT17:10:54 14OCT17:10:54 . .
318672 Sign 14OCT17:10:54 14OCT17:14:09 . .
318672 Modify 14OCT17:10:54 14OCT17:14:09 . .
318672 Verify 14OCT17:10:54 14OCT17:14:09 1 2
;
run;

proc sort data=temp;
by id ts;
run;

data have; /* "clean" timestamps */
set temp;
by id ts;
retain current_time;
if first.id then current_time = ts;
if ts - current_time < 120 /* 2 minutes */
then ts = current_time;
else current_time = ts;
drop current_time;
run;

proc sql;
create table want as
select
  ID, Action, TS_SER, TS, Verify,
  max(notecount) as notecount
from have
group by id, ts
order by id, ts_ser
;
quit;
AMFR
Quartz | Level 8

Thank you very much for your suggestion. I am still not getting desired output after cleaning step. I am working on better understanding the data. Sometime I feel like I know what I should I do and after a second I am confuse. I am working on other project for now because it was not urgent. I will update you once I start working on it again.

Thanks again, you are very helpful.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 861 views
  • 1 like
  • 3 in conversation