BookmarkSubscribeRSS Feed
saunvida
Calcite | Level 5

Hi everyone,

 

I'm working on SAS ESP Studio on SAS Viya 4 to create an online model using the Train window, and I need some guidance on handling streaming data.

I have location data (latitude, longitude) for several vehicles, read into ESP Studio, a few rows at a time, from a flat CSV file. I've assigned opcodes for each event: the first event for each vehicle has an opcode of 'I' (insert), and subsequent events for that vehicle are marked with 'U' (update). The goal is to maintain only the latest location ping for each vehicle at any given time. Here is a snapshot of the csv file:

 

0001.png


The Source window is correctly handling the events, maintaining a single record per vehicle by inserting the first event and updating/deleting previous ones as new data comes in. Following is a snapshot of how the source window results look like filtered for vehicle 1:

 

0003.png

 

It can be seen that the opcode of the latest record changes to 'UB' (Update Block) for vehicle 1 (row_key 2). My aim is to use the most recent event for each vehicle while periodically training the online model on the fly. However, when I pass the output from the Copy window to the Train window, I encounter an error stating that the input window for Train window must only produce inserts.

 

0002.png

 

To tackle this issue, I tried using Remove State window to reassign the Insert opcode to both Insert and Update events. However, this results in a list of all events associated with a vehicle, rather than just the most recent one.

Results of RemoveState window filtered for vehicle 1:

 

0004.png

 

How can I ensure that the Train window only receives the most recent event for each vehicle, so that my online model is trained correctly?

Any advice or suggestions would be greatly appreciated!

3 REPLIES 3
jbhattacharya
SAS Employee

Hi,

Can you please specify which online algorithm are you using?

 

jbhattacharya
SAS Employee

Ok I am not an analytics expert, but does K-Means algorithm consider a change in data which it has already seen? Will it not treat the updated data as new data?

Will it be possible for you to explain your requirement? That may help to suggest a better approach. If you want, you can set a call with me. My email id is joydeep.bhattacharya@sas.com

Whether you're already using SAS Event Stream Processing or thinking about it, this is where you can connect with your peers, ask questions and find resources.

 

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 382 views
  • 0 likes
  • 2 in conversation