What would this data look like (inputs, target, ids, etc)? Basically data that I can assign to processes, or from which I identify processes. Process mining lit usually refers to "event logs" and other event-oriented data, but practitioners are usually extracting and normalizing their own event-log datasets from multiple, heterogeneous data sources. In several examples, they generate hospital event logs from the admittance, care-progression, or other data warehouses at a hospital. Another example might be process-mining a computer hardware company by aggregating event data from a defect-log system, an inventory system, and a communication system, and incorporating these sources into a single dataset describing their processes. The benefit is that process mining is capable of looking across multiple data sources, rather than at a single data source, to identify processes of various kinds: bottlenecks, anomalies, or just overall organizational descriptions. No software system fully-encompasses any organization's needs, so there is typically some sort of aggregation involved: taking multiple data sources, and converting them to something that neatly describes processes (events) and is consumable by common process mining methods. Sorry, I should have clarified. I assume I'll have to do the aggregation/extraction myself, so I guess the question is where one might find organizational datasets of that scale, encompassing multiple data sources. Many companies and state agencies use such all-encompassing ERP systems for inventory, defect-tracking, CRM, and so on. The difficulty is finding anything of that scale in the public domain, and suitable for research. But you can imagine the benefit it might have for something like a large healthcare organization, whether state or private. Which I mention suggestively: there is so much data tracking in healthcare, it seems like a fruitful source of datsets, I just haven't found any yet. Thanks for your response!
... View more