BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Kiwi_Tim
Fluorite | Level 6

I am a fairly inexperienced Sas Viya Visual Analytics user and I am trying to work out how to select a specific subset of my data.

I have a data table that includes Events, Event ID Numbers, Event Statuses, Tasks, Task ID Numbers, and Task Completion Statuses.

An Event may have no or many Tasks, but each Task can only relate to one Event. So the relationship between tasks and events is a 'many to one relationship'.

 

An Event can have one of 10 Event Statuses, which includes the status of 'Complete'.

A Task can be in one of 3 Task Completion Statuses: 'Complete', 'Incomplete', or 'missing' (no value entered).

 

What I wish to achieve is to select all Events that are not 'Complete' where none have an associated Task with a Task Completion Status of 'missing'.

 

I am from a geographic information systems background, where I primarily used ArcGIS. This would be very easy to achieve in ArcGIS using relational databases. You would set up an 'Events' table that includes a unique identifier (Event ID Number) and includes all the event attributes. You would then create a 'Tasks' table using a unique identifier (Task ID Number), and all the task attributes. Included in the Task table is the 'Event ID Number' which identifies which event a task is related to. You then make a relationship between the two tables using 'Event ID Number'. Any tasks selected will then select the related events in the Events Table.

 

So to achieve the selection I want, I would select all tasks in the tasks table that have a Task Completion Status of 'missing'. I then use this selection to select all the events in the Events Table that are related to the selected Tasks. I then invert the selection in the Events Table, so that I have now got a selection of events that have no tasks where the Task Completion Status is 'missing'. I now select from that selection all Events where the Event status is not 'Completed'. I now have all the event data selected that I am interested in.

 

As SAS VA does not uses relational tables, and instead has all the data in one table, I have not been able to subset the data using the above method.

 

I am guessing that I need to make a 'New Data Item', 'Calculated Item' (text not numerical), where I can apply a new label which has 2 states: 'Target' and 'Not Target'.

I tried using a Boolean IF...ELSE Operator along with an AND Operator, which looked like this:

 

IF ( ( 'Event Status'n NotIn ('Completion') ) AND ( 'Task Completion Status'n NotMissing ) )
RETURN 'Target'
ELSE 'Not Target'

 

This isolates all rows where Event status is not 'Completion' and where Task Completion Status is not 'missing, and labels them Target, and labels all other rows with Not Target, but I still end with Events that include Tasks that have a 'missing' Task Completion Status in my 'Target' group.

 

Any advice would be greatly appreciated.

 

Thanks,

Tim

 

1 ACCEPTED SOLUTION

Accepted Solutions
KeithM
SAS Employee

Expanding on Hunter's thoughts, I was able to get this to work.  I will attach the data and XML for the test report for your review.  Here is a list table in the report showing the results:

KeithM_0-1664373227600.png

Here is how it can be easily done.

1. From the original data on the Data tab (icon next to the source table), select "New data from aggregation"

2. For "Selected Items" ONLY add Event ID.  Then select the filter on the right pane and filter on "Active Task"

KeithM_1-1664373656150.png

3. Go back to the Data pane and select the original data.  On the icon next to the table, select "New Data from Join..."

Perform a "Full Join" on the aggregated table you created in step #2 on "Event ID" join condition

KeithM_2-1664373923614.png

4. On the Choose Columns button, Select all the data items from the source and ONLY the Event ID from the aggregated table.

KeithM_3-1664374035263.png

 

5. The Data_Join table will be the table you report on.  Last step is to create a calculated item for Event ID Filter.

Create a new calculated item.  If the Event ID is not equal to the Event ID on aggregated table then return Event ID.  The expression looks like the following:

KeithM_4-1664374322564.png

6. All you have to do now is use the Join table to create the list table

KeithM_5-1664374452752.png

 

Notes:  Attached is the Excel spreadsheet used and the XML for the report.  To implement this example do the following:

A. Import in the Excel table.

B. On a new report, select "CTRL + ALT + b" to view the xml.  Replace the current XML with the attached XML and select the LOAD button.

C.  The report will complain that the data is not there.  Ignore that.  Select the Data pane.  Select the icon next to the source data and select "Change...".  Select your imported table and the report should display.

 

KeithM_6-1664374716308.png

 

 

 

 

View solution in original post

14 REPLIES 14
MarkusWeick
Barite | Level 11

Hi @Kiwi_Tim,

principially your approach looks sensible to me. So the problem may be in the details. Could you please share a screenshot of the actual definition of the actual calculated item?

Please keep the community friendly.
Like posts you agree with or like. Mark helpful answers as “accepted solutions”. Generally have a look at https://communities.sas.com/t5/Getting-Started/tkb-p/community_articles
MarkusWeick
Barite | Level 11

Hi @Kiwi_Tim, on second thought. Maybe it's sufficient to select your subset through the filters of the VA objects.

Please keep the community friendly.
Like posts you agree with or like. Mark helpful answers as “accepted solutions”. Generally have a look at https://communities.sas.com/t5/Getting-Started/tkb-p/community_articles
Kiwi_Tim
Fluorite | Level 6

I used

IF ( ( 'EVENT_STATUS'n NotIn ('Completed') ) AND (
'TASK_COMPLETION_STATUS'n Missing ) )
RETURN 'Idle'
ELSE ' '

Which then types 'Idle' in every row where the above statement is true. This identifies rows in the table that do not have a Event Status of 'Completed' and have a Task Completion Status that is missing.

Each row in the table represents a  Task, so there are sometimes multiple rows for one event. 

There can be 'Idle' tasks and tasks without 'Idle' that belong to the same Event ID. So I wish to identify every Event ID if an event includes 1 or more tasks where the there is an 'Idle'.

 

I can't filter using the data attribute created by the above formula because that will only filter at the 'Task' level on the table. There will be tasks with no 'Idle' label that are retained after filtering and they may share an Event ID with other tasks that are filtered out. This is not what I want.

 

I want to filter at the Event level if one or more tasks belonging to the same event has the 'Idle' label. I'm part way there with the above formula. Any ideas will be gratefully received 🙂 

 

SASKiwi
PROC Star

I'd suggest that this type of calculation would be more easily implemented in the data preparation phase before loading your data into VA. SQL can easily add up the number of idle tasks per event, add that to each row of your VA load table then you can filter on idle task counts > 0. Without knowing what your input data looks like a query similar to this should work:

proc sql;
  create table want as
  select A.* 
        ,B.Idle_Task_Count
  from have as A
  left join
  (select  EventID
          ,count(TaskID) as Idle_Task_Count
   from tasks
   where TASK_COMPLETION_STATUS = ''
   group by EventID
  ) as B
  on A.EventID = B.EventID
  ;
quit; 

 

Kiwi_Tim
Fluorite | Level 6

Thanks SASKiwi,

That is a very good idea, but alas, the organisation I work for has limited the SAS service options for our group, so all I have at my disposal is report creation and the options available therein. I will continue to play with a Calculated Item.

 

 

SASKiwi
PROC Star

So there is no possibility of requesting a data enhancement from the team responsible for loading the data?! Failing that I suggest you look at Aggregated Measures which I think should be able to produce the counts you need. Unfortunately I don't have your version of VA so I can't really guide you further. Opening a Tech Support track might be your best option. 

Kiwi_Tim
Fluorite | Level 6

Hi again,

I was hoping an If Then nested statement might do it. What I have now is the following:

Kiwi_Tim_0-1663811745112.png

You can see there is one Event ID here with 7 separate tasks. I have now identified Tasks that are Active. If an Event does not include  an Active Task, I wish to write its Event ID in the column 'Event ID Filter', otherwise I want it flagged as missing

 

I used the following formula to create the Event ID Filter data : 

IF ( 'Task Activity'n NotIn ('Active Task') )

RETURN 'Event ID'n

ELSE .

 

This doesn't give me what I want, as in my screenshot above no values should be written in this instance as there is one or more 'Task Activity' with a value of 'Active Task'.

 

I would expect this kind of analysis to be very common, as I used to do this kind of thing frequently with relational databases in GIS.

 

As I'm a novice with SAS I'm still searching for a workable solution. 

MarkusWeick
Barite | Level 11

Hi @Kiwi_Tim,

please have a try with the following three step approach to identify all Events, that are not 'Complete' and where none have an associated Task with a Task Completion Status of 'missing':

  1. define a calculated item that is 1 if the Event Status is complete or the Task Completion Status is 'missing', and 0 else.
  2. define a second calculated item as aggregated sum of the first calulated item.
  3. create a list object with two columns „event ID“ and the second calulated item.

If the second calculated item is 0, the top criteria is fulfilled, 1 else.

 

Best

Markus

Please keep the community friendly.
Like posts you agree with or like. Mark helpful answers as “accepted solutions”. Generally have a look at https://communities.sas.com/t5/Getting-Started/tkb-p/community_articles
Kiwi_Tim
Fluorite | Level 6

Thanks Bruehl,

I tried your suggestion, but it is not working as I want. Thanks everybody for responding. 

HunterT_SAS
SAS Employee

Assuming I haven't misunderstood what you are trying to do (and there is a decent chance I have), I don't think this is going to work as long as you have to have both Event ID and Task ID in the same list table. With both columns present, there is no way to aggregate the way you're wanting, at least that I can find.

 

Borrowing on Bruehl's suggestion, I made up some sample data that looks like this:

HunterT_SAS_0-1664223521549.png

So let us say you want to be able to filter to Event ID 2 because it has zero active tasks. I then created a calculated item to say if Task Activity = Active Task, return 1, else return 0:

HunterT_SAS_1-1664223670287.png

 

Then I added a new List Table with just Event ID. I then added a local filter to this List Table like this:

HunterT_SAS_2-1664223730814.png

where I'm filtering on my new Task Activity Number = 0 AND it's filtering on aggregated values. This gives me a List Table that is filtered to just Event IDs that have zero active tasks:

HunterT_SAS_3-1664223787931.png

I believe this is more or less what Bruehl was getting at. 

I could then use this list table and create an action to another list table that has Task Details for example, so that clicking a row in my Event IDs table filters my Task table. Again once you add Task ID to this first Event table, the whole thing breaks but if you keep them in separate table objects, it might be doable.

HunterT_SAS_5-1664224348493.png

 

Another option I have not fully explored myself yet is making use of Aggregated Datasources to potentially create the relational tables you mentioned earlier. Assuming all of the relevant columns exist in the single table you currently have you, you could go to the Data Options menu and select "New data from aggregation of <table name>":

HunterT_SAS_4-1664224045874.png

 

 

Then you could pick only the event columns. Create table.

Repeat again and pick only the task columns plus the event ID column. Create that table.

Then maybe you could join these together in the way you're wanting, or work with them separately in separate report objects? 

One downside to this approach, there is a performance hit when creating tables like this. Each time you open the report this way, these tables are generated in the user's personal CAS library so there might be a delay opening the report each time, and these tables take up space on the CAS server too that you may or may not have.

I haven't fully thought this approach out yet though so this may not be feasible either, but just something to possibly explore.

 





 

 



MarkusWeick
Barite | Level 11

@HunterT_SAS,

I like that you are not giving up!

Please keep the community friendly.
Like posts you agree with or like. Mark helpful answers as “accepted solutions”. Generally have a look at https://communities.sas.com/t5/Getting-Started/tkb-p/community_articles
KeithM
SAS Employee

Expanding on Hunter's thoughts, I was able to get this to work.  I will attach the data and XML for the test report for your review.  Here is a list table in the report showing the results:

KeithM_0-1664373227600.png

Here is how it can be easily done.

1. From the original data on the Data tab (icon next to the source table), select "New data from aggregation"

2. For "Selected Items" ONLY add Event ID.  Then select the filter on the right pane and filter on "Active Task"

KeithM_1-1664373656150.png

3. Go back to the Data pane and select the original data.  On the icon next to the table, select "New Data from Join..."

Perform a "Full Join" on the aggregated table you created in step #2 on "Event ID" join condition

KeithM_2-1664373923614.png

4. On the Choose Columns button, Select all the data items from the source and ONLY the Event ID from the aggregated table.

KeithM_3-1664374035263.png

 

5. The Data_Join table will be the table you report on.  Last step is to create a calculated item for Event ID Filter.

Create a new calculated item.  If the Event ID is not equal to the Event ID on aggregated table then return Event ID.  The expression looks like the following:

KeithM_4-1664374322564.png

6. All you have to do now is use the Join table to create the list table

KeithM_5-1664374452752.png

 

Notes:  Attached is the Excel spreadsheet used and the XML for the report.  To implement this example do the following:

A. Import in the Excel table.

B. On a new report, select "CTRL + ALT + b" to view the xml.  Replace the current XML with the attached XML and select the LOAD button.

C.  The report will complain that the data is not there.  Ignore that.  Select the Data pane.  Select the icon next to the source data and select "Change...".  Select your imported table and the report should display.

 

KeithM_6-1664374716308.png

 

 

 

 

Kiwi_Tim
Fluorite | Level 6
Thanks to all those that assist with finding this solution. I had put this particular project aside but have just now comprehended the solution, which works well.
Sam_SAS
SAS Employee

Thanks for following up with us! It should help others in the future 😊

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 2752 views
  • 4 likes
  • 6 in conversation