Faceting on Multiple entity types



The Problem



One of the requirements for the design was that the user needed to be able to search and see results for multiple entity types simultaneously. I had also gotten feedback that users wanted to see the data before querying on it. This posed a challenge both in moving the initial filtering step to a later point in the user flow, and in designing a faceting system that could accommodate hundreds or possibly thousands of different facets. The current faceting system was designed only to accommodate the facets for a single entity type, and would be unusable if there were too many entity types selected.



The Solution



I came up with a solution to have a multi-tiered faceting system that would allow the user to manage filters with multiple entity types. A universal search as well an as entity-specific search allowed the user more granular control over their query. The facets also doubled as a legend for the graph as there could be many different entity types displaying at a given time.



Large Data Sets



The Problem



The biggest challenge was deciding how to deal with showing results that could include millions of records. Previous research showed that some have attempted to solve this issue with pagination, but it presents a disorienting experience for the user as each page is displaying essentially a different graph and defeats the purpose of showing connectivity. We also did not want the graph to get so dense that the user could not make any sense of it. Additionally, there were concerns about performance abilities.



The Solution



To reduce clutter in the graph, I developed the concepts of 'base' and 'related' entities. This would allow clustering in groups by entity type on the graph to reduce clutter while still displaying relevant information. The more records within a group, the larger the node would appear on the graph. These groups could be expanded if the user were interested in investigating further.


Because the most likely use case was that a user was looking for a specific record, we did not need to show millions of records at a time. There would be too much information and the user would not be able to make sense of the graph. Showing a meaningful sample of 1,000 of the most connected records would be sufficient for users who were just wanting to explore the dataset, and users who were looking for something specific would continue narrowing their search with the facets until they found what they were looking for.