Exploring Cultural Heritage through Data
What do the icons in the dashboard mean?
At the lower end of the screen, there are a series of tabs for different ways to present and visualise your search results.
The first tab which is the main content area resembles a typical search engine, with a list of matching documents, sorted by recency, that can include news articles, web pages or social media postings.
Clicking on one of the results previews the document in the right sidebar, including the link that opens the original document in another browser tab. For each previewed document some metadata information (general source, sentiment value and location) are provided at the bottom.
The tag cloud shows the keywords associated with your search terms. Colours reflect the associations specific to a particular search term. Keywords in grey relate to several or all of the search terms. It is straightforward to extend this comparative analysis by providing additional search terms with the plus icon on the right sidebar. In this view the right sidebar now gives an overview of the matching documents, same as shown in the main content area in the first tab. This helps to better understand the origin of individual keywords and quickly find the original content.
Clicking on a word in the tag cloud applies an additional filter that restricts the list of search results to only those documents that also contain the clicked word. A small icon next to the filter status message allows you to remove this restriction and again show the entire set of documents.
The next tab provides a keyword graph, an alternative way to visualise associated keywords and see the strongest semantic associations within the search results. Its hierarchical display summarises how each of the search terms is perceived in the surrounding debate. Providing additional search terms automatically extends the graph.
By clicking on one of the graph nodes a filter is applied that restricts the list of search results in the right sidebar to the ones that also contain that keyword.
The trend chart plots the frequency of mentions for the chosen search terms over a specific timeframe. A story detection component labels each peak with the top three keywords during that time. This illustrates the evolution of major topics related to the search terms over time and allows the user to detect newly emerging topics and discussion points.
The peak labels can be clicked to apply a filter and restrict the previewed content in the right sidebar to focus only on this storyline.
The story graph (a Streamgraph visualisation) is another visual method to present the emergence and evolution of distinct stories around the search term. Hereby each story is a cluster of related documents, plotted around a vertically centred axis. The size of an area indicates how many documents belong to a particular story.
As in the trend chart, stories are represented by three keywords and each story can be clicked to apply an additional filter.
The story view allows you to explore the stories that are visualised in the story graph in more detail. The story view works similar to known news aggregators, where each story has a lead article and a number of related documents, again explained by three descriptive keywords. For each story, a rich set of metadata is extracted. This includes the origin of the story in terms of publication time and author. The impact of the story is then evaluated by analysing the temporal distribution of related publications. This analysis also helps to identify the best keywords to summarise the content of a story.
Similar to the main context area, here, clicking on any of the related documents activates the full document preview in the right sidebar.
The cluster map is another way to visualise related search results as an intuitive way to group search results by topic. By identifying similar documents, it helps to better understand the structure of online coverage and other large document collections. The visual representation of the cluster map arranges documents by their semantic similarity, using clustering algorithms and methods. The colours reveal whether the documents of a particular cluster stem from just one of the search terms or from multiple queries. Each node’s size is proportional to the reach of the document’s original source (a CNN.com article, for example, is rendered larger than a report published on a local community site).
Each group of nodes is described by three keywords, and by clicking on a single node the underlying document is revealed in the right sidebar.
The final tab shows the regional distribution of search results as a geographic map, with the ability to zoom into the graph to generate a more fine-grained display. This way also local coverage can be inspected and coverage of topics in different areas compared. The size of the circles reflects the number of results that refer to a specific location.
Selecting a circle shows a list of documents that reference the nearby location in the right sidebar.