Introducing Dact 1.6

Oct 15, 2012

Tag: nlp

It’s that time again! We have just released a new version of Dact, our tool for viewing and searching Alpino treebanks. In this blog post, we will give short description of the improvements in Dact 1.4 and 1.6. In these versions, we have primarily tweaked the user interface to make it more appealing to new users.

Treebanks are just one click away

Previously, Dact would open up with a blank window, where you could download or open a corpus via the application menu. In Dact 1.4, you will be welcomed by a sheet that allows you to pick a Dact-managed treebank, no matter if it is available on your computer or not. If the treebank is not available locally, Dact will retrieve it for you and open it.

Opening a corpus in Dact

Since version 1.6, Dact now also supports treebanks with millions of sentences, such as Lassy Large. Such treebanks are typically distributed as a directory with .dact files and can be opened with the Open directory… button on the corpus sheet.

Display of semantic roles

We have also modified Dact’s tree drawing logic to be able to display inline HTML styling, such has font colors and styles. We have immediately put this functionality to good use by coloring semantic role annotations in the SoNaR treebank purple.

Semantic roles

Cookbook and Query URIs

We now provide an extensive cookbook, which provides a large repository of example queries. Quite often, you can start with one of these queries and modify them to your needs.

Dact now also supports dact: URIs. These URIs can be used to make clickable Dact queries on a website or in an e-mail. The cookbook makes extensive use of these dact: URIs, so that you can run a sample query with one click. Dact URIs are currently supported on Mac OS X and Linux.

Query pipelines

Dact uses XML database technology to process queries quickly. For some complex queries, the query processor fails to optimize the query correctly. For instance, suppose that the following query is slow:

//node[%obj1_drinken%]

We might know that this query only returns trees with the lemma drinken. In such cases, we can use a simpler query to make a preselection and pipe the result to the slower query. For instance:

//node[@lemma='drinken'] +|+ //node[%obj1_drinken%]

The +|+ operator can be used to combine any number of queries. However, database indexing is only used for the first query in the pipeline.

Full-screen support in Mac OS X

Last, but not least, Dact finally provides full-screen support on Mac OS X. This means that you can run Dact full-screen on a separate desktop without showing the title bar or the Dock. This is especially useful on Macs with smaller screens.

Availability

Dact 1.6 is available from the Dact website.