The Wolf’s Lair. This seems to be the meaning of Wolfenbüttel according to some research I did on the Internet before coming to the HAB. No need to be scared – Wolfenbüttel is one of the few cities in Germany that has preserved a large number of charming historical houses. Twisted facades with nested timbers and small frozen canals provided a wonderful setting for my research visit here. Wolfenbüttel is a quiet, little town where everything you need is in walking distance, which allowed me to be surprisingly punctual to all conferences, concerts and social meetings organised at the HAB.
My research about image similarity search algorithms all began with some emblems volumes preserved here at the HAB. In the past, I faced the very frustrating task of searching for similar images in 17th-century publications. At that time (almost six years ago) I was obliged to leaf through many books to find what I was looking for, and the results, I must admit, were poor and unsatisfying.
Nowadays libraries are developing multiple tools that let computers do what is usually a very boring and time-consuming job for scholars. What better occasion than to return to my old problem with images? This time I wanted to understand how computers »see« them, and how the »magic« of image similarity search happens. My study of »Computing in the Humanities« at the University of Bamberg enabled me to turn a new page and look at the other side of the problem while the MWW scholarship in Digital Humanities allowed me to concentrate fully on my research, giving me the possibility to deepen my IT skills.
Once at the HAB, I had the opportunity to meet librarians and visiting scholars with enviable expertise on emblems. I discovered that emblems are incredibly complex – a composition of text and graphic elements capable of expressing elaborate ideas and revealing hidden meaning. I also discovered that if I wanted to do something on images I had to start examining the emblem’s pictura and not emblems tout court. Yes, it is always better to be exact when it comes to terminology!
Therefore, I essentially focused on the emblem’s pictura and I largely used the digitised emblems dataset of the Emblematica online project.
After clarifying the goal of my research project, I was able to dive into my research question: How does a computer search for similar objects in images? Suppose a scholar wants to examine emblems with a certain topic, let’s say, »love«. She/he will probably start looking for Cupido (the iconographic expression of love). This is what the problem looks like through the »eyes« of a machine:
A Cupido blurred and down-sampled
Computers (and let us not forget that this behaviour was inspired by how the human brain processes images) need to blur and downsize images to capture their essential structure. They also need to add further algorithms like the difference of Gaussians to choose which areas in the image describe it better and serve as key points or interest points. Once this process is completed, what we obtain is an image turned into a series of numbers called descriptors.
Interest areas or key points detected on Cupido
With this data it is already possible to perform some research on image similarity. Below one can see how descriptor matching works.
The top 20 best matches between key points. This allow computer to recognize an object in an image
What else? Since descriptors tend to produce a large amount of data, I applied a further layer called a “Bag of Visual Words” (BOVW) to transform descriptors into a single vector, which is easier to manage and produces better results. To intuitively understand what a BOVW does, we can take its name literally.
Cupido and his Bag of Visual Words
From each image, it is possible to extract image patches (our “visual words”) and their associated feature vectors. We count the times each visual word appears in the image and a histogram is generated (our bag of “words”). The similarity between two images is now based on their histogram.
Cupido's histogram as a query object, confronted with the histogram of the related emblem's pictura
And now? There are many other algorithms able to perform image similarity searches. Recent studies combine feature descriptors with BOVW and a Support Vector Machine (a form of supervised learning). Other scholars are exploring the performance of a CCN (Convolutional Neural Network) on images. Most of the studies focus on photos of real-world objects, which are very interesting for specific commercial purposes (e.g. recognising and classifying clothes, book covers, furniture etc.) or in the field of robotics. However, I think that the complex and variegated world of hand-printed images will always be an interesting and challenging test bench for these algorithms.