Implementing Algorithms to extract features of manuscripts

Swati(Karlsruhe):Today I implemented some algorithms for text segmentation of the images and now I am facing some challenges  to extract the features like text area, text width, page width, page height from the segmented images. It is so much fun to work with all others(Hannah, Philipp, Danah and Celia) in the team, as talking to them during the weekly telco somehow eases my stress level.

We are the best team and working very well to make our project “eCodicology” a big success.

 

Today’s Telephone Conference

Celia (Darmstadt): On nearly every Tuesday our project members in Trier, Darmstadt and Karlsruhe use to meet in a telephone conference (telco) via Skype. Every time it gives us a fresh look on news and proceedings of each working package. By lucky accident, our conference coincided with this year’s Day of DH. Today it was my turn to be the chair of the telco and we had a special guest, Jochen Graf from Cologne, who answered some questions about the annotation of images.

skype

I take this opportunity, to express a few thoughts on computer aided collaboration. Regular virtual collaboration via email, telephony software or web conference services is something I got to know as recently as I started to work for DH projects. For many of us, it is a kind of daily experience we got used to during the years. Without services like Skype or DFNVC many tasks in our routine work could hardly be managed. Especially in the global world of Digital Humanities, where completely diverse disciplines and communities get connected in transnational dimensions, digital teamwork has become an integral part. How much times have changed since the days of low speed dial-up internet!

 

Open Annotation Data Model and TEI

Philipp (Trier): During our telephone conference I get to know Open Annotation Data Model because our guest from the University of Cologne, who is working on a tool for the semantic annotation of images, asked us about it. For our metadata about medieval books we use a mySQL database and XML according to the TEI. Lots of these information refer to the codex and not to single pages or parts of pages. But the software that we develop in our project should recognize defined layout features on digital images and it might be interesting to store these results as annotations on the image as well as in the metadata in our XML files. The question arises whether it is possible to map these two approaches without loss.

 

Scanning Fragments and Redesigning the Homepage

Philipp (Trier): This morning I had a meeting with my colleague who is working at the Stadtbibliothek/Stadtarchiv Trier. We have talked about the possibility to scan fragments that have been seperated from the codices. They should be measured automatically and the computer should give suggestions to which codex the fragments once belonged because this information is missing in some cases.

Afterwards, we had a meeting with our web designer. He showed us his first drafts for the new homepage of the “Virtuelles Skriptorium St. Matthias”. The new design should give the page a fresh modern look and we like his drafts very much. The search for codices and contents will become the central function presented on the page and it should also become more comfortable to use it.

 

What is eCodicology?

S0099-00008-DOWNLOAD_3

…started in May 2013 as follow up project of the DFG funded projext Virtuelles Scriptorium St. Matthias. In the Virtual Scriptorium, medieval manuscripts of the Benedictine Abbey of St. Matthias in Trier were digitised and merged. These digital copies are further utilised in eCodicology with the aim to attract new codicological data from the existing digital images of medieval manuscripts.
eCodicology is the BMBF funded joint research project of Technical University of Darmstadt, the Karlsruhe Institute of Technlogy and the University of Trier. The main purpose of the project is the development, testing and optimization of new algorithms for the identification of macro- and micro-structural layout elements of the manuscript pages in order to enrich their metadata in XML format according to the TEI P5 standards. The already existing manuscript descriptions can thus be enriched automatically.
The project uses various image processing and feature extraction techniques which allows to detect and extract various layout features of digitized manuscript pages. As a result, humanities schlolars can analyze and find new hidden relationships in 170,000 pages of medieval manuscripts.
eCodicology exceeds the established standards of the virtual reconstruction of historic libraries, which aim for reunion, textual preparation and presentation of the collection. The project also aspires the resusability for future projects. The algorithms proved on the manuscripts of St. Matthias can serve as serving point for analyses of further manuscripts collections.

Bild2

We are a team of four young scholars with different study backgrounds and different tasks within the project. While Celia, Hannah and Philipp are humanists by training but getting more and more experienced in new digital technologies, Swati is our computer scientist and currently learning a lot about the work of and with humanists and about latin medieval manuscripts.

During Day of DH we are trying to give you some inside view of the work within our project.Bild1

 

 

Starting Day of DH in Trier

altbachtal

Hannah: My Day started with a bicycle ride up to the University of Trier which is located on top of Tarforst hights. But the effort is always rewarded by cycling trough one of the most historical areas of Trier, the Altbachtal. Today there can be found a garden plot but it used to be a temple district more than 2000 years ago in the very early days of Trier. And of course you have a stunning view to the vineyards surrounding the valley.

Arrived at the Trier Center for Digital Humanities I check my email account to see if some of my collegues from the project have any news for me. Since we are working from Trier, Darmstadt and Karlsruhe most communication is via email, but once a week we meet for a telephone conference and today is telco day! I’m already very exited to hear all the news about their working packages and to talk about future tasks for our project.