Version 1.1
Workshop: Free Stuff For Devs
Use Images, Text, Webarchive and Catalogue Data from the Austrian National Library in Jupyter Notebooks

Do you want to analyse historical newspapers with Python? Does training your CNN on historical postcard images sound nifty to you? Do you want to search within the Austrian Webarchive from the comfort of your home? We got you covered!
We use prepared (and pre-shared) Jupyter Notebooks to illustrate:
- The data the Austrian National Library has to offer (for free)
- Which Python libraries make accessing and processing these data easier
- Some example applications using these data within Jupyter
Individual participants are invited to either follow along the guided tour through some of the shared Notebooks with the rest, or they can work at their own pace through the provided material, asking questions as they arise.
We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:
https://labs.onb.ac.at/gitlab/labs-team/pydays19
Preliminary Rough Outline
- Overview Workshop
- Metadata & Catalogue
- Overview data formats, container formats, protocols
- Example SRU
- Example data harvesting OAI-PMH
- Example SPARQL
- Images & Text
- Overview IIIF
- Overview OCR formats
- Example download OCR text
- Example download pre-resized images for machine learning
- Example create IIIF collection from SPARQL query result
- Webarchive
- Overview Webarchive, API and content
- Example Wayback search via API
- Example full text search via API
Requirements for Participants
- Laptop
- Connectivity
- Python 3
- Working Jupyter Notebook installation
Material
We'll publish a requirements.txt and the selected Notebooks 1 week before the workshop, the slides 1 day before the workshop here:
https://labs.onb.ac.at/gitlab/labs-team/pydays2019
Language
Slides and Notebooks in English, Workshop in English (or German, if all participants prefer that)
Presenters
Georg Petz is the senior software developer of the Austrian National Library's R&D department. Stefan Karner is the software developer of the ONB Labs project.
Links
https://labs.onb.ac.at
Info
Tag:
03.05.2019
Anfang:
10:00
Dauer:
02:00
Raum:
F4.07
Track:
PyDays Workshops
Sprache:
en
Links:
Feedback
Uns interessiert deine Meinung! Wie fandest du diese Veranstaltung?
Concurrent Events
Referenten
![]() |
Stefan Karner |
![]() |
Georg Petz |