Current status: as soon as xsane is finished scanning the page, #yagf crashes. I think I'm going to use tesseract directly, from the command line. Or at least try this for one page and if it works, find a workflow to scan all those 100 pages using my old scanner and then a Python script like the one suggested by @vickysteeves for all the nitty gritty details. Better than improvised bash hacking!