Python Scripting with Scribus

Autoquote.py

This is the mother or perhaps older sister of en+emdash.py. I'm not going to go through this one as carefully as before, and in fact I have truncated what you see here from the original to shorten the list of languages which are indicated. You can find Autoquote.py in Scribus if you want all these details.

As I said in the previous chapter, this came about when someone was searching for an automated way to convert typewriter quotes, both single and double, to typographic quotes (also called "curly quotes"). Once I had the logic worked out, I created a wiki page about it, and was then challenged to make it usable for other languages. This led to a LOT of research, since different languages use different typographic characters. The French and Russians, and some Germans use guillemets (« and »), and there are differences in how they are used also. In the end there are 23 possible language choices.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# File: quotes.py - changes typewriter quotes to typographic quotes

"""
USAGE

You must have a document open, and a text frame selected.
There will be a valueDialog asking for your language for the quotes, 
the default is 'en', but change the default to suit your needs.
Detected errors shut down the script with an appropriate message.

"""
import scribus

if scribus.haveDoc() > 0:
    c = 0
    lang = scribus.valueDialog("Choose by language or country", 'Language: af, be, ch, cs, de, de-g, en, es, et, fi, fr,\n hu, is, lt, mk, nl, pl, ru, se, sk, sl, sq and uk\n are current choices','en')
    if (lang == 'en'):
        lead_double = u"\u201c"
        follow_double = u"\u201d"
        lead_single = u"\u2018"
        follow_single = u"\u2019"
    elif (lang == 'de'):
        lead_double = u"\u201e"
        follow_double = u"\u201c"
        lead_single = u"\u2019"
        follow_single = u"\u201a"
    elif (lang == 'de-g'):    # German - inverted guillemets for quotes
        lead_double = u"\u00bb"
        follow_double = u"\u00ab"
        lead_single = u"\u2019"
        follow_single = u"\u201a"
    elif (lang == 'fr'):      
        lead_double = u"\u00ab"
        follow_double = u"\u00bb"
        lead_single = u"\u2018"
        follow_single = u"\u2019"  # am hoping this will cover contractions like je t'aime

    else:
        scribus.messageBox('Language Error', 'You need to choose an available language', scribus.ICON_WARNING, scribus.BUTTON_OK)
        sys.exit(2)
        
else:
    scribus.messageBox('Usage Error', 'You need a Document open', scribus.ICON_WARNING, scribus.BUTTON_OK)
    sys.exit(2)

if scribus.selectionCount() == 0:
    scribus.messageBox('Scribus - Usage Error',
        "There is no object selected.\nPlease select a text frame and try again.",
        scribus.ICON_WARNING, scribus.BUTTON_OK)
    sys.exit(2)
if scribus.selectionCount() > 1:
    scribus.messageBox('Scribus - Usage Error',
        "You have more than one object selected.\nPlease select one text frame and try again.", scribus.ICON_WARNING, scribus.BUTTON_OK)
    sys.exit(2)
textbox = scribus.getSelectedObject()
pageitems = scribus.getPageItems()
boxcount = 1
for item in pageitems:
    if (item[0] == textbox):
        if (item[1] != 4):
            scribus.messageBox('Scribus - Usage Error', "This is not a textframe. Try again.", scribus.ICON_WARNING, scribus.BUTTON_OK)
            sys.exit(2)
contents = scribus.getTextLength(textbox)
while c <= (contents -1):
    if ((c + 1) > contents - 1):
        nextchar = ' '
    else:
        scribus.selectText(c+1, 1, textbox)
        nextchar = scribus.getText(textbox)
    scribus.selectText(c, 1, textbox)
    char = scribus.getText(textbox)
    if (len(char) != 1):
        c += 1
        continue
    if ((ord(char) == 34) and (c == 0)):
        scribus.deleteText(textbox)
        scribus.insertText(lead_double, c, textbox)
    elif (ord(char) == 34):
        if ((prevchar == '.') or (prevchar == ',') or (prevchar == '?') or (prevchar == '!') or (prevchar == '…')):
            scribus.deleteText(textbox)
            scribus.insertText(follow_double, c, textbox)
        elif ((ord(prevchar) == 39) and ((nextchar != ' ') and (nextchar != ',') and (nextchar != '.'))):
            scribus.deleteText(textbox)
            scribus.insertText(lead_double, c, textbox)
        elif ((nextchar == '.') or (nextchar == ',')):
            scribus.deleteText(textbox)
            scribus.insertText(follow_double, c, textbox)

        elif ((prevchar == ' ') or ((nextchar != ' ') and (ord(nextchar) != 39))):
            scribus.deleteText(textbox)
            scribus.insertText(lead_double, c, textbox)
        else:
            scribus.deleteText(textbox)
            scribus.insertText(follow_double, c, textbox)
            
    if ((ord(char) == 39) and (c == 0)):
        scribus.deleteText(textbox)
        scribus.insertText(lead_single, c, textbox)
    elif (ord(char) == 39):
        if ((prevchar == '.') or (prevchar == ',') or (prevchar == '?') or (prevchar == '!')):
            scribus.deleteText(textbox)
            scribus.insertText(follow_single, c, textbox)
        elif ((ord(prevchar) == 34) and ((nextchar != ' ') and (nextchar != ',') and (nextchar != '.'))):
            scribus.deleteText(textbox)
            scribus.insertText(lead_single, c, textbox)
        elif ((prevchar != ' ') and (ord(prevchar) != 34) and (nextchar != ' ')):
            scribus.deleteText(textbox)
            scribus.insertText(follow_single, c, textbox)
        elif ((prevchar == ' ') or ((nextchar != ' ') and (ord(nextchar) != 34))):
            scribus.deleteText(textbox)
            scribus.insertText(lead_single, c, textbox)
        else:
            scribus.deleteText(textbox)
            scribus.insertText(follow_single, c, textbox)
            
    c += 1
    prevchar = char

scribus.setRedraw(1)
scribus.docChanged(1)
endmessage = 'Successfully ran script\n Last character read was '+str(char) # Change this message to your liking
scribus.messageBox("Finished", endmessage, icon=scribus.ICON_NONE, button1=scribus.BUTTON_OK)

In retrospect I can't recall what led me to the indexing method of analyzing the contents of the text frame. I do know that one of the concerns was not to disrupt any control characters, things like newline characters, or various control characters used to change font styles. So what I'm doing here is plucking out characters to inspect them, then delete/change them or just moving on.

After that, then there was the major question of how to decide when to insert a left typographic quote and when to insert a right quote. Some obvious ones seemed to be when the quote was the first character of a paragraph it should be left and if it's the last, a right. There also comes up word boundaries as defined by a space before and a space after. The end of a sentence could be detected by punctuation as a sign for a right quote. I also tried to keep track of whether the last quote was a left or a right.

In the end, all I can say is that there was a lot of trial and error involved with this in trying to think about the use-case situations. One kind of issue I never figured out was with something like a sentence beginning with 'Twas. It should be a right quote, but where is the logic for that? Probably only a very rare situation. Still, I was left with a little anxiety, which is indicated by the last two commands, where I use a message to show what the last character analyzed was, so the user could be sure he made it all the way through the text. After all, it's easy to imagine that a text frame might have a quotation mark as its last character.

Something the script made me aware of is that word processors are very good but imperfect at assigning typographic quotes, and in some situations my script outperforms them.