Quantcast
Channel: Peter of the Norse Thinks He’s Smarter than You
Viewing all articles
Browse latest Browse all 54

Printing vs. the web

$
0
0

At Radio 1190, one of the things we do is print an insert into each of the CD cases that has the information we use. In addition to a brief review, there’s also extra info about the songs, like the length and tempo. This insert has to be 12cm by 12cm to fit in the case. While CSS has a cm measurement, no browser respects it.

First, I should point out that the standard unit of measurement for fonts on the computer, the point, is a real world thing. There are 72 points in an inch, just like there are 12 inches in a foot. So according to that 72pt = 1in = 2.54cm = 25.4mm. But look at what happens when you try to put that in a browser:

MMMM

Depending on the browser you’re using, the Ms might be different sizes. For even more fun, try printing them. I would set things up to be perfect on one browser, and they’d be slightly off on another. It was really annoying when it would be lined up perfect on the screen, and have gaps on paper. And don’t forget that some browsers fill to fit the page.

There’s only one thing for it: I’ll have to use PDF.

Since I’m using Python, I’ll have to use ReportLab. It is the most schizophrenic API for python I’ve ever seen. In fact the only thing I’ve seen that’s worse is PostScript. Which probably explains things. I suspect that ReportLab is tightly fitted around PDF. But these problems do make it difficult to work with.

I’m going to talk about only two parts of ReportLab: paragraphs and tables. Both are, in ReportLab’s terminology, “Platypus Flowables”. For the details of what that means, see the official documentation. Note that the official documentation starts with Canvas, which is the low level implementation. That’s how you program, but not how you learn. So I’m only going use the high level stuff.


Paragraphs are objects that know how to break lines. They also support a simple subset of XHTML so that you can do some styling. The signature is Paragraph(text, style, bulletText=None). Ignoring bulletText, it should be obvious how it works. The only gotcha is, style has to be a ParagraphStyle object. I don’t think there’s a difference between ParagraphStyle and a dict, so I’m forced to conclude that there’s some PDF equivalent that needs to be encoded. To add needless complexity, ParagraphStyle requires a name attribute that is only used by StyleSheet1s. That extra 1 is not a typo. It is an example of what’s wrong with this API. A StyleSheet1 is a dict that only accepts ParagraphStyle and uses its name as the key. I decided that normal dicts are good enough for me.

    styles = {'normal': ParagraphStyle(name='normal', 
                        fontName='Times-Roman', 
                        fontSize=0.5*cm, 
                        leading=0.4*cm),
              'small': ParagraphStyle(name='small', 
                       fontName='Times-Roman', 
                       fontSize=0.3*cm, 
                       leading=0.3*cm),
              'review': ParagraphStyle(name='review', 
                        fontName='Times-Roman', 
                        fontSize=10, 
                        leading=10.2),
              'song': ParagraphStyle(name='song', 
                      fontName='Times-Roman', 
                      fontSize=0.5*cm, 
                      alignment=1)}

Note that alignment is 1. That means center. There’s a constant somewhere, but I didn’t want to dig for it.

Assuming that obj is an Album model, creating the content of the table cells is simple.

    album = Paragraph(escape(obj.album), styles['normal'])
    artist = Paragraph(escape(obj.artist.artist), styles['normal'])

Now things get interesting. The artist and album need to fit in a table cell. We know the exact dimensions of that cell because we specified it. (It’s the whole reason we went to PDF in the first place.) We can find the dimensions by calling wrap with the available width. wrap is usually called by the surrounding frame, but we can do it ahead of time to see what happens.

    artist.wrap(5.8*cm, 1*cm)
    if artist.minWidth() > 5.8*cm or len(artist.getActualLineWidths0()) > 2:
        artist = Paragraph(artist.text, styles['small'])

minWidth is calculated from the longest word in this font. There are artists and albums that contain a run-on word. We also know that the cell height only allows for two lines. So if there’s more, that’s a problem. We switch to a smaller font. You can’t change the style after the wrap, and you can’t rewrap, so we need to create a new one.

The rest of the text areas are much the same, except for the review. Since it can have bold and italic and carriage returns, we can’t escape it. We’ll have to make sure that only the supported tags are used. It’s a little bit long and obvious so I won’t quote it.


Table and TableStyle work to make tables. While there have to be differences because it’s for tables and not paragraphs, it seems entirely unrelated. Written by an different group of people. Who should have their computers taken away. First, TableStyle is unnecessary. Any place you can use a TableStyle, you can use a list of tuples. Second, everything is done with lists of tuples. But not just any tuples. Specially formatted tuples. With sub-tuples.

Let’s look at the constructor. Table(data, colWidths=None, rowHeights=None, style=None, splitByRow=1, repeatRows=0, repeatCols=0)data is a list of equal length rows. Everything in a row has to be a Flowable (including Paragraphs or other Tables) or a string. There is a way to make data span cells, sub-tables are simpler. And strings don’t wrap, so unless it’s a single word ore phrase, you’ll probably want a paragraph. While colWidth and rowHeights determine the sizes of the cells. If you leave them blank, ReportLab will make educated guesses. The last three raise NotImplementedError when changed.

table styles are built up from “commands”. Commands are tuples of the form (property, (starting_column, starting_row), (ending_column, ending_row), value, ...).

Properties are always strings in capitals, and the value or values depend on the property. A negative column or row means from the end like slices. For example, ('ALIGN', (0,0), (-1,-1), 'CENTER') makes everything center aligned. Notice how the value is a string while in the paragraphs it’s an int. Also, every command has the start and end cells. Since most cells get multiple properties at once, it’s more than a little bit redundant.</p>

Actually creating the table is easier than with HTML. The cell sizes are absolute and borders and padding don’t effect them. It makes precise layouts much easier. Which is good for our purposes.


The whole thing sucks. I want a replacement, but don’t have the time or skills to make it myself. If you know of a free PDF generator, I’d like to see it.


Viewing all articles
Browse latest Browse all 54

Trending Articles