Wednesday, February 18, 2009

Project Report: BookMaker




Okay, this is a quick little project that I started on yesterday, and finished today.

As many of you know, I've taken to doing a lot of my original writing in Google Docs. It's convenient because I've got easy access anywhere that I have internet access, and I don't have to keep track of which version of my document is the current version.

The drawback is that nobody really wants to read a book on Google Docs, so if I want any feedback I've got to print it out. And as you probably already know, Google Docs is not friendly for printing.

Of course, my ultimate goal is publication, and while friends and family can probably live with the endless scroll of a Google Docs print job, I need something more polished for editors and agents.

What I've been doing is maintaining two copies of the document. When I finished the book, I copied everything from my working copy online into a Word Document as a reading copy. Since then, whenever I wanted to make changes (adding conversations to clarify plot points, or just fixing typos), I had to make sure to make the changes in both documents, and that's a pain. My other choice was to just keep updating my working copy, and then every time someone asks for a reading copy, I have to copy everything back into the Word Document and clean it up all over again.

To my delight, Google provides a Python library that makes it easy to login, access all my Google Docs, and even download and modify the contents of them from within Python. To simplify the maintenance process (and for future convenience), I wrote a program that would fetch a document by filename, then parse the HTML and feed it into a Word template. To maintain text formatting I had to do a little cleanup afterward (removing HTML italics tags and using Word's Selection object to italicize the contained text, that sort of thing), but mostly the template and the HTML parser handled everything.

To make this work, I had to use two special libraries -- Google's gdata, and win32com, a library that lets me control Windows applications from within Python (in this case, Microsoft Word). I learned a lot about both libraries over the last couple days, and I'm excited about how much they can do.

Right now, the program is capable of formatting the document into any of three templates: a Reading Copy with narrow margins and small font to keep printing prices down, a Markup with wide margins and double-spaced lines for easier feedback, and a Submission in Proper Manuscript Format which is pretty ugly to people accustomed to typeset pages, but apparently it's popular with the editors. It's nice to be able to quickly generate any of those looks, without having to separately maintain the content.

There's a limit on the size of Google Docs, and I've bumped into it with a couple books, so I'll eventually want to add multi-file support. I could also make a GUI client so it would be easier to enter a username, password, and document title (right now you just edit the script directly). None of that is necessary, though. Even the multi-file stuf can be handled with a quick copy-paste in Word.

So I'm calling this one finished at version 1.0.

Status: Stable, v. 1.0

No comments:

Post a Comment