PPTools/Ppgen

From DPWiki
Jump to navigation Jump to search

Ppgen: post-processing generator

Overview

Post-processing has been difficult for many people because of the HTML required for books produced at DP. With new considerations added for handheld formats, the task is even more daunting. I set out to see how simple a markup I could create that would allow post-processing without having the PPer even look at the HTML. A single source file would generate the Text and HTML version and the HTML would be PPV-approved to DP standards.

At all stages, I kept the needs of a beginning post-processor in mind. The first choice was whether the markup should be “presentational” or “semantic.” A presentational markup describes what something looks like--how it is presented. A semantic markup describes what something is and lets the generator decide how it should look.

Presentational markup uses a simple vocabulary to describe how it looks, such as margin changes, indents, floated blocks. There is no keyword for “poetry,” for example. That's what something “is.” Without having to define a semantic name for every construction, the markup allows the user to decide how a document will look with much more flexibility.

A piece of software takes a book marked up in this simple format and produces Text and HTML. The HTML is compatible with ebookmaker for DP/PG use. The text meets the usual standards for line length, wrapping, chapter heading spacing, etc. The name of the software program is “ppgen.”

Related Documents

Post-Processing Workflow

Ppgen Checklists

User-contributed checklists used for post-processing with ppgen.

Basic PPGEN Checklist for use with Guiguts

Tutorials

Lessons

Online Tools

Some online tools are available to help in checking your project, and we recommend their use by all PPers using ppgen. All can be found in the DP Post-Processing Workbench:

  • PPtext which includes:
    • PPgutc, which performs various checks like those found in gutcheck.
    • PPlev, which helps detect possible scannos by highlighting words that are within a Levenshtein edit distance of 1 from each other.
    • PPjeeb, which helps highlight possible h/b errors (e.g., he vs be).
    • PPtxt, which performs various checks on the utf8.txt output file produced by ppgen.
    • PPscan, a smart quote checker.
    • PPspell, a spell checker. (Note: PPers may, in addition, want to install the Aspell program and appropriate dictionaries.)
  • PPhtml which includes:
    • PPppv, which runs various checks similar to those that a PPVer might run.
    • PPlink
  • PPsmq, a smart quote reformatter, which will convert straight quote marks and apostrophes to the curly forms more common in printed material.
  • PPcomp, a program to compare the current state of the text to the way it looked at some time in the past

Program Files

Downloadable program files are available for ppgen and for some additional tools. These are available from GitHub or Dropbox or other sites as noted below, and may be freely downloaded for installation on your system.

  • The production version of the post-processing generator, ppgen.py, is here: ppgen
  • The development version of ppgen is here: ppgen
  • A scanno detection tool, ppscannos1.py, is available from the ppscannos1 page. (Note: ppgutc, mentioned above, may include everything that ppscannos1 provides, and more.)


Note: ppgen and the other downloadable tools listed above require Python 3. They will not work with the version of Python (Python 2.7) delivered with Guiguts. If you do not have Python 3 installed, you can download it from the python.org download site.