Create your PDF with Weasyprint

 06 January 2020   Arnaud

Category / Key words: Python / Python, Weaysprint, Pdf.


WeasyPrint is an open source library under "BSD License 2.0" license, allowing to make HTML to PDF exports (libcairo under the hood). Developed by Kozea, a company based in Lyon (France), web expert for health workers. The library is actually in version 51 and under active development since 2011.

This article aims to introduce you this tool, which i find simple and effective. I will show you basics features and give you tricks to quickly export your first PDFs. For a complete tour, read the official documentation.

Installation

With Debian, all is in the repositories:

sudo apt-get install build-essential python3-dev python3-pip python3-setuptools python3-wheel python3-cffi libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info

We go on with pip:

pip3 install WeasyPrint==51

With Windows, the installation is possible but complex, Pango and Cairo aren't available by default: Look at the documentation…

Working steps

To summarize, a print goes like this:

  1. the HTML tree is fetched and analized
  2. stylesheets are fetched (provided by the user in the page or by url)
  3. stylesheets are appllied
  4. elements are transformed into fixed dimensions boxes while respecting stacking rules
  5. boxes are drawn on the PDF
  6. metadatas (bookmark and hyperlinks) are integrated

Usage

Basic :

weasyprint <local HTML file or URL> output.pdf

Quick tour of parameters?

--format [png ou pdf]
define the export format.
--stylesheet <local file or URL>
link CSS passed as a parameter, multiple CSS will be applied in the order
--base-url <URL>
if your document contains relative URLs, enter here the root URL so that WeasyPrint will export full URLs
--attachment <file>
allow to attach any document to your PDF, the reader software will open this file by an external program according to its type.
--debug
by default, WeasyPrint displays every errors, the debug option adds every steps of the rendering.

The command line is a good way to quickly create a PDF, but test incremental changes could be burdensome, look at Reload a PDF while developing your CSS).

Appearence

Disposition

By default, PDFs are rendrend in the A4 format, to change this ouput format :

@page {
size: Letter; // ie. portrait, landscape, Letter
margin: 2.5cm; }

Font

By default, i choose to declare the font in HTML code, but WeasyPrint is perfectly able to fetch a remote front from stylesheet :

@import url(https://cdn.jsdelivr.net/npm/fork-awesome@1.1.7/fonts/forkawesome-webfont.woff2);
html {
font-family: ForkAwesome;
}

Reload a PDF while developing your CSS

The display in the browser of your HTML may differ from the PDF. So, you should look often at WeasyPrint export after your CSS modifications, to avoid unpleasant surprise.

To automatically reload after each modifications, user the embedded web server.

python -m weasyprint.tools.navigator

Under development, give the URL of your page like this file://<relative or absolute path of your HTML page> ex: file:///home/user/input.html

You'll just have to refresh page to see the rendering in your browser.

API python

Style sheets and HTML page can be passed as paths, url or raw text to WeasyPrint API (the 2 CSS and HTML objects) thanks to the "duck typing". Let's precise the named parameters in the example:

from Weasyprint import HTML, CSS

html = HTML(string='<h1>The title</h1>') # HTML(url='http://example.fr') or HTML(filename='template/test.html')

css = CSS(string='''
    h1{
    color: red;}''')

html.write_pdf(
    '/tmp/example.pdf', stylesheets=[css])

html.write_pdf() # or write_png()

Serve PDF in a Django app

The documentation mentions a django-weasyprint app, which provides a mixin and a generic class wrapping WeasyPrint while respecting the Django conventions.

So to create a view to export PDF :

import time

from django_weasyprint import WeasyTemplateView

class CustomPrintableView(WeasyTemplateView):
    template_name = 'example.html'

    def get_context_data(self, **kwargs):
        kwargs['time'] = str(time.time())
        return kwargs

    def get_pdf_filename(self):
        return 'example.pdf'

Conclusion

I hope this article show you the power of this tool which not only allow to export PDF but also to handle them easily and finely.