vc2_conformance.string_utils: String formatting utilities

The vc2_conformance.string_utils module contains a selection of general purpose string formatting routines.

indent(text, prefix='  ')

Indent the string ‘text’ with the prefix string ‘prefix’.

Note

This function is provided partly because Python 2.x doesn’t include textwrap.indent() in its standard library and partly to provide an indent function with sensible defaults (i.e. 2 character indent, and always indent every line).

ellipsise(text, context=4, min_length=8)

Given a string which contains very long sequences of the same character (e.g. long mostly constant binary or hex numbers), produce an ‘ellipsised’ version with some of the repeated characters replaced with ‘…’.

Exactly one shortening operation will be carried out (on the longest run) meaning that so long as the original string length is known, no ambiguity is introduced in the ellipsised version.

For example:

>>> ellipsise("0b10100000000000000000000000000000000000001")
"0b1010000...00001"
Parameters
textstr

String to ellipsise.

contextint

The number of repeated characters to retain before and after the ellipses.

min_lengthint

The minimum number of characters to bother replacing with ‘…’. This means that no change will be made until 2*context + min_length character repetitions.

ellipsise_lossy(text, max_length=80)

Given a string which may not fit within a given line length, trnucate the string by adding ellipses in the middle.

split_into_line_wrap_blocks(text, wrap_indented_blocks=False)

Deindent and split a multi-line markdown-style string into blocks of text which can be line-wrapped independently.

For example given a markdown-style string defined like so:

'''
    A markdown style title
    ======================

    This is a string with some initial indentation
    and also some hard line-wraps inserted too. This
    paragraph ought to be line-wrapped as an
    independent unit.

    Here's a second paragraph which also ought to be
    line wrapped as its own unit.

    * This is a bulleted list
    * Each bullet point should be line wrapped as an
      individual unit (with the wrapping indented
      as shown here).
    * Notice that bullets don't have a newline
      between them like paragraphs do.

    1. Numbered lists are also supported.
    2. Here long lines will be line wrapped in much
       the same way as a bulleted list.

    Finally:

        An intended block will also remain indented.
        However, if wrap_indented_blocks is False, the
        existing linebreaks will be retained (e.g. for
        markdown-style code blocks). If set to True,
        the indented block will be line-wrapped.
'''

This will be split into independently line wrappable segments (as described).

Returns
blocks[(first_indent, rest_indent, text), …]

A series of wrappable blocks. In each tuple:

  • first_indent contains a string which should be used to indent the first line of the wrapped block.

  • rest_indent should be a string which should be used to indent all subsequent lines in the wrapped block. This will be the same length as first_indent.

  • text will be an indentation and newline-free string

An empty block (i.e. ("", "", "")) will be included between each paragraph in the input so that the output maintains the same vertical whitespace profile.

wrap_blocks(blocks, width=None, wrap_indented_blocks=False)

Return a line-wrapped version of a series of text blocks as produced by split_into_line_wrap_blocks().

Expects a list of (first_line_indent, remaining_line_indent, text) tuples to output.

If ‘width’ is None, assumes an infinite line width.

If ‘wrap_indented_blocks’ is False (the default) indented (markdown-style) code blocks will not be line wrapped while other indented blocks (e.g. bullets) will be.

wrap_paragraphs(text, width=None, wrap_indented_blocks=False)

Re-line-wrap a markdown-style string with hard-line-wrapped paragraphs, bullet points, numbered lists and code blocks (see split_into_line_wrap_blocks()).

If ‘width’ is None, assumes an infinite line width.

If ‘wrap_indented_blocks’ is False (the default) indented (markdown-style) code blocks will not be line wrapped while other indented blocks (e.g. bullets) will be.