Skip to content
Commit 080a20c1 authored by Kai Willadsen's avatar Kai Willadsen
Browse files

misc: Avoid string copies during filtering (bgo#768300)

When we switched over to doing better regex filtering and highlighting
of ignored regions, we changed the way we were applying filters from a
simple multiple-regex approach to a merged-span based approach. This is
fine, except that this also changed the way we sliced the existing text
to produce the filtered version.

Prior to this commit, we removed matching filtered text by
concatenating two string slices, which is extremely slow in Python due
to the overhead of string allocation, among other things. With this
patch, we use a more idiomatic approach of grabbing all of the text
sections that we care about and concatenating them in a single join
operation at the end.

The test case in bgo#768300 was previously extremely slow (I gave up
waiting), but with this change takes a few seconds.

This commit also switches up the role of the "cutter" function, which
now only applies changes rather than expecting to modify the text. Text
modification is carried out by apply_text_filters itself, since it can
do so much more efficiently.
parent 760f63ac
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment