Hi,

Following up on this thread (https://www.monetdb.org/pipermail/users-list/2016-April/009107.html) from last year, the issue was that the pcre-based regex replacement did not implement back references in the replacement string.
In need for this feature, I so far used the pcre implementation via R function. It works well, but using R just for this is a bit odd, especially because R-core has a pretty dumb dependency on the whole TeX distribution, and building MonetDB docker images that need to include R and TeX just because of regex isn't a neat option.
I have also tried to use both re and regex packages via python, but this is very slow compared to pcre.

So I finally decided to put a few hours on the MonetDB implementation and added support for back references. 
They can be indicated with both \1 and $1 syntax. Back-reference 0 is the whole match. Out-of-bound back-references are empty strings.

While at it:
- pcre.replace replaces all matches, so I made a variant pcre.replacefirst to replace only the first match.
- the bat implementation was ignored because of a missing "module batpcre" in pcre.mal, it always resorted to manifold on the string version. Fixed that.

Also, I'm not sure why there don't seem to exist SQL functions declared for this. 
I use the following:

CREATE FUNCTION pcre_replace(s string, pattern string, repl string, flags string) RETURNS string EXTERNAL NAME pcre."replace";
CREATE FUNCTION pcre_replacefirst(s string, pattern string, repl string, flags string) RETURNS string EXTERNAL NAME pcre."replace_first";

Perhaps they could come by default?

Please feel free to use and modify the patch (Jul2017) in attachment if you think it's useful. I've tested it quite a bit, no issues found.


Here are some examples:


sql>select name, pcre_replace(name, '([a-z]+)_([a-z]+)', '[\\1]-[\\2]', '') as replaced from sys.functions limit 2;
+-------------+---------------------------------------------+
| name        | replaced                                    |
+=============+=============================================+
| mbr_overlap | [mbr]-[overlap]                             |
| mbr_overlap | [mbr]-[overlap]                             |
+-------------+---------------------------------------------+
2 tuples (4.650ms)


sql>select name, pcre_replacefirst(name, '([a-z])', '[\\1]', '') as replaced from sys.functions limit 2;
+-------------+---------------------------------------+
| name        | replaced                              |
+=============+=======================================+
| mbr_overlap | [m]br_overlap                         |
| mbr_overlap | [m]br_overlap                         |
+-------------+---------------------------------------+
2 tuples (4.728ms)

sql>select name, pcre_replace(name, '([a-z])', '[\\1]', '') as replaced from sys.functions limit 2;
+-------------+-----------------------------------------------------------------------------------------------------+
| name        | replaced                                                                                            |
+=============+=====================================================================================================+
| mbr_overlap | [m][b][r]_[o][v][e][r][l][a][p]                                                                     |
| mbr_overlap | [m][b][r]_[o][v][e][r][l][a][p]                                                                     |
+-------------+-----------------------------------------------------------------------------------------------------+
2 tuples (10.360ms)