# Python thing that doesn't work in Sage, works in pure Python

In sage -python:

>>> import re
>>> ST = "foobarsws"
>>> matches = re.match("(?:foo)?(bar)?(sws)?",ST)
>>> matches.groups()
('bar', 'sws')
>>> matches.group(0)
'foobarsws'
>>> matches.group(1)
'bar'
>>> matches.group(2)
'sws'


In Sage:

sage: import re
sage: ST = "foobarsws"
sage: matches = re.match("(?:foo)?(bar)?(sws)?",ST)
sage: matches.groups()
('bar', 'sws')
sage: matches.group()
'foobarsws'
sage: matches.group(0)
IndexError: no such group
sage: matches.group(1)
IndexError: no such group
sage: matches.group(2)
IndexError: no such group


Any ideas?

edit retag close merge delete

This one has gotten me, too. It's very annoying.

( 2012-06-19 02:45:48 +0100 )edit

See [here](http://www.sagemath.org/doc/faq/faq-usage.html#i-have-type-issues-using-scipy-cvxopt-or-numpy-from-sage).

( 2012-06-19 06:39:28 +0100 )edit

@bk322 - right, if I'd gotten that error I would have thought of this, but it was the IndexError that made it so weird. Because of course this error naturally appears anyway when you've tried to access a nonexistent group (like matches.group(3) would be above, I guess)

( 2012-06-19 09:35:55 +0100 )edit

Sort by » oldest newest most voted

Never mind.

sage: matches.group(int(1))
'bar'


Leaving this up for those who might stumble. I'll try to think of a better title for the question so people find it.

This seems bad, though. I suppose there isn't any way to notice when we are using "pure Python" commands and for the preparser to at least try not doing the whole Integer thing if it gets an Error of some kind? Maybe that would lead to more chaos than we'd like.

more

or the shorter: matches.group(1r)

( 2013-08-27 23:51:23 +0100 )edit

For others that stumble across this question, you can turn the Sage preparser off by calling preparser(False). Then the code above works just fine at the Sage command line:

sage: preparser(False)
sage: import re
sage: ST = "foobarsws"
sage: matches = re.match("(?:foo)?(bar)?(sws)?",ST)
sage: matches.groups()
('bar', 'sws')
sage: matches.group()
'foobarsws'
sage: matches.group(0)
'foobarsws'
sage: matches.group(1)
'bar'

more

To some extent, this might be a flaw in python's interface. There's an ambiguity to what matches.group is supposed to do:

matches.group(i)


should have the same effect as

matches.groups()[i]


or as

matches.groupdict()[i]


depending on the nature of i. As it happens, the C code decides on whether i passes PyInt_Check (i.e., really is an integer), so if i is Integer(1) then i is interpreted as a key into explicitly named groups. Unfortunately, the rules are a little strict for what the names of named groups can be, so if i is an Integer, it will never be valid as such:

sage: matches = re.match("(?:foo)(?P<1>bar)(?P<2>sws)?",ST)
error: bad character in group name


It's documented that group names must follow the same rules as valid python identifiers. Valid use is:

sage: matches = re.match("(?:foo)(?P<a>bar)(?P<b>sws)?",ST)
sage: matches.group('a')
'bar'


so the problem is a muddled interface: matches.group(i) must choose whether to try to interpret i as an integer or as a string. There is of course no good answer if i is neither and indeed, the code hardly tries anything.

In the last example, matches.group(i) should work for any i that compares equal to a or b and hashes consistently with that.

more