I was actually working on something similar recently (Regular Expression Replacement in Sage). I got my function working and modified it to suit your needs.
Please see the code and test output below or in this Sage Worksheet.
Note that this is just a WIP solution; far from catch all. Your LaTeX will likely be much more complicated than the equation you entered as a sample, and the regular expression parser needs to be made aware of all those syntactic nuances (especially pairs of enclosing delimiters). This can be done by altering the OrderedDict(adict)
in class make_xlat
or by passing it as the udict
argument to latex2func(s, udict)
. Also note that this replacer will try and operate on the LaTeX function with all the grammatical symbols at once. So in it's current form, there's no way to tell the replacer to find and replace something before something else.
What you really need is an all encompassing grammar for LaTeX equations. As @fredericc said, this is really not a simple thing to do, but if you feel so inclined, in the past I found that pyparsing is quite capable. Other popular parsing packages include PLY and PyBison.
import re
from collections import OrderedDict
def latex2func(s, udict=OrderedDict()):
class make_xlat(object):
def __init__(self, *args, **kwargs):
self.adict = OrderedDict((
('\left' , ''),
(r'\right' , ''),
(r'\cdot' , '*'),
('^' , '**'),
(' ' , ''),
('{' , '('),
('}' , ')'),
(r'(\d+)([A-Za-z]\d*\w*)' , r'\1*\2'),
(r'\((\d+)\)' , r'\1'),
))
self.adict.update(*args, **kwargs)
self.grp_lookup = self.make_grp_lookup()
self.rx = self.make_rx()
def make_grp_lookup(self):
grps = {}
for k,v in self.adict.iteritems():
i = max(grps.keys())+1 if grps else 0
grps.update({i+j: k for j in xrange(len(re.findall(r'\\\d+', v)))})
return grps
def make_rx(self):
l = []
for k, v in self.adict.iteritems():
if not re.search(r'\\\d+', v):
l.append(''.join(map(re.escape, k)))
else:
l.append(k)
return re.compile('|'.join(l))
def one_xlat(self, match):
try: # Simple string replacement
return self.adict[match.group(int(0))]
except: # Subpattern group replacement
i, m = zip(*[(i, v) for i, v in enumerate(match.groups()) if v is not None])
t = self.adict[self.grp_lookup[i[0]]] # Assumes all indicies are the same, if not, something went wrong
for i, g in enumerate(m):
t = t.replace(r'\{}'.format(i+1), g)
return t
def __call__(self, txt):
return self.rx.sub(self.one_xlat, txt)
translate = make_xlat(udict)
while translate(s) is not s:
s = translate(s)
s = s.split('=')
args = re.sub('.*\((.*)\)', r'\1', s[0])
eqn = s[1]
return eval('lambda {}: {}'.format(args, eqn))
_ = var('x')
tests = {
r'f\left(x\right)=x^{3}-10 \cdot x^{2}+31x-30' : x**3-10*x**2+31*x-30,
}
for k,v in tests.iteritems():
show('$ {} $'.format(k), display=False)
for _x in xrange(11):
print('x = {}'.format(_x))
print(' LaTeX Implementation: {}'.format(latex2func(k)(_x)))
print(' Sage Implementation: {}'.format(v.substitute(x=_x)))
f(x)=x^3−10⋅x^2+31x ...
(more)
Well, Latex syntax is often ambiguous and unprecise, not sufficient for computer algebra.