I would expect A[i][j] to be slower because it first has to create the vector A[i] and then extract an element from it, whereas A[i,j] does a single extraction without creating intermediate objects. There is overhead in constructing a mathematical object like a vector. Also, A[i,j] calls a single method, A.__getitem__, whereas A[i][j] performs two calls: it calls A.__getitem__ once to form the vector v = A[i], and then calls v.__getitem__ on that vector.
So if you care about performance, use A[i,j]. If you need the intermediate vector, use A[i].
I would expect A[i][j] to be slower because it first has to create the vector A[i] and then extract an element from it, whereas A[i,j] does a single extraction without creating intermediate objects. There is overhead in constructing a mathematical object like a vector. Also, A[i,j] calls a single method, A.__getitem__, whereas A[i][j] performs two calls: it calls A.__getitem__ once to form the vector v = A[i], and then calls v.__getitem__ on that vector.
So if you care about performance, use A[i,j]. If you need the intermediate vector, use A[i].
Regarding the ellipsis notation, it's written for mathematicians, like a lot of Sage. In mathematical writing, "1, 2, ..., 10" includes 10. range is not a standard thing for mathematicians, so there was no reason to change Python's behavior, and srange is the Sage version of that, so again, no reason to deviate from Python. But mathematicians will expect [2, 4, .., 10] to include 10.