character encoding
How can I chgange characte encoding in Sage notebook. I'm hungarian, and in string I need characterd like á, ő, ű, etc, but not \xc3, \xc5 and so on.
Apparently for now you may have to use the print
command.
print u"gömböc"
works, indeed even without the u
. See e.g. http://stackoverflow.com/questions/10...
To change the character encoding in a Sage notebook or any Python environment, you typically need to ensure that your source code files are saved with the correct encoding and that you handle string literals properly. Here are a few steps you can follow :
```python
```
python
text = "This is an example with special characters: \u00E1, \u0151, \u0171"
print(text)
Alternatively, you can use raw strings by prefixing your string literals with the letter "r". Raw strings ignore escape sequences and treat backslashes literally. Here's an example:
python
text = r"This is an example with special characters: \u00E1, \u0151, \u0171"
print(text)
By using these techniques, you can ensure that your Hungarian characters are properly represented in your Sage notebook or Python code, allowing you to work with them directly without relying on escape sequences like \xc3
or \xc5
.
EDITED. My original answer
You can prefix a string with the letter
u
to mark it as a unicode string, egu'gömböc'
.
was not so helpful. The notes below are maybe more a related discussion than a proper answer.
You can prefix a string with the letter u
to mark it as a unicode string.
If you are inputting unicode characters, this will affect how the string is encoded.
Here is what I get in the Sage REPL.
sage: 'Erdős'
'Erd\xc5\x91s'
sage: u'Erdős'
u'Erd\u0151s'
This shows a difference in the escape codes used for accented characters.
Apparently @kcrisman's indication to use print
is the key to properly displaying unicode strings.
sage: print 'Erdős'
Erdős
sage: print u'Erdős'
Erdős
The role of the u
prefix is not so apparent here.
The u
is useful if you are using unicode escape codes in the string.
sage: print 'Erd\u0151s'
Erd\u0151s
sage: print u'Erd\u0151s'
Erdős
The other version:
sage: print 'Erd\xc5\x91s'
Erdős
A character encoding tells the computer how to interpret raw zeroes and ones into real characters. It usually does this by pairing numbers with characters. Words and sentences in text are created from characters and these characters are grouped into a character set. There are many different types of character encodings floating around at present, but the ones we deal most frequently with are ASCII, 8-bit encodings, and Unicode-based encodings. More about.....Character Encoding
Mercal
Asked: 2015-04-16 10:57:45 +0100
Seen: 2,276 times
Last updated: Sep 15 '23
See also the presumably identical http://ask.sagemath.org/question/8249...
And see also https://groups.google.com/forum/#!top...