Ask Your Question

character encoding

asked 2015-04-16 10:57:45 +0200

czsan gravatar image

How can I chgange characte encoding in Sage notebook. I'm hungarian, and in string I need characterd like á, ő, ű, etc, but not \xc3, \xc5 and so on.

edit retag flag offensive close merge delete


See also the presumably identical

kcrisman gravatar imagekcrisman ( 2015-04-16 16:48:15 +0200 )edit
kcrisman gravatar imagekcrisman ( 2015-04-28 20:29:22 +0200 )edit

3 Answers

Sort by » oldest newest most voted

answered 2015-04-16 16:53:19 +0200

kcrisman gravatar image

Apparently for now you may have to use the print command.

print u"gömböc"

works, indeed even without the u. See e.g.

edit flag offensive delete link more



You are right, print is the key to printing a string properly. Prefixing u to the string changes how it is encoded, which might matter for the subsequent use of the string.

slelievre gravatar imageslelievre ( 2015-04-16 19:08:42 +0200 )edit

answered 2015-04-16 12:14:31 +0200

slelievre gravatar image

updated 2015-04-17 13:27:53 +0200

EDITED. My original answer

You can prefix a string with the letter u to mark it as a unicode string, eg u'gömböc'.

was not so helpful. The notes below are maybe more a related discussion than a proper answer.

You can prefix a string with the letter u to mark it as a unicode string. If you are inputting unicode characters, this will affect how the string is encoded.

Here is what I get in the Sage REPL.

sage: 'Erdős'
sage: u'Erdős'

This shows a difference in the escape codes used for accented characters.

Apparently @kcrisman's indication to use print is the key to properly displaying unicode strings.

sage: print 'Erdős'
sage: print u'Erdős'

The role of the u prefix is not so apparent here.

The u is useful if you are using unicode escape codes in the string.

sage: print 'Erd\u0151s'
sage: print u'Erd\u0151s'

The other version:

sage: print 'Erd\xc5\x91s'
edit flag offensive delete link more


The respone is u'g\xf6mb\xf6c'

czsan gravatar imageczsan ( 2015-04-16 12:26:16 +0200 )edit

answered 2015-08-07 09:24:42 +0200

A character encoding tells the computer how to interpret raw zeroes and ones into real characters. It usually does this by pairing numbers with characters. Words and sentences in text are created from characters and these characters are grouped into a character set. There are many different types of character encodings floating around at present, but the ones we deal most frequently with are ASCII, 8-bit encodings, and Unicode-based encodings. More about.....Character Encoding


edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2015-04-16 10:57:45 +0200

Seen: 1,831 times

Last updated: Aug 07 '15