Ask Your Question
2

character encoding

asked 2015-04-16 03:57:45 -0500

czsan gravatar image

How can I chgange characte encoding in Sage notebook. I'm hungarian, and in string I need characterd like á, ő, ű, etc, but not \xc3, \xc5 and so on.

edit retag flag offensive close merge delete

Comments

See also the presumably identical http://ask.sagemath.org/question/8249...

kcrisman gravatar imagekcrisman ( 2015-04-16 09:48:15 -0500 )edit
kcrisman gravatar imagekcrisman ( 2015-04-28 13:29:22 -0500 )edit

3 answers

Sort by » oldest newest most voted
1

answered 2015-04-16 09:53:19 -0500

kcrisman gravatar image

Apparently for now you may have to use the print command.

print u"gömböc"

works, indeed even without the u. See e.g. http://stackoverflow.com/questions/10...

edit flag offensive delete link more

Comments

1

You are right, print is the key to printing a string properly. Prefixing u to the string changes how it is encoded, which might matter for the subsequent use of the string.

slelievre gravatar imageslelievre ( 2015-04-16 12:08:42 -0500 )edit
0

answered 2015-08-07 02:24:42 -0500

A character encoding tells the computer how to interpret raw zeroes and ones into real characters. It usually does this by pairing numbers with characters. Words and sentences in text are created from characters and these characters are grouped into a character set. There are many different types of character encodings floating around at present, but the ones we deal most frequently with are ASCII, 8-bit encodings, and Unicode-based encodings. More about.....Character Encoding

Mercal

edit flag offensive delete link more
0

answered 2015-04-16 05:14:31 -0500

updated 2015-04-17 06:27:53 -0500

EDITED. My original answer

You can prefix a string with the letter u to mark it as a unicode string, eg u'gömböc'.

was not so helpful. The notes below are maybe more a related discussion than a proper answer.

You can prefix a string with the letter u to mark it as a unicode string. If you are inputting unicode characters, this will affect how the string is encoded.

Here is what I get in the Sage REPL.

sage: 'Erdős'
'Erd\xc5\x91s'
sage: u'Erdős'
u'Erd\u0151s'

This shows a difference in the escape codes used for accented characters.

Apparently @kcrisman's indication to use print is the key to properly displaying unicode strings.

sage: print 'Erdős'
Erdős
sage: print u'Erdős'
Erdős

The role of the u prefix is not so apparent here.

The u is useful if you are using unicode escape codes in the string.

sage: print 'Erd\u0151s'
Erd\u0151s
sage: print u'Erd\u0151s'
Erdős

The other version:

sage: print 'Erd\xc5\x91s'
Erdős
edit flag offensive delete link more

Comments

The respone is u'g\xf6mb\xf6c'

czsan gravatar imageczsan ( 2015-04-16 05:26:16 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2015-04-16 03:57:45 -0500

Seen: 841 times

Last updated: Aug 07 '15