Handling unicode characters

Please post any questions about developing your plugin here. Please use the search function before posting!
arawra
Senior Member
Posts: 190
Joined: Fri Jun 21, 2013 6:51 am

Handling unicode characters

Postby arawra » Sat May 03, 2014 2:45 pm

I've come across a rather bit of a pain for myself with pickling and loading pickles. When I am storing information that includes a player's name, I believe I'm going to need to make sure the strings are properly formatted so I don't get errors on the pickle load.

This happened when I was trying to load a pickle from ES:P that had names with unicode characters in them. I was able to go through and format them such that the pickle will load in Python 3, but it seems strange the pickle wouldn't load in Python 3 where all strings are unicode by default, but would load in Python 2.

I'm wondering what causes this behavior, as I'm assuming it has to do with the file processing on the open() method.

I'd also like to know how to handle names with unicode in SP, as I believe it was updated to Python 3.
User avatar
L'In20Cible
Project Leader
Posts: 1533
Joined: Sat Jul 14, 2012 9:29 pm
Location: Québec

Postby L'In20Cible » Sat May 03, 2014 5:53 pm

Well, it is hard to say what is the main problem without any code to reproduce it. I made some testing on my side and it is working just fine...
arawra
Senior Member
Posts: 190
Joined: Fri Jun 21, 2013 6:51 am

Postby arawra » Sat May 03, 2014 7:42 pm

User avatar
L'In20Cible
Project Leader
Posts: 1533
Joined: Sat Jul 14, 2012 9:29 pm
Location: Québec

Postby L'In20Cible » Sat May 03, 2014 8:23 pm

Syntax: Select all

>>> data = pickle.load(open('playerdict.txt','rb'), encoding='utf-8')
arawra
Senior Member
Posts: 190
Joined: Fri Jun 21, 2013 6:51 am

Postby arawra » Sat May 03, 2014 9:03 pm

Still not a good solution :\

For now, I just pickled a second dictionary that converted or replaced the unicode characters.

Syntax: Select all

>>> for x,y in data.items():
... if 'name' in data[x]: print(data[x]['name'])
...
name1
name2
name3
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "D:\Python33\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 3-7: cha
racter maps to <undefined>
>>>
User avatar
L'In20Cible
Project Leader
Posts: 1533
Joined: Sat Jul 14, 2012 9:29 pm
Location: Québec

Postby L'In20Cible » Sat May 03, 2014 9:21 pm

This will open your file in UTF-8 but it won't change the encoding of the strings it contains. Since the default encoding used by print is sys.stdout.encoding (which is set to cp850), you will rather have to encode it into UTF8 to then decode it back in order to print it...
User avatar
Doldol
Senior Member
Posts: 200
Joined: Sat Jul 07, 2012 7:09 pm
Location: Belgium

Postby Doldol » Sun May 04, 2014 7:55 pm

If the goal is to print, wouldn't this be what he's after then?

Syntax: Select all

data = pickle.load(open('playerdict.txt','rb'), encoding="bytes")


Preserve data as bytes instead of encoding to UTF-8 first? But I'm not an expert at this.
arawra
Senior Member
Posts: 190
Joined: Fri Jun 21, 2013 6:51 am

Postby arawra » Sun May 04, 2014 8:03 pm

Doldol wrote:If the goal is to print, wouldn't this be what he's after then?

Syntax: Select all

data = pickle.load(open('playerdict.txt','rb'), encoding="bytes")


Preserve data as bytes instead of encoding to UTF-8 first? But I'm not an expert at this.


This will open your file in UTF-8 but it won't change the encoding of the strings it contains.


Answer was in previous message.

Return to “Plugin Development Support”

Who is online

Users browsing this forum: No registered users and 12 guests