Porting Python Code¶
Introduction¶
As of January 2015, WindRiver python code has been made compliant with both python2 or python3 interpreters. This document tries to list the main problems we faced when migrating.
Many sites already list what to do, depending on the philosophy of the migration.
We decided to adopt a python code which is compliant with both python2 and
python3, using the 2to3
utility.
Before using 2to3
¶
The tool introduced some errors:
print¶
When using print
as a statement, the output file is empty. A simple file
containing:
def myprint(message):
print '** ' + str(message) + '**'
Is emptied by the 2to3
tool, and only contains:
Non
To fix that, first change all the print
calls to be function calls and not
statements :
def myprint(message):
print('** ' + str(message) + '**')
This should be enough.
raise¶
The issue intriduced by 2to3
is about the opposite to the print
problem.
The raise
calls are statements, not functions. Removing the parenthesis when
calling raise
should fix it. If you do not, then the followind code:
raise(MyException.message('My message'))
Is truncated to:
raise MyException
by 2to3
. To solve that, better write:
raise MyException.message('My message')
Running 2to3
¶
After running 2to3
on the python tree, the major changes are:
Old code | New code | Potential issues |
---|---|---|
unicode() |
str() |
Unicode disappeared |
dict.keys() |
list(dict.keys()) |
Line too long |
cStringIO |
io.StringIO |
String encoding |
StandardError |
Exception |
|
long |
int |
Data type check |
basestring |
str |
Data type check |
xrange |
range |
|
__nonzero__(self): |
__bool__(self): |
NonZero disappeared |
Issues¶
Unicode disappeared¶
The unicode type disappeared. So calling for either
>>> unicode(x)
or
>>> isinstance(x, unicode)
Is not possible with python 3. One of the main issues it raises, is the use of
StringIO
objects.
Python 2 | Python 3 |
---|---|
>>> import io
>>> f = io.StringIO()
>>> f.write('toto')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
>>> f.write('toto'.decode('utf-8'))
4L
|
>>> import io
>>> f = io.StringIO()
>>> f.write('toto')
4
>>> f.write('toto'.decode('utf-8'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
|
To write compatible code, a super class of StringIO
must be created:
>>> class CompatStringIO(io.StringIO):
... def write(self, s):
... if hasattr(s, 'decode'):
... # Should be python 2
... return int(super(CompatStringIO, self).write(s.decode('utf-8')))
... else:
... return super(CompatStringIO, self).write(s)
... def getvalue(self):
... return str(super(CompatStringIO, self).getvalue())
...
>>> f = CompatStringIO()
>>> f.write('mine')
4
>>> f.getvalue()
'mine'
>>> f.close()
Line too long¶
After the changes, some of the lines may be too long to be PEP8 compliant, some code modifications may be needed to remain PEP8.
String encoding¶
This was the major threat. Just to explain it shortly :
Python 2 | Python 3 | Compatible code |
---|---|---|
>>> e = bytearray([55, 56, 57])
>>> e
bytearray(b'789')
>>> str(e)
'789'
>>>
|
>>> e = bytearray([55, 56, 57])
>>> e
bytearray(b'789')
>>> str(e)
"bytearray(b'789')"
>>>
|
>>> e = bytearray([55, 56, 57])
>>> e
bytearray(b'789')
>>> str(e.decode())
'789'
>>>
|
To have the same behaviour, calling for decode()
is needed, and it works
for both python 2 and 3.
Data type check¶
In some cases, data manipulation may depend on the data type. As some types have
disappeared in python 3 (long
, unicode
, basestring
…), checking
for the data type needs some attention.
Running 2to3
converts the following Python 2 code into python 3
Python 2 | Python 3 |
---|---|
>>> isinstance(u'mystr', basestring)
|
>>> isinstance(u'mystr', str)
|
But the behavior to that code (with a python 2 interpreter) is different:
Python 2 | Python 3 |
---|---|
>>> isinstance(u'mystr', basestring)
True
>>> isinstance(u'mystr', str)
False
|
>>> isinstance(u'mystr', basestring)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'basestring' is not defined
>>> isinstance(u'mystr', str)
True
|
To make this work for both python 2 and 3, and keep the same behavior, a small code checking the interpreter version may be needed :
>>> import sys
>>> if sys.version_info[0] >= 3:
... strings = (str,)
... ints = (int,)
... else:
... ints = (int, long)
... strings = (basestring,)
...
>>> isinstance(u'mystr', strings)
True
The same poblem is true for int
and long
values, which ma be gathered
in an ints
variable.
NonZero disappeared¶
With python 2, the object method called by the bool()
function is
object.__nonzero__()
. This has disappeared with python 3, and the preferred
method called by bool()
is now object.__bool__()
Running 2to3
replaces the object.__nonzero__()
with
object.__bool__()
, which is a problem when using a python 2 interpreter.
Python 2 | Python 3 |
---|---|
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __nonzero__(self):
... return self._value != 0
...
>>> bool(Test())
False
|
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __nonzero__(self):
... return self._value != 0
...
>>> bool(Test())
True
|
In order to write code compatible with both python 2 and 3,
both __nonzero__()
and __bool__()
should be present.
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __nonzero__(self):
... return self._value != 0
... def __bool__(self):
... return self.__nonzero__()
...
>>> bool(Test())
False
Cmp Disappeared¶
With python3, the cmp()
function has disappeared along with the associated
__cmp__(self, other)
methods. It should be replaced with calls to
__lt__(self, other)
. So, for objects implementing the
__cmp__(self, other)
method, the comparison is broken as shown below:
Python 2 | Python 3 |
---|---|
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __cmp__(self, other):
... if self._value < other._value:
... return -1
... elif self._value > other._value:
... return 1
... return 0
...
>>> a = Test()
>>> # comparison tests
... b = Test()
>>> b._value = 1
>>> a < b
True
|
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __cmp__(self, other):
... if self._value < other._value:
... return -1
... elif self._value > other._value:
... return 1
... return 0
...
>>> a = Test()
>>> # comparison tests
... b = Test()
>>> b._value = 1
>>> a < b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: Test() < Test()
|
A possible fix to have this class able to be compared with both python 2 and 3,
is to also have the __lt__(self, other)
method in the object:
>>> class Test(object):
... def __init__(self):
... self._value = 0
... def __cmp__(self, other):
... if self._value < other._value:
... return -1
... elif self._value > other._value:
... return 1
... return 0
... def __lt__(self, other):
... return self.__cmp__(other) == -1
...
>>> a = Test()
>>> # comparison tests
... b = Test()
>>> b._value = 1
>>> a < b
True
Division returns float values¶
When dividing an integer, python2 was returning an integer, this is not true with python3, which always returns a float value. This may be problematic when using a division for iterable index:
Python 2 | Python 3 |
---|---|
>>> l = [1, 2, 3]
>>> a = 3/2
>>> a
1
>>> l[a]
2
|
>>> l = [1, 2, 3]
>>> a = 3/2
>>> a
1.5
>>> l[a]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not float
|
The beast way to have this code working for both python 2 and 3, is to call for
the int
function.
>>> l = [1, 2, 3]
>>> a = int(3/2)
>>> a
1
>>> l[a]
2
Conclusion¶
Writing code which is compatible for both python 2 an 3 interpreters is not that simple, it points out a lot of issues when using string conversions. Those issues may be hiding deeper problems, so doing the effort is time sonsuming, but it may unveil problems which would take ages to investigate.
A common way to solve the biggest threats, is to write a compat.py file, and define there the common types:
import io
import sys
class CompatStringIO(io.StringIO):
def write(self, s):
if hasattr(s, 'decode'):
return int(super(CompatStringIO, self).write(s.decode('utf-8')))
else:
return super(CompatStringIO, self).write(s)
def getvalue(self):
return str(super(CompatStringIO, self).getvalue())
if sys.version_info[0] >= 3:
import builtins
import queue as queuemodule
strings = (str,)
ints = (int,)
else:
import __builtin__ as builtins
import Queue as queuemodule
ints = (int, long)
strings = (basestring,)
def bytes2str(data):
if data and isinstance(data[0], int):
return ''.join('%c' % (chr(b)) for b in data)
elif data:
return str(data)
StringIO = CompatStringIO
builtinlist = builtins.list
queue = queuemodule
Using this compat.py
module may then look like:
from compat import StringIO
b = StringIO()
b.write('mine')