h***@public.gmane.org
2013-03-02 03:57:14 UTC
Status: New
Owner: ----
New issue 220 by st...-D1FsSgkXXbW1h8DMDp+***@public.gmane.org: cannot read html5lib in jython
http://code.google.com/p/html5lib/issues/detail?id=220
What steps will reproduce the problem?
Reproducible in Jython 2.5.2 and Jython 2.7b1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lib/html5lib/__init__.py", line 14, in <module>
from html5parser import HTMLParser, parse, parseFragment
File "lib/html5lib/html5parser.py", line 33, in <module>
import inputstream
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position
48-54: illegal Unicode character
What is the expected output? What do you see instead?
jython cannot read inputstream.py.
Please provide any additional information below.
inputstream.py contains some seriously broken Unicode characters in the
range 0xD800-0xDFFF, which are known as "unpaired surrogates".
This has been closed as wont-fix: http://bugs.jython.org/issue1836
It may be necessary to modify inputstream.py to not use these unicode
character literals when running in Jython.
n.b. a test for Jython:
import platform
JYTHON = (platform.system() == 'Java')
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
Owner: ----
New issue 220 by st...-D1FsSgkXXbW1h8DMDp+***@public.gmane.org: cannot read html5lib in jython
http://code.google.com/p/html5lib/issues/detail?id=220
What steps will reproduce the problem?
Reproducible in Jython 2.5.2 and Jython 2.7b1
import html5lib
import html5libTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lib/html5lib/__init__.py", line 14, in <module>
from html5parser import HTMLParser, parse, parseFragment
File "lib/html5lib/html5parser.py", line 33, in <module>
import inputstream
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position
48-54: illegal Unicode character
What is the expected output? What do you see instead?
jython cannot read inputstream.py.
Please provide any additional information below.
inputstream.py contains some seriously broken Unicode characters in the
range 0xD800-0xDFFF, which are known as "unpaired surrogates".
This has been closed as wont-fix: http://bugs.jython.org/issue1836
It may be necessary to modify inputstream.py to not use these unicode
character literals when running in Jython.
n.b. a test for Jython:
import platform
JYTHON = (platform.system() == 'Java')
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
You received this message because you are subscribed to the Google Groups "html5lib-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to html5lib-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send an email to html5lib-discuss-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB.
For more options, visit https://groups.google.com/groups/opt_out.
You received this message because you are subscribed to the Google Groups "html5lib-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to html5lib-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To post to this group, send an email to html5lib-discuss-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
Visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB.
For more options, visit https://groups.google.com/groups/opt_out.