Instances cleans the document of each of the possible offending
elements. The cleaning is controlled by attributes; you can
override attributes in a subclass, or set them in the constructor.
|
|
__init__(self,
**kw)
x.__init__(...) initializes x; see help(type(x)) for signature |
source code
|
|
|
|
|
|
|
allow_follow(self,
anchor)
Override to suppress rel="nofollow" on some anchors. |
source code
|
|
|
|
|
|
|
|
|
|
kill_conditional_comments(self,
doc)
IE conditional comments basically embed HTML that the parser
doesn't normally see. We can't allow anything like that, so
we'll kill any comments that could be conditional. |
source code
|
|
|
|
| _kill_elements(self,
doc,
condition,
iterate=None) |
source code
|
|
|
|
|
|
|
_substitute_comments(...)
sub(repl, string[, count = 0]) --> newstring
Return the string obtained by replacing the leftmost non-overlapping
occurrences of pattern in string by the replacement repl. |
source code
|
|
|
|
_has_sneaky_javascript(self,
style)
Depending on the browser, stuff like e x p r e s s i o n(...)
can get interpreted, or expre/* stuff */ssion(...). This
checks for attempt to do stuff like this. |
source code
|
|
|
|
|
|
Inherited from object:
__delattr__,
__format__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__repr__,
__setattr__,
__sizeof__,
__str__,
__subclasshook__
|
|
|
scripts = True
|
|
|
javascript = True
|
|
|
comments = True
|
|
|
style = False
|
|
|
links = True
|
|
|
meta = True
|
|
|
page_structure = True
|
|
|
processing_instructions = True
|
|
|
embedded = True
|
|
|
frames = True
|
|
|
forms = True
|
|
|
annoying_tags = True
|
|
|
remove_tags = None
hash(x)
|
|
|
allow_tags = None
hash(x)
|
|
|
kill_tags = None
hash(x)
|
|
|
remove_unknown_tags = True
|
|
|
safe_attrs_only = True
|
|
|
safe_attrs = frozenset(['abbr', 'accept', 'accept-charset', 'a...
|
|
|
add_nofollow = False
|
|
|
host_whitelist = ()
|
|
|
whitelist_tags = set(['embed', 'iframe'])
|
|
|
_tag_link_attrs = {'a': 'href', 'applet': ['code', 'object'], ...
|